Code generation for C-based applications

TL;DR

The core project is RAI (Racket Abstract Interpretation), currently
used in the generation of C code for digital music instruments and
effects.

http://zwizwa.be/rai


For DSPM, the Haskell experiments that predate RAI, have a look at the
entries in this log, and the files in the meta archive:

http://zwizwa.be/darcs/meta/dspm


-----------------------------------------------------------------------


These notes concentrate on techniques for manageable code generation
(multi-stage programming, macros, domain-specific languages, ...) to
be integrated with C-based applications in the domains of embedded
systems and digital signal processing.

The focus is on approaches using Functional Programming languages and
techniques.  For more info see entry://20111119-093057

The source code archive for this project is at:

    http://zwizwa.be/darcs/meta


Current - Jan 2012

  The problem I'm trying to solve is to separate these 2 engineering
  tasks that pop up when creating (audio) DSP applications:

    * Mathematical models, expressed as (possibly nonlinear)
      state-space update equations[1].

    * Implementation: management of interconnect, persistent state and
      code sequencing.

   Traditionally, one would verify correctness of models in some
   simulation package, and then manually incorporate the second step
   by translating the abstract structure into a specialized
   implementation.

   The problem is that not recognizing the structure of this last step
   leads to ad-hoc implementation decisions and hard to manage code.

   Put differently: the information difference between the abstract
   model and the implementation is quite substantial, i.e. there are
   many decisions to be made about where to put data and in what order
   to perform computations.

   However, the inherent structure/symmetry of the design choices to
   be made do lend themselves to abstraction, by employing the
   technique of (C) code generation.

   The approach I'm using is the construction of a typed, embedded DSL
   in Haskell, with extensive use of generic programming (type
   classes).  Code generation in itself is not a new thing.  Typed
   code generation however is.  I'm mostly following the ideas
   described here[2].

  
[1] http://en.wikipedia.org/wiki/State_space_(controls)
[2] http://okmij.org/ftp/tagless-final/index.html

Entry: Generating/Processing C code
Date: Fri Feb 27 10:20:18 CET 2009

Bottom line: I'm never going to be happy with a macro package that
doesn't understand C syntax.  NEXT: FrontC (godi, Ocaml).  This is
what's used in MetaOcaml for C pretty-printing.

Getting started with FrontC:

  M-x tuareg-run-caml /usr/local/godi/bin/ocaml

  #use "topfind";;
  #require "FrontC";;

  open Frontc;;

  Frontc.parse_file "/tmp/test.c" stderr ;;

I had to look up the "stderr" part here:
http://caml.inria.fr/pub/docs/manual-ocaml/libref/Pervasives.html

With "int boo(int a) { return a + 1; }" this gives:

- : Frontc.parsing_result =
PARSING_OK
 [FUNDEF
   ((INT (NO_SIZE, NO_SIGN), NO_STORAGE,
     ("boo",
      PROTO
       (INT (NO_SIZE, NO_SIGN),
        [(INT (NO_SIZE, NO_SIGN), NO_STORAGE,
          ("a", INT (NO_SIZE, NO_SIGN), [], NOTHING))],
        false),
      [], NOTHING)),
   ([], RETURN (BINARY (ADD, VARIABLE "a", CONSTANT (CONST_INT "1")))))] 


Further experiments require a bit more Ocaml experience..


EDIT: there is now c.plt !


Entry: Getting Started with c.plt
Date: Sun Jun 28 15:45:49 CEST 2009

Problem:

  * Why doesn't c.plt parse the packetforth source?  Am I using
    constructs that are not C99 ?

Answer: no this seems to be an incomplete c.plt grammar.


Entry: int f(int *);
Date: Sun Jun 28 20:04:45 CEST 2009

box> (parse-program (open-input-string "int f(int *);"))
#f:1:11: parse: unexpected parenthesis (`)') in: C_PAREN

Time to read the grammar.  It's funny: reading BNF to see at a glance
what's going on is a skill I've never mastered.  Maybe because of its
extreme locality.  Maybe there is a tool out there that can help one
view it using hyperlinks?  Also, a BNF form annotated with examples
would be nice.  This way the visual pattern-recognition can help.

In private/parser.ss I find this:
     ;; XXX: what about DeclarationSpecifiers AbstractDeclarator?
     #;[(DeclarationSpecifiers !PushDeclarator AbstractDeclarator)
      (build-parameter-declaration $1 $3 (@ 3))]

Which looks like it is the problem.  Let's uncomment.

Ok, then it parses.. But i do get "1 shift/reduce conflict"..

Using one of the previous tests derived from forth.c I get the same
problem as before (after commenting out the line above):

box> (parse-program (open-input-file "test-1.c"))
car: expects argument of type <pair>; given ()

Crap.. It looks like there is no way around going into things a bit
deeper.


[1] http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf
[2] entry://../library/f0598571db600409ae2b7aefb43744dd


Entry: Staapl's ideas in a broader context
Date: Tue Jul 14 10:44:51 CEST 2009

I would like to attempt another introduction to the ideas behind
Staapl from the viewpoint of current state of affairs in the embedded
C programming world as I perceive them.  This should be useful in a
broader sense, outside of the specific (Forth, Scheme based) Staapl
approach.


1. The Status Quo

One thing is sure at this point in time.  C and Unix (Linux) have won.
There is nothing that can reduce cost more than a standard platform.
Moreover, In places where Unix' memory footprint is too large, C is
still surviving.

What saddens me is that this focus on C is stifling constructive
creativity in other ways.  C might be a "good enough" platform for
creating a vast collection of open source tools that can run anywhere.
It however is _not_ a nice language to write complex programs from
scratch.


2. Doing it Differently

What I am trying to get an idea about in Staapl is to see what
embedded programming is really about.  What are you left with once you
eliminate the struggle with the tools?

Writing C code in the past, I have lost a tremendous amount of time
dealing with its low-level nature.  I've found my way around most of
these problems, but only _after_ careful study of high level
programming systems.  A bit of knowledge about the implementation of
high-level languages makes it possible to get enough of the necessary
infrastructure into a C program to soften the development pain[1].

Working on Staapl, the conclusion I've come to is that an embedded
programming system should have the following components well
integrated:

  - a simple base language that allows you to get close to the machine

  - reusable libraries that take care of recurrent problems

  - a sane metaprogramming system (code generation from abstract
    descriptions) closely matched to the base language

  - a straightforward way to support debugging and profiling
    (introspection) in the development loop

What "well integrated" means in practice is to not cause extra
problems on top of the problems you are already going to encounter in
an embedded software setting: debugging and profiling.

Staapl tries to solve these problems by combining ideas from Forth and
Scheme to acknowledge and solve the base language, metaprogramming and
debugging/profiling problems.  The particularities of Scheme and Forth
make it possible to include base _language_ extensions as libraries.
From a modular programming perspective this is a very strong plus[3].


3.  Current Solutions.

The embedded (Linux-based) software industry seems to have filled the
4-point list in the following way:

  - C (a GCC-based cross compiler)

  - a huge amount of available code

  - a lot of C preprocessor macros + ad hoc external scripts and test
    suits that make it all hold together

  - printf (or more sophisticated log / trace based debugging) and
    modifications to the source code that make it possible for GDB to
    be an effective tool using breakpoints and watchpoints

The first item is actually quite well covered by the C programming
language.  C is close enough to the machine to not introduce
many artificial limits when you need low-level control.

The second item is filled in the sense that there is a vast body of in
principle reusable C code.  However, the problem with combining this
code is in the interfaces they have to present.  Because C's code and
data API is so low-level, a lot of libraries depend on _other_
libraries to solve representation problems.  The general disorganized
nature of the Open Source Software community turns this into quite a
mess.  A typical Linux based workstation has a large amount of
"middleware" duplication due to this.  I'm almost willing to bet that
half of all C code written is simply glue code that could be
eliminated where there a better standard interface.

The funny thing is that on a workstation this duplication doesn't
really pose a problem due to an abundance of memory resources.
However, in special purpose embedded applications it leads to a
serious hindrance: sometimes it is practically impossible to
disentangle the web of dependencies and produce lean code, so people
start re-inventing and re-implementing interfaces to add to the
already huge pile of solutions.

Then, looking at metaprogramming, it is not unreasonable to call the C
preprocessor an abomination.  It has causes a lot of people to write
inpenetrable macro code and resort to ad-hoc C code generator scripts.

There are examples of quite involved systems that are meant to
generate C code.  However, the fact that there is no standard here is
a genuine problem that wastes a lot of developer's time.
Closed-source solutions (such as the Mathworks' template system) don't
really help either.

The last point, profiling and debugging, has been made simpler in the
recent years due to the development of open simulators.  The irony is
that this is not done at the point where it makes most sense (a
programming laguage's machine model) but at a _real_ machine's
interface, a de-facto standard arising due to semiconductor market
dynamics.  Point being that such a low-level tie point is often not
very practical.  But hey, it's better than nothing, and it definitely
has some use (i.e. valgrind).


4.  So what is the real practical problem?

The lack of a standard approach for metaprogramming.

If there is one thing we can learn from the past is that standards
cannot be imposed.  They emerge as a consequence of many small
short-sighted local decisions, and are bound to be sub-optimal[2], so
let's rephrase this:

  The real problem is the lack of a standard for C metaprogramming.

If you start looking at C and its associated interfaces as a platform
(a machine) then the problem is really simple: you can automate away a
lot of tedious low-level C handling as long as your system can

  * generate C code from some meaningful high-level description with
    whatever form of error-catching high-level semantics
    added. (i.e. a static type system).

  * parse C code to make sure you can tie into the vast collection of
    open source software to extract what you need, using either their
    public interfaces, or possibly going deeper into their internal
    structure and pick and choose there.

A tool that can read/write C code full of unnecessary cleverness and
extract meaningful components in a way that can be used without
unnecessary cleverness on the operator's side would be opening quite a
lot of possibilities.  Even without a background in language design,
one can think of many applications that would make large classes of
boilerplate code a thing of the past if only the complexity of
"managing the code reader / generator" can somehow be reduced.

Moral of the story: if C is there to stay, it's probably best to treat
it as a "data format" instead of a programming language. 


5.  What to do with metaprogramming?

Once you have metaprogramming techniques, which is essentially control
over the semantics of your languages, what can you do with it?

There is one single idea that keeps coming back to me: make sure your
application can move gradually from a dynamic to a static structure.
This fixes an extra handle on your project to get to the right
combination of correctness, observability and maintainability.

Dynamic features make development easier in the stage where you don't
really know what you're doing yet.  Dynamic languages allow for ad-hoc
debugging tools.  If there's one place where you need one-shot
cleverness it's debugging.  It is really helpful to be able to change
the semantics of your base language in specific ways to trace down
problems.  What you don't want is non-observable parts hiding in your
system because of rigid static constraints imposed by the programming
system.  Code really is data, and so is machine state.  You should be
able to look at every aspect of a program, dead (static code) or alive
(runtime state).

Once you have and idea correctly implemented, try to move as much as
possible to static code and eliminate all unnecessary scaffolding
cleverness.  If your solution is any good, it probably has some kind
of structure that can be expressed elegantly in a modern type system.

From my experience: early development in a static language is hard
because the language tends to get in the way.  In a dynamic language,
late development and maintenance is hard because dynamic languages
allow too much implementation freedom and complexity and so leave much
space for obscure errors to hide.  Dynamic languages leave the
programmer's internal representation implicit: this is exactly the
stuff you'll forget about when not working on the code for a couple of
months, or the stuff that the other guy looking at your code doesn't
know at all.  If you cast this structure in logic you're better off in
the end.  

Practically my message seems to boil down to two principles:

  - allow for on-target dynamic structures (embed an interpreter for a
    reflective scripting language so it's there when you need
    cleverness)

  - make sure you have a good static tool setup (compiler +
    verification) so most (all) unnecessary "moving parts" can be
    eliminated as soon as you have the basic structure figured out.


6.  Links

[1] http://en.wikipedia.org/wiki/Greenspun%27s_Tenth_Rule
[2] http://en.wikipedia.org/wiki/Worse_is_better
[3] http://blog.plt-scheme.org/2007/05/macros-matter.html


Entry: The importance of platform
Date: Wed Jul 15 15:49:21 CEST 2009

"Substrate" is essential.  Without the right language to formulate
your ideas, it is very easy to get bogged down in irrelevant details.

I'm thinking about bridging C and the Forth/Scheme world, by finding
ways to bring the ideas in the reach of engineers that work mostly in
C.

But, is this really desirable?  Does the very elegant combination of
Forth and Scheme translate to a bridge between C and Scheme or some
other functional programming language?  What exactly will one loose
(or gain!) moving the target language from Forth to C?  What I notice
in MetaOcaml is that presence of side effects introduces non-trivial
problems.  The fact that Forth has a large functional subset might be
the real reason for the good Forth/Scheme mesh.


Entry: Two forms of broken base systems
Date: Sun Jul 19 13:33:30 CEST 2009


The need for metaprogramming comes from an inability to replace one
system with another when it is not appropriately flexible, but at the
same time it is impossible to do so for any (practical or theoretical)
reason.  The strange thing is that this happens both for systems that
are either not complex enough or too complex.


  1. Getting it abstract: metaprogamming as a symptom of inflexible base
     systems.

  In this case, the base system is not expressive enough to provide a
  substrate to encode a solution in a non-redundant way.

  If you look at Ruby on Rails[1], the "irrepairable" part is the fixed
  base systems used in web programming: XML/HTML, Javascript and SQL.

  In Scheme[2], the inflexible part is the way in which new name
  bindings are introduced, and the order in which sub-expressions of
  an expression are reduced to produce a value.  In the case of Scheme
  this inflexibility is a choice: the base language is _designed_ to
  be simple, which renders it incomplete in these two areas.  Macros
  can be used to build abstractions on top of this.

  In general: macros help to abstract `fixed' parts of a language:
  eliminate redundancy due to regularity of constructs that cannot be
  manipulated at run-time.


  2. Getting it right: metaprogramming as constraint system.

  In some cases the inflexible nature of a system might be that it is
  _too_ flexible, for different meanings of flexible: the base system
  permits too many mistakes that one would like to prevent by
  specification in a less flexible language.

  This is essentially what (domain specific) programming language
  design is about: providing higher level semantics that is easier to
  check for inconsistencies compared to a more flexible low-level
  system.


Often however it is a combination of both:  I.e. a typed functional
langauge provides more flexibility in some domains (i.e. providing an
abstract memory model) but less flexibility in others (requiring some
contraints on the code structure are satisfied before translation to a
low-level substrate).

[1] http://rubyonrails.org/
[2] http://en.wikipedia.org/wiki/Scheme_%28programming_language%29


Entry: Standard way for Metaprogramming C?
Date: Sun Jul 19 11:23:21 CEST 2009

One thing that I find ironic: electrical engineers (EE) are obsessed
with macros.  In fact, all the electronic design tools I know are
essentially template programming systems, where static (physical!)
structures are generated from high-level descriptions, and a lot of
effort is put into making sure that what is generated is actually
correct by "executing" the descriptions on different levels of
abstraction (simulation / verification).

Why is it then that in current practice, the world of embedded
programming, which is so close to the world of electronics design,
doesn't really have a streamlined way to do metaprogramming of C?  Why
do people just take C for granted?

Yes, there are macros in a macro assembler, and there is the C
preprocessor, there is M5 in the autotools and there are plenty of
cases where EEs "discover perl" or any other dynamic language and
start generating input to the assembler or C compiler.  But there are
no _serious_ open tools I know of that attempt to take control of the
metacompiling process itself, abstracting away some of the red tape
that peeple keep solving over and over again - the main one being:
parsing and generating C code to and from abstract representations.

Not to say that there is no research in metaprogramming C.  Start with
papers about MetaOcaml[1] and follow references.  Not to say either
that there are not systems that "compile to C" as a platform. I.e. the
MathWorks' Real-Time Workshop[2] seems to attract a lot of investment
from industry as a standard platform to escape from C and VHDL /
Verilog, but the templating system is not really that much different
from the C preprocessor or M5.  If the highlevel tools don't work for
you, you're still forced to dive into the internals of an ad-hoc
system.

Most of commercial effort simply focusses on providing _languages_ to
the end user.  Most tools hide the metaprogramming behind the scenes,
but nobody seems to be building open tools generic enough so they can
be picked up as a standard, so it can prevent the waste of time
created by ad-hoc metaprogramming systems where these specific tools
cannot be applied.

What I'm talking about is to give people the ability to _design
languages_ and get them off of the fear that this in some sense is
"magic".  There should be a way to put these technologies in the hands
of the "small guy".  This would involve:

   * A simple metaprogramming tool chain that takes care of the nitty
     gritty to save programmer's time and avoid ad-hoc choices.  This
     in the form of a library that can be included in C programs and
     any other higher level language through a FFI.

   * A way to "trick" embedded engineers out of their focus on C and
     educate them about how high level programming languages and type
     systems are implemented, so they can make educated decisions when
     rolling their own.

[1] http://www.metaocaml.org/
[2] http://www.mathworks.com/products/rtw/ 


Entry: C is not a programming language
Date: Sun Jul 19 14:04:07 CEST 2009

C serves as an _interface standard_ that hold together a large part of
the elecronics and embedded software industry.  This is not a bad
thing.  It looks like it is practically impossible to get at any kind
of standard which such a huge span in a constructive way: nobody seems
to be resourceful enough to get this organized.  (Or inclined to..)
So we're stuck with suboptimal, evolved and half broken yet
somehow-good-enough standards...

However, this is no reason to confuse C with a good programming
system!

Maybe it is Dijkstra's lost BASIC generation that has switched from
BASIC to C after they got into electronics because they thought they
already knew how to program?  (That's basicly my story.)

C as a programming language behaves as a blindfold.  From my own
experience it seems to be a local optimum that prevents one from
looking outside one's pieceful little shire because the distance to
modern programming systems is simply too great.  Any road that leads
away from C in the _right_ direction (from the embedded systems
programmer's long-term sanity's point of view) seems to be wrought
with secondary problems that prevent "sticking" as a standard
approach.  The issues are usually those typical of absence of any
standard: there is too much freedom to make choices, so people roll
their own and add to the overall noise.

Evolution will not save us here!
Something needs to be done.


Entry: Scheme and Electrical Engineering / Two kinds of programmers
Date: Sun Jul 19 14:06:32 CEST 2009

What I find ironic is that Scheme seems to have origniated from a very
EE-centric way of thinking (look at the constraint programming in
SICP[1]), yet _all_ the EEs I know detest Lisp and Scheme, and simply
ignore anything that "isn't as efficient as C".

The really, really sad thing is that at a time where understanding the
"inside" of programming languages is more important than ever, it
looks like a place like MIT has given up too[2] and focusses on
programming as patching.  

A recent LtU thread (which I can't find atm) talks about two families
of programmers: those that stick to one tool and learn how to use it
to their best ability (patchers, system builders), and those that
build tools: libraries and composition systems (programming
languages).  I've seen this very clearly in the Pure Data
community[3].  Some people don't want to get out of the "patching"
position, while others are attracted to building components in C.  The
reason I am so interested in metaprogramming is exactly this.  The
continuum between C and Scripting needs to be made available.

Precisely because C and VHDL/Verilog are not going anywhere, it is
important to teach people on how to build systems on top of these
languages.  To make sure they don't fall into the pit of "let's build
another patching system" without a good knowledge about programming
language theory.


[1] http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-22.html#%_sec_3.3.5
[2] http://tech.mit.edu/V125/N65/coursevi.html
[3] http://en.wikipedia.org/wiki/Pure_Data


Entry: Survey of existing C parsers
Date: Sun Jul 19 16:43:43 CEST 2009

 * GCC [1]
 * Haskell language.c[2]
 * Ocaml / MetaOcaml frontc[3]
 * PLT Scheme c.plt[4]
 * LLVM Clang

[1] http://gcc.gnu.org/ml/gcc-patches/2004-10/msg01969.html
[2] http://www.sivity.net/projects/language.c/
[3] http://pauillac.inria.fr/cdk/newdoc/htmlman/cdk_180.html#SEC206
[4] http://planet.plt-scheme.org/package-source/dherman/c.plt/3/1/planet-docs/c/index.html
[5] http://clang.llvm.org/


Entry: GCC's parser
Date: Sun Jul 19 17:08:37 CEST 2009

Before going to great length reinventing parsing, let's look at the
mother of all first.  A better understanding of the gcc and binutils
source code might shed some light here and there.

The overall impression I get reading about people trying to understand
gcc is that it is quite big and unmanagable..

[1] http://gcc.gnu.org/ml/gcc-patches/2004-10/msg01969.html
[2] http://www.srcf.ucam.org/~jsm28/gcc/


Entry: First concrete goal
Date: Sun Jul 19 17:40:53 CEST 2009

On the packet Forth source tree: replace direct structure access with
accessor.


Entry: Installing Ocaml + FrontC
Date: Sun Jul 19 17:58:22 CEST 2009

The simplest way seems to be to install using godi[1].  Install ocaml
and ocaml-pcre, then do:

wget http://download.camlcity.org/download/godi-rocketboost-20080630.tar.gz
tar xf godi-rocketboost-20080630.tar.gz
cd godi-rocketboost-20080630
/bootstrap --prefix /usr/local/godi
export PATH=/usr/local/godi/bin:/usr/local/godi/sbin:$PATH
./bootstrap_stage2

Now use `godi_console' to install the package `godi-frontc'.

Then fire up ocaml as described here[3].


[1] http://godi.camlcity.org/godi/index.html
[2] http://www.cocan.org/tips_for_using_the_ocaml_toplevel
[3] entry://20090227-102018
[4] http://caml.inria.fr/pub/docs/manual-ocaml/libref/Pervasives.html
[5] http://www.podval.org/~sds/ocaml-sucks.html
[6] http://adam.chlipala.net/mlcomp/
[7] http://www.mpi-sws.org/~rossberg/sml-vs-ocaml.html
[8] http://www.ocaml-tutorial.org/
[9] http://delicious.com/doelie/ocaml


Entry: Clang - LLVM's C parser
Date: Sun Jul 19 21:19:36 CEST 2009

If there's one tool that's supposed to do a lot of tricks out of the
box, it's clang..  C++ support is not ready yet, but C support is
there and is "production quality".

When using LLVM it's probably better to generate one of the
intermediate layers.

This looks like a nice starting point:
clang-cc -ast-print-xml <file>

This looks like a nice place to stop for now.  Next thing to figure
out is how to plug into the compiler by generating some source code,
or a C AST in XML.


[1] http://clang.llvm.org/
[2] http://clang-analyzer.llvm.org/
[3] http://clang.llvm.org/get_started.html


Entry: Writing an LLVM frontend
Date: Mon Jul 20 10:51:33 CEST 2009

So..  LLVM's C parser seems promising.  Now, when I want to generate
code, at what point should I plug into LLVM?  From what I gather,
LLVM's machine language is SSA.

I recently asked a question on the PLT Scheme list about LLVM[2].
Eli mentioned:

  Implementing an LLVM interface, however, should be very easy --
  after spending some time looking at the various options, our
  conclusion was that it's really just easier to invoke LLVM on whole
  files (in the LLVM language), and using the LLVM jit to create an
  executable piece of code. The first instinct was to use the LLVM
  commands directly, but that doesn't really have any advantage over
  using whole files.

[1] http://www.llvm.org/docs/LangRef.html
[2] http://osdir.com/ml/plt-scheme/2009-06/msg00070.html


Entry: future of performance (LtU)
Date: Tue Jul 21 16:11:45 CEST 2009

[5]: The future of performance is GLSL[1] / HLSL[2] / Cude[3] /
     OpenCL[4], or dynamic code generation from a higher-level
     language.

[1] http://en.wikipedia.org/wiki/GLSL
[2] http://en.wikipedia.org/wiki/HLSL
[3] http://www.liebertonline.com/doi/abs/10.1089/cmb.2008.0157
[4] http://en.wikipedia.org/wiki/OpenCL
[5] http://lambda-the-ultimate.org/node/3518#comment-50100


Entry: ocaml + llvm
Date: Tue Jul 21 22:31:53 CEST 2009

[1] http://llvm.org/docs/tutorial/OCamlLangImpl1.html


Entry: How to use C
Date: Fri Jul 24 15:08:00 CEST 2009

An attempt to describe my C programming style.

The basic idea is: C isn't too bad as long as you get a hold on the
memory management.  Keep data structures simple and as much as
possible use stateless code with locally managed data (C-stack).

Outside of that: try to generate C from a higher level description.


  - Avoid malloc: write _functional_ C code where possible (allocate
    data on the stack) and try to use a garbage collector or gc +
    embedded scripting language for representing long-lived data.
    This greatly simplifies interactive testing in gdb.

    My experience is that in a low-level project, usually a lot of
    data doesn't need malloc() for intermediate representation.  The
    part that does need persistence can usually be structured in such
    a way that it can be queried by stateless code without the need to
    transform it into alternative long-lived (cached) representations.

  - If you need malloc, try to use _linear_ data structures: make
    everything look like a tree or a directed graph such that
    con(des)struction becomes simpler.  Code that is looking at
    inconsistent data because it's still building it or taking it
    apart _really_ needs to be kept minimal.

  - Use _comprehensions_ (iterator macros) where you would use
    iterator objects or scheme-like closures.  The idea is that
    `downward closures' are free in C, so using them is often simpler
    than writing iterator objects.

  - Use _single assigment_ variables.  Assign to variables at their
    declaration point, and declare a new name for every intermediate
    result.  The compiler is smart enough to optimize this.  In fact:
    the internal representation in GCC is exactly this!

  - Write pretty-printing routines that print data structures, but
    have no other side-effects.  Very useful in gdb.

  - Add a TRAP = kill(getpid(), SIGTRAP) routine at all places where
    you didn't implement something, instead of a print statement that
    reads "not implemented".  Then when running in the debugger, you
    can type it in directly at the trap point and recompile.

In a sense, this advice centers around a `functional' approach to
programming, where the ability to form genuine long-lived lexical
closures is forsaken for a C-stack approximation using a few simple
tricks.

If you write code that doesn't do anything except translating
descriptive data structures into more specific ones that are used by
your application: use an asynchrounous garbage collector, and manually
call it when you have the specific data.  Realize that this
configuration part is really a compiler.  Don't write a compiler
without GC.  Alternatively: generate a simple data structure or a C
program using a HLL.

Note: these ideas about data structures of course do not apply to
problems that are inherently `algorithmic' or (mutable) data-structure
centric.


A different point: don't abuse the C preprocessor.  Essentially this
means that expansions shouldn't go to deep, since that leads to
hard-to-debug code.  Factor out everything that can go into an inline
function, and put the stuff you can't do in that way in a wrapper
macro.  The type checker is your friend whenever you try to change
something.  Macros can provide:

  - implicit bindings or context, i.e. the "current" object

  - abstractions over data and function definitions

  - abstractions over control structures

  - abstractions over (parts of) variable and function names


Entry: Real men program in C
Date: Fri Aug 14 08:45:20 CEST 2009

I didn't read it all, but some of the comments are quite funny[1].
There is such a huge gap between embedded software and HLL...

Another gem here[2].

Hey, it's OK to make fun of people from time to time ;)

[1] http://www.embedded.com/columns/barrcode/218600142?printable=true
[2] http://lambda-the-ultimate.org/node/120


Entry: Bison's getting extinct?
Date: Mon Aug 17 15:54:04 CEST 2009

I've used YACC style parser gen in Scheme before, for the wrong
project (a wiki parser which can really do with just a scanner as it
doesn't use block structure).  Maybe an s-expression + string + number
parser is a good exercise.

Then, I'm quite interested in parser expression grammars[2].  While I'm
still structurally procrastinating integrating libpth into the
project, simply because my uclibc target doesn't have makecontext(), I
can just as well see if it's difficult to implement.  Now that I do
have a GC, be it a trivial one, it might be not so much of a stretch..

EDIT: recently I've used peg/leg[3] to build and s-expression
parser[4] for libprim[5].

[1] http://www.acm.org/crossroads/xrds7-5/bison.html
[2] http://en.wikipedia.org/wiki/Parsing_expression_grammar
[3] http://piumarta.com/software/peg/
[4] http://zwizwa.be/darcs/libprim/ex/sexp.leg
[5] entry://../libprim

Entry: Progress in ideas: how to sell Lisp anyway.
Date: Sun Aug 23 13:32:01 CEST 2009

Context: You can't sell Lisp, but you can hide it!
(Note: `lisp' here means also `typed functional language')

I like to play with these things.  Staapl is my current state of
affairs: using a dtyped language (scheme) with a powerful macro &
module system to implement a macro-based untyped Forth dialect.  This
I intend to extend with some static analysis following current
approach to gradual typing in the Scheme literature.  A work in
progress..

So, how to _not_ do that, but get something out there:

* Syntax: I think I can sell a DSL as an x -> C compilers though, as
  long as x is either XML or looks like C.  Syntax is totally
  unimportant for getting the job done, but _essential_ to get
  something sold.

* DSL semantics is something ``you don't talk about''.  It's part of
  the just-make-it-work part that is to be sold as a product: it
  _really_ needs to be _good_ but its goodness should be apparent from
  intuiton.  If you have to explain semantics, you fail.  There needs
  to be a gradual build-up from well-known semantics (i.e. C or Java)
  and all the rest needs to sneak in under the radar.

* Hide a dynamic language core as an ``embedded debugger''.  it's
  essential to have a fully dynamic, introspective language at the
  core of the application.  This doesn't need to be visible outside of
  the debugger though. (See libprim: Scheme and PF).

* Hide as much as possible in the form of ``a library performing
  magic.''  In this case it doesn't matter how it's written, as long
  as the API is clear and simple.  I'd call this: ``write leaf objects
  / components in a typed DSL''.

* During development, move stuff from the dynamic to the static part
  once you understand them.  This makes maintenance easier.
  Translating from a dtyped to a styped implementation usually forces
  you to be more formal, while it makes the code extremely brittle to
  changes.


When I ship a C code generator to a client, assuming they are mostly
interested in the product, and don't intend to modify the compiler
intensively, but nontheless want to have some chance to understand and
slightly modify the codee.  What language should I use?

I'm getting more worried about lisp/scheme.  They are powerful tools
but require a paradigm shift.  Especially macros.  And really, nobody
likes parenthesis.  So I'm thinking, for straightforward recursive
data structure transformation, Ocaml might be a better match..  It's
relatively easy to learn, and the type systems makes it hard to make
changes that work without at least makinga bit of sense..

So, let's focus on the following workflow: use Ocaml programs to
translate XML data structures into C programs.


Entry: MetaOcaml
Date: Mon Aug 24 09:49:20 CEST 2009

wget http://www.metaocaml.org/dist/old/MetaOCaml_309_alpha_030.tar.gz
tar xf MetaOCaml_309_alpha_030.tar.gz 
cd MetaOCaml_309_alpha_030
./configure --prefix /usr/local
make world
make bootstrap  # for testing
make opt
make opt.opt    # faster native code compilers
sudo make install


[1] entry://../staapl-blog/20080927-103024
[2] http://www.metaocaml.org/


Entry: CamlP4 and concatenative languages
Date: Sat Aug 29 15:12:31 CEST 2009

It looks like OCaml and MetaOCaml are going to be quite central to
some of the approaches I'd like to take, so let's start to get
aquianted with CamlP4[1].  The ``p4'' stands for
``Pre-Processor-Pretty-Printer''.

The first necessary thing is an s-expression parser.

[1] http://caml.inria.fr/pub/old_caml_site/camlp4/
[2] http://www.venge.net/graydon/talks/
[3] http://martin.jambon.free.fr/extend-ocaml-syntax.html


Entry: Dynamically Typed Metaprogramming
Date: Tue Sep  1 18:31:07 CEST 2009

So.. Been thinking a lot about the static vs. dynamic typing for code
generators (staging) for a DSL.  I've isolated 2 regimes:

  1.  Fixed language semantics and implementation: code generators
      are small components that don't change.

  2.  Complex code generators, possibly evolving.

In case 1. the benefit of statically typed generators is not so great.
Once the generators work they will remain constant, so the added
benefits that types bring when modifying internal generator structure
doesn't pay off.  Abstract interpretation is probably enough to
perform checks at generator run time (DSL compile time), which is all
the DSL _user_ really needs.  In case 2. you probably want MetaOCaml
style typed staging.

I wonder what the compexity point is where typing starts to pay off.
One point I definitely see is complex DSP algorithms (i.e. video
codecs).

Of course there is the constraint that metaprogramming needs to be
useful: you want to _lower_ the abstraction level of the
_implementation_ to get a clear view on resource usage, or dually you
want to _raise_ the abstraction level of the _specification_ starting
from a fixed platform (i.e. C / asm).


Entry: Abstract Interpretation and Scheme -> C mapping
Date: Tue Sep  1 19:21:08 CEST 2009

Essentially the same trick as is used in most MetaOCaml literature: if
the homogenously staged program doesn't use many of the highlevel
features of the meta language, it can be projected onto a language
with lower level semantics.  This is called heterogenous staging.

In the case of Scheme -> C projection, I already started with
something like that in libprim, as an experiment to generate EX code
(Dynamically typed async-GC language with C control flow).

A starting point: In [1] on p.8 a subset of the C grammar is
specified, which to which a subset of the OCaml grammar specified on
p.9 is mapped.

[1] http://www.cs.rice.edu/~taha/publications/journal/ngc07.pdf


Entry: Your generator is my macro
Date: Tue Sep  1 22:03:26 CEST 2009


Popular notions
---------------

As pointed out to me recently, I should not forget the existence of
the popular[1].

To be fair, the site is quite broad.  But I can't help but get the
feeling that a lot of the complexity is because of deficiencies in
some high level programming languages, mostly Java.

It seems that lisp-style macros would make this kind of code
generation a lot less painful.  To me (and my target domain) most of
this seems a bit moot.  I would be interesting to learn about
something more powerful and well-designed than hygienic macros and
modules.


A spectrum of metaprogramming techniques
----------------------------------------

As mentioned in this LtU post[2], there is a spectrum of code
generation techniques:

   rigid and static                                    flexible and dynamic
<--passive generators---active generators--macros--compilers---interpreters-->

I tend to stick to the macros-compilers part, with excursions into
active code generation and interpreters.  Let me give my list. 

- AST interpreters (PF, embedded Scheme)
- byte code interpreters
- compilers -> C / asm / LLVM / VM (control / data structure flattening)
- typed/untyped heterogenous staged metaprogramming (MetaOCaml / Staapl)
- typed homogenous staged metaprogramming (MetaOCaml) and
  untyped lisp macros / AST tx tools (compile to Scheme/Lisp/OCaml/Haskell)
- active code generators (simple gen of header files, macro defs, struct inits, ...)
- passive generators (emacs keyboard macros)


Some remarks
------------

The last one: passive generators, should be used with care.  I use
them only as `sanitizers': converting permanently to a new data
format, and then taking that as a root.

Active generators I call simple configuration file -> init file /
interface definition file conversion.  Essentially flat direct
translation.

Lisp macros and AST tx tools are really powerful in-language
metaprogramming.  Missing link: (gradually) typed macros.  Typed
homogenous staged programming fills the void made by restricted typed
macros.  I'm not sure about the usefulness of this yet.

Heterogenous staged programming (HSP) is useful for separating
high-level program specification from implementation as a high -> low
map.  This is a restricted form of compilation.  HSP comes close to
`real' language design and implementation.

Full high level to low level (ASM) compilers are best left to experts.
Compilation to intermediate level is probably best phrased as HSP (for
C) or macros (for higher level targets).

Interpreters: anything that doesn't `stage' a solution is an
interpreter.  Most complex data structures can be seen as languages.
Interpreters are best written in high level languages.


[1] http://www.codegeneration.net
[2] http://lambda-the-ultimate.org/node/1495#comment-17349


Entry: Staging Control Flow of Data Driven Languages
Date: Wed Sep  2 08:59:36 CEST 2009

I've worked on a data flow language in Staapl[1] a couple of months
ago, and I recently realized that because it is staged (all control
flow can be known at compile time) it's worth it to generalize it to
constraint propagation[4] based approach (equations instead of
functions).

The key insight is that Abstract Interpretation[2] can be used to
capture the following nuance: at compile time, you know the
availability of data asserts (inputs) but you don't know the value.
Abstractly interpreting the program with only this information allows
you to 1. sequence rule evaluation and 2. directionalize the rules
into functions.

It seems to me that this principle can be generalized.  Any kind of
non-standard control flow can be statically sequenced as long as it
depends only on availability of data, and you known this availability
at compile time.

Note that you cannot do this in cases where the control flow depends
on the values themselves, so attempting to stage `amb' (prolog /
backtracking) might not be so useful, except for adding discrete
constraints that are satisfiable at compile time.

An interesting implementation technique for Abstract Interpretation
are staged macros[3].


[1] http://zwizwa.be/staapl
[2] http://en.wikipedia.org/wiki/Abstract_interpretation
[3] entry://../plt/20090719-142813
[4] entry://../staapl/20090831-155006


Entry: Staged CSP/Pi ?
Date: Thu Sep  3 10:22:24 CEST 2009

Maybe I should ask Matt Jadud.  Control flow staging might somehow
mesh with Occam.


Entry: Prototyping and DSLs
Date: Thu Sep  3 13:27:14 CEST 2009

I prefer my prototyping systems to have the semantics of full
programming languages.  Compared to special purpose prototyping tools,
this approach is both more powerful and `simpler' (in the sense of
less actual complexity) because it avoids corner cases.

The cost of this is a higher abstraction level: you essentially have
to also define the DSL.  In practice, with some experience, this cost
can be amortized, especially when you use systems that make expressing
language definitions easier (i.e. PLT Scheme).

Like anyone I want the prototyping app (DSL).  Systems that are
limited in generic power and shift to application-centric
expressability are definitely useful, but in addition I want an escape
hatch into an enclosing general purpose programming language in cases
where my DSL is too simple to express some corner cases.  (And I could
write _another_ DSL if another consistent domain emerges.)


Entry: DSL: syntax and semantics
Date: Thu Sep  3 13:40:54 CEST 2009

The importance in a DSL is the semantics: what structure does it
express, and possibly how does this translate to something we already
know how to implement?

Surface syntax (notation) is maybe only a time saver, but there is
something to say about a specific iconic language stimulating thought
in specific ways.


Entry: C pretty printing
Date: Thu Sep  3 15:36:26 CEST 2009

It's possible to use LLVM's clang code formatter to parse non-pretty C
code and produce pretty C code.  It allows to side-step the pretty
printing problem (cosmetics problem..)


Entry: A Methodology for Generating Verified Combinatorial Circuits
Date: Fri Sep  4 11:24:42 CEST 2009

This paper describes the combination of staging and abstract
interpretation that seems to be key to MetaOcaml based specialization.

[1] http://www.cs.rice.edu/~taha/publications/conference/emsoft04.pdf


Entry: DSLs
Date: Thu Sep  3 21:28:25 CEST 2009

http://www.infoq.com/presentations/Truth-about-DSL

Kathleen Fisher
Charles Consel

Gabor Karsai
Juha-Pekka Tolvanen
Marjan Mernik

Cameo of Walid Taha during Q&A.

What I take from this is the divide between DSLs for programmers and
DSLs for end users.  The former take mostly the form of embeddings in
gp languages, while the latter tend to be more like advanced raphical
configuration tools.

In other words, using the following two components:

   - a language L (formally specified or not)

   - an implementation map L->B which translates L to some base
     language B.  this map possibly serves as the only specification,
     and possibly encoding knowledge in the form of implementation
     strategies.

It is important to distinguish applications based on whether the end
user will be changing the L->B map.  The vast majority of commercial
applications of DSL this seems to be _not_ the case.

I.e. for RAP based abstractions, you're not only interested in L
(highlevel description of problem/solution) but also in the
implementation choices made by L->B.

Simply put, if user of DSL == creator of DSL, very different rules
apply then when this is not the case.

One more remark: I think it was Gabor Karsai that mentioned that DSLs
work really well, but the problem is integration: merging DSLs with
other DSLs.  To me it seems this problem disappears when you embed a
DSL in another language: for interfacing, there is always the host
language to fall back to, and it being a `real' language, will have
decent, general composition mechanisms.


Entry: Synthesis of implicit models (optimization problems)
Date: Fri Sep  4 15:34:39 CEST 2009

From this and personal correspondence I'm surprised to see Maple has
code generation abilities to C/Matlab/...

[1] http://www.cas.mcmaster.ca/~carette/newtongen/
[2] md5://ad463e80f1858f07b89a4072f5c9db46


Entry: Matlab parser
Date: Fri Sep  4 15:40:32 CEST 2009

It would be very helpful to have a frontend for matlab syntax.  Maybe
Octave can be used here?


Entry: IP - staged image processing
Date: Wed Sep  9 17:01:39 CEST 2009

I'm getting closer to making this a reality.  The most important part
is building a collection of _parametric_ grid folds.  This allows to
separate correctness (express programs in terms of abstract folds with
higher level semantics) from optimization (pick the parameters that
steer the code generation, i.e. loop unroll, order, tiling, ...

It seems that this is a significant design decision: don't put the
intelligence in the system (optimizing compiler), especially not
locally, but make alternatives explicit and bring them outside.  This
allows for manual configuration and automated global optimization
(i.e. a single week-long exhaustive search for an efficient
implementation once the application is correct).  It also makes it
possible to map the configuration parameters to resource use.

The bottom line is quite simple really: make _all_ design choices
explicit, and write a specializer for those.


Entry: Algebra of Programs
Date: Thu Sep 10 21:28:13 CEST 2009

I ran into two papers about Machanising Fusion [1][2].  

I watched Backus' lecture on FL[3] yesterday.  What surprised me was
that while 1. he emphasizes firmly the importance of having an algebra
of programs: to _reduce_ the power of the object language to be able
to manipulate it algebraically without getting into undecidable
territory, 2. he _still_ mentioned that automatic translation to
efficient representation is a hard problem.

As clearly expressed in [1], the big finger points to a solution where
optimization needs to be specified.

Maybe I'm oversimplifying, but I have this itch that says that stuff
like this is really about notation, as is often the case in anything
algebraic..  What we need is a DSL for optimization and hardware
mapping of constructs.  Something that can both be driven by a search
algorithm / theorem prover AND has a meaningful semantics for an
engineer trying to map an algorithm to specific hardware.

Now, the ``algebra of programs'' approach probably makes it possible
to change this DSL from a dumb collection of choices to be made, to
actual manipulations that can be expressed in language form (theorems
about the relationship between the choices).

I need to do an application keeping this framework in mind.  Build the
meta-layers up from the ground..  The image processing / audio DSP
problems should be a good start: algorithms are usually
straightforwardly expressed in some high level parallel form (data
flow), but get complicated if serialization needs to be expressed
manually.

I've had this hunch for quite some years now: there is structure there
to be exploited, but there is no tool to capture the structure.
People seem to be very good at solving `application mapping' problems
once they understand fully the source and target domains.  I take this
mostly from my own experience in using and abusing fixed-structure
devices in different ways to get to efficient solutions by finding
abstractions that translate directly to assembly language.  However,
doing this _formally_ has always been out of my reach due to the
complexities involved.  People seem to be good in dealing with
inclomplete maps and leaky abstractions.

Maybe that's the _real_ problem?  It's not possible to formalize leaky
abstractions?  If your abstraction has too many failing corner cases
that you can't make explicit beforehand, but that are quite obvious
when they occur in manual mapping, the human mind has an advantage
over any formal method.  This is the frame problem in AI.


[1] http://progtools.comlab.ox.ac.uk/members/oege/publications/documents/fop03fusion.pdf
[2] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.124.5591&rep=rep1&type=pdf
[3] http://www.archive.org/details/JohnBack1987


Entry: Staging vs. Partial Evaluation
Date: Sun Sep 13 12:00:19 CEST 2009

Explain the practical difference between the techniques of Staging and
Partial Evaluation, beyond the hand-waving ``PE is automated
Staging.''

A partial evaluator can be defined as a program that consumes a
general program and a fixed set of inputs, and produces a specific
program specialized to the fixed set.

Staging can be defined as performing this specialization manually, by
writing an explicit parameterized code generator that takes the fixed
set and produces the specialized program.

  - A staged program can express design (implementation) decisions
    that a partial evaluator cannot resolve due to 1. undecidability
    or 2. practical computational cost.

  - A staged program can give you extra guarantees about the
    specialized form of the program.  I.e. it can express the
    condition ``failure to implement as required''.

  - A (limited) partial evaluator (one that avoids recursion) can be
    used to augment staged programming.

TODO: give some explicit examples to support these vague claims.

What I'm really trying to work out is the use of staging for
control-flow guarantees (real-time programs, reduction to state
machines) combined with limited data-flow partial evaluation (= mostly
constant folding).

Glossing over a million details, there are essentially two forms of
PE: those that simplify expressions at compile time, and those that
modify arbitrary control structures.  The former are quite
straightforward to handle, but the latter (because they involve
unbounded recursion) are in general undecidable.  It is my thesis that
moving control flow into the expression domain (avoiding
non-termination in the form of _finite_ combinators) is the way to go.


Entry: Formal Verification of Timed Systems: A Survey and Perspective
Date: Fri Sep 18 11:10:27 CEST 2009

[1] http://ntur.lib.ntu.edu.tw/bitstream/246246/44838/1/01316040.pdf


Entry: Extensions to C syntax
Date: Sat Sep 19 11:22:56 CEST 2009

Instead of trying to invent ad-hoc syntax, why not stick with C?  Now
that c.plt is complete, it might be interesting to implement DSL
features as simply extensions that fit in the original syntax.  I.e.

    solve(a * x + b * y == q,
          c * x + d * y == r);

This would parse as a function call to `relation' with two expressions
as arguments, and could expand into a declaration for x and y in terms
of a,b,c,d,q,r.  This could be used to build arbitrary tree
structures:

    solve(vars(x,y),
          a * x + b * y == q,
          c * x + d * y == r);

I'm moving this into staapl[1].

[1] entry://../staapl/20090919-120436


Entry: SML as a DSL for writing compilers
Date: Sat Sep 19 15:36:55 CEST 2009

From this[1] HN post.  It has a reference to Engineering a
Compiler[2], the Machine Toolkit[3]. It mentions SUIF[4] and links to
some interesting papers[5].

[1] http://news.ycombinator.com/item?id=831568
[2] isbn://155860698
[3] http://www.cs.tufts.edu/~nr/toolkit/
[4] http://suif.stanford.edu/
[5] http://www.cs.utexas.edu/users/mckinley/20-years.html


Entry: Selling Lisp in a Bag
Date: Sun Sep 20 13:17:21 CEST 2009

From most of my communications lately it looks like selling Lisp[1] is
not going to work.  To get anywhere with testing these ideas against
some gegenwind is to use them where they are used best: backstage,
without any need for explanations..

I'd like to focus on two architectures: the TI fixed point C64x as can
be found in the DaVinci[4] and OMAP SOCs (i.e. the OMAP3530[2] in the
BeagleBoard[3]), and the floating point cousins[5] i.e. the C6701[6].
I have the DM6446[8] (OSD2) which has a C64x[7] DSP.

The C67x[6] is quite similar to the C64x[7] in that it has 8
functional units.  In floating point mode: 4/2 (add/mul).  In fixed
point mode: 6/2.  The FUs are partition in two paths of 4.  The C64x
fixed point core in addition supports vector instructions (each FU can
execute a vector of add/mul + complex mul + Galois field mul).  A
presentation about the arch here[9].

When fixing on an architecture like that, there are 2 overall
strategies that seem obvious:

   1. propagate CPU features upwards into abstractions

   2. compile high-level description to the highest possible
      abstraction level provided by the vendor tools[10] (TI C
      compiler and assembly optimizer)

Getting that right will probably get you more than half way there.
The rest seems to be data memory management.

The tool architecture then looks something like:

META  = special purpose code transformers for static DSL
TOOLS = vendor toolchain
LINK  = binary linking step (no inter-op optimization)

System software then looks like:

* components (final + performance critical):

                  META                      TOOLS
  [ static DSL ] ----->  [ ASM / C / C++ ] -------> [ BIN ]

* toplevel system (debug + proto + not performance critical):
                   
                LINK                                  
   [ dyn DSL ] ----->  [ BIN ]

One possible design flow is to move components from dynamic
composition to static after exploratory phase.  The dyn DSL could
_interpret_ the static DSL (to provide aspect join points).

An important property of the C6000 architecture is loop buffer support
for software pipelining.  

Another interesting point is that the C6000 tools provide ``linear
assembly'' which is a language level inbetween C and full scheduled
assembly code.  The assembly optimizer can perform register
allocation, partitioning and scheduling.  It can perform software
pipelining.  It looks like this is a very welcome target for
compilation: full access to all functionality without having to deal
with the intricacies of low-level resource allocation.


[1] entry://20090823-133201
[2] http://focus.tihttp://focus.ti.com/docs/prod/folders/print/tms320dm6446.html.com/general/docs/gencontent.tsp?contentId=36915
[3] http://beagleboard.org/
[4] entry://../davinci
[5] http://focus.ti.com/paramsearch/docs/parametricsearch.tsp?family=dsp&sectionId=2&tabId=1948&familyId=1401
[6] http://focus.ti.com/general/docs/lit/getliterature.tsp?genericPartNumber=sm320c6701&fileType=pdf
[7] http://www.ti.com/lit/gpn/tms320dm6446
[8] http://focus.ti.com/docs/prod/folders/print/tms320dm6446.html
[9] www.imec.be/sips/pdf/inv_sp_noraz.pdf 
[10] http://www-s.ti.com/sc/techlit/SPRU187


Entry: Monads in OCaml
Date: Mon Sep 21 12:00:04 CEST 2009

The package is used in [2].  Rationale: for MetaOCaml style typed
staging, effects are problematic.  I know of two solutions for this:
keeping everything pure using monads, and using delimited control to
avoid scope problems[3].


[1] http://www.cas.mcmaster.ca/~carette/pa_monad/index.html
[2] http://www.cas.mcmaster.ca/~carette/publications/scp_metamonads.pdf
[3] http://okmij.org/ftp/Computation/staging/circle-shift.pdf


Entry: Hardware Mapping
Date: Tue Sep 22 13:18:09 CEST 2009

Let's look at some of the current approaches in hardware mapping
(mostly for special-purpose hardware like VLIW DSPs and ASICs).

This is essentially my main problem: given a high level description of
an application, map it down to particular hardware architecture.  This
points to a certain meet-in-the-middle design, where highlevel
description and lowlevel platform are based on heuristics about
problem domain and general ways of implementing it, but a middle layer
binds the particularities together (resource allocation).


Entry: MARTES
Date: Tue Sep 22 14:13:19 CEST 2009

I'm looking at local efforts in model-based design.  I ran into the
MARTES[2] project which ran over 3 years (2005-2007):

    ``The aim of the MARTES project is the following: The definition,
    construction, experimentation, validation and deployment of a new
    model-based methodology and an interoperable toolset for Real-Time
    Embedded Systems development, and the application of these
    concepts to create a development and validation platform for the
    domain of embedded applications on heterogeneous platforms
    architectures.''

Local academics listed in [2] are [3] and [4].

The two companies that cought my eye are: 

- NXP for SCATE[5], MPC[6] and C3PO[7], which seem to be mostly under
  the care of Ondrey Popp[8]

- Cofluent Design[9] for their MCSE and Cofluent Studio products.


Looks like what I'm looking for is Model Transformations.

I'm thinking: separate level connections from tool effects.
I.e. you're going to use different tools at different levels, but try
to keep the link between them clear.  The Generic Upsilon
Transformations (GUT)[10] is mentioned in [2] as a modeling technique
where each model TX is a merge of two models.


[1] http://www.martes-itea.org/public/publications.php
[2] https://lirias.kuleuven.be/bitstream/123456789/167594/1/D2.10_v1_ITEA-MARTES.pdf
[3] http://www.cs.kuleuven.be/~stefanv/
[4] http://www.cs.kuleuven.be/~aram/
[5] http://sourceforge.net/projects/scate/
[6] http://sourceforge.net/projects/mpsc/
[7] http://sourceforge.net/projects/c3po/
[8] http://www.linkedin.com/pub/ondrej-popp/4/637/56b
[9] http://www.cofluentdesign.com/

Entry: Local MetaOCaml
Date: Tue Sep 22 14:55:33 CEST 2009

Searching for local MetaOcaml references brought me to [1].

[1] http://swerl.tudelft.nl/bin/view/EelcoVisser/WebHome


Entry: FMTC & Model Based Design
Date: Tue Sep 22 15:21:38 CEST 2009

At FMTC[1] (bridge between flander's mechatronics industry and
academe, ad do-tank instead of a think-tank) people have been working
on model based techniques[2].

This is mostly the work of Peter Soetens[3].  What I'm interested in
is: 1) to figure out what the basic idea + representation is and 2)
how this can be related to techniques from functional programming
(semantics, types, logics, proofs) which allow one to get a firm grip
on the code translation process.

  ``De geavanceerde sturing van de toekomst omhelst de integratie van
  simulatie, codegeneratie, netwerk- of veldbusdistributie,
  configuratie, geavanceerde taakspecificatie en validatie van
  complexe toestandsmachines''

More specifically: how are simulation and code generation implemented?

Peter's PhD was about Orocos[5].

[1] http://www.fmtc.be
[2] http://www.mechatronicamagazine.nl/nieuws/achtergrond/bekijk/artikel/fmtc-verkent-grenzen-modelgedreven-machineontwerp-met-open-source.html
[3] http://www.hightechmechatronica.nl/sprekers/peter-soetens.html
[4] http://thesourceworks.com/
[5] http://www.orocos.org/


Entry: NXP & SCATE/MPC/C3PO
Date: Tue Sep 22 15:33:00 CEST 2009

Who is behind this?  How functional is it?  Are there any success
stories?  Can this be extended upwards (models).

SCATE[1], Source Code Analysis and Transformation Environment supports
analysis mapping and transformation of C++ source code.

MPC[2], The Multi Purpose structural Compiler is a tool for generation
of network descriptions in various formats or languages
(VHDL/Verilog/..), from an efficient structural description language.

C3PO[3] is a software synthesis tool that provides a solution for
building maintainable, reliable and robust software infrastructures
and/or compiler frontends from a set of attributed grammar rules in
EBNF notation.

svn co https://scate.svn.sourceforge.net/svnroot/scate scate/trunk
svn co https://c3po.svn.sourceforge.net/svnroot/c3po c3po/trunk

[1] http://sourceforge.net/projects/scate/
[2] http://sourceforge.net/projects/mpsc/
[3] http://sourceforge.net/projects/c3po/


Entry: code generation from UML
Date: Wed Sep 23 10:41:20 CEST 2009

Since UML[1] seems to be the paradigm around which a whole lot of
model-based approaches are built, it's time to give it a good look.

From the more approachable FP camp, I find this LtU tread[2].  A
characterization of modeling languages is to give a platform to talk
about strategies instead of implementations.  

Anton van Straaten complains about representation-oriented approach in
standard OO use (abstraction violations), and later about the
evolutionary nature of these kinds of tools (the pyramid).

Paul Snively notes that the gap between design and programming narrows
as type systems get better.  Marc Hammann talks about the ``undead
monster'' (OO vs. FP) and calls the OO/FP distinction (object/ADT,
class/module) an implementation detail. ``I think I understand what
problem those people think they are solving: 1) Design is hard.  2)
Many programmers lack good design skills.  3) Therefore, we must
encode good design into tool and languages so they don't need to
develop these skills.'' Allan McInnes points to FAD[3].

[1] http://en.wikipedia.org/wiki/Unified_Modeling_Language
[2] http://lambda-the-ultimate.org/node/1828
[3] http://www.cs.kent.ac.uk/pubs/2001/1152/
[4] http://www.ncgia.ucsb.edu/conf/interop97/program/papers/kuhn.html
[5] http://en.wikipedia.org/wiki/SCADE


Entry: MARTE & UML modeling
Date: Thu Sep 24 10:39:42 CEST 2009

( Important: do not confuse MARTE[1] with MARTES[2]. )

First, what makes my toe itch?  When I hear ``model'' I think of
mathematical model: something that can be handled with mathematical
(logic) tools, i.e. equations / relations: formal languages that can
be equipped with some well-defined relations to other objects
(semanitcs, implementation, ...).  The essential idea is that you can
make your model _interact_ with something.  Another model type that
has this property is scale models: here you use physics instead of
mathematics to interact with a scaled down prototype.

The cool thing about _mathematical_ modeling is that you can have a
weird bunch of people called logicians and mathematicians figure out
how to build a meaningful structure (once and for all), and map it to
your problem domain.  I.e. as engineer, you keep an intuitive view on
the modeling, but can rest assured that the substrate works, and that
you don't loose relational power between different abstraction layers
due to ill-defined concepts.  The mathematical construct used as
modeling language is an API for your ideas, without degrading into
nonsense.

In UML-lore, modeling is more about providing a basis for
communication and understanding, not necessarily using mathematical
tools (or physical) interactions.  I.e. the _interaction_ part is
mostly about software engineers talking to each other.

From the tutorial[3]:

  ``A model without its meta-model has no meaning.''

This translates to: the language (meta-model) you use to construct the
description (model = instance) needs to have a semantics, i.e. it
needs to be precisely linked to something that has meaning in some
well-understood way (i.e. a mathematical structure, physical law,
general belief, ...)

In the slide ``A Bit of Modern Software'' p.13 the question is posed:
``Can you spot the architecture'', wrt to a page of SystemC code that
links a producer and a consumer as part of an imperative program.  The
next page has the UML diagram.  In an FP approach, one would use
higher order functions or staging to combine both the diagram and its
implementation in the same source code (i.e the diagram _is_ the code).

A profile[4] is essentially a DSL: a set of high level primitives and
combining forms that compile to a lower level implementation.

This all smells so similar to FP/logic, to a point where mapping
between the two representations should be feasible.  This reminds me a
lot of the OO patterns approach: because the OO model itself has so
little structure, one _imposes_ structure trough interface patterns
with implicit meaning.  UML seems to try to add some meaning to such
interface patterns in terms of code generators.

Some heated comments from the Haskell list[5].  The thread contains a
link to Rosetta[6], a system level modeling language.

My conclusions: 1. as long as semantics isn't clear (a function from
syntax to some implementation: a compiler) this makes no sense at all.
2. too much rooted in object-oriented view of the world.

If there are compilers, then probably a meta-programming framework can
embed the ideas (interface _to_ OO/UML for tool integration but, don't
use it as a base in tools themselves).

I'm going to leave it at this humorous note [7].

[1] http://www.omgmarte.org
[2] entry://20090922-141319
[3] http://www.omgmarte.org/Tutorial.htmZ
[4] http://en.wikipedia.org/wiki/Profile_(UML)
[5] http://www.mail-archive.com/haskell@haskell.org/msg18185.html
[6] https://wiki.ittc.ku.edu/rosetta_wiki/index.php/Main_Page
[7] http://archive.eiffel.com/doc/manuals/technology/bmarticles/uml/page.html


Entry: DSL'09 conference
Date: Thu Sep 24 14:32:21 CEST 2009

Things to look at from the DSL'09 conference[1].  The Sun paper about
expertise gap[2] and the awards[3].

[1] http://dsl09.blogspot.com
[2] http://homepage.mac.com/mlvdv/publications/squiresvdvvotta-pphec06-han.pdf
[3] http://dsl09.blogspot.com/2009/07/best-of-dsl09-award.html


Entry: Marc Leeman's thesis
Date: Fri Sep 25 19:34:16 CEST 2009

The thesis[1] and a paper about dynamic memory management
vs. footprint & power consumption[2].

[1] http://chiana.homelinux.org/~marc/phd.html
[2] http://crichton.homelinux.org/~marc/downloads/journalvlsi2003.pdf


Entry: DSP: the problem
Date: Sun Sep 27 15:28:46 CEST 2009

Looking back at the last couple of years, I'm not sure if I really got
much closer to solving my initial problem: ``higher level description
of DSP applications which allows (dynamic) rapid prototyping and
compilation to optimized (static) structure.''

I think I have a good idea now about the dynamic side of things.
Tools like Pure Data and Simulink help a lot, though they require a
huge library of tools to be useful.  These tools tend to be specific
and tied to the framework.

It would be better if the prototyping environment and the
specification language for efficient compilation are actually the same
thing.  The granularity is still too coarse, and the implementation
too specific..

In the last couple of years I learned a bit about staging and partial
evaluation.  The thing that still puzzles me is sequencing and
combinator transformations: conversion of high level operations to
somewhat optimally sequenced code (vs. CPU feed rate and memory
usage).

This book[1] might bring some more insight in currently used
methods. (``Embedded Computing: A VLIW Approach to Architecture,
Compilers and Tools'').

Chapter 10, programming languages talks about extensions to C as:
  - restricted pointers (aliasing)
  - fixed point arithmetic
  - circular addressing
  - matrix referencing and operators (FORTRAN style)

Remarks on writing for ILP:
  - avoid irregular loop kernels
  - write code for predication (avoid mem ref in if-statements)
  - provide aliasing hints (`restricted')
  - localize variables
  - eliminate redundant loads
  - enable prefetching and exploit locality properties
  - specialize code
  - LUTs and memoization

[1] isbn://1558607668


Entry: HPC & CFD
Date: Wed Oct  7 18:40:21 CEST 2009

I'm trying to get a better idea of the world of HPC, especially CFD,
to see if it's possible to bring some of the things I've learned
recently to fruition.

A course on CFD[1].

HPC. In cluster applications, the state-of-the art tools seem to be
MPI[2] and OpenMP[3].  A comparison[4].  MPI is based on
message-passing / distributed memory while OpenMP is based on
multithreading.


[1] http://personalpages.manchester.ac.uk/staff/david.d.apsley/lectures/comphydr/index.htm
[2] http://en.wikipedia.org/wiki/Message_Passing_Interface
[3] http://en.wikipedia.org/wiki/Openmp
[4] http://en.wikipedia.org/wiki/Comparison_of_MPI,_OpenMP,_and_Stream_Processing


Entry: NASA Survey on Automatic Code Generation (ACG)
Date: Wed Oct  7 19:53:40 CEST 2009

Googling for "MetaOCaml FORTRAN" I found this[1], a survey about the
use of automatic code generation tools.  Not surprisingly, the control
field and Real-Time Workshop stick out.

MetaOCaml was mentioned in one of the companies in the survey.  Let's
try to figure out which one it is.

For some companies/problems AGC didn't help much due to inherent
domain complexity, while for other it would be impossible to get to
market without.  Integrating AGC into an existing design flow seems to
be non-trivial in many cases.  Generator customization is sometimes
necessary (33%, where 60% this was done by tool vendor).

AGC are complex and bugs in AGC are common (60%), and verified
generators are highly desirable.  This gives a good rationale for
typed metaprogramming.  Also important are: presenting mathematical
derivations, performing domain-specific analysis is important,
tracability between model and code.

[1] http://ti.arc.nasa.gov/m/pub/1219h/1219%20(Denney).pdf


Entry: C preprocessor functional programming
Date: Fri Oct  9 08:11:19 CEST 2009

Is it possible to embed (threaded) state in the C preprocessor?

[1] http://mail.perl.org.il/pipermail/perl/2003-July/002701.html


Entry: Address Generation
Date: Fri Oct  9 09:04:00 CEST 2009

Something I've never quite understood well..  Is it best to use
pointer increments, or should one use indexed addressing?  Also, on a
VLIW, the FUs that perform address calculations, are they separate
from the main units, or included?  For the C6000 I find[1]:

  Addressing modes—The C6000 performs linear and circular
  addressing. However, unlike most other DSPs that have dedicated
  address-generation units, the C6000 calculates addresses using one
  or more of its functional units.  


[1] http://www.edn.com/index.asp?layout=article&stt=001&articleid=CA46710&pubdate=3/30/00

Entry: Code Gen @ Sioux (dutch)
Date: Fri Oct  9 13:11:42 CEST 2009

It keeps amazing me how two strands can develop in parallel with
little interaction (OO/UML vs. FP based codegen/metaprog).  However,
this particular part of UML is not very object/class-centric, and fits
quite well in the FP idea of pluggable modules joined by interfaces.

[1] http://www.sioux.be/index.php?option=com_docman&task=doc_download&gid=26&Itemid=274
[2] http://www.openarchitectureware.org/staticpages/index.php/oaw_eclipse_letter_of_intent
[3] http://www.eclipse.org/modeling


Entry: MetaOCaml -> Fortran90
Date: Sat Oct 10 10:44:33 CEST 2009

This seems to work (it generates sources just fine using Trx.run_f90),
however I have a problem with running the compiled .f90 code from
MetaOCaml.


Entry: XML and XSLT
Date: Tue Oct 20 12:08:37 CEST 2009

Probably the most important category of model transformation is
XSLT[1] a declarative language to perform XML -> XML transforms.

There is a Scheme variant called XSieve[2].

[1] http://en.wikipedia.org/wiki/XSLT
[2] http://xsieve.sourceforge.net/index.html#preface


Entry: Process vs. Tools
Date: Wed Oct 21 13:19:43 CEST 2009

If I learned anything recently it's that process is more important
than tools, and you want the tools to support the process.


Entry: UML
Date: Wed Oct 21 13:22:34 CEST 2009

So.. I still don't understand.  The problem really seems to be
semantics: UML doesn't have any.  UML in the first place is a tool to
give people something to talk about as long as there is no code.

What I mean: what does refinement in UML-style DSLs look like?  Is
there _some_ map between the UML model and an implementation?  Where
does the extra information necessary for code generation come from?


Why do you need code gen?  When the compiler for the base language
can't efficiently work out a specialized for of a generic piece of
code.


Entry: Defunctionalization
Date: Sat Dec 26 09:26:19 CET 2009

A technique for converting higher order functional programs to first
order ones.

[1] http://www.brics.dk/RS/01/23/BRICS-RS-01-23.pdf


Entry: OCaml monads: perform (do) notation
Date: Sat Jan 30 10:03:38 CET 2010

Reason: typed metaprogramming currently needs either a pure setting
(i.e. monads) or some delimited control tricks to prevent scope
extrusion.


Using OCaml / MetaOCaml based on 3.09.1 :
  
        make OCAMLC=ocamlc \
             CAMLP4=camlp4o \
             PP-EXT="-pp '\$(CAMLP4) -I . pa_extend.cmo q_MLast.cmo'" \
             SYNTAX-EXTENSION=pa_monad-camlp4-3.09.cmo

This produces pa_monad.cmo (bytecode. native code .cmx is not
necssary)

Using it (3.09):

        ocamlc -pp 'camlp4o -I . pa_monad.cmo' -c ...


Now, to refresh that OCaml lore.  


Entry: MetaOCaml and functors, or Haskell and typeclasses
Date: Wed Feb  3 17:01:04 CET 2010

The main question: is it really so beneficial to have type safety for
generating numeric code?  It's quite hard to make type errors in flat
routines that deal with numbers only.

From this perspective it might be more convenient to use the more
powerful language for defining abstract operations that generate code
as a side effect, i.e. in ANF, trivially translated to C?

EDIT: Maybe this typed MSP is not so all-important for a first order
language...  Let's see how far I get in Haskell and untyped ASTs.


Entry: Simple numeric abstract interpretation: implementation
Date: Thu Feb  4 07:26:20 CET 2010

The idea is to translate the code in staapl/algebra/staged-number.ss
to a typed representation, such that operations respect the Num type
class.

The idea is that the "number" represented by the new datatype is
actually an environment of memoized expressions.  If an expression
corresponds to a value, it can be reduced at compile time (Haskell run
time), otherwise it needs to be postponed to run time (of the
generated C program).

So, performing a binary operation consists of somehow joining two
environments (possibly eliminating common subexpressions) and
extending it with a new expression.

To avoid common subexpression elimination it might be required that
the environment is always shared, i.e. that it is threaded through the
computation in the same way it will be threaded at run-time when local
variables are initialized sequentially.

So the question arises: how to represent the environment and values?

It would be simplest to use a model that directly corresponds to the
generated C code, i.e. to use symbolic names and shadowing.

So, datatypes are: Env, Expr

Can environment be reused? (Is there a type class for environments?)

Let's keep it simple:

type Op  = [Char]
type Ref = [Char]
type Env = [(Ref, Expr)]
type Closure = (Expr, Env)

data Expr = Unop  Op Ref
          | Binop Op Ref Ref
          | Integer
            deriving Show


Entry: Combining environments
Date: Fri Feb  5 07:55:22 CET 2010

When combining two expressions in a binary operation, the environments
need to be joined.  In order to make this operation simpler, it's
probably best to model the environments as sets and not allow shadowing.

I'm thinking about why this problem of joining environments didn't
happen in the Scheme version.  Probably because it was implemented
using context (Scheme parameters).

In the haskell approach I would like to represent context locally,
i.e. make each term self-contained (a closure).  It seems this would
make it easier to use type classes: to define algebraic operations as
pure functions instead of contextual functions.

So, let's stick to environments as finite functions, and the join
operation obtained by the relational union of both functions,
requiring that the result is again a function.

Conclusion: purity requires self-contained terms.


Something doesn't add up though

 S) One the one hand there is a sequential view (ANF or CPS).  Here the
    environemnt is incrementally built up moving from expression to
    expression.
 
 L) On the other hand there is a local view where each term has an
    environment connected, and both environments are not necessarily
    the same.

How can both be connected?

Can we assume in first approximation that for L) both environments are
the same?

Suppose we want to evaluate (+ a b) in an initial environment that
contains {a, b}.  Writing both terms `a' and `b' as closed terms gives:

   (a, [(a, #arg), (b, #arg)])
   (b, [(a, #arg), (b, #arg)])

Where #arg means this variable is a C function argument and will be
initialized at run time when the function is called.

Adding both numbers gives:

   (r1, [(r1 (+ a b)), (a, #arg), (b, #arg)])

Now the trouble begins when we want to do something like:
    (* (+ a b) (- a b))

Both sides can be executed independently, giving:

    ((* r1 r2), [(r1 (+ a b)), 
                 (r2 (- a b)),
                 (a, #arg), 
                 (b, #arg)])

The trouble is that r1 and r2 need to be unique.  If this is the case,
joining the two environments isn't such a problem.


Entry: Haskell: generating unique tags
Date: Fri Feb  5 08:34:26 CET 2010

I.e. gensym for haskell.  Alternatively, we could just work with
object equality.  This would give common subexpression elimination for
free.

So the real question: when are two terms equal?

The answer would be: when they expand to the same original expression
modulo some transformation rules like associativity or commutativity.

This probably gives quite a combinatorial explosion.

The next questions are then: 

    - Can equality tests be memoized?

    - For associative operators: can the association pattern be
      computed lazily?

For lazy association, the point being that intermediate results that
are not observed are not needed in the rest of the computation.

I.e. suppose we have (+ a b c) and factor this to (+ (+ a b) c).  If
(+ a b) is never observed, the decision to factor the original sum in
one of three ways can be postponed until another constraint pops up
that makes it obvious, i.e. the appearance of either (+ a b), (+ a c)
or (+ b c) or their commutations.

How do you express that?


Entry: Backtracking and lazy association
Date: Sat Feb  6 08:05:54 CET 2010

The idea: when converting an expression containing an associative
operator without parenthesis to ANF/CPS, extra information is
necessary about how to combine the binary operations into a tree.

The default association could be divide and conquer, as it shortens
the critical path.  However, it might be interesting to make different
combinations based on what happens down the stream (see previous post).

The question is then: how to implement this?  Can we get something for
free due to laziness?


Entry: Explicit memoization
Date: Sat Feb  6 08:11:49 CET 2010

It seems that performing explicit memoization (introducing local
variables) is one of the tricks that can make a code generation step
simpler by taking common subexpression elimination work out of the
hands of the compiler.

I'm thinking particularly of automatic differentiation: it's a problem
that ``uses'' sequential processing (deep dependency graphs due to
memoization) to reduce the amount of operations compared to symbolic
differentiation.

So, what is common subexpression elimination (CSE)?  It converts tree
structures into directed acyclic graph structures.

The big idea is that common subexpressions are often the result of
macro expansion: macros can copy code.  The rule then becomes: don't
copy code, always memoize.


Entry: Expression equality
Date: Sat Feb  6 08:23:19 CET 2010

So, to build the ANF compiler, an equality operation for expressions
(run-time value nodes) is necessary.

  - what is equality of references
  - how to express references

Names are probably not a good idea yet

OK: got something going.

Now I'm thinking about removing Var (free variables) from the
environment struct.  Putting it like that makes it rather obvious they
shouldn't be in there.  Environment = intermediate nodes only.

Maybe it's not necessary to perform environment joins?  It could be
done in a final pass over the whole expression tree.  However, that
might lead to complexity explosion if no memoization is used.


Entry: ai.hs: base code works
Date: Sun Feb  7 07:46:42 CET 2010

Type classes are a nice abstraction mechanism.  See:

*Main> 1 + (1 / var "a")
["a"]
R1 <- (div 1.0 a)
R2 <- (add 1.0 R1)
return R2

Let's make it spit out C syntax.

There is one thing to resolve: does the list ordering depend on the
evaluation order?  This shouldn't be the case, as semantics of the
Haskell language is not dependent on order if only safe operations are
used..  So why are the lists not always ordered?

I can't reproduce.

Memoization seems to work to:

*Main> let a = var "a" in 1 + (1 / (a * a))
["a"]
R1 <- (mul a a)
R2 <- (div 1.0 R1)
R3 <- (add 1.0 R2)
return R3

Let's make prettyprinting more abstract: OK, using intermediate AST of
an ANF/SSA machine language.


Entry: Commutativity and associativity
Date: Sun Feb  7 08:50:18 CET 2010

Exploiting those laws is a bit less straightforward.  However, how far
can we get with local reasoning only?

I.e. a problem is this: ((1 + a) + 2)

Currently this doesn't add the two literals.  Making that happen
requires two transformations:

 ((1 + a) + 2) -> ((a + 1) + 2) -> (a + (1 + 2))

Using local reasoning (rewriting), the outer '+' should be able to
reduce the expression.  Using a normal form is probably best.

Normalization:

  (1 + 1)       -> 2                  [e1]
  (1 + (1 + a)) -> (2 + a)            [e2]
  (a + 1)       -> (1 + a)            [c]
  ((x + y) + z) -> (x + (y + z))      [a]

((1 + a) + 2)

-> (1 + (a + 2))  [a]
-> (1 + (2 + a))  [c]
-> (3 + a)        [e2]


Entry: Problem with evaluation order
Date: Sun Feb  7 13:04:58 CET 2010

Might be symptomatic of some conceptual error:

*Main> putStr $ valuesShow $ let a = var "a" + 1 in [a,1/a]
in:  a
R1 <- div 1.0 R2
R2 <- add a 1.0
out: R2 R1


Maybe the problem is in the use of union: this doesn't respect order?

 union [2,4] [1,4,5] => [2,4,1,5]

Or not..  I don't see the direct connection though.

About pointer equality: apparently that's not referentially
transparent.  There is a StableNames module that provides something
like a pointer name, but it's in the IO monad.


Entry: Reconstructing environment
Date: Sun Feb  7 13:57:27 CET 2010

Maybe it's not necessary to thread the environment: only perform
common expression elimination once?  Can this be done without
sacrificing sharing?

This needs a different approach..  Maybe have a look at standard
memoization techniques in Haskell.


Entry: Type classes are nice
Date: Sun Feb  7 14:13:34 CET 2010

Objective reached: I have a Code object that can be used as an
ordinary numbers in Haskell expressions.

Next:
  - Lazy environment?
  - Expression normalization (assoc, commut)


Entry: Vectors and Matrices
Date: Sun Feb  7 14:36:30 CET 2010

Can standard Haskell vectors and matrices be used?


Entry: Cleanup
Date: Sun Feb  7 15:36:56 CET 2010

I've added Add and Mul tags to the Prim datatype for simpler pattern
matching.  Next: write operations on Code as lifted operations on Ref.

Maybe writing as a monad is simpler?

Using some lifting:  ([Ref a] -> Prim a) -> ([Code a] -> Code a)

Conclusion is that the current representation is too cumbersome.  The
wrapping should be repackaged somehow such that environment threading
isn't such a pain.


Entry: Threading environment
Date: Mon Feb  8 15:10:22 CET 2010

Some remarks:

   * Compile time reductions are of the type:

         [Lit] -> Lit
         [Prim,Lit,Var] -> Prim

     How can computations on these expressions be lifted to
     computations that have an associated environment (for memoization
     modulo equivalence).

   * The resulting program is sequential.  Can this be exploited?


I'm asking the wrong question.  Why is it so straightforward in
Scheme?  The order of evaluation was fixed by Scheme's: the
environment is a stack.  Once an operation is added it won't be
removed.  In the current implementation, expressions are independent
(no context).

Current ideas:

  + An environment is necessary to be able to not lose sharing that's
    already present.

  + The current implementation implements this by implementing common
    subexpression elimination wrt. term equivalence (i.e. taking into
    account commutativity).

  - Complexity is too high: term equality is defined recursively, as
    is subexpression elimination.

It might be best to simplify the environment model.  But how to do
this without loosing the nice type class implementation
(i.e. no shared context).


Are there any hard constraints?

  1.  Code is self-contained (closed).

  2.  Complexity needs to be minimal (i.e. definitely not 2^N and
      N^2 only if necessary.)


Conclusions for implementation:

  1. A monadic approach might be useful for implementing the abstract
     evaluation rules with an implicit environment join, but the end
     user cannot be exposed to any form of shared context.

  2. Currrent approach is OK but needs memoization of the term
     equality function since it is recursive.


Entry: Haskell and memoization
Date: Mon Feb  8 16:09:41 CET 2010


See [1] section 2) Memoization with recursion.

memoized_fib :: Int -> Integer
memoized_fib =
   let fib 0 = 0
       fib 1 = 1
       fib n = memoized_fib (n-2) + memoized_fib (n-1)
   in  (map fib [0 ..] !!)

The '!!' operator is array indexing, which appears curried here;
i.e. the last line is the same as:

   in  ((map fib [0 ..]) !!)

Which returns a function that maps an Int to one of the elements in
the list.

In general, using a memoized y-combinator it is possible to memoize
recursive functions by making sure they recurse into a memoized
version of themselves.

[1] http://www.haskell.org/haskellwiki/Memoization


Entry: Expected kind `* -> *', but `Code a' has kind `*'
Date: Wed Feb 10 09:17:53 CET 2010


Some grave conceptual error in the following code?

instance Monad (Code a) 
    where return = returnCode 
          (>>=)  = bindCode

include p ps = if' (p `elem` ps) ps (p:ps)
join xs ys = foldr include ys xs

common op' [] = op'
common op' (op:ops) = if' (op' == op) op (common op' ops)


returnCode r = Code r (case r of
                         Node n -> Env [] [n]
                         Var v  -> Env [v] []
                         Lit l  -> Env [] [])

bindCode (Code t (Env vs ps)) combine =
    case combine t of
      Code t' (Env vs' ps') ->
          let ps'' = join ps ps' 
              vs'' = join vs vs'
              e''  = Env vs'' ps''
          in
            case t' of 
              Node p' -> Code (Node (common p' ps'')) e''
              _ -> Code t' e''
    

Hmm.. Obscure message[1].  The correct line seems to be:

instance Monad Code
    where return = returnCode 
          (>>=)  = bindCode

I.e. Monad expects a type constructor, not a type.

Ok, I see..  The problem is that an environment is a container for
Term types, not the number type Terms are parameterized over.

Solution: an environment is an addition to terms that tracks sharing
between nodes.  Terms themselves are syntax trees and are
self-contained, except for external variable inputs which are treated
as constants anyway.

[1] http://hackage.haskell.org/trac/ghc/ticket/1633


Entry: Terms do not know of environments
Date: Wed Feb 10 09:30:03 CET 2010

Trying to express the problem of merging environments as a monad lead
me to think that mabye this needs a different terminology.

Terms are self-contained tree structures, and the environment is only
there to track sharing/memoization, i.e. to identify unique nodes.
Let's just call it "Nodes" then.

Next: refactor the code accordingly: the Nodes collection is
parameterized by Term, and only needs some abstract operations on it.


Entry: Structure of datatypes
Date: Wed Feb 10 10:15:41 CET 2010

One thing I don't have an intuitive feel for yet is when to use
projections in sum types.  I.e. is one of the alternatives of a sum
type a separate type or not?

The conditions seems to be: there are operations that work only on the
projection, and lifting is straightforward.


Entry: Normal forms
Date: Wed Feb 10 10:33:20 CET 2010

Using normal forms is useful for reducing nested expressions.

Let's try commutativity:


  (1 + 1)       -> 2                  [e1]
  (1 + (1 + a)) -> (2 + a)            [e2]
  (a + 1)       -> (1 + a)            [c]
  ((x + y) + z) -> (x + (y + z))      [a]


It's probably simplest to represent the sum/product of a value with a
fully reduced term as an intermediate object.

What I'm thinking now is that it might be better to abstract Ring /
Field first, such that abstract evaluation doesn't need any string
comparisons[1].  

[1] http://okmij.org/ftp/Computation/Generative.html#GE-generation


Entry: One-pass generation
Date: Wed Feb 10 11:34:58 CET 2010

Maybe it's better to stay a bit closer to [1]:

  ``Can we generate optimal code in one pass, without further
  transformations such as common-subexpression-elimination, without
  looking into the code at all after it has been generated?''

Next to abstract evaluation, memoization is really an essential part
of the deal: generate code with let expressions, and don't try to fish
them out later.

So, can they be separated?  Can we have a language that can be
abstractly evaluated and memoized separately?


Wow.. This stirs up a lot.  For one: the rewrite rules for the
abstract evaluation seem ill factored:

asum (Lit l1) (Lit l2) = Lit (l1 + l2)
asum (Lit l1) (LitAdd l2 t) = LitAdd (l1 + l2) t
asum (LitAdd l1 t) (Lit l2) = LitAdd (l1 + l2) t
asum t Zero = t
asum Zero t = t
asum (LitAdd l t1) t2 = LitAdd l (t1 + t2)
asum t1 (LitAdd l t2) = LitAdd l (t1 + t2)

I don't see how to factor patterns in the pattern matching rules;
i.e. commutativity is duplicated in the rules above.

Syntactic abstraction is so easy in Scheme -- not having it and having
to express a problem differently is a challenge.


[1] http://okmij.org/ftp/Computation/Generative.html#framework


Entry: Commutation rewrite rule in CPS
Date: Wed Feb 10 14:25:46 CET 2010

termAdd :: (Num l) => (Term l) -> (Term l) -> (Term l)
termAdd = add (add Add)
    where
      add k Zero a = a
      add k One a = add k 1 a
      add k (Lit a) (Lit b) = Lit (a + b)                    -- evaluate
      add k a@(Lit _) (Add b@(Lit _) c) = Add (add k a b) c  -- associate
      add k a b = k b a                                      -- commute

If maching fails, add calls its continuation k.  The first k passed in
is (add Add).  If that version of add fails again it calls its
continuation k = Add which returns the term unevaluated.

With normalization and all association rules:

termAdd :: (Num l) => (Term l) -> (Term l) -> (Term l)
termAdd = add (add norm)
    where
      add k Zero a = a
      add k One a = add k (Lit 1) a
      add k (Lit a) (Lit b) = Lit (a + b)                                  -- e
      add k (Lit a) (Add (Lit b) c) = Add (Lit (a + b)) c                  -- a1
      add k (Add (Lit a) b) (Add (Lit c) d) = Add (Lit (a + c)) (Add b d)  -- a2
      add k a (Add b@(Lit _) c) = Add b (Add a c)                          -- a3
      add k a b = k b a                                                    -- c

      norm a b@(Lit _) = Add b a    -- n 
      norm a b = Add a b


I guess starting from equivalences (the algebraic rules) these rewrite
rules could be derived using Knuth - Bendix?

After thinking some more, this is the simplest set of rules I can find:


termAdd :: (Num l) => (Term l) -> (Term l) -> (Term l)
termAdd = add (add Add)
    where
      add k Zero a = a
      add k One a = add k (Lit 1) a
      add k (Lit a) (Lit b) = Lit (a + b)                                  -- e
      add k (Lit a) (Add (Lit b) c) = Add (Lit (a + b)) c                  -- ae1
      add k (Add (Lit a) b) (Add (Lit c) d) = Add (Lit (a + c)) (Add b d)  -- ae2
      add k a (Add b@(Lit _) c) = Add b (Add a c)                          -- an
      add k a@(Lit _) b = Add a b                                          -- n  
      add k a b = k b a                                                    -- c


For multiplication, the rules will look almost exactly the same.  Now,
the question is: how to abstract over such a pattern match?

Or..  How many combinations of rewrite rules should we allow?
I.e. allowing Signum 


Entry: Typed functional programming (Haskell vs. Scheme)
Date: Wed Feb 10 16:28:50 CET 2010

A matter of taste probably, but static type joy does come at a price.

A lot of patterns I take for granted in Scheme (i.e. through macros)
are not straightforward to express due to constraints of the type
system.

Once example is pattern matching: often I would like to parameterize
type constructors.  Probably I'm just not there yet.

It's too early to say, but atm I tend towards Scheme.  However,
Haskell _is_ incredibly consise when you do find a way to stick to the
bondage.


Entry: Knuth - Bendix
Date: Wed Feb 10 17:07:38 CET 2010

Let's see if I get this right.  Given a set of _equalities_ and an
order relation on terms, it is possible to construct a confluent
rewriting system (if one exists?).

The order relation on terms serves to define normal forms.
[1] http://en.wikipedia.org/wiki/Knuth–Bendix_completion_algorithm


Entry: Affine values
Date: Wed Feb 10 17:25:05 CET 2010

I'm looking at this in the wrong way.  The rewriting systems I
mentioned before are too error-prone and low-level.

Maybe the trick is to use a proper number system?

I'm also looking at very specific optimizations: combining constants
using the commutative and associative laws.

Or how do you call numbers that look like: a + b x, where x is a
variable and a, b are numbers.

Such numbers allow similar bubble tricks.

What about nesting two kinds of number systems?  (Zero, One, Lit l)
and (a + b x)?


Maybe the idea is to define operations on a certain abstract domain
and lift them into more detailed domains of which they are subsets
(terminology?).

I'm starting to see the point: what you _really_ want is to make
compositions of, or otherwise interactions with different abstract
domains.

Anyway, it seems to be the case that you should use the abstract
domain that works for you.  What I've had in the back of my head is
numbers + variables.

So, factorizating might work by building a number type that contains
multiple number types.


Entry: Nesting number types
Date: Wed Feb 10 18:09:43 CET 2010

It's going a bit fast..  After re-reading [1] a bit I see they do
stack different abstract domains.  I.e. in 4.1: 

      type maybeValue = Val of float code
                      | Exp of float code

      type abstract_code = Lit of float
                         | Any of float * maybeValue      

The first one determines whether it's cheap or expensive to duplicate
a value, and the second allows operations on compile-time literals to
be evaluated away.

Compared to my approach, constant addition is not tracked, only
constant multiplication.

The concretization functions mentioned in [1] are what is the "show"
code in mine.

Maybe I should experiment a bit with number systems and see how they
compose; see what kind of patterns emerge.

[1] http://www.cs.rice.edu/~taha/publications/conference/emsoft04.pdf


Entry: Typeclasses vs. multiple dispatch
Date: Wed Feb 10 18:29:56 CET 2010

While all this static poo-poo is quite nice, it might be interesting
to look at how to translate it to a dynamic setting with multiple
dispatch.

I've always looked at operator overloading as fluffy extras that
aren't really useful, but the real deal here is polymorphism: being
able to work with different (layered!) number systems is quite a
thrill.


Entry: Memoization and Control
Date: Wed Feb 10 18:33:52 CET 2010

So, with practical use of abstract interpretation +- understood, it's
time to work on the code graph problem.  Create a monad for sharing,
and try to express an algorithm in it, i.e. using the FFT described in
[1].

Monadic multistage programming[2].

The monad suggested is the state-continuation monad:

  'a monad     = state -> (state -> 'a -> answer) -> answer

  let ret a    = fun s k -> k s a
  let bind a f = fun s k -> a s (fun s' x -> f x s' k)

and a staged version:

  let retN a    = fun s k -> .<let z = .~a in .~(k s .<z>. )>.
  let bindN a f = bind a (fun x -> bind (retN x) f)

The state is used for threading the memo table through the code at
compile time.  This table contains variable names like .<z>. above
that represent cached function outputs, and is hashed by the input
arguments of functions.

In MetaOCaml the generated variable z in the code template will be
automatically renamed to make it unique.  If this renaming is desired
in Hasell code that uses direct syntax trees, uniqueness needs to be
guaranteed in a different way.

In a first attempt it might be interesting to forget about the
memo-Y-combinator, and use the state monad only for passing variable
names.


[1] http://www.cs.rice.edu/~taha/publications/conference/emsoft04.pdf
[2] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.92.495&rep=rep1&type=pdf


Entry: Monads and the incredible uselessness of intuition
Date: Wed Feb 10 19:13:48 CET 2010

Yes, a monad is a TYPE constructor, not a value constructor.  I.e. it
is a parameterized type, not necessarily a flat data structure that
can be instantiated by calling "M x".

I'm severely lacking in intuition here.  Maybe it's time to do some
more standard exercises.


It's mostly about this error message:

    Kind mis-match
    Expected kind `* -> *', but `Contstate' has kind `*'
    In the instance declaration for `Monad Contstate'
Failed, modules loaded: none.

Monad apparently requires a type constructor with two arguments. (wording?)

No, check this[1]

A 0-param type constructor has kind `*'
A 1-param type constructor has kind `* -> *'
...


[1] http://en.wikibooks.org/wiki/Haskell/Kinds


Entry: Continuation monad
Date: Fri Feb 12 08:40:39 CET 2010

Writing the definitions in one go doesn't seem to work yet.  Maybe
it's time to try one thing at a time.

I.e. start from the continuation monad, and try to express some CPS
code in it.  The state monad part doesn't seem to be the problem.


Entry: Let's start again
Date: Sat Feb 13 09:25:21 CET 2010

What is the problem?  I miss intuition diving into state+cps.
Additionally, I'm still not sure whethere it's a good idea to leave
the expression realm.

Isn't it possible to turn the problem upside down, i.e. when I write
TERM1 + TERM2, this constructs a composition of two functions,
i.e. make the arithmetic expressions do nothing but _serializing_ the
computation, and feed any extra environment in later.

Goal:

  - create a Num class that can partially evaluate (PE) a numeric
    algorithm expressed in straight haskell.

So we're not really after staging, but after PE; that's the starting
point.  Maybe later we can add annotation that help the PE.

What is then the problem to solve?  It is memoization: how to know if
two expressions are equal, either resulting from:

  - explicit `let'

  - equivalence (CSE)

Goals: don't loose sharing introduced by `let', and if possible allow
arbitrary equivalences to lift out common expressions.

So, what this does is to take an underlying term type and an
equivalence of terms, and construct a wrapper around it to keep track
of expression sharing.

So there are the following modules:

  1) ai.hs / ainum.hs : term type and partial evaluation

  2) nodes.hs : environment threading ( ANF/SSA conversion + CSE )

The important part is that both are separate.  Let's stick to ai.hs
for 1) and focus on building nodes.hs on top of an abstract term type.


Entry: Staged programming vs. partial evaluation
Date: Sat Feb 13 09:49:27 CET 2010

SP and PE are really two different approaches.  The main difference
being that SP is explicit: all reductions are expressed by the
_programmer_, while in PE all reductions happen automatically.

MetaOCaml seems better suited for SP because of the Code data type.
I'm going to stop pursuing this route in Haskell due to lack of proper
quotation operations.

Haskell seems better suited for PE because of the type classes / ad
hoc polymorphism.  Essentially, in Haskell you can take a datatype
approach using generic code while in MetaOCaml you write specialized
code operating on a specialized data type.

Or something like that..


Entry: code.hs (nodes.hs)
Date: Sat Feb 13 11:05:38 CET 2010

Flattening an expression tree is really just a fold over that tree,
where the accumulated state is the variable->term dictionary and each
term is either reused from the map or creates a new mapping.

This needs a (Term -> Term) function mapping terms to variable terms,
and an equality operation on Terms.

So, it's not necessary to have a Code datatype such that operations
can be lifted: the accumulation/flattening can be completely hidden in
process, resulting in a map of (variable, term).


So.. what is needed is a code fold, and a way to introduce new
bindings.  I.e. foldTerm and letTerm.

Or not.

I don't see it..  Let's write it in its most concrete form first.

The conversion is a tree traversal.  One part is an ANF conversion.
The second part is CSE.


Entry: Flattening expressions into SSA
Date: Sat Feb 13 19:48:52 CET 2010

-- Convert list of terms to dictionary and list of varrefs.
type Dict a = [(Term a, Symbol)]
anf :: [Term a] -> Dict a -> (Dict a, [Term a])

anf = f [] where
    f rs [] d = (d, reverse rs)
    f rs (t:ts) d =
        case t of
          Op rator rands -> 
              let (d',rands') = f [] rands d   -- recurse down terms
                  sym = gensym d'              -- create variable
                  t'  = Op rator rands'        -- transform term
                  d'' = (t', sym) : d'         -- extend environment
              in f (Var sym : rs) ts d''
          _ -> f (t:rs) ts d
    
-- Generate a unique variable name
gensym = Symbol . ("r" ++) . show . length


-- This is just a dumb tree flattener.  It won't enable sharing for
-- copied expressions.  Now, how to extend this to enable CSE?


Entry: Memoizing original "merge" approach
Date: Sat Feb 13 19:38:16 CET 2010

The more I think about this, the more I like the previous incremental
approach.  Maybe focussing on memoizing node-set membership (modulo
equivalences like commutation for + and *) is a better approach?

So what is a memo-term?

   - Term: an expression tree
   - [Term]: a list (set?) of all expression nodes (*)
   - Term -> Maybe Term: a (fast) is-member function

If (*) is a list, the order should be one of the possible sequential
program orders.

Problem: naive node membership is too expensive, as the (==) is
recursive, and recomputes equality of nodes multiple times.

termNodes = f where
    f t@(Op _ ts) = t : (concat $ map f ts)
    f t = [t]

firstWith p = f where
    f []           = Nothing
    f (x:xs) | p x = Just x
    f (_:xs)       = f xs

-- Naive node membership;
nodeOf t0 = firstWith (t0 ==) . termNodes


So.. Memoization is usually based on the equality function, but what
if that's exactly the function to memoize?


Entry: Pointer equality in Haskell
Date: Sat Feb 13 21:19:32 CET 2010

Apparently this doesn't work in general because it breaks referential
transparency[3].

Anyway, this[1] suggests that unsafePtrEq can be used to build
memoization, where the lack of referential transparency is only
locally observable.

[1] http://www.haskell.org/ghc/docs/4.06/hslibs/sec-ioexts.html
[2] http://cvs.haskell.org/Hugs/pages/libraries/base/System-IO-Unsafe.html
[3] http://www.mail-archive.com/haskell@haskell.org/msg04239.html


Entry: Term -> Closure Term Env
Date: Sun Feb 14 10:27:38 CET 2010


-- A Closure represents a Term and its environment Env.  An
-- environment is a set of nodes and an association
-- (membership/sharing) function.

type Member a  = Term a -> Maybe (Term a)
data Env a     = Env [Term a] (Member a)
data Closure a = Closure {closureTerm :: Term a, closureEnv :: Env a }

instance Show a => Show (Closure a) where
    show (Closure t (Env ts _)) = show (t,ts)

-- To extend an environment with a Term is to either establish its
-- membership and return a representative, or to add it to the set and
-- membership functions.

extend t e@(Env ts member) =
    case (member t) of
      Just t' -> Closure t' e
      Nothing -> Closure t (Env (t:ts) (compareTerm member t))

compareTerm :: Eq a => Member a -> Term a -> Member a
compareTerm = compareTermModulo (==)
noMember t  = Nothing

compareTermModulo eq member t = f where
    f t' | t `eq` t'  = Just t
    f t' | otherwise  = member t' 
    

returnClosure t = Closure t (Env [t] (compareTerm noMember t))

---------------------------

However, the member function doesn't need to be made so abstract:
eventually this should only catch equivalences that are not captured
by the the pointer equality.

What about something like this:

     Env [Term a] [(Term a, Term a)]

where the first list is the list of shared nodes, and the finite
function caches other equivalences.  Whenever a comparison doesn't
match, it is added to the list of nodes.

Ok, this seems to work:


data Env a     = Env [(Term a)] [(Term a, Term a)]
data Closure a = Closure {closureTerm :: Term a, closureEnv :: Env a }

instance Show a => Show (Closure a) where
    show (Closure t (Env ts eqs)) = show (t,ts,eqs)


ptrEqual :: a -> a -> IO Bool
ptrEqual a b = do
  a' <- makeStableName a
  b' <- makeStableName b
  return (a' == b')

termRefEq :: (Eq a) => (Term a) -> (Term a) -> Bool
termRefEq x y = unsafePerformIO $ ptrEqual x y

firstWith p = f where
    f []           = Nothing
    f (x:xs) | p x = Just x
    f (_:xs)       = f xs

lookupWith p = f where
    f []               = Nothing
    f ((a,b):ps) | p a = Just b
    f (_:ps)           = f ps 

-- To extend an environment with a Term is to either establish its
-- membership and return a representative, or to add it to the node
-- set or equivalence tables.

extend :: Eq a => Term a -> Env a -> Closure a
extend t e@(Env ts eqs) = find cache0 (find cache1 miss)
    where
      cache0 = firstWith  (t `termRefEq`) ts      -- try exact== node table 
      cache1 = lookupWith (t `termRefEq`) eqs     -- try equivalence== memo table
      miss = case firstWith (t ==) ts of          -- compute real equivalence
             Just t' -> Closure t' (Env ts ((t,t'):eqs))  -- extend memo table
             Nothing -> Closure t  (Env (t:ts) eqs)       -- extend node table

      find cache next = case cache of 
                          Just t' -> Closure t' e; -- reuse from cache
                          Nothing -> next          -- try next search


Entry: Merging environments
Date: Sun Feb 14 11:50:43 CET 2010

Let's extend the join-term-to-environment to the more interesting
operation of joinging environments.  This could also benefit from some
pointer-equality: merging two environments can be done up to the point
where the memotables are equal. i.e. finding the common list tails,
and concatenating the remainder.

( This has a lot in common with merging in source control! )

The incentive to accelerate join by finding common roots follows from
these assumptions:

   - big environments probably share data i.e. through explicit
     memoization in the code that produced the expression.

     => rebuilding the cache tables would be expensive

   - non-shared environments tend to be small (independent
     expressions).

     => finding a common root fails, but won't take much time.


CONTEXT:

  - 1st order: term equivalence (==) is cached using memotables
    embedded in an Env datastructure.

  - 2nd order: Env equivalence, or equivalence/sharing of the
    memotables themselves might also be accelerated.


Entry: Merging lists: recovering tail sharing
Date: Sun Feb 14 12:08:59 CET 2010

Take two lists, establish their common tail, add head of first to
second list.  Is there a Haskell function for this?

A different datastrucure might help: length-annotated lists, as this
can prune the search tree; i.e. two lists cannot be equal if their
lengths are not equal.


import Data.List 
import System.IO.Unsafe
import System.Mem.StableName


refEq a b = unsafePerformIO $ do
              a' <- makeStableName a
              b' <- makeStableName b
              return (a' == b')

type TaggedList a = [(Int, a)]

taggedCons = f where
    f x [] = [(0 :: Int, x)]
    f x txs@((n,_):_) = (n+1,x) : txs

taggedList = foldr taggedCons [] 

taggedTail = f where
    f as [] = []
    f [] bs = []
    f as bs | as `refEq` bs = as             -- found tail
    f a@((na,_):as) b@((nb,_):bs)
      | na == nb = f as bs                   -- same length => recurse
      | na > nb  = f (drop (na - nb) as) bs  -- diff length: drop heads
      | na < nb  = f as (drop (nb - na) bs)


Problem: this might rebase a lot though.  Maybe sharing should be done
in a functional representation anyway?


Entry: Environment is only necessary during threading
Date: Sun Feb 14 13:41:51 CET 2010

Let's rephrase...  What is important?  When flattening an expression
tree/graph, it needs to be determined which nodes are the same.

What should an environment do?

  - provide an ANF/SSA view of expression graph

Anything else is implementation.  In first approximation this means:
an environment should keep track of sharing based on some form of
equivalence, i.e. object equivalence (as a consequence of data
sharing at the meta level) or other equivalences.

So basically, we never need to _merge_ environments, as there is only
one instance that is threaded through the flattening.  It looks like
the problem is simpler this way.

Ok, this works pretty well.

  - given terms, compute a fold over them to accumulate the sharing
    environment, and replace all terms with shared variants.

  - provide conversion of environment to node-naming function (for
    obtaining register names)

  - provide fold over nodes in environment

This then gives a way to convert a term (with sharing and possibly
other equivalence relation) to a flat ANF-style structure.


Entry: Cleanup: sparate Ai.hs and Term.hs
Date: Sun Feb 14 15:32:34 CET 2010

Recovering sharing seems to work.  What's next?  Abstracting it such
that different number systems can reuse the sharing + flattening code.

It's probably easier to build build the syntax tree on top of a shared
data type, than to abstract the recursion pattern.  Or not?

What do we really need more than an ordinary fold?

Some guideline: datatypes and pattern matching functions are lowlevel
and don't compose easily, operators are highlevel and compose in a
straightforward ways.

It's probably simpler to keep what is now Ai.hs to be the language
backbone, and separate it from the abstract interpretation.


Entry: Ai and nesting types
Date: Sun Feb 14 17:52:57 CET 2010

  - How to perform AI (normal forms from ainum.hs) on the Term type?  

  - How to nest number types, i.e. construct complex math / matrix
    math in terms of real base types.

It's a royal pain to write down the algebraic transformation rules in
terms of AST pattern matching.  Without syntactic abstraction of
binding forms (pattern matching) they cannot be abstracted over
easily.

Alternatives?

  - Use richer datatypes + perform evaluation on these before
    compilation to Term.

  - Find a way to express matching rules as combinators so they can be
    abstracted over.


I.e. the following is a concrete evaluator for performing addition on
terms in normal form, producing a normal form.

termAdd :: (Num l) => (Term l) -> (Term l) -> (Term l)
termAdd = add (add Add)
    where
      add k Zero a = a
      add k One Zero = One
      add k One a = add k (Lit 1) a
      add k (Lit a) (Lit b) = Lit (a + b)                                  -- e
      add k (Lit a) (Negate (Lit b)) = Lit (a - b)
      add k (Lit a) (Add (Lit b) c) = Add (Lit (a + b)) c                  -- ae
      add k (Add (Lit a) b) (Add (Lit c) d) = Add (Lit (a + c)) (Add b d)  -- aec
      add k a (Add b@(Lit _) c) = Add b (Add a c)                          -- an
      add k a@(Lit _) b = Add a b                                          -- n  
      add k a b = k b a                                                    -- c


Using operators doesn't really make this more readable:

termAdd :: (Num l) => (Term l) -> (Term l) -> (Term l)
termAdd = try (try (:<+>:)) where
    try flipAdd = (<+>) where
      L 0 <+> a = a
      L a <+> L b = L (a + b)                                            -- e
      L a <+> Negate (L b) = L (a - b)
      L a <+> ((L b) :<+>: c) = L (a + b) :<+>: c                        -- ae
      ((L a) :<+>: b) <+> ((L c) :<+>: d) = L (a + c) :<+>: (b :<+>: d)  -- aec
      a <+> (b@(L _) :<+>: c) = b :<+>: (a :<+>: c)                      -- an
      a@(L _) <+> b = a :<+>: b                                          -- n  
      a <+> b = b `flipAdd` a  


So... let's convert to intermediate form, and perform simplificiation
on that.

EDIT: finaly it stabilized at this:


-- Commuative reduction is able to propagate ``literal bubbles to the
-- surface'' by keeping semi-literal terms in a (L a :<*>: b) normal form.

data CT l t = U                        -- unit
            | Z                        -- zero
            | L l                      -- literal
            | T t                      -- opaque term
            | (CT l t) :<*>: (CT l t)  -- op
              deriving Show

ctOp :: (Num l) => (l -> l -> l) -> CT l t -> CT l t -> CT l t
ctOp binop = (<*>) where
    (*) = binop

    -- unit and zero
    Z <*> a = Z ;  a <*> Z = Z
    U <*> a = a ;  a <*> U = a

    -- reduce literals/semi-literals: (L a) or (L a :<*>: _)
    L a <*> L b = L (a * b)
    L a <*> (L b :<*>: c) = L (a * b) :<*>: c
    (L a :<*>: b) <*> L c = L (a * c) :<*>: b
    (L a :<*>: b) <*> (L c :<*>: d) = L (a * c) :<*>: (b :<*>: d)

    -- no reduction -> enforce normalized semi-literal form
    a <*> (L b :<*>: c) = L b :<*>: (a :<*>: c)
    (L a :<*>: b) <*> c = L a :<*>: (b :<*>: c)

    -- catch-all 
    a <*> b = a :<*>: b


Entry: Commutative manipulations
Date: Sun Feb 14 20:58:10 CET 2010

-- The idea is to locally unpack Term -> CTerm, perform a
-- simplification, and convert back to Term.


-- Commuative reduction is able to propagate ``literal bubbles to the
-- surface'' by keeping terms in a (L a :<*>: b) form.

data CTerm l t = U                              -- unit
               | Z                              -- zero
               | L l                            -- literal
               | T t                            -- opaque term
               | (CTerm l t) :<*>: (CTerm l t)  -- op

commOp :: (Num l) => (l -> l -> l) -> CTerm l t -> CTerm l t -> CTerm l t


commOp op = try (try (:<*>:)) where
    (*) = op
    try next = (<*>) where
      Z <*> a = Z
      U <*> a = a
      L a <*> L b = L (a * b)                                          -- e
      L a <*> (L b :<*>: c) = L (a * b) :<*>: c                        -- ae
      (L a :<*>: b) <*> (L c :<*>: d) = L (a * c) :<*>: (b :<*>: d)    -- aec
      a <*> (L b :<*>: c) = L b :<*>: (a :<*>: c)                      -- an
      L a <*> b = L a :<*>: b                                          -- n  
      a <*> b = b `next` a  


Entry: Commutative manipulations cont..
Date: Mon Feb 15 09:44:06 CET 2010

Hmm.. writing a ``generic'' evaluator doesn't seem to be a
well-defined problem.  I think I'm starting to understand the main
idea behind the MetaOCaml approach: sometimes it's best to limit the
number of optimizations to those your generated program actually
needs, to keep the generation process simple.  

Managing the algebraic manipulations as in Cas.hs can get complex.

So, shift focus?  It would probably be a good idea to have a concrete
problem to solve.

EDIT: it turned out to be not so difficult to make the code a bit more
readable by removing the CPS recursion and writing the pattern match
out in full.  Currently Cas.hs contains:


-- Commuative reduction is able to propagate ``literal bubbles to the
-- surface'' by keeping semi-literal terms in a (L a :<*>: b) normal form.

data CT l t = U                        -- unit
            | Z                        -- zero
            | L l                      -- literal
            | T t                      -- opaque term
            | (CT l t) :<*>: (CT l t)  -- op
              deriving Show

ctOp :: (Num l) => (l -> l -> l) -> CT l t -> CT l t -> CT l t
ctOp binop = (<*>) where
    (*) = binop

    -- unit and zero
    Z <*> a = Z ;  a <*> Z = Z
    U <*> a = a ;  a <*> U = a

    -- reduce literals/semi-literals: (L a) or (L a :<*>: _)
    L a <*> L b = L (a * b)
    L a <*> (L b :<*>: c) = L (a * b) :<*>: c
    (L a :<*>: b) <*> L c = L (a * c) :<*>: b
    (L a :<*>: b) <*> (L c :<*>: d) = L (a * c) :<*>: (b :<*>: d)

    -- no reduction -> enforce normalized semi-literal form
    a <*> (L b :<*>: c) = L b :<*>: (a :<*>: c)
    (L a :<*>: b) <*> c = L a :<*>: (b :<*>: c)

    -- catch-all 
    a <*> b = a :<*>: b


Entry: Template Haskell
Date: Mon Feb 15 09:59:00 CET 2010


If it's a pain to write reduction systems, why not compile them from a
higher level description?  Yes, Haskell is not Scheme (convenient
syntactic abstraction), and it's also not MetaOCaml (templates might
be ill-typed) but it's halfway in the middle.

On the practical side: if both the data types and the pattern matching
on these types are generated at the same place, maybe some constraints
of the generated code can be captured in the types of the generator?

Goal: find out how to generate an algebraic rewrite system from a
specification as equalities, i.e. find out if Knuth-Bendix can be used
to build the rules (unary vs. binary?).

EDIT: This seems to be way to far-fetched.  Cas.hs turned out to be
quite straightforward.  See prev post.


Entry: Nesting the Term type
Date: Mon Feb 15 10:31:12 CET 2010

Point: try complex or normal numbers on top of the Term type.

The Haskell Complex type can't be used in a straightforward manner:

  Term (Complex Double) 

just gives complex versions of the operations, and

  Complex (Term Double)

doesn't seem to work because the following instances are not defined:

instance (Real a) => Real (Term a)
instance (RealFrac a) => RealFrac (Term a)
instance (RealFloat a) => RealFloat (Term a)


Defining them makes it possible to use complex operations like this:


test1 = compileNodes [realPart c, imagPart c] where
    c = a * b
    a = var "ar" :+ var "ai"
    b = var "br" :+ var "bi"

This gives:

([r3,r6],
 [(r0,(mul ar br)),
  (r1,(mul ai bi)),
  (r2,(negate r1)),
  (r3,(add r0 r2)),
  (r4,(mul ar bi)),
  (r5,(mul ai br)),
  (r6,(add r4 r5))])

Reductions also work:

test2 = compileNodes [realPart c, imagPart c] where
    c = a * b
    a = var "ar" :+ 0
    b = var "br" :+ var "bi"

gives:

([r0,r1],
 [(r0,(mul ar br)),
  (r1,(mul ar bi))])


Entry: Combining Cas.hs and Ai.hs
Date: Mon Feb 15 11:38:27 CET 2010

I'm writing transformations from Term a <-> CT a (Term a) such that
ctOp can be used to perform algebraic simplification for + and *.

As a side note: the Haskell type system is incredibly useful for
writing transformations like these.  I think I'm hooked!  I.e. this is
just beautiful:

ctLift :: Num a => CTtx a -> Binop a -> TBinop a
ctLift (ct, ict) binop = f where 
    (<*>) = ctOp binop                -- lift Binop to CTBinop
    f a b = ict ((ct a) <*> (ct b))   -- pullback to yield TBinop


Entry: Conclusions
Date: Mon Feb 15 18:49:44 CET 2010

Milestones:
  - Shared.hs : shared DAG nodes based on (memoized) term equivalence
  - Cas.hs : abstract commutative+associative algebraic manipulations
  - Ai.hs : Complex (Term Double) as proof-of-concept for composite number types

$ cat Term.hs Shared.hs Cas.hs Ai.hs | wc
    333    1773    9693

It looks nice.  Definitely a seed for something useful.

Haskell is a fantastic tool.


Entry: Control Structures or Network Combinators?
Date: Thu Feb 18 16:43:52 CET 2010

Before this can host useful code, some control structures need to be
introduced.  However, it seems better to start on the right foot and
not allow arbitrary loops or recursion and only provide network
combinators.

Essentially, the basic abstraction is MIMO functions.  The combinators
then should ``duplicate structure'' meaning they should build
directed networks, not state machines.

Is it possible to push feedback (i.e. for IIR filters) outside of a
composite network?  Networks should be pure, so state can be managed
separately, and network composition is simplified.

I.e. the working model is a state-space model, where computation is
pure and state,input,output are explicit.

This seems OK _if_ subsampling is handled appropriately.


-> Hint: take a look at CUDA.

Entry: State / Continuation monad
Date: Fri Feb 19 13:16:04 CET 2010

Multi-stage programming with functors and monads[1] uses the following
state-continuation monad, described in Fig.1 on page 11:

  type ('p,'v) monad = 's -> ('s -> 'v -> 'w) -> 'w
       constraint 'p = <state : 's; answer :'w; ..>

The monad is represented by a function taking two arguments: a state
's and a continuation ('s -> 'v -> 'w), and producing a value 'w.  The
continuation takes two arguments, a state 's and a value 'v and
produces a value 'w.

  let ret (a :'v) : ('p,'v) monad = fun s k -> k s a

Unit creates a function that when invoked, passes the value to the
continuation.

  let bind (m : ('p,'v) monad) (f : 'v -> ('p,'u) monad) : ('p,'u) monad
      = fun s k -> m s (fun s' b -> f b s' k)

Bind creates a function that when invoked, invokes the monadic value m
with the provided state s and a constructed continuation.  That
continuation applies the function f to the value it receives to
produce a new monadic value, which it then applies to its provided
state s' and the original continuation.

  let fetch s k = k s s
  let store v _ k = k v ()

State access.

  let k0 _ v = v
  let runM m = fun s0 -> m s0 k0

Run a monad providing it an initial state and the initial continuation
which returns the value.

  let retN (a : ('c,'v) code) :
   (<classif: 'c; answer: ('c,'w) code; ..>, ('c,'v) code) monad
     = fun s k -> .<let t = .~a in .~(k s .<t>.)>.

Control effect.

  let ifL test th el = ret .< if .~test then .~th else .~el >.
  let ifM test th el = fun s k ->
    k s .< if .~test then .~(th s k0) else .~(el s k0) >.


The monad is documented also in [2].  First, let's gain some more
intuition about the state-continuation monad.  (I'm still just reading
the definition; not understanding it intuitively).

What _is_ this monad?  It is a computation producing a value,
parameterized by an initial state s and a continuation k.  Let's focus
on both in isolation, and then combine them to see how they interact.

The computation encoded in the monad is written in CPS, meaning that
when it is done, it will pass its result to k.  The simplest such
computation is created by `ret' which passes state and value to the
continuation, effectively "returning".

To get a better idea of the CPS monad, it might help to switch to the
continuation monad in Haskell, and explain its bind operator[3]:


  -- r is the final result type of the whole computation 
  newtype Cont r a = Cont { runCont :: ((a -> r) -> r) } 
  
  instance Monad (Cont r) where 
      return a       = Cont $ \k -> k a                       -- i.e. return a = \k -> k a 
      (Cont c) >>= f = Cont $ \k -> c (\a -> runCont (f a) k) -- i.e. c >>= f = \k -> c (\a -> f a k) 

With tagging/untagging removed (also see Haskell comments above) and
translated to OCaml this is:

  type ('a,'r) monadK = ('a -> 'r) -> 'r

  let retK a    = fun k -> k a
  let bindK m f = fun k -> m (fun a -> f a k)

The type is more clear now: the monad value is a CPS computation that
takes a continuation ('a -> 'r) and produces a value.  The continuation
is a function that takes a value 'a and produces a result 'r.

  * retK creates a computation that takes a continuation k and passes
    it the value a

  * bindK takes a computation m and a function f and chains them
    together such that first m is performed, and its output is piped
    through f before continuing with k


Similarly for the state monad[4]:


  newtype State s a = State { runState :: (s -> (a,s)) } 
 
  instance Monad (State s) where 
      return a        = State $ \s -> (a,s)
      (State x) >>= f = State $ \s -> let (v,s') = x s in runState (f v) s' 

Converted to untagged OCaml

  type ('s, 'a) monadS = ('s -> ('a * 's))

  let retS a    = fun s -> (a,s)
  let bindS x f = fun s -> let (v,s') = x s in (f v) s'

[1] http://www.cas.mcmaster.ca/~carette/publications/scp_metamonads.pdf
[2] http://www.cs.rice.edu/~taha/publications/conference/pepm06.pdf
[3] http://www.haskell.org/all_about_monads/html/contmonad.html
[4] http://www.haskell.org/all_about_monads/html/statemonad.html


Entry: Next
Date: Sat Feb 20 09:19:03 CET 2010

- Fix Shared.hs such that the equality memoization operation is
  isolated (it's used twice).

- Play with MetaOCaml code generation monad in raw (bind) notation.

- Install MetaOCaml do-notation.

- Think about DSP combinators


Entry: Shared -> properly factorized memoization
Date: Sat Feb 20 10:36:37 CET 2010

Essentially there are two operations: lookup and extend.  The latter
can probably reuse the former.

Done


Entry: Replaced check with mplus from Control.Monad
Date: Sat Feb 20 12:27:58 CET 2010

I guess this is why the continuation monad documentation here[1] says
that in haskell it's often not necessary to use contunations due to
laziness.  Short-circuiting the Maybe monad:

*Main Control.Monad> Just [] `mplus` Just [0..]
Just []

I'm still not used to not having to write a (lambda () ...) form to
create an explicit thunk.  It takes quite a while to move from a
strict language to a lazy language; to think of _everything_ as a
value completely free of effect.


[1] http://www.haskell.org/all_about_monads/html/contmonad.html


Entry: A project
Date: Sat Feb 20 13:00:13 CET 2010

I need a project.  Something to do to evaluate the two techniques
currently under the loupe:

  - explicit staging in MetaOCaml

  - partial evaluation (using meta/haskell/*.hs)


Entry: Embedding Faust in Haskell
Date: Sat Feb 20 14:13:51 CET 2010

Let's look at Faust[1].

Faust's basic data type are data stream operators representing block
diagrams.  It supports the following operators (block diagrams)
compositions (precedence):

f ~  g   recursive (4)
f ,  g   parallel (3)
f :  g   sequential (2)
f <: g   split (1)
f :> g   merge (1)


The combinators seem strange at first sight.  It seems that
wire-crossings are not so simple to express.

The graphic representations in [3] are most illustrative.

* The : operator is not the same as function composition: it operates
  on a stack of signals.  (Faust is a stack language??)

* The , combinator is like : with the stack replaced by a queue.

* The <: and :> operators are fan-out and (summed) fan-in.

* The ~ operator is the y combinator with a delay in the loop.

* The _ and ! operators are identity and cut (sink).


To embed these combinators in the Ai.hs structure, one needs the
concept of "delay", i.e. polynomials.  I.e. a data flow network can
contain outputs that are fed back as inputs.

Then the Faust operators (the block diagram algebra defined in [3])
can probably be built on top of that by allowing a representation of
"busses" and operations on them.

Let's look closer at the algebra.  The operators <: and :> seem
ad-hoc.  They could use some explicit fanout / fanin operations.  Atm
they seem to be fixed at duplicate / sum.  (It would be more
interesting if this were an equality constraint ;)

I do like the stack and queue operators : and , but these could be
augmented with some more general wire-shuffling routing combinators.

The paper[3] mentions (2.9) the Stefanescu Algebra of Flownomials
(AoF) to be related.  AoF is an extension of Kleene's calculus of
regular expressions.  See pubs in [5].


                              * * *


Apparently Claude is/was working on something similar[4].  Looks like
he gave up after finding some type level tricks that lead to
exponential compilation times.

If I understand correctly, the problem seems to be that Faust's
combinators can easily be embedded as operations on lists of streams
(busses).  However, to make it possible to determine bus size at
compile time, some other machinery is necesary.  This rings several
faint bells related to stack languages and typing.  I believe the Cat
language's type system might be of use here.

In Cat[6] the main trick seems to be that operations are polymorphic
over the rest of the stack.


[1] http://faust.grame.fr/
[2] http://www.grame.fr/Ressources/pub/faust-chapter.pdf
[3] http://www.grame.fr/Ressources/pub/faust-soft-computing.pdf
[4] http://claudiusmaximus.goto10.org/cm/2009-12-06_heist_busted.html
[5] http://www.comp.nus.edu.sg/~gheorghe/papers/selPub.html
[6] http://www.cat-language.com/Cat-TR-2008-001.pdf


Entry: Operations on polynomials
Date: Sat Feb 20 14:56:57 CET 2010

Intuitively it seems that SSA/ANF also has a nice effect on converting
networks involving delays to unit delays.  Let's make this precise.

Once an ANF form is obtained from a graph expression, all Z operators
can be replaced by state lookups.

Essentially, whenever this form occurs:

             r1 <- (z r0)

We can save r0 in the state memory to be used on the next stream tick,
and recover the previously saved version in r1.  

For the implementation, since the delayed value will not be accessed
any more after it is read, the mutation can be done in-place, or:

             r1  <- zr0     -- delayed version
             zr0 <- r0      -- save for next iteration

Also see staapl[1] logs for this.  That code will collect powers of z
such that delay lines / offsets can be used.


[1] entry://../staapl/20091011-095357


Entry: Type-level vector size specifications
Date: Sat Feb 20 21:38:46 CET 2010

-- I tried the following, but it's not what I wanted: it can only
-- construct size 2^N structures


import Control.Applicative

data Tensor a = a :* a  deriving (Show, Eq)
infixr 5 :*


instance Functor Tensor where
    fmap f (x :* y) = (f x) :* (f y)


instance Applicative Tensor where
    pure x = x :* x
    (fx :* fy) <*> ( x :* y) = (fx x) :* (fy y)

    
instance (Eq a, Num a) => Num (Tensor a) where
    (+) = liftA2 (+)
    (*) = liftA2 (*)
    abs = fmap abs
    signum = fmap signum
    fromInteger i = let x = fromInteger i in x :* x


-- For more info see [1] and [2].
-- [1] http://www.mail-archive.com/haskell@haskell.org/msg16605.html
-- [2] http://www.eecs.usma.edu/webs/people/okasaki/icfp99.ps


Entry: Scheme vs. Serious Static Types
Date: Sun Feb 21 08:56:54 CET 2010

Types are hard.  I'm not sure why.  Maybe because some of the
abstractions are simply alien to me?  Maybe because programming with
logic is different than programming with functions/objects/...

From these two weeks of intense Haskell & OCaml study I think I can
re-appreciate Scheme and macros.  To some extent Scheme macros can
give similar payoff as type systems: provide a means to encode
high-level semantics that is interpreted at compile time.  Currently
I'm definitely more comfortable with having the same language at the
meta level (i.e. Scheme's `syntax-case' macros).  Maybe the truth is
in the middle: Typed Scheme?


Entry: Input / State / Output
Date: Sun Feb 21 09:12:02 CET 2010

Combinators for DSP.  Can the recursion operator be made explicit?
I.e. in the spirit of 1-pass code generation, how to encode the delay
operator?

Let's think about the eventual C code to make this more concrete.
Code eventually needs 3 dictionaries: input, state, output.

In C this needs a concrete representation.  It seems simplest to use
structs for this, like:

void process (struct ins *in, struct outs *out, struct states *state) {
     float r1 = in.bus0;
     float r2 = in.bus1;
     float r3 = r1 * r2;
     out.bus0 = r3;
     float r4 = state.z0;
     state.z0 = 0.9 * r3;
     ...
}

This interface is only for generated code.  The (compile-time) high
level objects should have a different composition mechanism though.
However, this can be completely defined in Haskell.  A lot is possible
here; it can be kept modular.

Todo: collect input, output, state variables during translation.


Entry: Tagless interpreters
Date: Sun Feb 21 09:33:36 CET 2010

I wonder if there's any benefit to using a tagless approach to
representing the Ai.hs code.

The main idea behind tagless interpreters is to replace algebraic data
types with type classes; i.e. shifting one level.  The benefit is that
the object language types can be encoded as metalanguage types.

For the dataflow language it is important to be able to distinguish
different types.  These are a minimum: bool, int, float|double.


Let's have another look at [1][2].  I'm starting with the class
definition in Fig. 3.  I do not need higher order functions; only
expressions.  Let's worry about abstraction later.  This means leave
out lam & app and concentrate on a rep with two types (int & bool),
primitive expressions and conditionals.

class Symantics repr where
    int   :: Int  -> repr Int;
    bool  :: Bool -> repr Bool;
    float :: Float -> repr Float;
    
    add :: repr Int -> repr Int -> repr Int
    mul :: repr Int -> repr Int -> repr Int
    leq :: repr Int -> repr Int -> repr Bool
    if_ :: repr Bool -> repr a -> repr a -> repr a

So, what is `repr'?  It's a type constructor taking one argument.
Let's take an arbitrary one: [].

instance Symantics [] where
    int = pure
    bool = pure
    add = liftA2 (+)
    mul = liftA2 (*)
    leq = liftA2 (<=)
    if_  = liftA3 if'

if' :: Bool -> a -> a -> a
if' b t e = if b then t else e

This suggests that applicative functors can be used in general.

Can we simplify and make Symantics inherit from the number
typeclasses?  I don't really see how to do that though.  I tried
several things but ended up with a lot of kind errors on one path and
needing UndecidableInstances on another so I'm going to wait until I
understand why.

The simplest way to side-step the problem seems to be this:

class Symantics repr where
    -- Lift constants into the domain
    int   :: Int   -> repr Int;
    bool  :: Bool  -> repr Bool;
    float :: Float -> repr Float;
    
    -- Primitive operations
    bop :: Num a => (a -> a -> a)     -> repr a -> repr a -> repr a
    cop :: Ord a => (a -> a -> Bool)  -> repr a -> repr a -> repr Bool
    if_  :: repr Bool -> repr a -> repr a -> repr a


Here Symantics abstracts over the different functions and only
requires that each interpreter (repr) has lifting functions.

But that breaks what we inteded to create: it doesn't capture semantic
differences between + and *.  It looks like there is no way around
somehow relating Symantics and the number typeclasses.  This could be
done as merely an interface, i.e. (+) from Num delegates to add from
Symantics.

Again:

  * Symantics class encodes object language + type.

  * Symantics instance encodes language interpreter.

  * It's possible to build the Ai.hs style interpretation on top of
    this by relating the numeric type classes to Symantics.

Now try the latter in a concrete way, then try to generalize the
layering or composition of different types.

I wonder: why is it not possible to embed MetaOCaml in Haskell this
way?  In [3] line 163 it says "we never pattern-match on bytecode".
So it seems that indeed this allows a lot of the features of MetaOCaml
to carry over to Haskell, but probably not all.  How are they
different?

The conversion to ByteCode in [3] instantiates to typed GADT.
However, it seems it's useful to convert the (type-checked!) language
constructs into an untyped form if the form is never inspected no type
errors can be injected.

Conclusion: it's possibly useful to write a typed layer _on top of_
the Term data type in Ai.hs; this layer would then ensure static type
safety.


[1] entry://../compsci/20090823-121637
[2] http://okmij.org/ftp/tagless-final/APLAS.pdf
[3] http://okmij.org/ftp/tagless-final/Incope.hs


Entry: Using tagless representations
Date: Sun Feb 21 12:27:28 CET 2010

One problem though: the current implementation uses byte-code
inspection for partial evaluation (i.e. the "bubbling" of constants).

Maybe this can be solved by representing such values as an abstract
domain.

I need a rest to absorb all of this.  I think I get the main idea, but
the details of what happens at what level, and how different type
systems can be used are quite complicated.


Entry: The DSL for DSP
Date: Sun Feb 21 22:21:06 CET 2010

Current ideas:

  - based on SSA/CFA dataflow networks & partial evaluation in Ai.hs
  - unit delays fit in this framework
  - build higher order combinators (ala Faust) on top of this, but
    prefer polynomials as base rep.

What would be the next step?  Build a tb-303 emulator that compiles
down to dsPIC assembly.  This should bring out some flukes in the
general idea as it contains everything:

  * high level objects (filters with control input smoothing)

  * nonlinear distortions

  * compilation to concrete architecture (i.e. register allocation &
    peephole optimization)

The last two are standard compilation techniques I'll probably not be
able to avoid when making proper compilers.  All the other
optimizations can be built in closer to the language itself.

( How does Staapl fit into the picture?  I guess that Staapl is the
bottom-up approach: a beefed up macro assembler embedded in Scheme.
Most of the tricks that come out of this experiment in Haskell/Ocaml
can probably carry over to Scheme in some way. )


Entry: Practical issues for DSP language
Date: Wed Feb 24 08:19:56 CET 2010

Currently I can break down the primitive entities into:

 - unit delays

First and foremost: how to translate a program with delays into a
state space formulation.

 - variable delays with integer indexing

Then, how to map domain values (i.e. floats) to delay offsets.  This
includes bounds checking.

 - conditionals: 
   * piecewize functions
   * 'while' or 'for' loops

Are general conditional statements allowed, or do we provide only
``local'' conditionals, i.e. to compute piecewize functions, and
``global'' conditionals, in the form of functional combinators
(locally iterated state space systems).

 - subsampling & block processing

How to incorporate combinators for subsampling, oversampling and block
processing?  Lifting/embedding operators are necessary to couple
different iteration plans.

 - lookup tables

Similar to block processing, variables delays: how to accomodate 

 - buffering


Remarks

 * Thinking about the nested processing approach, maybe its better to
   use an explicit (causal) stream model everywhere, and allow for
   translation to incremental form for the top level data stream.

   I.e. back to the previous approach (in Staapl): all variables are
   infinite streams by default, but can be optimized away.

 * Nested streams are only necessary for iterative algorithms that run
   multiple updates per real-world sample tick.

 * Lifting a sample-based stream to a block-based stream means
   creating a nested stream processor, as the sample based stream
   needs to perform its iteration for each real-world block tick.

 * The other way around: embedding a block processor in a sample-based
   stream processor requires buffering.

 * The main question seems to be: are delays represented by registers
   that are updated in-place (i.e. for state), or indexed offsets
   (when multiple input / output sample are available) or a
   combination of those that handle different border cases.


Entry: Applicative functors
Date: Wed Feb 24 22:04:00 CET 2010

So is a state space update function :: (s,i)->(s,o) an applicative
functor?

I wonder.. If this[1] (and the discussion about applicative functors
vs. comonads) contains a solution to the problem of how to represent
streams such that C dataflow loops can be easily extracted.  In any
case, it's probably essential to fully undersetand McBride &
Paterson's applicative functor paper[2].

What about this: the effect is the threading of the state so it is
hidden.  The `pure' part is the conversion of input into output.  The
desired computation is, given a list of inputs (and a state), produce
a list of outputs:

  s -> [i] -> [o]

Now, what if the behaviour itself changes. i.e.:

  [(s,i)->(s,o)] -> [i] -> [o]

Hmm.. What I'm looking for is really a curried state monad.  The
important part is not (s,i)->(s,o) but i->s->(s,o), where s->(s,o) is
a standard (State o) monad.

For example:

> import Control.Monad.State
> import Control.Applicative

> f :: Int -> State Int Int
> f i = do 
>   s <- get
>   put (i + s)
>   return (s + 100)

Now it's possible to map this function over a list of inputs to obtain
a list of stateful computations, which can be chained together using
sequence and instantiated using runState.

> out = runState (sequence $ fmap f [1..10]) 0

I think I'm getting the hang of this.


[1] http://lambda-the-ultimate.org/node/988
[2] http://www.cs.nott.ac.uk/~ctm/IdiomLite.pdf


Entry: Simulink S-functions
Date: Thu Feb 25 11:32:09 CET 2010

Discrete functions are split into two callbacks: 

  - mdlOutputs()   compute next outputs (predict)

  - mdlUpdate()    perform state update

From a control pov: use the predictor computed in previous step to
generate control outputs; this minimizes input -> output latency.
Then spend the rest of the cycle updating the predictor using input
measurements and generated outputs.

This rings a faint bell about simplectic forms: there is a ping/pong
discrete update method that conserves simplectic product between
position and state (i.e. Keppler's law): even if the integration isn't
exact, the conservation property is kept.

Anyways, the ping/pong update seems a good base model, as compared to
the direct update (unfactored) update.  How to write this as a monad
or AF?

I.e. could this be something like:

     s -> i -> o
     s -> o -> s

Probably it needs to be

     s -> i -> o
     s -> i -> o -> s

     
Entry: Lucid and Streams in Haskell
Date: Fri Feb 26 10:07:29 CET 2010

Basic idea: what I want to do is:
  - stream processing
  - nested (truncated) stream processing
  - add local function optimizations (i.e. lookup tables)

Maybe it's best to implement most of those in plain Haskell and see
how they can be tranformed to state-space models and iterative loops?
I don't have enough intuition about what is already possible with the
available tools.

From [2]:

fby :: [a] -> [a] -> [a]
x `fby` y = head x : y 

Anyways.  Given a rational function 1 / 1 + z how can this be
translated to a stream?  Essentially, how do you represent a
difference equation?  That seems to be the problem to solve.


[1] http://www.haskell.org/haskellwiki/Lucid
[2] http://www.haskell.org/sitewiki/images/2/21/Lucid.lhs


Entry: Converting rational functions to difference equations
Date: Fri Feb 26 10:46:09 CET 2010

1. They need to be causal, which means they need to be in normal form:
   only positive powers of z.

2. We need an algebra of rational functions that supports
   normalization, multiplication, addition, and partial fraction
   expansion.


The trick seems to be to immediately introduce memoization when a
rational function is converted into a stream.  Look at the fibo memo
here[1].

It seems to be quite tricky!  What you want is to associate each node
in the stream with a couple of delayed versions of itself to be able
to compute the next entry, however doing this in a way that doesn't
keep a lot of garbage around doesn't seem to be trivial.

I.e. if a stream refers back to itself, the recursion needs to stop at
some point, so what you need is:

  - the output/state stream
  - a computation that refers back to previous versions of that stream

> y zy = let y' = 0.9 * zy in  y' : y y'

*Main> take 5 $ y 1
[0.9,0.81,0.7290000000000001,0.6561000000000001,0.5904900000000002]


Can that be written in a style that doesn't use explicit state
passing?  Is it that necessary?

> y y1    = let y0 = 0.9 * y1            in y0 : y y0
> x x1 x2 = let x0 = 0.9 * x1 - 0.5 * x2 in x0 : x x0 x1

The idea is that in passing state down the tail, some of the memory
values ``fall off the end''.  Simply pushing an element to a list would
work, but would retain a lot of state.

Let's do it anyway; maybe it's possible to recover a kernel routine
through abstract interpretation?

> z zs@(z1:_) = let z0 = 0.9 * z1 in z0 : z (z0 : zs)

Note this computes the forward stream, but passes the backward stream
downward!  I.e. it is a kind of zipper.  More general, with coefs:

> series coefs izs = s izs where
>   s zs = let z0 = foldr (+) 0 $ zipWith (*) coefs zs
>          in z0 : s (z0 : zs)

This makes it straightforward to translate a rational function to a
series (currently ignoring nominator).

> toList (RatFunc n (d0:ds)) = 1 : series coefs (1 : [0..]) where
>    coefs = map (\x -> (-1) * x / d0) ds

Separating the iir combinator from series this becomes:


> iir f izs = s izs where
>     s zs = let z0 = f zs in z0 : s (z0 : zs)
>
> series coefs = iir (fir coefs)

The f taken by iir is has type :: [a] -> a.  It filters the history of
the output, i.e. the input array is the _reverse_ of the generated
array.  (Renamed iir to ar).

Next: arma.  Or better arx: AutoRegressive with eXogenous input.  Arx
seems to work as:

> arx p = s where 
>     s ys (x:xs) = 
>         let y0 = (p ys) + x     -- filter + add input
>         in y0 : s (y0:ys) xs    -- build stream and pass reverse stream to s

simply adding the input.  Now ma is similar, but the naive
implementation isn't what we want:
 
> ma' p = s where
>    s xs = (p xs) : (s $ tail xs)

This is an anticausal filter.  We want a causal one.  This is similar
to the arx combinator, but it builds a history (reverse stream) of the
input, not the output.

> ma p = s where
>     s zs (x:xs) = 
>         let zs' = x:zs 
>         in (p zs') : s zs' xs

This completes the combinators.  Both ma and arx have the same type
signature; except that ma isn't necessarily class Num because it
doesn't use any arithmetic.

  Num a => ([a] -> a) -> [a] -> [a] -> [a]
           
           filter        hist   in     out


Alright.  Code seems compliete: arx,ma,fir,applyRatFunc,ir

Now the question is: how to translate this to time-varying filters?
Instead of having a single predictor, this could be represented by a
stream of predictors.  Seems straightforward.

Additionally, histories should be associated to the predictor, i.e. it
consumes a history and an input value and produces a history and an
output value.

[1] http://www.haskell.org/haskellwiki/Memoization


Entry: AwesomePrelude
Date: Sat Feb 27 09:55:57 CET 2010

Getting rid of datatypes.  From the tagless interpreter subject it was
already clear that often data constructors of sum types can be
replaced by class instances.  However, then you loose pattern
matching; which in the case of the tagless interpreter is exactly the
point: no more run-time interpretation.

Apparently it is also possible to fake pattern matching.  See
AwesomePrelude[1].  I assume it is quite related to the tagless
approach by moving more information into the type signatures.


[1] http://tom.lokhorst.eu/2010/02/awesomeprelude-presentation-video


Entry: Next?
Date: Sat Feb 27 11:54:56 CET 2010

Now what's next?  Make RatFunc part of the num class?


Entry: Functional analysis
Date: Sat Feb 27 12:24:40 CET 2010

In the RatFunc.hs implementation, can we somehow use the fact that a
history is necessarily finite (data, not codata like streams) since it
is folded over?

B.t.w is there a relation between data/codata and the Banach space
concept of signal/filter (DSP-engineering speak)?  I probably mean L^p
spaces[2].

I.e. if I recall there are theorems about H1 and Hinf norms for Banach
spaces that are used in control theory, and a symmetric version for H2
norm for Hilbert spaces.


[1] http://en.wikipedia.org/wiki/Functional_analysis
[2] http://en.wikipedia.org/wiki/Lp_space
[3] http://donsbot.wordpress.com/2010/02/26/fusion-makes-functional-programming-fun/


Entry: RatFunc and Applicative
Date: Sat Feb 27 15:25:07 CET 2010

This separates the recursion pattern from the update equation, but
doesn't really add much readability:

> causal p = s where
>     s hs (x:xs) =
>         let (hs',y) = p hs x
>             ys      = s hs' xs
>         in y : ys

> arx' p = causal p' 
>     where p' hs x = (hs',y)
>               where hs' = y:hs
>                     y   = p hs + x

> ma' p = causal p' 
>     where p' hs x = (hs',y)
>               where hs' = x:hs
>                     y   = p hs'

These can be further normalized if the ma doesn't operate on the input
sample directly, i.e. if 'now' is separate from 'past', and even more
so if the coefficient of the direct input is 1.  Does this signify
anything?  Only that an armax filter is quite regular: the input and
output delay lines look similar.  Is this useful?  Probably not really.

Where am I going?  This is all interesting but I'm loosing
integration..

Using rational functions gives a way to deal with signals in the
frequency domain.  But what I'm trying to accomplish is a time-domain
representation with nonlinear operators (i.e. multiplication).

The RatFunc.hs provides:

    - Interface between RatFunc and stream processors.  This is useful
      for simulation in Haskell.

Now, can this be used together with abstract interpretation to recover
implementatation structure?  The key is probably in the ``history''
implementation.  Frankly it seems a bit trivial: ratfunc _already_
gives a direct map to implementation as tapped delay lines, which can
either be implemented as shift registers or as a sliding memory
window.

So, in the bigger picture, the question seems to be: how to implement
delays.  The answer to this is probably to keep them in a single big
ring buffer.


Entry: Memoization and filter banks
Date: Sat Feb 27 16:35:30 CET 2010

For ordinary computations a lot can usually be gained by reusing
intermediate results.  The same is probably also true for IIR and FIR
filters.  I.e. when is it more useful to implement serial tructures vs
parallel ones?

This is about ladder and lattice filters.

So, it would probably be useful to think about all kinds of
refactorings of rational functions, and especially the effect of
transformations on the coefficients.

I.e. what would be _really_ interesting is to do on-line filter
design, i.e. adaptive filtering with based on some optimization
problem.

Ok, that's the essential insight: it's not about filters.  It's about
coefficients and how to update them.

I.e. given a certain fixed filter topology, how do the coefficients
influence a certain cost function.  Kalman filters etc..
 
This brings us pretty close to one of Jaques Carette's papers[1][2].

Another interesting one is this [3].

[1] http://www.cas.mcmaster.ca/~carette/newtongen/
[2] http://www.cas.mcmaster.ca/~carette/newtongen/verif_gen.pdf
[3] http://www.cas.mcmaster.ca/~carette/publications/CaretteEtAl2008_AISC.pdf


Entry: Adaptive filtering
Date: Sat Feb 27 17:13:39 CET 2010

So.  Let's experiment a bit.  The basic structure I'm thinking of
computes derivatives of filter coefficients to be used in the update
of an on-line optimizer.

The goal is to describe the filter structure in a very high level, and
derive C code that implements a state-space update.


Entry: Next
Date: Sat Feb 27 18:05:56 CET 2010

- Classical sound synthesis engine.  

  What's missing is a bunch of optimizations like piecewize linear
  functions and lookup tables.  However, the real problem is
  delay-management, and this should be trivial to do from a SSA/ANF
  form.

  Todo: convert z-parameterized ANF to practical stream
  transformations (haskell) and generation of
  in/state/out-parameterized C code.


- Feedback control: convert an implicit model to an explicit state
  update function.


It seems that the 2nd one can be built on top of the first one.


Entry: Filters
Date: Sat Feb 27 18:26:02 CET 2010

Now this has been a while..  It is of course possible to compute the
ARX and MA from the same delay lines, by computing the MA after the
ARX instead of before.

So, again.


Entry: MetaOCaml -> BER MetaOCaml
Date: Tue Mar  2 13:20:34 CET 2010

From [1]:

    BER MetaOCaml is a conservative extension of OCaml with the
    primitive type of code values, and three basic multi-stage
    expression forms: Brackets, Escape, and Run. BER MetaOCaml
    implements the type system based on environment classifiers to
    type-check expressions that produce and run code values. BER
    MetaOCaml makes no other changes to the OCaml language, remaining
    fully compatible with the underlying OCaml system. BER MetaOCaml
    is current with the byte-code OCaml release 3.11.2.

Looks like maturity is near!

[1] http://groups.google.com/group/fa.caml/browse_thread/thread/098357ea26046912#


Entry: Analyzing PowerPC programs
Date: Wed Mar 10 09:59:06 CET 2010

Work by Tom Hawkins[1]. 

  > Was this inspired by work at your current employer like with Atom
  > and some of the other stuff you've released?

  Yes, we had an immediate need to debug some machine code.  I looked
  around, but all the emulators I found (PSIM, et al.) were too
  complicated.


[1] http://groups.google.com/group/haskell-cafe/browse_thread/thread/29197b4f2345e3e8


Entry: Type checking is a stage
Date: Wed Mar 10 10:12:42 CET 2010

Other things I need to check:

   * Relationship between staging and type checking[1].

   * Relationshop between abstract interpretation and types[2].

   * Ziggurat[3]: how to ``seal'' static semantics, i.e. give macros a
     static meaning without using ``specification by compiliation''.

   * Dave Herman's work on macros and binding[4].

   * Occurence typing[5]: The typed Scheme type system.


[1] http://lambda-the-ultimate.org/node/2575
[2] http://lambda-the-ultimate.org/node/2208
[3] http://lambda-the-ultimate.org/node/3179
[4] http://www.ccs.neu.edu/home/dherman/research/papers/esop08-hygiene.pdf
[5] http://www.ccs.neu.edu/home/samth/dissertation.pdf


Entry: Improving the Static Analysis of Embedded Programs via Partial Evaluation. 
Date: Wed Mar 10 10:25:03 CET 2010

David Herman and Philippe Meunier. International Conference on
Functional Programming (ICFP), 2004. [bib, ps, pdf]. [1]

The basic idea is to replace an interpreter for an embedded language
with a compiler or partial compiler such that static invariants
implicitly present in the embedded language structure are accessible
at compile time.

[1] http://www.ccs.neu.edu/home/dherman/research/papers/icfp04-dsl-analysis.pdf


Entry: Closing the Stage
Date: Wed Mar 10 13:11:45 CET 2010

Typed compilation[1] and tagless representations (as GADTs or
modules/typeclasses).  The idea is that type checking is a stage, i.e.:

  "To type-compile an untyped term to a higher-order abstract syntax,
  we need staging, or its emulation."

I ran into this before in disguise.  The step is to go from something
``dumb data'' to something with a binding structure and holes in it,
i.e. code.

The paper[2] essentially models MetaOCaml inside System F.  So I
wonder, does this mean it can also be embedded in Haskell?

It also cites "SPIRAL, code generation for DSP platforms[3]" and
"HUME, a domain specific language for real-time embedded systems[4]".


[1] http://lambda-the-ultimate.org/node/2575
[2] http://okmij.org/ftp/Computation/staging/metafx.pdf
[3] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.91.2985&rep=rep1&type=pdf
[4] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.94.7250&rep=rep1&type=pdf


Entry: Assembler snarfing
Date: Thu Mar 11 16:55:03 CET 2010

# compiling C to assembler
cd /tmp
echo "int add(int a, int b) { return a + b; }" >test.c
gcc -S test.c -O3 -o -


# assembling to raw binary output:
arm-linux-as test.S -o test.o
arm-linux-objcopy -O binary test.o test.bin


Entry: Logic simulation : the problem is state.
Date: Sat Mar 13 11:12:37 CET 2010

I'm starting to think that the real problem is abstraction on the meta
level.  You need abstracted semantics and proofs (types) not just
abstracted behaviour.

The point of this post is to make that explicit somehow.

Example: suppose F : (x,y) -> z implements addition by chaining
together a bunch of NAND gates.  What you want is a theorem about the
high-level behaviour (addition) instead of the low-level behaviour
(inverted conditional propagation of binary values).

I start with a functional representation of state machines: we ignore
state by only using networks.  Later we introduce state by 'fixing'
networks to given state.  So, tattoo this on your forehead:

                        (i,s) -> (o,s)

A machine transition maps an input to an output while changing a
state.  So state itself is nothing special, we just have multi-in
multi-out (MIMO) functions.  Now, with that out of the way, the
remaining problem is to take computations of the form

                           s1 -> s2

that model state transition functions, and chain them together.  More
specifically, how to manage sub-states and derive static properties
from compositions (clobbering, commutation, ...)

The fundamental problem in composing MIMO functions is _adapting_ the
inputs and outputs so they can be chained.  I.e. lift the functions to
an appropriate embedding in a larger space.

To make the problem purely about connectivity, allow only a single
primitive operation: the NAND (or NOR) gate.  This will make
representation language simpler at the expense of larger networks.

The representation will be binary directed graphs.

For NAND networks, the types are really just sets of nodes.  Lifting
types is nothing more than adding additional nodes to input and output
(type).

So what is necessary?  

  * STAGING: Distinguish between functions (i.e. network macros, stage
    1 entities) and instances (connections between nodes, stage 2
    entities).

  * NAMING: Both nodes and template I/O are named. These are different
    entities.  The latter are type names (stage 1) while the former
    are instance names (stage 2).

Now the trick is: there are no instances!

Cfr. Haskell: there is no state!  Only transitions.

I.e. an instance is a template for which some of the external (and
internal!) nodes are named to turn a problem that analyses state
(those nodes) into a problem that analyses state transitions
(computational paths between nodes).


Entry: Functions vs Macros
Date: Sat Mar 13 15:23:12 CET 2010

So what is the great benefit of functional programming?

               Fan-out of functions.

I.e. functions are values.  A single function can be used multiple
times.


Why are macros and functions different?  Macros get duplicated
(instantiated) and loose their unique identity.  I.e. they are
functions used in a previous stage and are completely gone in the end
stage.

( This smells like objects: objects are functions connected to state
  such that the objects are the things with unique identities. )


So the deal is that you don't really want to ever expand a macro if
you want to analyse its meaning. 

I believe this is the basic idea behind typed, staged programming:
macros (code processing functions) keep their identity, i.e. they have
a real meaning on the meta-level (are separately typable).

I need to go back and have a look at OBJ[1].  It also deals with
instantiation of templates.

[1] http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=3538530B13621FE83E2532A38E7A8B47?doi=10.1.1.36.1030&rep=rep1&type=pdf


Entry: Logic simulation and SSA form
Date: Sat Mar 13 15:27:04 CET 2010

The problem I want to solve is to verify optimization rules of the
PIC18 compiler.  I.e. to prove (or lesser: test) the correctness of a
transformation on object code.

I'd like to use a functional (dataflow network) model because it makes
manipulation of the object under test simpler, reducing the complexity
of the verification code.

So an essential part is the translation of imperative to functional
code.

The plan I have is to translate each PIC18 instruction into a MIMO
logic function operating on bit vectors.  To keep things simple I'd
like to model instructions as functions between particular states of
the machine.  I.e. the "ADDLW k" instruction is

       (k0,...,k7) -> (W0,...,W7) -> (W0,...,W7,C,Z,DC)


However, when composing these functions into a network, intermediate
nodes need to be named.  I.e. the input W0 is not the same as the
output W0.

This boils down to conversion of the imperative form to SSA form by
associating bit register names to different node names.


Practical:

  * I started working on a functional representation of networks using
    type classes, and a Nand representation.  Conclusion: what is
    useful is a tagless representation as a typeclass Logic and
    operations defined on it.  This can then be given multiple
    representations later by instantiating the classes.  At this
    moment an explicit representation (i.e. the NAND networks) is not
    necessary.

  * A representation of dataflow networks is useful to _visualise_ the
    network.  I.e. the ai/Shared.hs module can be used.

  * Problem to solve: construct SSA form from imperative program.
    Every assignment creates a new variable.  Problem: how to
    represent and create variables?

    It seems that this kind of behaviour is easy to do with Scheme
    macros, using shadowing as in:

      (let-values (((x y z) (op1 ...)))
      (let-values (((x y z) (op2 ...))) ... ))

    This then leads to a binding structure for the composite operation
    that can be queried as a function, i.e. using abstract
    interpretation.

    How to do something like this in Haskell?


Entry: What are macros?
Date: Sat Mar 13 16:00:18 CET 2010

  1. Untyped macros: Scheme deals well with hygienic modular macros.
     Advantage: generality.  Disadvantages: 1. object code binding
     structure is not respected.  2. no static meaning.

  2. Typed staged programming: not as general as a Scheme macro
     system, but does handle binding structure of object code
     explicitly.

     What I remember from [3] is that staging goes under lambda.  The
     interpolation (escape) operation in staging is related to
     variable reference, but one needs to take care that the bindings
     in different stages do not clash.

  3. Parameterized programming[1][2].  I read the first one a couple
     of years ago.  Basic idea seems to be about equations and
     rewrites.


[1] http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=3538530B13621FE83E2532A38E7A8B47?doi=10.1.1.36.1030&rep=rep1&type=pdf
[2] http://maude.cs.uiuc.edu/
[3] http://okmij.org/ftp/Computation/staging/metafx.pdf


Entry: Haskell and unicode
Date: Sat Mar 13 20:40:34 CET 2010

To make the code a bit prettier and avoid Prelude name clases, I'd
like to use proper math symbols for and, or, not, xor, ...

Hmm.. this works fine for binary ops, but not unary ones.  So let's
just let it be.


Entry: Manipulating binding structure in Haskell
Date: Sat Mar 13 22:35:39 CET 2010

Problem: convert an imperative register transfer language to SSA form
to construct a functional dependency network.

Try to do this without low-level interpretation (tagless).


Register allocation is SSA -> assignments (convert nodes to registers)
    
We're interested in the reverse operation: assignments -> SSA

So, this needs an environment that has a mapping of symbolic register
names to their current instances.  Whenever an assignment happens, the
environment is updated with a new value and the old value becomes
inaccessible.

We're only interested in building functional representations.

I.e. using a reader monad with (String -> a) mappings, where the a
parameter is the representation of the emulator (i.e. true logic,
multi-value logic, AST, ...) seems a straightforward excercise:

    addlw (k0,k1,...,k7) = 
     ("W0","W1",...,"W7") -> (("W0","W1",...,"W7"),"C","DC","Z")

where the strings are taked from the environment, added to ki, and
placed back into the environment replacing the previous values.

Can it also be encoded in the type system such that there is no
dynamic String mapping?  This seems a bit less straightforward.

Yes it does seem possible to make get/set functions for each of the
machine register bits, i.e. define the machine state as a state monad
with accessor functions indexed by bit numbers.

Conclusion:

    - Instead of manipulating bindings, just manipulate state.

    - The state of the machine is a monad.

    - The monad is parameterized by the representation type of the
      bits to allow analysis based on abstract evaluation.

    - The monad can probably be decomposed into a layering of several
      monads (try out monad transformers).

    - Programs (functional representation of imperative code) can be
      constructed in do notation.


Entry: The Simulator
Date: Sun Mar 14 10:26:07 CET 2010


3 architectual elements:

    - logic gates: Logic typeclass, parameterized over bit
      representation.

    - memory/register Mem monad, parameterized over bit
      representation.

    - opcodes written as Mem a -> Mem a'

The move from a -> a' can represent the mutation.  Mutation can be
encoded in the type.  I.e. it should be possible to maintain the whole
memory history, but at the same time use a fast imperative
implementation.

So, how to encode mutation in the type?  Essentially this needs a list
(environment) in the type.  Lists can be encoded as tuples.

This is related to automatically "forwarding" monad transformers.


Let's try first to understand the future stage environment
representation in [1].  It seems to be useful for this problem, as we
can get rid of the assignment by using nested-let (see the Scheme
example earlier).

The paper[1] refers to [2]: environment classifiers.  It seems [1] is
a bit too general; I don't need effects.  However, [2] does use a
multi-stage calculus.  I'm confused...  Let's stick to [1].

It doesn't seem that the approach in [1] is very useful, except for
the environment passing idea.


[1] http://okmij.org/ftp/Computation/staging/metafx.pdf
[2] http://www.cs.rice.edu/~taha/teaching/05S/511/papers/popl03.pdf

Entry: Encoding `find' in types?
Date: Sun Mar 14 11:36:26 CET 2010

I think the big question is really: should the environemnt
representation be tagless or not?  Is it ok to simply cons elements of
a shared sum type onto a list and read them out later, or should the
binding environment be in the type?

Let's try the tagless approach.

Suppose I have types R0, R1 that statically encode certain registers.
Suppose I have a current enviroment (R1,(R0, ())).  Suppose the type
of the current context is R0.  Is it possible to have a single `read'
operation that fishes out the R0 from the whole type environment?
I.e. can you encode `find' in types?

Ok. I got up to this:

class Ref e r where ref :: e -> r

instance Ref (r,e') r  where ref (x,y) = x
instance Ref e' r' => Ref (r,e') r' where ref (x,y) = ref y

However, the latter doesn't work as it needs the extra constraint r /=
r'.  How to add this constraint?  Apperently[1] this type constraint
should work:

                  TypeEq t1 t2 False

There's a package HList[2] that does most of the magic.  However, I'd
like to reconstruct it from the basics first.  How to add that type
constraint?

Anyways, -XOverlappingInstances solves the problem, but I read
somewhere that equality is not always well-defined.  I don't
understand.  Let's stick to this solution until it becomes a problem.
It seems that HList[2] avoids overlapping instances.


[1] http://www.haskell.org/pipermail/haskell-cafe/2006-April/015372.html
[2] http://homepages.cwi.nl/~ralf/HList/


Entry: Is this useful for testing optimization rules?
Date: Sun Mar 14 13:36:01 CET 2010

So I guess most of the low-level representation techniques are clear
now.

But I'm still not sure how to represent optimization rules.  The idea
is that optimization rules themselves are really computations
(reductions).  They change representation but preserve semantics.

Generally, how do you prove that two representations have the same
semantics?  How do you specify semantics?

I feel like I'm working in reverse.  Attempting to encode properties
in the Haskell type system is surely an interesting excercise.
However, I feel that what I actually want to accomplish is going to be
clumsy to express.

Maybe the answer is: what do we really loose trying to express this in
Scheme where staging is a lot simpler?


Entry: Dirty machine language
Date: Sun Mar 14 13:53:49 CET 2010

It is about proving an embedding of 2 semantics.

Let's start with an example.  I want to prove this rule correct:

 (([qw a ] [qw b] -)             ([qw (tv: a b -)]))
 (([addlw a] [qw b] -)           ([addlw (tv: a b -)]))
 (([qw a] -)                     ([addlw (tv: a -1 *)]))

This relates machine language elements (qw, addlw)  source language
elements (-) and meta language elements (-).


A thorn in my side has always been the presence of real machine
instructions.  What you want is something like this:


 (([qw a ]    [qw b] -)           ([qw (tv: a b -)]))
 (([iw (a +)] [qw b] -)           ([iw ((tv: a b -) +)]))
 (([qw a] -)                      ([iw ((tv: a -1 *) +]))

I.e. where the addlw machine instruction is represented by a piece of
source language code: map its semantics into the language that's being
compiled.  The problem is that the machine code is really messy, and
more specific and stateful than the language to compile.  I.e. flags
are affected, which do not show up in the source language, so it might
not be embedded directly.

The problem needs to be solved in reverse: give a correct semantics to
the machine language, and see if there is an embedding of the source
language that preserves the original semantics constrained by extra
clobbering.

I.e. the main problem is that the machine language isn't clean.
Otherwise mapping would be really trivial.  Representing this
dirtyness to verify that original semantics is preserved by adding
extra constraints, like which registers are clobbered, is the real
problem.

What would help is to try to express this in Haskell, so at least some
form of structure can fall out of the description when everything is
made explicit.


Entry: Compiler
Date: Sun Mar 14 13:58:57 CET 2010

What about this:

  1. Build the "specification by compiler" semantics as a result of:

     - Exactly specify the target machine semantics in terms of logic
       operations and registers as a tagless specification.

     - Provide multiple interpretations: compile, interpret, partially
       evaluate.

     - Map forth -> machine language directly.  This mapping
       (specification by compiler) does _not_ perform any reductions
       by itself: all reductions are a consequence of the exact target
       semantics.

  2. Relate this semantics to a simplified semantics of the forth in
     terms of an idealized machine.


Then use testing or possibly proving to compare the two semantics.

By only using the exact target semantics to perform optimizations, it
should be possible to exactly identify each "fuzzy" translation.

So the question is: is this useful?  Partial evaluation of machine
language?  It seems that all the work is then transformed to the
partial evaluation step: which reductions are possible.  It probably
becomes increasingly more difficult to specify correct
transformations.

I.e. in Forth it's easy to see that 1 + 2 + is the same as 1 2 + + but
once the stack is not there explicitly (i.e. only as a pointer
register), how do you express this condition?  On the machine level
there is a big difference between the footprint of spilling results to
memory and performing the optimization at compile time, but on the
source language semantics level we don't care.

This needs a machine model with some abstract additions, such that
additional rewrite rules can be proven to maintain a certain 'view' of
the semantics.

                     Squinting at machine code?


Entry: Squinting at machine code
Date: Sun Mar 14 15:21:52 CET 2010

Maybe I'm taking the wrong approach trying to capture all this in type
systems..

The idea behind Staapl is bottom-up language construction.  Machines
are messy, and with a couple of hacks one can construct a thin layer
on top that has a reasonable, composable semantics.


    Maybe the goal should be: help the assembly programmer?


The whole problem is: there is no exact semantics.  There are
boundaries that are not checked, i.e. stack overflow.

The question is then: how to build something out of this that you can
trust?  How to trust a bunch of rewrite rules from which a semantics
emerges?  How to make the implementation testable?

I.e. given a direct, non-reducing map from stack operations to machine
code.  How do you prove that a reduction produces the same result?
Can we do this statistically?

The main conclusion is that: 

    In the current view it is _not_ feasible to prove the
    transformation rules correct.  They are simply too messy.

    However, by constructing a model of the machine (asm language +
    state) which can accomodate different abstract interpretations,
    and a simpler model of the language it is probably possible to
    make a decent set of tests.


Entry: Haskell vs. Scheme
Date: Sun Mar 14 21:47:37 CET 2010

One problem I haven't found a solution for in Haskell is the
specification of pattern matching rules.

In the Scheme implementation of Staapl there are rule templates that
can be expanded into rule instances.  Macros make this easy.

     (patterns-class
      (macro)
      ;;----------------------------------------
      (word         opcode)
      ;;----------------------------------------
      ((1+          incf)
       (1-          decf)
       (rot<<c      rlcf)
       (rot>>c      rrcf)
       (rot<<       rlncf)
       (rot>>       rrncf)
       (swap-nibble swapf))
      ;;----------------------------------------
      (([movf f 0 0] word) ([opcode f 0 0]))
      ((word)              ([opcode WREG 0 0])))

Above the identifiers `1+',`1-',... are defined by instantiating the
two rules on the bottom, filling in `word' and `opcode' respectively.


I'd say the problem with Haskell is that pattern matching is syntactic
sugar, and syntactic abstraction is not easy (without resorting to
template Haskell).

So the question is, can pattern matching be solved in a different way
as to restore compositionality?

Maybe the answer is in expressing the redundancy differently?
I.e. instead of having to resort to code that generates pattern
matching rules for a data type, make the data type more general so a
non-redundant set of rules can capture the ``for these do'' part.


Entry: Feldspar
Date: Tue Mar 16 11:53:13 CET 2010

Feldspar[1] looks like an interesting project to follow.  Very close
to things I've been trying lately.  In [2] it is said they follow a
``deep embedding'' which means that the result of evaluation in the
host language is a syntax tree instead of a computation.  For Feldspar
this is then compiled further, or interpreted.

Fusion works by "Representing vectors by their index function".  It
takes the following form for the definition of map:

    map f (Indexed 1 ixf) = Indexed 1 (f . ixf)

I.e. the function f is moved inside the loop.  [2] : "... map
... results in a new vector ... with a different index function.  The
new index function is simply the composition of f and the previous
index function ixf.

Aside from fusion, Feldspar generates non-optimized imperative code
that is postprocessed.

They target TMS320C64xx.

The paper also mentions SPIRAL and Silage as related work.

The main conclusion I draw here is that it doesn't seem to be so
difficult to map functional descriptions to imperative programs for
well-structured programs.  However, the devil is in the details: how
to direct the compiler to generate efficient code.  That part of
compilation is really complex.

[1] http://dsl4dsp.inf.elte.hu/
[2] http://dsl4dsp.inf.elte.hu/Feldspar-ODES8.pdf


Entry: When is a stage a stage?
Date: Tue Mar 16 14:27:03 CET 2010

I'm getting confused counting the number of stages/levels in Staapl.
Essentially this is about the number of data -> code transition steps.
But that seems arbitrary, as it is possible to encode staging in
higher order functions.

So where to draw the line?

Another problem: (intuitively) complexity seems to be added mostly at
the pattern matching stage.  Maybe this is because that is the point
where branching/alternatives are introduced?  I.e. the combinatorial
part of computation.  Compared to computation by pattern matching,
composition of higher order functions seems straightforward.


Entry: Register allocation for image processing
Date: Sat Mar 27 17:42:24 CET 2010

Another one of those non-obvious obvious things: register allocation
can also be used to allocate larger buffers, i.e. for tile-based
image processing.

How does this relate to tiles?

What does this have to do with the dimisishing returns of fusion;
i.e. when the instruction cache fills up, or there is too much
register spilling in an inner loop?

( I forgot that these memory access patterns are really complicated!
They got me utterly confused before...  Maybe the point in writing DSP
metaprogramming code is to get _them_ under control in a formalism;
i.e. to create an algebra of access patterns. )


Entry: Dataflow and loops
Date: Sat Apr  3 15:43:25 CEST 2010

A hand-waving remark.  As mentioned in the comments here[1],
dataflow/reactive programming is hindered by the absence of loops.

This is maybe the biggest roadblock I face in trying to write down
some embedded language for DSP.  However, for a synchronous dataflow
language this doesn't matter; it might be solved by simple source
transformations and combinators.

  * Almost all the interesting algoritms in numerical computing are
    iterative, computing a function by successive approximation,
    i.e. as the steady state of some iterative dynamical system; with
    or without state maintained from one evaluation to the next.

  * For time-based algorithms resampling and block-based functions are
    also a requirements.  Both constraints add significant complexity
    to ordinary functional data-flow based representation.

Any practically useful DSP language should solve those problems
properly, probably using combinators that transform "basic" dataflow
programs into iterated and or re-routed forms.


[1] http://lambda-the-ultimate.org/node/2057


Entry: Embedding C in Haskell
Date: Wed Apr  7 00:45:30 CEST 2010

I'm a bit too tired atm for the details, but this[1] seems
interesting.  Got to it from LtU[2].  See also [3] and my earlier
comments[4].

How would one run such a system in practice?  Is it possible to build
a useful _real_ machine (i.e. a small microcontroller) without the
notion of program?

Maybe I'm at a good point here to think really hard about these
issues.  I want to build a language that lets me express the
audiosynth algorithms I know to derive an implementation.  I want to
do this without imperative notions.  Possibly in a total functional
language.

Looks like I must be careful to not re-invent continuation- and
stream-based I/O[5].

[1] http://augustss.blogspot.com/2007/08/programming-in-c-ummm-haskell-heres.html
[2] http://lambda-the-ultimate.org/node/3893#comment
[3] http://conal.net/blog/posts/can-functional-programming-be-liberated-from-the-von-neumann-paradigm/
[4] entry://../compsci/20100115-080242
[5] http://research.microsoft.com/~simonpj/papers/history-of-haskell/


Entry: Total functional programming
Date: Wed Apr  7 09:22:48 EDT 2010

Embedding of a total[1] language in ML[2].  Termination is guaranteed
through a restricted form of recursion.


[1] http://en.wikipedia.org/wiki/Total_functional_programming
[2] http://lambda-the-ultimate.org/node/3893#comment-58317


Entry: Image processing
Date: Sun Apr 11 10:19:40 EDT 2010

Maybe it's time to bite the bullet.  I like to solve the following
problems:

   * Create a video for melissa's performance on thursday

   * Perform the programming part of this in a decent programming
     language.  Preferrably Haskell.

   * Make the DSP part "compilable" to a simpler language like C.


Edit: this failed miserably; and I had to do it manually with command
line tools, meanwhile pulling out my hair.  There seems to be a
significant interfacing complexity that I failed to appreciate.


Entry: Classical Mechanics in Scheme and Abstract Interpretation
Date: Sat Apr 17 14:40:47 EDT 2010

I'm thinking of translating Structure and Interpretation of Classical
Mechanics[1] to Haskell as a tagless embedding: using type class
polymorphism to replace the Scheme generic functions.  I never really
made this distinction in previous attempts to symbolic computation in
Scheme: generic functions are exactly what is needed to represent
numeric/symbolic computations with the same mechanism.

SICM seems like a nice book on its own, but it's the right book at the
right moment for me.  Looks like I did learn something in these last
couple of months judging by the way that abstract interpretation
viewed in terms of generic functions feels completely natural now.
Apparently the detour through Haskell bondage was essential to make
things click.  Types do add some value, especially for building
intuition and understanding.

What shifted is the idea that computer algebra systems are "magic" to
that they are essentially quite ad-hoc collections of computations in
the form of rewrite rules.  I.e. a generic "simplify" method doesn't
exist; who defines whether a formula is simple?

Symbolic computation on the other hand is very well defined in terms
of operations on algebraic data types.  Essentially there is little
difference between operations on primitive types such as numbers, and
composite types such as syntax trees representing algebraic
expressions.  Expressions are really not all that special; they just
allow a richer set of "values".

This quite a mile stone in a journey that started about 10 years ago.
I guess Norvig is right[2].  Bridging numerical and symbolic
computation in that way (making the latter accessable in everyday
programming) is basically what I always wanted.  

The funny thing is that arriving at that point with appropriate
abstractions in my head _feels_ like an anti-climax, as if this
understanding has always been there and as if it isn't really so much
of an accomplishment.


[1] http://mitpress.mit.edu/sicm/
[2] http://norvig.com/21-days.html


Entry: DSP language - combinators
Date: Sun Apr 18 09:53:48 EDT 2010

Maybe I should stop whining about the difficulty of implementing a
combinator language for DSP, and just go ahead and do it without
thinking too much.  Build another one to throw away.

Currently I see two major structural components that somehow need to
be expressed:

  * Iterative systems: functions defined as limit points of
    trajectories of dynamical systems.  Such dynamical systems are
    themselves whe whole point of the language.

  * Generalized resampling which "transpose" or "rotate" time and
    space information.

This is essentially a variant of the fixed point combinator and a
"rotate" operation that can relate between time and space operations.

What I don't see yet: how to take the functional representations and
map them to an implementation with looping and buffering.

Maybe the next step should just be to make these operations
expressible in some high level form such that it can be interpreted
merely as functions, and then perform abstract evaluation on those
functions.

Let's resume an important trick from Feldspar[1].  The Indexed type
represents a vector or sequence as an index function :: Integer ->
value.  The `map' operator is then defined as a composition of this
function.  This can be understood as loop fusion.

          map f (Indexed 1 ixf) = Indexed 1 (f . ixf)

This is quite profound; it is so simple.


[1] entry://20100316-115313
[2] http://dsl4dsp.inf.elte.hu/

Entry: Loop Fusion: Parametrization of Equivalence Classes
Date: Sun Apr 18 10:27:31 EDT 2010

What is the big idea?  You want to separate:

   1. declaration of functional dependencies

   2. iteration strategy


The formula 

          map f (Indexed 1 ixf) = Indexed 1 (f . ixf)

can be interpreted in two ways.  As a rewrite rule (from left to
right) which picks a particular iteration strategy, or as an equiation
that creates an ``equi-semantical'' subspace of terms.

Is it somehow possible to construct representation of this
eqiu-semantical space that is explicitly parameterized, or at least
can be navigated locally.

In the end, this is really about symmetry.  It should be possible to
somehow relate this to discrete groups.

Once that group is explicit, optimizing the representation could be
done according to some well-defined measure, by defining a function
that maps the group to a (set of) benchmarks.


I.e. for the formula

         map f (map g v) = map (f . g) v

Essentially these are rotations of binary trees.

Maybe it's time to understand contemporary directed systems
(optimizers) first[1], and then move on to manual structure space
specifications.

Before that, let's try to understand why one would ever want to move
in the other direction.  Fusion eliminates intermediate data storage.

When can (some) intermediate storage be good? 

Suppose that by using fusion, we can replace buffer memory access with
register memory access.  This works well as long as we have enough
registers.  Once we need to start spilling registers to memory we
might loose inner loop speed.  Let's call this "spill pressure" and
find a concrete example of it actually occurring.

Anyways..  This subject is quite complicated as there are so many
specific rewrite rules based on heuristics.  The more I see this, the
more I want to put "knobs" on it.  Make a user interface to program
transformation.  Navigate the equi-semantic manifold.

[1] http://www.cse.unsw.edu.au/~dons/papers/CLS07.html


Entry: Simulating Mechanical Systems in Haskell
Date: Mon Apr 19 20:14:47 EDT 2010

Let's go through the whole exercise:

      U(x,y) = x^2 + y^2 + a / (1 + (x-1)^2 + y^2)

- Compute derivative using autodiff
- Compute lagrangian + hamiltonian equations
- Simulate the equations using the leapfrog method
- Draw them as animation from within haskell
- Generate a C program (i.e. OpenGL) that performs the simulation.

At the end of this (after realizing it is all a bit harder that I
thought) I should at least be a bit wiser.


Entry: Generating an audio synth
Date: Wed Apr 21 12:17:27 EDT 2010

Need a fast path to something that works with jack.  Since one of the
main points of writing a metaprogramming library is to arbitrarily
move nodes between compile and run time, we need both a Haskell-based
jack client and a VM that loads compiled C plugins.

# First idea is to use the `jack' package in hackage, but that fails
# to compile:

tom@del:~$ cabal install jack
Resolving dependencies...
[1 of 1] Compiling Main             ( /tmp/jack-0.57871/jack-0.5/Setup.hs, /tmp/jack-0.57871/jack-0.5/dist/setup/Main.o )

/tmp/jack-0.57871/jack-0.5/Setup.hs:2:0:
    Warning: In the use of `defaultUserHooks'
             (imported from Distribution.Simple):
             Deprecated: "Use simpleUserHooks or autoconfUserHooks, unless you need Cabal-1.2
             compatibility in which case you must stick with defaultUserHooks"
Linking /tmp/jack-0.57871/jack-0.5/dist/setup/setup ...
Searching for jack/jack.h...found
Searching for st.h...setup: user error (ERROR: st.h not found)
cabal: Error: some packages failed to install:
jack-0.5 failed during the configure step. The exception was:
ExitFailure 1

# Let's try to see what's going on here.  Unpack it:

tom@del:/tmp$ tar vxf ~/.cabal/packages/hackage.haskell.org/jack/0.5/jack-0.5.tar.gz 
jack-0.5/
jack-0.5/src/
jack-0.5/src/Sound/
jack-0.5/src/Sound/JACK/
jack-0.5/src/Sound/JACK/FFI.hs
jack-0.5/src/Sound/JACK.hs
jack-0.5/LICENSE
jack-0.5/README
jack-0.5/Setup.hs
jack-0.5/jack.cabal
jack-0.5/INSTALL
jack-0.5/examples/
jack-0.5/examples/Amplify.hs


# In Setup.hs the header dependencies are specified as:
neededHeaders = ["jack/jack.h", "st.h"]


On the website[1] it is mentioned that: There is a version in Hackage,
but the darcs repo is preferable at the moment.  It also mentions:
"Note that realtime audio processing done in Haskell is not really
feasable. At least I didn't get it to work properly."

Ok, abandoning this approach.

Next?  Maybe use Pd as a basic substrate, and use run-time plugin
loading there.

[1] http://open-projects.net/~shahn/index.cgi?seite=code


Entry: Re-loadable C modules
Date: Wed Apr 21 13:22:49 EDT 2010

Problem 1: make it possible to re-load object code in the background.
It needs to be background for real-time issues.  Also, it might be
better to run in a different address space to limit the effect of
crashes.

So, what is the simplest way to perform a call into a different
process?  How does Jack handle this?

Maybe using plain jack for this is good enough.  I don't really need
events that badly; streams should be enough.  Btw. does jack allow for
events?

Ok: start with jack and reloads by reloading application, then later,
add reloading mechanism into application.


Entry: Jack simple-client.c
Date: Fri Apr 23 10:34:27 EDT 2010

So, let's get started.  Get the example[1] from jack SVN, and figure
out how to plug in a pure state transition function.

Ok, next: generate a trivial C program using the Haskell code from
last couple of months.  See following posts.

[1] http://trac.jackaudio.org/browser/trunk/jack/example-clients/simple_client.c


Entry: Haskell in Emacs: getting rid of :cd ~/.cabal on C-x C-l
Date: Fri Apr 23 11:34:26 EDT 2010

In `inferior-haskell-load-file': annoying for debugging local modules:
the Hasell mode in Emacs apparently switches the current directory to
~/.cabal before loading the module.

How to avoid?

If I recall, it somehow tries to guess the proper directory, and if
it's not specified, it will set it to ".cabal".

See inf-haskell.el: It uses `default-directory' which is a variable
that automatically becomes buffer-local when set.  The problem seems
to be here:

  (inferior-haskell-find-project-root (get-buffer "Function.hs"))
  => "~/.cabal/"

Why is that?

  (inferior-haskell-cabal-of-buf (get-buffer "Function.hs"))
  => #<buffer .cabal>

Hmm..  Killing that buffer then messes up some state.  Killing+loading
the .hs buffer fixes that (some local variable is bound to the
buffer).  Digging deeper, the intelligence seems to be in
`haskell-cabal-find-file'.

Solution seems to be here[1]:

 (require 'inf-haskell)
 ;; Do not change directory to ~/.cabal on load
 (setq inferior-haskell-find-project-root nil)

[1] http://sites.google.com/site/haskell/notes/emacs


Entry: Term.hs and Function.hs -> generate C code
Date: Fri Apr 23 12:06:31 EDT 2010

Problem: given a pure function on a number typeclass (one that can be
tested in Haskell), compile it to C using the Term and Function
abstractions.  Forget about reductions for now.

How to get from Term -> Function?

Problem: Term and Function are just representations.  The type class
mapping is done in Ai.

*Main> let x = (var "b") + (var "a") in (x * x)
(mul (add b a) (add b a))

The default show of Term is a parenthesized tree representation.
Compiling to Function introduces memoization.  It's default rep is SSA
assembly form.

test1 = compile [realPart c, imagPart c] where
    c = a * b
    a = var "ar" :+ var "ai"
    b = var "br" :+ var "bi"

*Main> :t test1
test1 :: Function

*Main> test1
in: 
r0 <- mul ar br
r1 <- mul ai bi
r2 <- negate r1
r3 <- add r0 r2
r4 <- mul ar bi
r5 <- mul ai br
r6 <- add r4 r5
out: r3 r6

The Function type can be converted to C syntax using the `printC'
function.

*Main> printC test1
float fun() {
    float r0 = mul(ar, br);
    float r1 = mul(ai, bi);
    float r2 = negate(r1);
    float r3 = add(r0, r2);
    float r4 = mul(ar, bi);
    float r5 = mul(ai, br);
    float r6 = add(r4, r5);
    return(r3, r6);
}

Or more directly:

*Main> printC $ compile [var "a" + var "b"]
float fun() {
    float r0 = add(a, b);
    return(r0);
}


Some practical problems that need to be solved:
  - C infix syntax.  Either in Haskell, or using C macros.
  - Convert functions calls to loops for Jack block processing.

Infix syntax seems to be simplest to remove using C macros in the
current ad-hoc Function printing.


Entry: Z -> loop
Date: Sat Apr 24 13:16:03 EDT 2010

Last development: some algebraic manipulation of filters or state
space models.

*Main> printC $ compile [(z (z $ var "a")) + (z $ var "b")]
float fun() {
    float r0 = z(a);
    float r1 = z(r0);
    float r2 = z(b);
    float r3 = add(r1, r2);
    return(r3);
}

The interesting part is that the `z' operator doesn't need special
treatment and will survive the SSA conversion.  After that it can be
given a special meaning as described in [1].

Todo: convert a MIMO function with unit delays into a C loop.

-- Basic idea: 
--  * wrap the code in a "for (i=0; i<n; i++) { ... }" loop
--  * convert input/output access to indexed array accesses
--  * convert delay ops to struct refs

It might be simplest to again not bother with C syntax and use
reserved names instead.

#define aref(a,i) a[a]
#define sref(a,field) a.field
  

- The dereference aref conversion is trivial (if a variable name is a
  signal, replace it with a ref, otherwise leave it alone). 

- For outputs, the "return" statement in C code rendering needs to be
  replaced with a proper assignment operation.

- State updates work like this.  For each

             r1 <- (z r0)

  we generate

             r1 = sref(state, r0);
             sref(state,r0) = r0;

  A `fold' over the code can then collect all the state variables and
  construct declaration code for the state struct.


Main problem: how to incorporate proper assignments in the C code
generation?

data Let      = Let      String String [String]  deriving Show
data Function = Function [String] [Let] [String] 

The "return" statement needs to be replaced with assignments.  The
rest seems trivial.  (Is it me, or is this really the only place where
the functional picture breaks down a bit: the allocation of storage
space for function results?).


[1] entry://20100220-145657


Entry: Dataflow notation
Date: Mon Apr 26 20:06:11 EDT 2010

For low-level code (asm, C), always erite operations as (OP ins outs)
instead of the normal nested function notation.  This makes fanout
easier, and also works in C for composite objects that can't be
returned by value.  Let's call this ``dataflow notation''.

Note that in Haskell it might be simpler to keep using function
notation to blend better with the rest of the language, and return
multiple output values as tuples.

How is this relevant?  I started writing the `printC' formatter to use
normal C function composition, but this already breaks down for
iteration of input/output blocks where explicit naming of the output
buffer is required.

Note that this makes function composition awkward in C.  Luckily it is
mostly useful at a level where one would inline anyway when the
functions are small, and when they are large or block-based processing
is used, passing around pointers isn't so inefficient.

Note that dataflow variables[1] are a high-level idea.  I'm talking
mostly about the notiation, where a sequence of data processing
releations can be interpreted as a sequential program, i.e. they are
already ordered according to some dependency chain.

[1] http://en.wikipedia.org/wiki/Oz_(programming_language)#Dataflow_variables_and_declarative_concurrency


Entry: Fender Tone Stack
Date: Mon Apr 26 20:31:03 EDT 2010

See DAFX06 paper[1].  This makes me think of the Ableton GUI and
"pulling" on the filter transfer function graph to make parameters
move.  Might be an interesting application of automatic
differentiation.

[1] http://www.dafx.ca/proceedings/papers/p_001.pdf


Entry: Dataflow notation: input and output?
Date: Wed Apr 28 13:41:31 EDT 2010

The idea is that you don't want to duplicate array references inside
loops, so maybe it's better to use a load/store mechanism built into
the Function -> C printer. 

data Let      = Let      String String [String]  deriving Show
data Function = Function [String] [Let] [String] 

Hmm.. another case of leaky abstractions..  What is necessary is a
well-defined translation from expression to dataflow form.

1. Given a (Haskell) function that maps a bunch of inputs (either
   curried or as a tuple) to a bunch of outputs.

2. Given an abstract description of I/O ports.

3. Construct a C program that binds the computation to the ports.

This is a "LIFT" operation.  Find a good representation that can guide
the rest of the implementation.


Entry: Unified Embedded^2 DSL
Date: Fri Apr 30 20:13:41 EDT 2010

Maybe it's time for unifying embedded Haskell DSLs for embedded C
programming?  There are at least a couple, Atom[1] being the most
prominent atm, but also the DSP language Feldspar[2], Hume[3], which
is not an EDSL but an interpreter written in Haskell, and the work on
tagless interpreters[4].

[1] http://hackage.haskell.org/package/atom
[2] http://dsl4dsp.inf.elte.hu/
[3] http://www-fp.cs.st-andrews.ac.uk/hume/downloads/
[4] http://okmij.org/ftp/tagless-final/course/


Entry: Abstract Interpretation & Polymorphism
Date: Sat May  1 15:18:02 EDT 2010

Why is this such a great pair?  Probably about the "interpretation"
part: polymorphism is actually only about context-dependent
interpretations of the same form.

Maybe the slogan should be: type polymorphism is a perfect "vat" for
containing both direct computations and staged ones.  The staging is
really the big idea; polymorphism makes it easier to express as values
can be related formally to programs using structure-preserving maps.

I still wonder why this feels trivial and revolutionary at the same
time.


Entry: Compositional dataflow notation
Date: Sun May  2 12:33:37 EDT 2010

The main problem scetched in the previous posts is the representation
of code right before it goes into the C code generator.  More
specifically: 

  * low-level machines have no "return <values>" statement, only
    assignment to shared/global resources (registers + memory).

  * embedding a DSL in haskell goes most smoothly if lambda
    abstractions and tuples are used for input and output
    respectively.

An impedance match is necessary here.  

The main problem in my current approach is that the `Function'
datastructure does not reflect this.  In fact the current "result = op
arg1 arg2" syntax for the SSA form hides this pattern.

So, what is the real problem?  Maybe the intermediate SSA form
construction needs to be built on top of a dataflow (multi in - multi
out) notation, such that at least dataflow networks can be composed:
i.e. composite networks can be added as primitives to a dataflow
expression.

Main principle: caller allocates storage space (provides variables).

When translating expression form (with or without let-sharing) to
explicit dataflow, nodes are created.

Question: how to combine data-flow composition (instantiation of code
over nodes) with expr->SSA conversion + common subexpression
elimination.

What does `compileNodes' actually do?

*Term> :t compileNodes
compileNodes
  :: (Eq a) => [Term a] -> ([Term a], [(Term a1, Term a)])

It names intermediate nodes in a Term datastructure, reusing sharing
based on the `==' relation on the underlying type.

*Ai> compileNodes [(1 + var "a") * 2]
([r1],[(r0,(add 1 a)),(r1,(mul r0 2))])

It looks like this is the main place to start inserting external
nodes.  I.e. instead of having compileNodes return a list with
generated nodes that serve as a result (above this is [r1]) it might
be better to add binding there, or add some level of indirection which
decouples variable names and nodes.


Entry: Pure Data & Abstract Interpretation
Date: Sun May  2 14:49:45 EDT 2010

It might be interesting to construct a subset of functions that have
an equivalent in Pd, at least for the non-feedback stuff.


Entry: DAG Representations
Date: Sun May  2 15:13:42 EDT 2010


                  flatten           schedule
MEMO-EXPR (dag)  --------->  SSA  <----------  DFN (dag)


1. Starting from expressions and variables.

The memo-expr is the basic form of functional code: value reuse
through function abstraction or let-binding.  Adding variables allows
one to move from trees to directed acyclic graphs (value reuse).

SSA is a MEMO-EXPR without expression nesting, i.e. all bindings are
primitive operations.


2. Starting from dataflow notation (input/output variables).

A dataflow network consists of:

  - a set of nodes
  - a set of input-output relations between nodes
  - a constraint: each node is the output node of exactly one relation

Node that the difference with expressions is that a DFN has _explicit_
output node naming, while for expressions this is _mostly_ implicit,
except for memoized expressions, and of course for SSA, which can be
interpreted as a DFN.  A memo-expression has _explicit_ dependency
order (lexical scope), but implicit data allocation.

                     MEX          DFN
    dependency order explicit     implicit
    output alloc     implicit     explicit          

An SSA expression is _both_ a MEX and a DFN.


Entry: SSA IN/OUT parameterization
Date: Sun May  2 15:47:20 EDT 2010

In the light of previous post, what is the problem?  All code I'm
using depends on SSA form, but what I want is an abstracted SSA/DSN
form that allows me to bind network outputs to externally allocated
nodes.

So this essentially needs a functional representation.  An SSA
structure is a function that has sharing and internal node allocation
figured out (i.e. more abstractly it can be parameterized by an
internal node allocator), but can be patched into a larger node
network.

So, SSA is form is  NodeAlloc -> Inputs -> Outputs -> Network.

Actually, it's simple: nodes need to be parameterized, the rest of the
behaviour can be fixed.

Roadmap:
    - abstract node-alloc
    - parameterize output assign (default = node alloc)

So, in Term.hs, `nameNodes' is where the intermediate nodes are named.


This is mostly a wrapper around `nameShared' from Shared.hs while
shared node construction is performed separately using the
`extendShared' function.

Approach: stick to numeric node tags, but replace the [0..]  sequence
by a parameter `regnums'.  OK.

Next to open up: Function.

data Let      = Let      String String [String]  deriving Show
data Function = Function [String] [Let] [String] 

This is already fully serialized, and all node names are Strings.  It
might be simpler to keep this representation, and focus on performing
substitutions on the symbol names.


*Main> test1
in:  dummy
r0 <- mul ar br
r1 <- mul ai bi
r2 <- negate r1
r3 <- add r0 r2
r4 <- mul ar bi
r5 <- mul ai br
r6 <- add r4 r5
out: r3 r6


Maybe a guiding remark: do I want to keep an SSA-converted network
around to re-use, or is it simpler to recompile from the memo-term
form and use only Haskell function composition?

What purpose does `Function' serve?  To decouple the Term form from
the C/ASM code generation?

I'm making this too difficult.  Let's start over.

Maybe I need to pick one and stick with it?  Compose only in Haskell,
or also in the lower-level code?  This might be important for re-using
function composition in the low-level language instead of using only
macro expansion.

Entry: DAG Interface
Date: Sun May  2 17:35:23 EDT 2010

This is about _interface_.  It's the same DAG, but we want to use it
in different ways:

1.  Function: give it some inputs, let it produce output.

2.  Structural: connect functionality to storage nodes (i.e. for the
    implementation of combinators).

Basicly, I want to be able to ping-pong between those.


The function view is needed to integrate in a general Haskell
functional world, while the structural view is needed to express
dataflow combinators and integration into an imperative world with
explicit data allocation.

SSA conversion and memoization can probably be completely delegated to
the _implementation_ side.  The important part is the abstract part:
what do these 2 interfaces mean?

Looks like I'm at the point of reinventing the Arrow[1] abstraction,
where 1. is the function type constructor (->) and 2. is an arrow type
constructor.

[1] http://en.wikipedia.org/wiki/Arrow_(computer_science)


Entry: More DAG interface
Date: Tue May  4 10:32:20 EDT 2010

More milling of the previous post.  Let's add some heuristics to that.
The functional view is easiest to use, meaning, it _reads_ easier as
it is based on _lexical scope_.  The graph view is more interesting
for program manipulation, and for graphical depiction.

Question: how to convert function view to graph view?  I.e. given a
function:

        f (x,y) = (x/l, y/l) where l = sqrt (x*x + y*y)

How do we bind it to some combinator that has explicit node
references?  Could the graph view be just an interpretation of the
code?

Essentially, what we want is to convert from function to Arrow
representation (with explicit outputs) and back.  Efficiency is not an
issue on the Haskell side, but the eventual C code generation should
eliminate all abstraction overhead.

So, let's find a way to compose two copies of f as an Arrow
representation.

So, Arrow is an interface.  What are we going to use it for?
Essentially, we want to make something that has its _compostion
exposed_, i.e. a data struction.  Simply composing functions hides the
internals; we want to look inside.

This seems to be the essence, wether to expose composition, or make it
implicit.  .e. it should be possible to "flatten" a network into a
function.

Representing networks: let's take the simplest approach: a collection
of functions, and connectivity information.

So, what is a DFN?  Let's use the simplest, concrete data
representation and worry about functional abstraction later.

  -- a set of nodes.

  -- a set of processor instances.  an instance is a relation from
     nodes to functionality (a pure function).

Since this represents a graph, some form of de-cycling is necessary.
Note that the same is necessary for flat files!  In essence: we are
building a data structure that describes a network as a flat
structure.  There are two options: a zipper, or a dictionary.  The
former might be interesting for traversal, but not for external
representation.  The latter seems the best no-surprise option.

Let's stick with the Pure Data patch format, ignoring the graphical
information.  The basic elements are:

              object <name> [<arg> ... ]
              connect <ob_id> <out_id> <ob_id> <in_id>

The identifiers are indices into sequences of objects, inputs and
outputs.  This is an imperative description; i.e. a network is
represented by how it is incrementally constructed.

So, the main question: how to evaluate this imperative description?
Can it be done incrementally?  Does it make sense to reprepresent it
in a different way that can be easily serialized to/from this
imperative description?

I'm getting side-tracked.  The Pd format is not what I want, it's too
lowlevel.  What I want is: black boxes with inputs and outputs that
can be composed.


So, the elements again:

   - functions, probably :: [Float] -> [Float] as it will become
     difficult to hard-code the number of ins/outs generally

   - nodes, connected to a single generator, with multiple consumers.
     this is the source of non-directedness (2 ways of traveling the
     chain).


What seems to be clear is that the only simple representation is to
stick with functions, or any kind of textual representation, and find
a way to represent fanout.  Inputs you can just "pass in" anyway in
the functional representation.

What about this: convert a structure which has outputs represented
lexically, to one that doesn (i.e. behaves as a function).

Essentially: ``introduce (single) assignment''.

Maybe that's the real problem all along.  Some kind of functional
representation of assignment.


Entry: Representing (single) assignment
Date: Tue May  4 11:47:53 EDT 2010

See previous posts for prelude.  These ideas keep coming back:

  - Lexical scope is good  (stick to lambda/let for inputs)

  - In addition, explicit outputs are necessary.

It seems I'm looking for some kind of abstraction of assignment.
I.e. the "output" operator.

The problems that comes up then is: what is the return type of a
function that performs bindings to outputs?

  f (in1, in2) out = out <-- in1 + in2

Ha! It's the ``other side'' of the channel!

So the main abstraction I'm looking for is a channel with one input
and any number of outputs.  Funny, this relates directly to the
DFL / reactive work in Scheme[1].

The type of `f' above is ... -> [Channel], or even a tuple.  This
would then also bridge with the Arrow abstraction.

The problem is: how do you tie the two ends together?  How do you
implement "connect".  The flow of information is like:

     A "patcher" creates a node which has two ends, passes the input
     end as an output to a processor, and the output end as an input
     to any number of processors.

The question is then: what is a channel?  How to represent it?
Modeling a channel as a mutable object seems to make this simple.  Can
it be modeled as a function?  The important element is sharing.

Let's write `liftA2' for binary operators, or maybe Functor is enough
and simply use `mapf' for (x,x) -> x.

It seems that this needs a push or pull approach, i.e. we need to
"pick a side" of how to evaluate a network that's constructed by a
patcher.  Pull seems safest and closest to how we represent channels
(model a network as its eventual output), so a channel is something
that converts a "network context" into a number, triggering evaluation
whenever a certain value isn't present yet.

This is a State monad.  It's not a Reader monad since the context
(variable bindings) can and does change after an evaluation.

What we want to abstract is the memoization.

Full circle again: this is exactly the representation used in the
Staapl/Scheme approach to the dataflow problem.

So, let's restate: a channel is a State monad c->(v,c) where c is the
evaluation context.


[1] entry://../plt/20100327-160826


Entry: The essence of dataflow
Date: Tue May  4 12:44:19 EDT 2010

(Read: the essence of dataflow for synchronous/clocked systems -- the
essencec of "connect").

Abstractly:

    Lifting FUNCTIONS over CHANNELS, the result is a memoization
    structure -- output re-use -- a collection of nested `let'
    expressions -- SSA form -- ...

The conclusion is that connections _implicitly_ introduce memoized
structure.  Making this explicit is what it is all about.

The memoization structure seems then to be best represented by a State
monad.


Entry: Implementing DFN as a State monad.
Date: Tue May  4 12:45:05 EDT 2010

1.  Lifting functions to channels.

    Objective: Channels implement the Num class.  This can be done in
    two steps: implement Functor and Applicative, then use `mapf' and
    `liftA2'.

    Note that a Channel is a MonadState, so `mapf' and `liftA2' are
    already defined.

2.  Implement "connect".

    The two sides of a channel:

      - Construction (through the lifting operations) creates a
        computation that takes a context and produces a value and a
        context.

      - Consumption queries the context, and either reuse a value if
        it is already computed, or triggers computation.

    The whole representation is still functional (outputs are
    "created" by the invocation of a function).  To connect this view
    to explicit output descriptions, a "sink" type is probably
    necessary.

    Making the output explicit requires an "arrow lift".  (a -> b) ->
    (DFN in out)


So, let's separate the two ideas:

  1.  Abstraction of memoization as a state monad.

  2.  The arrow lift: making outputs explicit such that "connect"
      becomes possible.

The latter probably requires a level of indirection to connect
intermediate nodes (a consequence of SSA form) to output nodes.  It
seems that this small sacrifice (the presence of explicit "sink" nodes
like indexed arrays as lvalues in C) yields a structure that can be
completely embedded in a lexical language.

It seems that the lesson to learn is that a DFN is not symmetric
wrt. input/output.  Turning the arrows around still means it is a
directed acyclic graph, but there is a little bit more structure to
capture.  Lexical structure (one binding, multiple uses) neatly
reflects this structure.

Building the memostructure as a state monad seems straightfoward.  I
have examples.  Just need to get used to Haskellisms.  The arrow
lifting seems more involved.  I'm missing the "two ends of the arrow"
thing.

Write down "connect" first.

connect o i = 


let 
    o1 = mapf f i1
    o2 = mapf f i2
    in
    (connect o1 i2)


What does "connect" do?  It "reduces" a network, i.e. eliminates one
_input_ while leaving all outputs available.


So, start from this: 

  * "connect" eliminates an input from a network.

  * outputs need to be "exported" explicitly, since each intermediate
    output node could be made public.


Entry: Comment dumps
Date: Tue May  4 13:10:24 EDT 2010


Basic idea: dealing with Data Flow Networks, one usually wants two
different kinds of representation:

  * FUNCTION: tuples -> tuples

  * NETWORK: collection of primitives and compositions, i.e. nodes
             and connections

The main operation for this is the conversion: NETWORK -> FUNCTION.
The other way is only possible trivially, i.e. abstracting a single
primitive.


Function representation fits best for native embedding in Haskell,
and for abstract interpretation techniques it is probably also best
to use functional representation over specialized data types.
However, at some point an explicit representation of networks will
be necessary for both input (importing Pd patches) and output
(generating C code that uses C function calls instead of fully
inlined code).


The main differences between function composition and a DFN are
that a DFN has:

  * Explicit intermediate nodes

  * Implicit order of computations


------------

After thinking about this for a while, it seems that the needed
abstraction is a `channel'.  Something that can be bound _once_ to
a value, and read from as an input.  I.e. it behaves as a number
box.

Lifting pure function to channel functions seems straightforward:
channels are a Functor.


A dataflow representation of a function is just the Function
"opened up" with output nodes made explicit.


Entry: Connect consumes inputs, while outputs can observe internal nodes
Date: Tue May  4 14:47:36 EDT 2010


DFN operations:

composition:

    A dataflow network is a _set_ of processors, where each processor
    has a set of inputs and a set of outputs.  Therefore, a dataflow
    network can be _abstracted_ as a processor.

reduction:

    Binding an input (the "connect operator") removes the visibility
    of an internal processing node.

output abstraction (internal node hiding):

    Since fanout is possible, all intermediate nodes could be outputs.
    This requires outputs to be declared explicitly.

input abstraction (default / constant values):

    Similarly, inputs could be defaulted.


So, the trick is to see CONNECT as _input elimination_, making binding
well-defined, and taking the connect operation as a transformation of
processors.

The rest seems to be just representation of the input/output/processor
sets.

Note that this definition of CONNECT does allow cycles to be
introduced.  These could be eliminated by adding cycle checks, and
inserting decoupling delays.

Now, is it possible to make CONNECT build an SSA / lexical memo
network directly?  This seems to require some sorting operation.

I'm trying to write this down in types.  The main problem seems to be
node equality.

This seems to be central to Haskell-style programming.  It's hard to
create equality between abstract types.  This seems to be because of
the need of some external "naming" entity.  In Scheme, this naming
entity is just a memory location: each object has an explicit name.
Haskell values don't seem to have such an intrinsic identity.

So, a DFN always needs a node sequencer.

To solve this problem, node identity seems indeed to be core obstacle.
Is this because I'm ``thinking imperatively?''.


Vaguely, whenver this occurs, it seems that I'm trying to introduce
the concept of variable in a way that is only possible using staging:
generating code and interpreting it.  Now, in pure functional
programming it is usually possible to leave those nodes as variables
allowing a functional representation where node identity is a
non-issue.


Entry: The Pure Data patch format
Date: Tue May  4 16:40:55 EDT 2010

Starting out with the Pd data format as approaching some platonic
ideal, let's move towards finding a functional representation of that
platonic ideal.  Let's stick to the format of the abstraction as what
it's all about (dac~ and adc~ are top level special cases of outlet~
and inlet~).

An abstraction is:

   * A set of sub-processors that define the available inputs and
     outputs.

   * A set of connections that constrain the available inputs.

   * Input abstraction: promote some open input nodes to the next
     level / set other open input nodes to default.  (Bind all inputs
     as either abstraction inputs or constant inputs.)

   * Output abstraction: promote some output nodes to the next leve /
     ignore other output nodes.

Important observation: inputs and outputs are not symmetric.  Each
input (variable) can be bound/assigned-to only once, but can be used
multiple times.


Entry: Unification
Date: Tue May  4 16:57:49 EDT 2010

What I'm really looking for is unification: allow the same variable to
have different names on both sides of the abstraction.  Connect =
unify.  End of story.

How to abstract this in Haskell?

This seems like one of those `pearl' abstractions that keep people up
at night so it would surprise me if there isn't already a highly
abstract library for this.


So this brings us back to the initial idea of this morning: the real
issue is assignment / binding / unification..

Suppose e is a single assignment store.  How to express the assignment
of two variables (nodes)?

   connect a b e = e'
        
So, you can unify/connect an input and an output, two inputs, but not
two outputs.  This shows that Pd does not use unification but
something simpler as it does not support in-in connections.  

Allowing unification of outputs gives constraint programming, where
inputs/outputs are symmetric and processors are just constraints.


Entry: Much ado
Date: Tue May  4 17:43:23 EDT 2010

But I'm not really getting anywhere.  I've got a couple of somewhat
meaningful abstractions, but no way to glue them together.

So, practically, what is the problem I want to solve?  Given an opaque
pure function over the Num class, lift it to a structure that accepts
inputs and outputs.  These inputs and outputs are then components of
an array.

Why do I need this?  To combine numeric grids and (stateful) iterative
processes, the basic ingredients of numerical math, abstracting the
kernels as pure functions so they can be examined in other settings.

Given:
   - function over Num
   - representation of input / output nodes
   - intermediate node generator

Compose these.


Problem (already solved) when the function is opaque, you loose node
sharing information.  This can be rebuilt (and more) using a term
equvalence relation.


Starting from more explicit composition information, node sharing
information can be recovered directly without the use of tricks.
Since I need the opaque representation, it might be best to stick with
node sharing recovery.


So, today I went all over the place, but because I don't have full
control over composition (as it is hidden behind opaque function
objects) there is no real use for all these different representations,
right?  Or could such a more explicit representation be built as the
result of abstract evaluation?

Practically, it is probably best to model the C code generator in
Arrow form with explicit "output inputs", and an explicit node
generator.

So "compile" yields :: 

type Input = String
type Output = String
Num a => ([a] -> [a]) -> [Input] -> [Output] -> CodeGen

Where CodeGen takes care of internal node generation.

Why do I get the feeling that there are too many arbitrary choices,
and I'm not getting the real idea.  Maybe this just needs a
distinction between assign and define (alloc), where assignment to
variables is allowed once (alloced by caller).

Block [Input] [Output] [Temp] [Assigns]

Morale: allocation needs to be made explicit (in the form of
declaration).  The basic unit will then become the block.

The Assigns are essentially "block calls", i.e. macro nesting or
procedure calls are possible there.

float fun(float dummy) {
    float r0 = mul(ar, br);
    float r1 = mul(ai, bi);
    float r2 = negate(r1);
    float r3 = add(r0, r2);
    float r4 = mul(ar, bi);
    float r5 = mul(ai, br);
    float r6 = add(r4, r5);
    out0 = r3;
    out1 = r6;
}

->

float fun(float dummy) {
    float r0, r1, r2, r3, r4, r5, r6;
    r0 = mul(ar, br);
    r1 = mul(ai, bi);
    r2 = negate(r1);
    r3 = add(r0, r2);
    r4 = mul(ar, bi);
    r5 = mul(ai, br);
    r6 = add(r4, r5);
    out0 = r3;
    out1 = r6;
}


The main translation is then memoterm -> block, given input and output
names.  Actually, assignment is already a block (proc?) operation with
multiple inputs and one output.

Esentially, all primitive functions are trivially lifted to
application + assignment, with allocation performed explicitly
somewhere else.


Entry: Assignments + applications are primitive procedures
Date: Tue May  4 19:26:08 EDT 2010

It seems that separating node allocation (declaration) and procedure
invocation is the essential idea: this frees up outputs for allocation
by the caller.

Again, look at Oz[1].  This way, functions+assignments are primitive
procedures.  Here ``procedure'' refers to the CTM[2] approach of
building functions on top of procedures that allow single assignment.

Such a structure (procedure instead of function based) translates
directly to C/LLVM code with input/output and state vectors.  

Compiling between Term (+ possibly recovered sharing) and Procedure is
the main problem to solve.

               Term ---> Shared ---> Procedure


-- Node Names
type Input  = String
type Output = String
type Temp   = String

-- Procedure Abstraction
data PAbs = PAbs [Input] [Output] [Temp] [PApp]

-- Procedure application
data PApp = PApp Proc [Input] [Output]
data Proc = PComp PAbs
          | PPrim String


This seems to be OK.  It composes (it's a recursive type terminated by
the PComp | PPrim sum).  If the list of procedure applications [PApp]
is sorted according to the dependencies this translates directly to
C/LLVM code.

Now, it would be nice to have an abstract (functional) encoding of
this such that all symbols can be replaced with lexical variables.


[1] http://en.wikipedia.org/wiki/Oz_(programming_language)
[2] http://en.wikipedia.org/wiki/Concepts,_Techniques,_and_Models_of_Computer_Programming


Entry: Summary
Date: Tue May  4 21:05:17 EDT 2010

It was quite a trip today.  A lot of fuss, but what are the remaining
simple ideas?

  * The single assignment procedure (a procedure that receives
    variables that can be assigned to once) is a key concept.

    It allows decoupling of variable declaration (storage allocation)
    and definition (computation and result storage).  Both are
    combined in `let' expressions.

    Separate allocation seems to be necessary for maintaining
    efficiency and real-time behaviour for DSP apps: it allows static
    allocation.  

    This separate allocation mechanism works well for different levels
    in the memory hierarchy and works all the way down to the RISC
    instructions.  At some point the static references (i.e. static
    addresses, registers) can be replaced with indirection (pointers)
    : this is where we can move from macros to run-time composition
    (procedure calls).

  * An arrow is something that behaves as a function (i.e. it has
    composition) but can have other structure in addition.  I.e. it
    could be used to represent a data structure that represents a
    program (syntax).

  * It's possible to interpret the Pd patch format in a way that makes
    "connect" an _abstraction transformer_.  An abstraction without
    connections starts out as a union of sub-abstractions and
    subsequent connections eliminate inputs.  Independently of
    composition throuch connections, inputs and outputs can be
    exported, where un-exported inputs and outputs are respectively
    defaulted and ignored.

  * Memoization over channels: might need to see that first before I
    believe it.  I had a vague optimistic picture in my head this
    afternoon, but not any more..  The problem is "merging" of
    environments.  Been there.  Failed.  The right approach seems to
    be a fold over term tree structures to restore the "straigt line"
    (as in ai/Shared.hs)

  * Unification is what we really want in the end.  But it seems that
    this can't be easily embedded in Haskell, mabye also due to the
    threading issue?


Entry: Composition in the low-level rep
Date: Wed May  5 14:04:30 EDT 2010

The Procedure.hs representation currently uses a recursive type for
PAbs, i.e. a PApp can invoke a PAbs directly.  This might not be a
good idea as we probably just want to represent flat block structure
(one level of abstraction) and use target language name abstractions
(procedures or macros) for composition.

If a generator that produces PAbs structures needs macro expansion, it
should perform expansion in its higher level description.

Next: fix compileNodes such that the register naming is done more
explicitly.  The problem is in compileNodes / envFold

*Main> compileNodes [0..] [var "a", var "a" + 1]
([a,r0],[(r0,(add a 1))])

This probably needs to use the Procedure.hs representation directly,
which would also hide the object equality unsafe property.


Entry: Living without intrinsic object identity : Shared.hs
Date: Thu May  6 10:20:41 EDT 2010

How this should work: intermediate node allocation (naming) is an
integral part of Expression -> SSA conversion.  This is what Shared.hs
should do: construct a Term structure with intermediate _names_,
i.e. Var.

Currently in the code it is a bit unclear to me where exactly the
names get generated.  It is done in `nameNodes', separate from the
Shared.hs functionality.

Conclusion: constructing datastructures where object identity is a
property is not a good idea.  In Haskell, object identity _always_
needs to be made explicit.  This means that Shared.hs is better
rewritten as a simple SSA convertor directly, such that intrinisc
object identity (which _is_ used in the _implementation_) is not
observable outside the module.


Goal: given a list of (Sym, Term) pairs, construct a list of (Sym,
Term) pairs where each Term is simple, and additional variables are
introduced from a list of Sym temp nodes.

This would hide the Shared data structure from the outside world, and
only return Term -> Term maps.

It does look like the external behaviour of Shared.hs is based only on
(==), so there really shouldn't be a problem.

`nameShared' builds a dictionary of Terms to index types.  However,
what is needed is:

     - nestedTerm -> Shared

     - Shared -> uniqueTerms  (or abstractly as fold)

     - a memoized equality relation

Practically, this function can be reused: it just combines recursive
Term structure with memoized (==) to build a list of representative
terms and a dictionary of unique nodes.

type Env a = Shared (Term a)
envFold :: (Eq a) => [Term a] -> Env a -> ([Term a], Env a)

That structure then needs to be operated on, by mapping the [Term a]
list to provided output names, and the remaining nodes in Env a to
intermediate nodes.

OK.  Difficult part is done: output names get properly woven into the
register naming code.  It was straightforward in hindsight ; haskell
code does tend to become quite dense.

Next: get rid of Function.hs by porting all C gen code to Procedure.hs


Entry: Memoization of (==), the right way
Date: Thu May  6 14:11:52 EDT 2010

What I should have done instead is to write the algorithms
without memoization based purely on (==), then define 

  (==) :: Memo a -> a -> a -> (Bool, Memo a)

and thread this through the computations.  This allows one to
keep the memo tables hidden, but keep them around as long as they
are needed.


The probem is that the (.==) used in nameShared does not use the
equvalence table used to build up the environment.  Maybe it should
just use extendShared followed by pointer equivalence?

The problem is currently that the tmpNames passed into nameShared
trigger recursive (==) to be invoked on misses, which is definitely
not what we want to accomplish.


It seems that the easiest way to do this is to wrap the whole
computation in an IO monad such that `makeStableName' can be used, and
then use `unsafePerformIO' to perform any kind of operation that might
use pointer equality to produce a datastructure that no longer relies
on sharing.


The real, deeper problem seems to be that I'm trying to separate tree
traversal and caching of (==), while what I really want to do is to
"compact" a tree according to the equivalence relation, and also
"decouple" it by creating _explicit_ indirections in the form of node
names, without resorting to impicit pointer equalities in further
processing.


Conclusion: find a way to abstract "tree unpack", where a recursive
structure is converted into a flat dictionary by adding node
references (i.e. natural numbers).

To implement this, the unique naming could be bootstrapped by using
makeStableName inside an unsafePerformIO.


Entry: Removing Function.hs
Date: Fri May  7 10:10:22 EDT 2010

Basic code generation now uses Procedure.hs instead of Function.hs

Next: node naming for `compile' in test-Ai.hs

In the current bootstrapping code in test-Ai.hs all variable names are
introduced manually.  What we want to do is to take an opaque
function, and generate variable names automatically.

So, how to lift an opaque function?

Problem: the `nameShared' function needs to keep track of the nodes
that are consumed from the temp node lazy list.  Alternatively, they
can be gathered from the not-declared inputs/outputs.

Step 1: lifting [Term a] -> [Term a]

-- Lifting
lift f = proc where
    outTerms = f ins
    proc = procedure [] $ compileNodes outs tmps outTerms

*Main> lift (\(a:b:_) -> let x = (a+b) in [x*a,x*x])
in:   
out:  y0, y1
temp: 
(r0) <- add (x0, x1)
(y0) <- mul (r0, x0)
(y1) <- mul (r0, r0)

Defining functions on lists however is annoying.  It's probably
possible to use tuple type info for this, and use a typeclass?

However, typeclasses do not allow type aliases..  They need tags.

Can this be wrapped in an Applicative Functor?  Maybe this should be
the point where a function is transformed into an Arrow?  Know your
abstractions...  I'm a bit lost.

This makes perfect sense:

*Main> :t lift
lift :: (Eq t, Show t) => ([Term a] -> [Term t]) -> PAbs


The question is, do we need specialized lift ops for tuples and
specialized functions?  One of the nice properties of applicative
functors is that this is all served by currying.  The PAbs data
structure however does not support higher order functions.

What about this:

-- Lifting
lift f (ins, outs) = proc where
    outTerms = f ins
    proc = procedure [] $ compileNodes outs tmps outTerms


*Main> lift (\(a:b:_) -> let x = (a+b) in [x*a,x*x]) (ins,outs)
in:   
out:  out0, out1
temp: 
(tmp0) <- add (in0, in1)
(out0) <- mul (tmp0, in0)
(out1) <- mul (tmp0, tmp0)


But this is not fmap, as it's not ::  (a -> b) -> f a -> f b

This raises the question: is it better to represent compiled networks
as functions that abstract over their inputs/outputs (i.e. as Arrows)
or do we just want to compile ``only once'' at the toplevel?

Main idea here: is the Procedure ouput something we want to process,
or just a shorthand for the C/ASM/LLVM concrete syntax output?

I guess the latter, so let's not loose time thinking about this.  All
Arrow abstraaction is for higher levels.

One thing though: combinators.  You can't just convert combinators to
dataflow code by abstract interpretation: too much structure gets
lost.  Sharing is not too hard to recover, but iteration is an
entirely different story.  So yes, for combinators some kind of
morphism is probably necessary (to perform the same combinators on
Haskell vs. generated code).

The AI would be there to build kernel functions.  The combinators are
there to pipe data in / out.


Entry: Finalizing (-> Procedure)
Date: Fri May  7 13:39:07 EDT 2010

Next: fix In + Tmp names.  This should include a stage that guarantees
that all variables are used (inputs,tmp) and assigned to
(outputs,tmp).

This is difficult to modularize..  

Inputs : these are picked by the mapper function, so it should know
the arity.  Outputs: come from the output of the function.  Temps:
allocated by the compileNodes step.  Maybe the datastructure returned
should keep track of them?


Entry: Goals
Date: Sat May  8 10:35:23 EDT 2010

Time to restate goals:

 1. Gain intuition about how things fit together and how to use
    Haskell idioms: opaque functions, concrete data structure
    combinators, and write some DSP code with Haskell-based analysis
    and resulting C code for Pd / jack audio.

 2. Plug it into known technologies (i.e. data parallel haskell).


Entry: Main problem
Date: Sat May  8 11:52:57 EDT 2010

Function vs. Procedure view.  It still isn't 100% clear to me how to
express single assigment in Haskell directly.  Probably need to play a
bit more with what I have now..


Entry: Flattening Term graphs (SSA)
Date: Sat May  8 12:46:05 EDT 2010

I'm having difficulty modularizing the compilation step.  The problem
is keeping track of inputs and outputs.

Maybe this should all be collapsed into one function that converts a
Term structure into a flat structure, introducing names.

Practically: separating sharing and naming might not be a good idea,
they are somewhat connected.  The main idea seems to be that the
flattening fold should know about variable names of the term
structure.

Given: 

  * Through abstract interpretation, a Tree structure parameterized by
    input nodes (i.e. The Term Var constructor.) associated to output
    nodes.
 
  * an intermediate node allocation function (a lazy list)

Wanted:

  * flattened dictionary (SSA form)

  * List of intermediate nodes  (for allocating intermediate storage)


Let's make this abstraction functional, so we don't need to deal with
generating input/output names.

This gives a family of functions.  Let's take the 2->2 one, and
generalize later.

   f :: (a,a) -> (a,a)

   compile_2_2 :: ((a,a) -> (a,a)) -> (Var a, Var a) -> (Var a, Var a) -> Proc


So, what is the type of compilation?  Start with a function

  f :: a -> b

The compiler takes a function, two sets of variables, and produces a
code segment.

  compile :: (a->b) -> [Tmp] -> Out b -> In a -> Procedure


Important remarks: 

  `compile' doesn't know how many temporaries are generated, so tmp is
  a list tmp = [Node].

  The output and inputs are directly related to the base types:

    a               In a
    Float           Term Float
    (Float, Float   (Term Float, Term Float)


Starting from functions like (a->b), I'm constantly being pulled into
not naming outputs.  I.e. lifting to (Ins a -> Outs b).

There is something wrong with my intuition here.  (a->b) and Proc (Ins
a) (Ins b) are different beasts.


Another question.  Given an opaque function ([a] -> [a]), can we
convert it to the following?

           :: [Node] -> Outs -> Ins -> Procedure

Where the Procedure is filled in properly, I.e. it contains a finite
list of temp nodes, and finite lists of input/output variables.


Entry: Just flattening + node allocation + input collection.
Date: Sat May  8 13:06:29 EDT 2010

Let's properly solve the problem of node flattening and input/output
gathering first using a specialized data type that has the minimal set
of components.

    data Graph n v = Leaf v | Node n [Graph n v]
              
Where the n type identifies nodes (i.e. it could be an opcode) and the
v type identifies node names in a unique way.  This should then be
massaged into a non-recursive data type:

    data Binding n v  = Binding v n [v]
    data Dict n v     = Dict [v] [v] [v] [Binding n v] 

The flatten function takes a temp node generator and a list of variable
to output term bindings to build a dictionary structure.

    flatten :: [v] -> [(v, Graph n v)] -> Dict n v

This consists of two steps:
    * fold over tree
    * fold over list
    
Just like before, the Fold is the most important part.  What is the
accumulator for the fold?  The dictionary, but in a form that can
represent pointer equality.


( I'm cursing now at the absence of side effects...  Accumulations
like these are trivial in Scheme.  I've been able to avoid these in
Haskell up to now.. )

So, what about writing Flatten.hs without memoization first, then try
to memoize it?  Even then, writing down the recursion pattern:

  dict -> tree -> (dict, tree)

is problematic.  In SSH.hs `envFold' I've written it out explicitly.
Looks like it's time to go to basics again.
  

Today's conclusions:

  * Making "flatten" explicit is probably a good idea.  It's a decent
    interface and it makes it possible to concentrate on the essence :
    remove recursion from (any?) datatype by introducing variable
    names.

  * I'm struggling with two basic abstractions: dealing with object
    equality (which can currently be ignored by using non-memoized ==
    and later try to memoize it), and dealing with map + threaded
    state.


Entry: Combining monads
Date: Sat May  8 14:54:12 EDT 2010

So, the problem seems to be that I'll need the IO monad for stable
names (pointers) and a State monad for building the dictionary list.

I don't think I can build a "flat" monad that incorporates the IO
monad as the latter is abstract..  So how do you combine the IO monad
with a State monad?  The answer is to use monad transformers[2].

[1] http://www.haskell.org/all_about_monads/html/transformers.html


Entry: Just State
Date: Sat May  8 15:07:24 EDT 2010

The core operation is:

   - Check if a node is already present in the dictionary.  If not,
     allocate a new intermediate tag and update the dictionary.

   - Replace node with tag.

So there are two state components that need to be threaded: currently
visited nodes + node naming.

type Scan n v = State ([v], Dict n v) v

Hmm... too many loose ends.  One thing at a time.

Key elements seem to be:

PASS1:
   1. Convert tree into a list of nodes (abstract recursive decent)
   2. Write dictExtend as n -> State ...
   3. fmap (n -> State ..) over nodes to get [State ...]
   4. sequence the [State ..] into State
   5. runState

PASS2:
   6. flatten the dictionary elements.


Ok, got full flattening + sharing without memo in Flatten.hs


The remarkable thing is that the output is completely ignored.
Meaning this is just a fold and the State monad isn't necessary.

However, it might be easier to stick to the monad such that it can be
combined with the IO monad for stable names.

dictExtend is now factored into a foldable function, with the State
monad version derived from it.


Entry: Summary
Date: Sat May  8 20:04:50 EDT 2010

So, today was again a trip.

  * I'm not quite used to many of the Haskellisms.  Monads need some
    more work; i.e. monad transformers.

  * Explored the relationship between left folds and sequence + mapf
    for a State monad.  (Syntax directed vs. semantics directed?)

  * TODO: IO monad transformer for stable names, together with State
    monad from fold.

  * TODO: gather inputs + tmps from current bindings list.  Figure out
    how to work with lists as sets.

  * TODO: multiple outputs.


Entry: Multiple outputs
Date: Sat May  8 21:25:41 EDT 2010

I've changed the fold to a 2-pass fold.  For each expression, the last
node is renamed to be the output.  Problem is that this isn't always
possible, i.e. if output nodes are not unique, some name sharing needs
to happen.

Icredible how there is so much tension between explicit and implicit
outputs.  Both have advantages and disadvantages, and there are some
corner cases in the explicit output case case.

The current implementation for named outputs also feels quite ad-hoc.
Think about this.  Maybe outputs should be assigned explicitly to
intermediate nodes.  And symmetrically, the same should be done with
inputs?

I'm trying to find something that has a solid base.  It looks like
input/output _copy_ is the most general.  Copy elimination can then be
an optimization that works in some cases.

I.e. for pure risc machines, if the ins and outs are memory, they need
to be copied to registers anyway.  Otoh, this is something that's
easier to perform on the caller side.

So what to do with copy nodes?

Conclusion really seems to be that this is far from a sound
representation..  Maye I should stop worrying about these details and
pick one: let's make all nodes internal, and use explicit in/out
assignment/unification.

Solution: the Bindings type now has two constructors: Op and Assign,
where Op is register to register, and Assign transports in/out to/from
registers.

Can it be made fully load/store?

Ok, seems to work.  Also added distinction between Op, Input, Output
so all input, output and tmps can be straightforwardly identified.
Maybe they don't even need to be part of the datastructure.

Alright: full circle:  combinint Ai, Term, Procedure, Flatten, Complex

*DFL> compile f0
in:   in0, in2, in1, in3
out:  out0, out1
temp: tmp0, tmp1, tmp2, tmp3, tmp4, tmp5, tmp6, tmp7, tmp8, tmp9, tmp10
[tmp0] <- [in0]
[tmp1] <- [in2]
[tmp2] <- mul[in0, in2]
[tmp3] <- [in1]
[tmp4] <- [in3]
[tmp5] <- mul[in1, in3]
[tmp6] <- negate[tmp5]
[tmp7] <- add[tmp2, tmp6]
[out0] <- [tmp7]
[tmp8] <- mul[in0, in3]
[tmp9] <- mul[in1, in2]
[tmp10] <- add[tmp8, tmp9]
[out1] <- [tmp10]


Shuffling around a bit, the representation can be abstracted over
inputs and outputs like this:


-- Compile to linkable code.
compileIO :: (Show a) => ([Term a] -> [Term a]) -> [String] -> [String] -> Procedure
compileIO f is os = compileTIO tmps is os f


This gives the functional representation as desired.  I think from
here things should go a bit more smoothly.


Entry: Summary
Date: Sun May  9 11:03:52 EDT 2010

   * Ai, Term, Procedure, Flatten, Complex are linked together in
     DFL.hs; separation of concerns:

       Term      : tree/graph representation
       Ai        : algebraic operations for Term
       Procedure : low level SSA representation for cod gen
       Flatten   : conversion from tree/graph to flat SSA form

       Complex   : tests composite algebraic operations with Term

   * The Procedure generation does not use external output allocation
     : there is one level of indirection that uses Input / Ouput
     instructions and internal temp node allocation.  This can be
     eliminated in a target-specific way; solving assignment to
     caller-allocated variables generally seems to contain too many
     special cases.

   * The Flatten.hs code does not use memoization yet.  This can
     probably be postponed until it becomes a problem for large
     programs.


Entry: I -> O -> C
Date: Sun May  9 11:12:52 EDT 2010

About compileIO in DFL.hs - is it composable?

Is the following type I -> O -> C an arrow?  Here C is code.  This
type represents abstracted code where inputs and outputs can be filled
in.  I.e. linkable dataflow code.

The key question is : can this be made to compose?  Give two such
objects, can we turn it into one of similar type?

The problem is that we need some kind of composition operation on C.
I guess this could work, but since C is already defined from a more
general type (either opaque functions or concrete syntax trees) it is
probably a lot easier to perform the composition at those levels.

The composition of C objects is about _structure_, not functionality.
I.e. what kind of looping strategy to use.  Such transformations
cannot be performed on the function space!

So, it probably makes more sense to perform this kind of composition
on functions that map [n] -> [n].


Entry: Stacks
Date: Sun May  9 11:34:13 EDT 2010

Does it make sense to use stacks for the lifting?  I.e. automatically
plugging through unused IO?  No.. then the automatic I/O name gen
doesn't work : it depends on the function to be lifted accepting but
ignoring unused inputs.


Entry: Memo again
Date: Sun May  9 15:58:15 EDT 2010

So, what about the memo.  It seems simplest to use hash consing
(memoized constructors) for the graph structure.

It seems that hashing the Graph constructor Node should be enough.
How to abstract this?

The hashing table is exactly what I'm looking for as the output of the
algorithm.  So, if the constructors of Graph can be hidden behind
monadic functions, this might come for free when converting between
Term and Graph rep.  Or am I missing something again?


[1] http://www.haskell.org/haskellwiki/MemoisingCafs


Entry: Next
Date: Mon May 10 09:02:34 EDT 2010

  * C code generation from Procedure.

  * Representation of looping combinators: lifting.


Entry: Lifting and combinators
Date: Mon May 10 15:54:54 EDT 2010

Summary remarks from previous posts:

  - Implementing loops is necessarily a syntax-oriented approach
    (i.e. working on the Procedure or Term data types), since there is
    no way to do this purely from semantics : the high-level
    combinator structure gets lost through code duplication, and just
    the atomic data dependencies are left over.

  - What should be possible is to write these combinators in such a
    way that they can be lifted to the original function space to
    apply them to the opaque functions, i.e. there's a functor that
    maps from combinators on syntax to combinators on functions.

Now, can we turn this around by defining functions to operate on
abstract representations of containers and using loop fusion to derive
code?

Currently (for the music DSP app) I'm only interested in 2 kinds of
combinators: parallel vector map i->o lifted to [i]->[o], and
sequential state machine execution (s,i)->(s,o) lifted to
(s,[i])->(s,[o]).

The former is simple to do (see Feldspar [1]).  The latter requires
some way to define the unit delay 'Z' on abstract containers, probably
a modification of the fusion law in [1].

Conclusion: find a way to represent fusion.

The abstract container type should be guided by the implementation of
'Z'.  I.e. whenever Z occurs in a term, it has to be a sequence type,
or a type associated with a past?


So, let's start with defining Indexed as a functor with:

    fmap f (Indexed 1 ixf) = Indexed 1 (f . ixf)

import Ai

data Indexed a = Indexed (Int -> a)

instance Functor Indexed where
    fmap f (Indexed ixf) = Indexed (f . ixf)

instance Show a => Show (Indexed a) where
    show (Indexed ixf) = show $ map ixf [0..10]

i0 :: Indexed Float
i0 = Indexed (\n -> [0..] !! n)
i1 = fmap sin i0

t0 :: Indexed (Term Float)
t0 = Indexed (\n -> var ("a[" ++ show n ++ "]"))
t1 = fmap sin t0

*Main List> t1
[(sin a[0]),(sin a[1]),(sin a[2]),(sin a[3])]


The fusion law is the main trick to abstract vectors into element
references.  The insight that Indexed doesn't really need to be a
function from Int but an abstract rep is the main bridge to AST
output.

Next: how to define Z on Indexed?  Looks like Z needs to be a property
of the data type - much deeper than simply feeding "-1" into the index
function.


[1] entry://20100316-115313


Entry: Defining Z
Date: Tue May 11 19:59:36 EDT 2010

Since lifting over a simple array iteration is quite trivial, the only
remaining real problem is dealing with the unit delay Z.  Essentially
there are two forms to handle:

 * Output delay, or delay of computed values.  These are stored in
   local variables in the loop body, and in state vectors inbetween
   loop iterations.  A special implementation might be necessary for
   long delay lines.

 * Input delay.  These can be implemented using array indexing, except
   for the first iteration where they need to come from stored state,
   and the last iteration where they need to be saved as state.

Implementing output delays can be done using the trick mentioned
previously.  Input delay needs a specialized code body.

EDIT: Implementing 'Z' is going to be the main problem, as this is the
only structure that allows alternative implementations that
significantly change the memory access patterns of the resulting code.
The goal should probably be to find a representation of the different
degrees of freedom of implementation, and the construction of a cost
function based on memory access patterns.


Entry: Next
Date: Wed Jun 23 10:14:45 CEST 2010

After 6 weeks off, what's next / what's important?

There are two main hurdles:

  * Create a non-trivial example to convince prospective clients.  As
    Carrette mentions in [1], the techniques useful for deriving
    implementation from model are quite straightforward, but their
    selection and combination _does_ require some cleverness.

    It's like cooking: ingredients are simple, the combination is
    where the magic is created.  Compilers are usually heavily
    factored translators.  The human mind doesn't seem to handle the
    big composite picture intuitively very well, while the steps are
    easy.

  * Once such an example is made for a certain domain, can this be
    transferred to ``stable knowledge'', or is ad-hoc intervention by
    a language designer necessary?  (This is the ``Can you sell DSL?''
    question.  Is the DSL flexible enough?)


[1] http://www.cas.mcmaster.ca/~carette/newtongen/


Entry: Picking up again
Date: Thu Jun 24 10:27:11 CEST 2010

So, it's been 6 weeks, and I completely forgot the structure.  This
happened before, and seems to be a problem with most compiler-oriented
programs:

  - components are simple, straightforward

  - the whole processing chain is not (there is a lot of leverage in
    orthogonal directions from the different components).


Entry: Modeling: interpretation and compilation
Date: Tue Jun 29 10:59:25 CEST 2010

Interpretation is the conversion of a model to a piece of information,
i.e. does it terminate or does it implement a certain function?  You
build a model to _understand_ a problem.

Compilation is the conversion from one language to another, allowing
further interpretation (i.e. execution on a machine) or compilation.


  1.  Find a good model that allows the implementation of necessary
      model interpretations (i.e. verification) and model translations
      (implementation in low-level language).

  2.  If interpretation and/or compilation of the modeling language is
      not trivial, it makes sense to implement these features in a
      functional programming language or a proof assistant.


Obviously, human understanding is an important factor in the design of
a modeling language, but less straightforward to define.  The modeling
language needs to be well-adapted to describe the problem domain in an
intuitive sense without getting overly complex (too flexible).


Entry: Implementing Z
Date: Sat Jul  3 11:04:25 CEST 2010

As mentioned before[1], the Z operator is the main source of
complexity, as it can (and should) be implemented in different ways:

  - load/store of local variables (registers) inside loop + load/store of state memory outside
  - load/store of state memory inside loop
  - load/store of circular delay lines for Z^n, n >> 1
  - load of indexed input + load/store of block-based state memory outside of loop

The task is now to find a way to represent these different
implementations, and parameterize the compilation step.

The choice seems to be about the "type" of memory used: do we
represent this explicitly or not?  Inside the loop we use fast memory,
while state data that bridges different loop iterations can be in
slower memory.  Should this kind of distinction be left over to the
low-level compiler, or do we model it explicitly?

Looks like explicit modeling has more benefits.  Let's stick to the
slogan: "don't trust automatic caches".  This way we can compile to
simpler hardware.

Main idea: compiling Z is about managing memory.

Each loop has two state buffers: fast and slow for sample and
block-based access.  Transfer between the two is explicit at the start
and end of the loop.  For the C implementation, fast memory are local
variables and slow memory are pointer indexed structs.  This solves
the problem for output state.

Input state has an alternative implementation through indexing.  In
practice this is going to occur a lot for block-based processing,
which is essentially a whole different problem and might need a
special approach to manage the required overlap.

Maybe I should take a hint here.  The structure of the implementation
is mostly determined by the shape of the inner loop; the rest doesn't
matter so much as the more flexible pointer-indexed approaches can be
used.  Essentially there are then only two kinds of loops:

   - Block based (FIR)
   - Sample recursive (IIR)

For IIR loops we're kind of stuck: the recursive structure requires a
certain approach (keeping state in fast memory + streaming input and
output).

For FIR loops many loop transformations are possible.

Hierarchical combination of FIR and IIR such as the Kalman filter,
where the inner loop is FIR (matrix ops), and the outer loops are IIR,
have FIR-freedom in the "kernel" problem, and a fixed IIR structure on
the higher level.

So it seems this IIR/FIR problem is what I need to look at.  Once in
the FIR domain, the problem becomes more elegant and there is much
literature available.

My contribution should then be mostly in how to deal with
closely-looped IIR structures, and how to transform between
sample-based IIR and block-based IIR structures so it is simpler to
combine a-symmetric IIR structures with more symmetric FIR structures.

So, the simplest get-it-running compilation is then to use explicit
slow/fast state memory with pre/post transfer and no input indexing
(pure streaming).


[1] entry://20100511-195936


Entry: The DSP problem
Date: Sat Jul  3 11:52:25 CEST 2010

Simplified, implementing DSP algorithms is about optimizing usage
patterns of memory and CPU:

- Find locality of reference (memory hierarchy)

Try to consume non-final intermediate data as soon as possible after
they are generated, keeping them in fast memory.

- Find parallellism (use all functional units)

In current computer architectures, there seem to be 3 levels at which
this can be done:

  * Instruction level (VLIW : usage of functional units on 1 CPU).
  * Shared memory thread level : multiple CPU, 1 memory pool.
  * Network : multiple CPU, only data comm.

The problem structure that sets the stage are the data dependencies
which are algorithm-specific and can be considered fixed.  Only very
high level mathematical understanding can perform algorithmic
transformations that eliminate dependencies.  The irony is that often
the introduction of dependecies can make an algorithm faster (linear
update).


Entry: Linear versus Tree updates
Date: Sat Jul  3 12:03:21 CEST 2010

Many DSP algorithms use updates (memoization) to reduce complexity.
Often this is in the form of linear updates.  How many of the
traditional update algorithms can be updated in a more tree-like way,
allowing better parallel decomposition?

I.e, Kalman and FFT are essentially linear update algorithms:
successive steps depend on previous ones, while the inner product is
not (accumulation can be subdivided in arbitrary ways).


Entry: Simple IIR compilation: slow + fast state memory
Date: Sat Jul  3 12:55:55 CEST 2010

The no-brain solution is:

- Load state slow -> fast
- Iterate over input & output, updating fast state
- Store state fast -> slow

This is just a "subsampling" conversion of an IIR filter from sample
to block based ops.

If I recall, this is mostly there: query the SSA form for "Z"
operator.


Entry: Practically: an application  +  Current code structure reminders
Date: Sat Jul  3 16:05:51 CEST 2010

So, it gets more clear every time I pick this up again: it needs to be
more application driven; otherwise I'm starting to float.  The
"subsampling IIR" approach should work to build simple block-based
Pd/LADSPA/Jack plugins.  Atm the glue logic is more important than the
code transformation & parameterizable compiler stuff: fix the BORING
PLUMBING first!

Next:
  - fix prettyprinter for the new Eq based flattener (non-memoized: fix that later)
  - obtain a Bindings struct with Z operators + write pre/post state fetch and update


As a reminder:
  Term      : Syntax for nested algebraic terms
  Ai        : Abstract interpretation (algebraic manip + simplifcation) of Term
  Flatten   : name intermediate nodes to convert Graph -> Bindings
  Procedure : Representation of SSA procedures (most concrete output of code generation: includes variable names)
  DFL       : Combine Term, Procedure and Flatten 

The toplevel module seems to be DFL.hs which allows the expression of
parameterized node networks.

While DFL just combines the different components, some functions are
curried in a way to keep the parameter names abstract, giving a syntax
representation that avoids the use of concrete names, allowing for
some composition mechanism on _syntax_ only, i.e. how to implement
loops (completely separate from any _meaning_ i.e. interpretation of
operations -- this distinction is important!! [1]).


-- List based functions need to accept an infinite list.    
f0 :: (RealFloat t) => [Term t] -> [Term t]       
f0 (ar:ai:br:bi:_) = [realPart c, imagPart c] where
    c = a * b
    a = ar :+ ai
    b = br :+ bi

*DFL> :t compile
compile :: (Show a) => ([Term a] -> [Term a]) -> Procedure

*DFL> compile f0
in:   in0, in2, in1, in3
out:  out0, out1
temp: r0, r1, r2, r3, r4, r5, r6, r7, r8, r9, r10
[r0] <- [in0]
[r1] <- [in2]
[r2] <- mul[in0, in2]
[r3] <- [in1]
[r4] <- [in3]
[r5] <- mul[in1, in3]
[r6] <- negate[r5]
[r7] <- add[r2, r6]
[out0] <- [r7]
[r8] <- mul[in0, in3]
[r9] <- mul[in1, in2]
[r10] <- add[r8, r9]
[out1] <- [r10]


So, I think I understand again.  Derived from source, not the
documentation in [2] (couldn't find it / didn't know it was there /
whatever...).

In [2] I don't understand the temp node allocation problem..  From
what I gather I've side-stepped the problem by using a hardcoded
temp-node sequence.  Later when allocation becomes more clear, this
could be solved using some state monad.

So, DFL solves 2 problems:

   - Abstract interpretation / algebraic simplification
   - Serialization (graph -> SSA) + Eq-based CSE

This bridges
  * DSL semantics: generic Haskell arithmetic expressions
  * DSL syntax: implementation details for bindings, loops, temp vars, ...


[1] entry://20100509-111252
[2] entry://20100509-110352


Entry: Data Structures in Flash
Date: Thu Aug 19 22:37:40 CEST 2010

I'm worrying a bit about the memory usage of an application I'm
writing for a uC.  It builds a large pointer-filled bookkeeping data
structure at startup.  The whole structure is built in RAM, but the
result is completely determined by the (static) code.

Since the Flash memory size is 8x bigger than the RAM, this whole data
structure is better generated at compile time and put into Flash
storage.  The result will probably be smaller in Flash than in RAM, as
the most compact reps can be chosen (i.e. arrays instead of lists).

The problem is that this needs a code generator.  I don't think the
structure can be built just using the preprocessor alone.


But then, thinking further, why not put all the static data structures
in program Flash, i.e. most embedded apps do quite well with static nb
of threads etc..  Does it make sense to move the constant bookkeeping
data out of the RAM?  How to implement something like that?  Does it
already exist?


Actually, this might be the niche for Staapl: compilation of static
structures into a base stack language.  Once things are static, most
data structures can be combined with their "interpreter" into code
that executes them.


Entry: Hash consing for static data structures: mixing Flash and RAM
Date: Thu Aug 26 09:56:18 CEST 2010

As I mentioned before, sometimes an application builds an elaborate
data structure that's constant once it's built.  In embedded apps,
such a structure can go in Flash.

The problem is: ROM is "viral".  By its very definition, you can't
have static data depend on dynamic one, so all code that eventually
produces a static structure needs to be lifted to the next stage.

This is what the Racket[1] macro system is about.

So it is a bootstrapping problem.

Is it possible to somehow find the fixpoint in a simpler way?

I.e. instead of having init code build a RAM structure, have it:
     - build _code_ for initializing the structure in Flash
     - verify the current structure in Flash.

I.e. you compile and run the program at least twice.  The first time
the Flash verification fails (because there is no structure yet).  The
second time the generated Flash code is included and verified.

This reminds me of hash consing[2] for data structure sharing.

The problem however is to work around explicit mutation that's
necessary to build cyclic data structures.  It seems really
straightforward to do this for directed structures though.


What are the prerequisites?

  * Objects need to be constant.  No mutation!

That's really it.  The rest is a representation problem.  Data is
shared when it is there, and not when it is not.  The Flash is then a
pre-constructed collection of data from which we can draw.

If no cicular references are needed this is not a real limitation.

However, when circular references are important, how would we solve
it?  What is really needed is a mechanism to create circular
structures without mutation.  (The Y combinator for data).

Actually, the solution is quite simple.  You just need indirection
through names (just like the Y combinator) and somehow abstract name
resolution.  If each structure has a name (which is not its pointer!),
you can refer to structures by name inside the constructor, but you
construct fully resolved structures.  Then you hash-cons based on
name, and you manually resolve all non-hash-consed RAM structs.

[1] http://en.wikipedia.org/wiki/Racket_%28programming_language%29
[2] http://en.wikipedia.org/wiki/Hash_consing


Entry: A Theory of Typed Hygienic Macros (Dave Herman's PhD dissertation)
Date: Thu Aug 26 13:52:34 CEST 2010

[1] http://calculist.blogspot.com/2010/05/theory-of-typed-hygienic-macros.html
[2] http://lambda-the-ultimate.org/node/3986


Entry: Hash Consing with Names
Date: Fri Aug 27 14:13:28 CEST 2010

to implement:  Cn = (Cons An Bn)

  -  Lookup Cn -> Cp
  -  Is it equal to (Cons Ap Bp) where Ap -> An, and Bp -> Bn
  -  If yes, reuse in Flash, 
     otherwise create in RAM + register for resolve later

The point: structures contain real pointers for efficiency, but the
Hash consing is based on names.

The context: the app has large, circular, constant data structures
that could be represented in Flash.

What is necessary: mutable nodes, in case where most of the data
structure is constant, but a small fraction needs to be in RAM: use
indirection but reserve fixed addresses for the mutable values.


Entry: Backronym: ENUFC
Date: Thu Sep 30 23:10:25 CEST 2010

ENOFC

It has a funny ring to it, no?

Embedded
Next
Operatingsystem
For
Controllers


Entry: Writing languages in the C preprocessor
Date: Fri Oct  8 09:00:00 CEST 2010

How difficult is it to encode a decent language in the C preprocessor?
The problem is that there is no state, and that there is no real
computation.


Entry: Yegge Grok
Date: Sat Oct 23 21:03:22 CEST 2010

[1] http://vimeo.com/16069687


Entry: Generalizing Staapl
Date: Wed Nov 17 15:29:55 EST 2010

Just to get an idea of the abstract structure of what I'm trying to do
with Staapl, here[1] is an attempt to express the concatenative macro type

I  :: m -> (t -> t)
I' :: [m] -> (t -> t)

in a Haskell coat.

This is the first thing I came up with.  Both m and t can be anything.
The only structure that's there is that a concatenation operator, here
expressed by list construction, has a meaning if there is a meaning
for primitive types m.

class Concat m t where
    iPrim   :: m -> (t -> t)    -- interpret primitive
    iConcat :: [m] -> (t -> t)  -- interpret concatenation

    iConcat  = compose . (map iPrim) where
                 compose = foldr (.) (\x -> x)

Generalizing this to any foldable data structure gives:


[1] http://zwizwa.be/staapl/meta/staapl/staapl.hs


Entry: Again: concatenative languages
Date: Thu Nov 18 08:26:15 EST 2010

Fresh morning, not too much coffee, not too grumpy.  Let's try again..

Where I got yesterday: there are 3 parts to a concatenative language:

1) a meaning function   I :: m -> (t -> t)

2) a concatenation/composition function   f m -> (t -> t)
   where f is a Foldable type

3) an empty program t0

Is it a good approach to use a Monad instead of a Foldable?  

Hmm.. I don't really know where I'm going apart from the very weak
intuition that I'm going to find something.  This is probably for
later.  Maybe I should just ask on the Haskell IRC channel?

First I got to know, what is my question?

This is a primitive machine language:

  prim :: c   -> (s -> s)
  comp :: [c] -> (s -> s)
  nop  :: s
  prim nop = \s -> s

An opcode c has a meaning (s -> s) transforming a machine state s.

A sequence [c] has a meaning (s -> s) obtained by running the
primitive opcodes in sequence.

The `nop' is an empty program, and interpreted it leaves the machine
state intact.


I have one missing element: the compositon container of the target
language!  What I have is really this:

  prim :: c   -> ([s] -> [s]),  prim [] = \ss -> ss
  comp :: [c] -> ([s] -> [s])
  
I.e. there are both source and target containers (composition,
concatenation) and primitives.  Here we use lists and stacks-as-lists.

So what about picking monads for the code and state composition:

  prim :: c    -> (ms s -> ms s)
  comp :: mc c -> (ms s -> ms s)

What if these are the same?

  prim :: c   -> (m s -> m s)
  comp :: m c -> (m s -> m s)

So far this is still meaningless.  I also do not have a goal yet apart
from finding something "really obvious".

Let's try to formulate again:

Given a function I :: m -> (t -> t) representing the meaning of a
collection of Staapl macros (t -> t), is there a way to characterize
the [m] -> t transformation which compiles a macro program [m] into a
target program t?

Enough.. This isn't going anywhere.


A bit more thought: the real point is that each s->s in a composition
has the _possibility_ to change the _whole_ state, but will probably
not do so.  What you want is several small _local_ changes to state.

But really enough now.


Entry: Copilot
Date: Sat Jan  1 14:47:13 EST 2011

Copilot: A Hard Real-Time Runtime Monitor[1], built on top of Atom[2],
a procedural language for C code generation.

Summary:
* It's hard to do embedded programming: no abstraction
   - Have to understand (computing) hardware
   - Have to understand peripherals (real-world interaction)
   - Probably no operating system
   - Timing
* Paradoxically, we depend on them
* How to get evidence it works?  Currently: circumstantial.
   - Certification: process oriented
   - Probably not formally verified
   - Unanticipated faults and behaviour -> danger
   => Needs run-time monitor
* Copilot implementation:
   - Haskell EDSL
   - Constant space / time dataflow: no side effects
   - Static schedule, ensuring sync between streams


Remarks:
   - Fault-tolerance code is often the cause of failures!  This needs
     separation of concerns.


[1] http://vimeo.com/16676033
[2] http://hackage.haskell.org/package/atom


Entry: Multiple interpretations in Scheme
Date: Sun Jan  2 14:52:35 EST 2011

What I find so useful about Haskell is type classes: it allows you
give multiple interpretations of the same type-level structure.

This is perfect for model + implementation co-design.

However, I have a vested interest in Racket (ex PLT Scheme).
I think it's time to make a decision about what to do with that.

To me it seems more valuable to go for Haskell as the main vehicle
because of market share, which has a huge effect on:

     * Code reuse:  more libraries for Haskell.

     * Finding developers: plenty of people want a Haskell job.

     * Experience as a side effect, useful for consulting.

What would keep me in Scheme?

     * Hackable.  Scheme is easier to abuse and I fear I'm going to
       run into less immediate trouble with my own bad design.
       Sometimes I just miss object identity.

     * Less type system distraction.  This is really like candy: might
       be better to not have it in the house in the first place.

The point being that if I stick to Racket, the tools will only be
in-house, and I don't get enough Haskell experience to be able to take
different contracts and get out of the C/C++ pit.

Maybe it's time to revive an old thread: how to specify the compiler
(optimizer) so it works both in Scheme and in Haskell.  This way at
least we have some extra redundancy in the place where it is necessary
most.

Really, the answer shouldn't be either / or, but _both_.

Just need to fix the surface syntax problem.  I suppose Haskell can
deal with s-expressions just fine.


Entry: Virtual analog
Date: Mon Jan 10 10:07:39 EST 2011

It's starting to itch again..  Maybe it's time for me to start making
the Haskell DSP modeling -> C code for Jack/Pd bridge.  Since scope[1]
I do have some working Jack code so it shouldn't be so difficult to
set up the glue code.

Question: should I use my own ad-hoc generator, or should I use
Atom[2]?  Some previous Galois presentation[3] about Copilot[4][5] I
ran into also uses Atom for basic code generation.

Watching the Galois video[6] makes me think that it's probably a bit
overkill as I really just need fixed sample rate streams with input
and output feedback.  The only useful part for me seems to be the
static scheduler.  So it might be best to work on top of Atom
directly.

Let's restate the goal:

Derive implementation and verification/validation from specificication.

Implementation is C code that's inserted into some stream programming
framework.  Verification is any kind of (Haskell) operation that can
be applied to the specification, i.e:

  - test filter transfer functions wrt. specs
  - compute sensitivities and numerical accuracy
  - perform "compile time" design of tables (i.e. filter coefs, lookup tables)
  - make plots for visual inspection of implicit requirements


[1] http://zwizwa.be/darcs/scope
[2] http://hackage.haskell.org/package/atom
[3] http://vimeo.com/16676033
[4] http://corp.galois.com/blog/2010/9/22/copilot-a-dsl-for-monitoring-embedded-systems.html
[5] http://hackage.haskell.org/package/copilot
[6] entry://20110101-144713


Entry: Sawtooth generator
Date: Mon Jan 10 12:42:02 EST 2011

I have an interesting driver case for some DSP code generation.
Simple enough to understand, yet hard enough to implement in C due to
manual memory management.

What I want to do is to perform a proper model based design of a
differentiated piecewize polynomial sawtooth generator.

It has the following elements:
    - Ramp increment phasor (state)
    - Fractional part operator in phasor feedback loop
    - Nonlinear function evaluation (polynomial)
    - 2 cascaded differentiators (state + possible different implementations)

I just started writing in C and what I miss is:

  - Inspection of my code.  I.e. what is the actual transfer function
    graph, am I actually adding the right numbers, etc...  Doing this
    kind of design in C and having to set up a test suite in Pd,
    requiring recompile and restart is a real drag.

    I want to get the algorithm right and want to only worry about
    some ad-hoc C glue when I do the Pd testing.

  - Automatic allocation of filter state variables and loop iteration.
    This always comes back and should be quite straightforward to
    automate.


Entry: Representing difference equations
Date: Tue Jan 11 11:43:04 EST 2011
Type: tex

\section{Introduction}

We are in need of a representation for DSP programs to serve as a
basic substrate for specification, analysis and compilation to state
machine.  Such a representation can be expressed in terms of
operations on sequences.  For linear discrete update equations this
leads to the well known Z--transform\footnote{ We have the relation
that defines the Z--transform coincide with the usual mathematical
definition of generating function or $Z(a_n,z) = \sum_0^\infty a_n
z^n$.  In engineering literature one sometimes finds the inverted
representation $Z(a_n,z^{-1})$.  

The $s$--plane of the Laplace
transform of functions on the real line can be related to the
$z$--plane of the Z--transform of discrete sequences by converting
functions of the real line to sequences through a \emph{sampling}
operation, where the function is evaluated on integer points.  This
corresponds to the relation $z=e^s$.  

Sometimes it makes sense to use
linear \emph{difference} equations like $x_{k+1} - x_k = f(x_k)$
instead of linear \emph{update} equations $x_{k+1} = f(x_k)$, which
leads to a shifted variant of the Z--transform.  We will provide some
means to transform between these two representations, but focus on the
Z--transform as it is more direct.  

The $d$--plane corresponding to
the shifted Z--transform is defined as $d = z-1$, which resembles the
$s$--plane for small values of $s,d$.  One sometimes sees the sampling
period $T$ show up in these formulations, but I've found it simpler to
just stick to $T=1$ and transform coefficients and specifications
accordingly.}.


The problem with these is that I need non-linear difference equations,
so the s,z,d-plane representations only do not cover the problem
space.  The question then becomes: how to mix paradigms?  What is the
substrate?  For the sawtooth oscillator we have the following
equations.

\begin{enumerate} 

\item A nonlinear difference equation for the oscillator.  
Here $f(\cdot)$ computes the fractional part $\in [0,1[$.
$$p_{k} = f(p_{k-1} + \delta)$$

\item A static nonlinear equation for the polynomial evaluation. 
$$x_k = (p_k-1)p_k$$

\item A linear difference equation for the differentiator.
$$y_k = x_k - x_{k-1}$$
\end{array} $$
\end{enumerate} 


Maybe the unifying point should be that these are represented
as \emph{equations} and not as stateful programs, meaning the $=$ sign
represents a relation and not an assignment statement.  We want to be
able to manipulate form and parameters.

So, are the nonlinear difference equations operations on sequences?
That does seem to be the case: the element--wise $f(\cdot)$ operation
can be \emph{lifted} to apply to whole sequences.  This however leads
to expressions where the input and output cannot be \emph{divided
out}, which means that we can not isolate a transfer function.

\section{Requirements}

From the remarks above we can try to distill the basic requirements
for the representation:
\begin{itemize}
\item Expression of difference equations $x_{k+1} - x_k = f(x_k)$ and 
iterated functions $x_{k+1} = g(x_k)$.  
\item Transformation between difference equations and other representations.
\end{itemize}

About notation.  I would like to make it possible to use Haskell's
ad--hoc polymorphism to express different forms of the equations.
Unfortunately this becomes difficult because multiplication represents
both operator application and component--wise multiplication.

Wether to pick difference equations or iterated functions as the basic
representation form depends on the most likely forms of the $g$ in the
equations above.  For linear filters with slow dynamics compared to
the sampling frequency, the difference equation is a better form.
However, since those linear equations are easy to transform, it might
be best to stick to iterated functions as these are more general.


Entry: Update equations
Date: Thu Jan 13 08:57:14 EST 2011

The basic functionality allows one to express difference/update
equations in state space form.  This should be general enough to allow
for further symbolic manipulation.

z x = a * x


From what I've learned in the past, it is usually better to express
structure in terms of type classes, instead of algebraic data types.
It seems always possible to write the ADT version as one
implementation of the interface, and write i.e. a code generator as
another one.

So the problem is then, what do the different symbols mean?  How to
mesh the concept of equation (constraint) with Haskell function
definitions?  Will it be possible to use the function-definition-style
above, or will it need ugly things like [1].

One problem with implicit definitions is that it is going to be hard
to write a full set of interpretations.  The conflict between on one
hand linear operations and z-transforms, and the other hand nonlinear
ones is already quite a limitation.

Let's rephrase: When using a typeclass interface, will we need
introspection to determine whether something falls in the linear or
nonlinear class?

Problems:

   - The '*' operator is typed (Num a) => a -> a -> a.  This means it
     can't distinguish between the nonlinear variable multiplication
     and the linear scaling operation.

     Moreover, we want to be able to express "parameterized linear"
     expressions, i.e. a knob-controlled low-pass filter.  One of the
     controls will have slow dynamics compared to the other one.

     Conclusion: when we write down an equation, we don't even know if
     it's linear or not.  It depends on the interpretation of the
     variables.


It seems that the simplest approach at least for the arithmetic
operators is to use the standard Num classes and encode the parameter
vs. stream information in an abstract type.

So that really only leaves Z, which is necessarily defined on a
stream.  Can it be eliminated?  Can we use lists of update functions
instead?  I.e. in the "z x = a * x" example, the representation could
be (x, a * x).

So, this leaves:

   * Variable declarations
   * Static Equations
   * Update equations

What form should they take?  Static equations can just be Haskell
equations in let expresssions.

Really, I see no other way than to use functions of the form
(S,I)->(S,O) where S,I,O are vectors.  Such a function then would be
the base of analysis.  If it is linear (which can be checked by
evaluating it over a specific data type) one can make a certain
analysis.  What remains is to find combinators for the functions.
This all seems relatively straightforward.

So where is the problem?  Combinators.  All the crap from the other
dataflow code shows up here again: 

  * NOTATION. Picking nodes and connecting them is a nontrivial
    problem wrt. notation.

  * SHARING. An essential problem in dataflow is to manage the sharing
    of data.  Using an implicit Num type class approach makes things
    more difficult as the sharing information is lost (it has a
    global, non-functional nature).


I suppose the notation problem requires some insight that can only
come through experimentation.  The sharing problem however is really
essential.  The core of question seems to be: "Can sharing be combined
with abstract Num classes?".  

I solve it in the Ai code through an equality relation (lacking object
identity).  Is there an other way?  It is my impression that I am
not seeing something that is right in front of my nose.

From what I read (Kiselyov et al.) there seem to be two ways to do
this: Monads in pure languages, and state & partial continuations in
impure languages, but some form of state to keep track of the
dependency graph and not just the values is essential.


[1] http://overtond.blogspot.com/2008/07/pre.html


Entry: Combining code generation with memoization: Not without monads?
Date: Fri Jan 14 13:09:28 EST 2011

So from the previous post it seems that while it is possible to
recover simple mathematical operations from _black box_ definitions,
the sharing information is lost.

data Expr = Value Integer
          | Signum Expr
          | Abs Expr
          | Add Expr Expr
          | Mul Expr Expr
            deriving (Show, Eq)

instance Num Expr where
    a + b = Add a b
    a * b = Mul a b
    signum = Signum
    abs = Abs
    fromInteger = Value


f1 x y = x * y
f2 x y = x + y

-- f1 1 2 :: Expr  =>  Mul (Value 1) (Value 2)
-- f1 1 2 :: Expr  =>  Add (Value 1) (Value 2)

f3 x y = a * a where
    a = x + y

-- f3 1 2 :: Expr  =>  Mul (Add (Value 1) (Value 2)) (Add (Value 1) (Value 2))


The subexpression (Add (Value 1) (Value 2)) appears twice.  What we
would like to do is to make this sharing explicit by introducing named
nodes, i.e. transforming the tree into a directed a-cyclic graph,
which can be represented as a dictionary of nodes of primitive
operations, where the operands of each node only point "backwards" in
time.

It seems impossible to write a Num class instance that recovers this
sharing structure.  It is possible to work around this solving a
_different_ problem using the Eq relation on nodes.  That effectively
implements common subexpression elimination and is an expensive
operation.  "True" node equality that was present in the source
program is lost forever.  This is a _feature_ of the language called
referential transparency[2] : expressions represent values and
_nothing else_.

To contrast this, I've implemented a similar code generator in Scheme,
and its impure features of strict specified evaluation order, global
shared state and object identity make it trivial to translate
expressions to dictionary form, preserving the sharing.  The meaning
of the generated output program (the specification of the exact
operation sequence) mirrors the observable side effect of the Scheme
code evaluation.

In a pure language that simply does not work as all this execution
machinery does not influence the computations, so you have to


   * Introduce these side effects or evaluation order explicitly,
     i.e. through monads or combinators.  This is the approach used
     in:
     
     ``Jacques Carette and Oleg Kiselyov Multi-stage programming with
     functors and monads: eliminating abstraction overhead from
     generic code.'' (as accepted but unrevised) Science of Computer
     Programming [1].

     Essentially this writes the `let' operation as a function.
 

   * Work around them in other ways (i.e. forgetting about sharing and
     using common subexpression elimination to convert the exploded
     trees back to a DAG).

     This has the advantage of being able to recover expressions the
     programmer missed, but it completely ignores any sharing
     expressed in the source code.

     I've used this approach here[3].


So let's bite the bullet: implement the sharing Monad in Haskell.

[1] http://www.cas.mcmaster.ca/~carette/publications/scp_metamonads.pdf
[2] http://en.wikipedia.org/wiki/Referential_transparency_%28computer_science%29
[3] http://zwizwa.be/darcs/meta/ai
[4] https://agora.cs.illinois.edu/download/attachments/16883/emsoft04.pdf


Entry: DSP code gen
Date: Fri Jan 14 15:13:41 EST 2011

Conclusions up to now:

  - Use instances of the Num class to perform code analysis.  This has
    the huge advantage that pure functions in the problem remain pure
    in the modeling/specification, and we don't need to carry around a
    pile of scaffolding.  It also allows maximum code reuse.


  - For representing dynamics, update equations are the way to go.  I
    am currently in dispute about using either:

     * the forward state-space form:
       x_{k+1} = f (x_k)

       This is conceptually more straightforward as it isolates
       shift/memory operators from pure functions.  This facilitates
       analysis of the RHS update functions themselves.

     * backward time delays (old approach):
       x_k = f(x_{k-1}, x_(k-2}, ...)

      This needs a delay operator intermingled with the the function
      specification.  The only advantage seems to be that for
      implementation this is quite straightforward.  Delays are just
      indexed state memory references.


    Putting it this way makes it fairly clear: choose state-space form
    to preserve purity.

    Note also that in many problems, the state space formulation is
    more natural than a multiplied-out multiple delay form.  For lack
    of being able to make that more precise, I can mention that at
    least for low-frequency dynamics the state space formulation has
    superior numerical performance due to the large magnitude
    difference between coefficients in the multiplied-out form.


  - Sharing.  The expressions for the update equations f might contain
    shared subexpressions.  It is important to respect the
    programmer's effort to write (keep!) a problem in memoized form.

    Due to referential transparency, implementation details like value
    sharing are not observable for Haskell functions, so this needs a
    special treatment.  There seem to be two options, of which the
    first one seems most appropriate:

      * Sharing monad[1].  All code that uses data sharing needs to be
        written in monadic form.

      * Sharing recovery through common subexpression elimination.

[1] http://www.cas.mcmaster.ca/~carette/publications/scp_metamonads.pdf


[1] entry://20100219-131604


Entry: Reasoning About Staged Programs
Date: Sun Jan 16 11:28:00 EST 2011

In [1] the importance of value sharing is stressed: ``... the code
duplication problem, which is among the most common pitfalls of
staging.''

Also mentioned that ``Monadic staging essentially does Common
Subexpression Elimination (CSE) on the generated code.''  I don't
immediately see that, as that needs equality relations for
code.  In MetaOCaml this is not the obvious `==' and `=' :

# .<1>. == .<1>. ;;
- : bool = false

# .<123>. = .<123>. ;;
- : bool = false

So which one is it?  Maybe what is mentioned in the paper is just
based on pointer equality, but in that case it is not CSE as it only
works for expressions that where common in the generator.


[1] http://www.owlnet.rice.edu/~ji2/papers/staging-proof-techreport.pdf


Entry: Monadic let (a failed attempt at State continuation monad in Haskell)
Date: Mon Jan 17 10:38:11 EST 2011

For practical code generation we need at least the following
ingredients:

  - Continuation monad for let insertion.

  - State monad for variable names.


I'm lacking the ``muscle memory'' that comes with experience.  How do
we start this?  Essentially we want to make it easy to use, employing
standard `do' notation.  Let's start with what that would look like:

    f x y =
      do
        a <- x + y
        return a * a

This gives an error message:

    Occurs check: cannot construct the infinite type: t = m t
      Expected type: m t
      Inferred type: t
    In the second argument of `(*)', namely `a'
    In the expression: return a * a

Ok, that is silly.  This one typechecks:

f4 x y =
    do
      a <- x + y
      return (a * a)


So what does this mean?  It has the following type:

f4 :: (Num (m b), Monad m, Num b) => m b -> m b -> m b

Both the underlying type b and the monadic type (m b) need to behave
as a Num.  To summarize:

   Num b       from a * a
   Num (m b)   from x + y


So how to define the monad?  What confuses me is the type:
  m a -> (a -> m b) -> m b

I'd expect a == b.  Probably because I want to express a writer monad?
I'm getting frustrated.  It makes no sense...


Some questions.

  * What is the thing that is bound by `bind'?  Meaning the `a' type
    above?  It is a code expression.

  * Does it make sense to actually add monadic values?  Is it possible
    to only have 'let' be made explicit by using a do notation, and
    not any other form of combination?

    This means merging of dictonaries.  Do the examples in the paper
    use this?

  * How do the two types fit in this?  Why am I so conceptually stuck
    on "list append" ?


Entry: Monadic multi-stage programming
Date: Mon Jan 17 12:24:42 EST 2011


[1] http://www.cs.rice.edu/~taha/publications/conference/pepm06.pdf
``A monadic approach for avoiding code duplication when staging
memoized functions.''  Proceedings of the 2006 ACM SIGPLAN Workshop on
Partial Evaluation and Semantics-based Program Manipulation (PEPM),
2006, Charleston, South Carolina, USA, January 9-10, 2006,
pp. 160-169.


Entry: Folding Code : Giving Up Random Access
Date: Mon Jan 17 12:29:42 EST 2011

I once glanced at a paper that was comparing monadic vs. "fold"
approach, I believe saying that both are two sides of the same coin,
one focusing on computations, the other on values.  I believe it was
this one[1].

So I wonder, is there a way to express the memoizing issue as a "fold"
instead of a monad?

This whole problem doesn't seem to "stick" in my mind.  I find it all
very non-intuitive.

It seems that the "fold" approach should be based on concatenative
code, using combinators for variable fan-out.  EXPLICIT FANOUT.


So it seems that this is quite a deep conflict:

  * Lexical scope is very handy as it provides _RANDOM ACCESS_ to
    nodes in a computation network.  However, once reduced this
    structure disappears completely, which complicates specification
    of code generators through abstract interpretation.

  * (Flat?) combinators can encode the same computation network
    information, but are easier to stage because they express all
    sharing explicitly, i.e. through DUP.


What is so tickling about this is that in the case of an applicative
language, one introduces a sequential mechanism to tame the random
access scope.  On a high hand-wavy level this seems so obvious, but
the nitty gritty details are difficult.  But I guess it's safe to say
the following:

    Internal variable bindings are a structural element of pure
    functional code that cannot be recovered from its I->O behaviour.

This also makes the CPS approach so much easier to understand: every
time a variable gets bound, we want to intercept it.  This can be done
by exposing it through a function binding.

In a flat concatenative language this is trivial: the code is already
in CPS and we can observe intermediate states without any problems:
just modify the "interpreter" or "fold".


[1] http://www.springerlink.com/content/768043006044675p/


Entry: State Continuation Monad
Date: Mon Jan 17 14:16:22 EST 2011

In OCaml it's this:

'a monad = state -> (state ->'a -> answer) -> answer
let ret a     = fun s k -> k s a
let bind a f  = fun s k -> a s (fun s' x -> f x s' k)

To Haskellify it we introduce some wrappers:

               
-- State Continuation monad.  The State and Answer types are dummy
-- types for now..
type State  = Integer
type Answer = Integer

-- See defs in 
-- [1] http://www.cs.rice.edu/~taha/publications/conference/pepm06.pdf

data SC a = 
    SC {runSC :: (State -> (State -> a -> Answer) -> Answer)}

instance Monad SC where
    return a = SC (\s k -> k s a)
    (SC a) >>= f = SC(\s k -> a s (\s' x -> (runSC (f x)) s' k))
                   

The thing to see here is that the monad structure does not depend on
the State and Answer types.  These are just passed around.
Additionally, the SC type constructor takes one parameter, which will
be the type implementing values/target_code or any other abstract
domain.

This is something I found hard to understand at first, and I'm still
clumsy at expressing it: The monad interface shows only how certain
wrapping structure can be composed.  All the other typing context can
be completely in the background.

How to make those params abstract then?  It's actually quite
straightforward.  This type-checks fine:


data SC s r c = 
    SC {runSC :: (s -> (s -> c -> r) -> r)}

instance Monad (SC s r) where
    return a = SC (\s k -> k s a)
    (SC a) >>= f = SC (\s k -> a s (\s' x -> (runSC (f x)) s' k))
                   

So, what does this do?  A state-continuation object (SC) is a function
that takes a state and a continuation and produces a result.  The
continuation of type (s -> c -> r) is _another_ function that takes a
state and an input (c, code) and produces a result.  In the following
it is important to distinguish these two function types.

The Monad operation `return' creates a SC that passes the state along
with the value argument to the continuation.  The Monad operation
`bind' creates a new SC by composing two functions in CPS form with
additional threaded state.

(SC a) >>= f = SC (\s k -> a s (\s' x -> (runSC (f x)) s' k))

`a' is a state-continuation function (SC)
`f' is a function that produces an SC given an object

The updated state is produced by the `a' SC function and passed to the
anonymous continuation function (\s' x -> (runSC (f x)) s' k).  This
latter continuation function invokes the SC created by `f' from the
input argument `x' which is also passed to the anonymous continuation
function by `a'.

A more loose interpration: The `body code' that belongs to the SC `a'
and the SC produced by `f' is performed in sequence.  This sequential
nature is a property of code written in continuation passing style.
The state is simply "along for the ride" and is updated and passed on
appropriately.

So what does the SC do?  It can be any operation that takes a state
and a continuation object, performs some computation and passes the
updated state and a result to a continuation.


[1] http://www.cs.rice.edu/~taha/publications/conference/pepm06.pdf


Entry: Using the SC monad
Date: Tue Jan 18 18:49:06 EST 2011

The next step in [1] 3.4 is to specialize the monad so that let
insertion can be used.  First we need to figure out which type is
exactly specialized, and define monad laws.

Why are the value and answer types not always the same in the rules
described in 3.4 [1]?

As I understand it, the specialized returnN operation can't be used as
a generic return, simply because it is not generic enough:

returnN :: Expr -> SC Integer Expr Expr
returnN x = SC f where
    f s k = Let n x (k s' (Var n)) where
                    (n, s') = allocVar s

The type of `x' is fixed.  This should be possible to solve by
generalizing the functionality of the Let from a concrete data type to
an abstract operation.  In [1] the operations are made ``smart'' to
avoid duplication of simple expressions such as literals.  This
smartness should not be part of the definition of `return'.

A straightforward rewrite is to just lift out the whole return
function:

maybeBind x s k =
    Let n x (k s' (Var n)) where
            (n, s') = allocVar s

returnN = SC . maybeBind

Questions:

  * Why are the expression and result types different in [1]?  Maybe
    it was a different paper that explained this, but I do remember
    clearly that there was a specific reason.

To simplify I'm going to take that out as I see no reason for it now
apart from complicating the types.

The explanation in [2] is a bit less dense and easier to follow.

[1] http://www.cs.rice.edu/~taha/publications/conference/pepm06.pdf
[2] http://www.cas.mcmaster.ca/~carette/publications/scp_metamonads.pdf

Entry: Is Haskell just too difficult?
Date: Wed Jan 19 18:04:50 EST 2011

I don't quite get it yet.  Sometimes type inference doesn't seem to
work and I have no clue as to why and no clue about how to make it
explicit either.  Seems I'm doing a bit too much at once: type class
stuff and the basic idea of monadic sharing.

I'm in the stage where Haskell still seems to suffer from a "move the
problem elsewhere" issue.  Of course, pure functional programs are
really nice.  Types that tell you things about those programs are also
nice.  Things that tell you things about types are also nice.  Gluing
all these things together and making them do what you want is a bit of
a bumpy ride though.  Granted, the result (when you manage to express
yourself properly) is often very good, but preventing the programmer
to write the program in the first place is maybe not such a good
plan..

Anyway, to be fair, I probably need to first gain more basic
understanding it before I give up.  But it seems that, at least from a
point of "less fuss, just functionality" an approach in Scheme using
macros to solve the "input problem", and abstract interpretation
implemented with the Racket unit system and _simply using plain old
side effects_ would probably be a better fit for my mind.

But hey, maybe I finally get it one of these days.

For later reference: here's a paper that bridges "real" state and
monads from a Scheme perspective[2].

[1] http://lambda-the-ultimate.org/node/92
[2] http://www.cs.indiana.edu/hyplan/jsobel/Parsing/explicit.pdf


Entry: Simpler monad
Date: Wed Jan 19 19:06:59 EST 2011

- Can we just use "join"?  I suppose a 2-level monad wrapping would
  then just be joining dictionaries.

  I'm not so happy with the loss of generality when `bindN' makes code
  explicit.  Maybe this is solved somewhere else with type classes and
  that's where I get the idea in the first place?

  I think I'm missing the point here: the retN operator is the
  let-insertion operator and intended to be different from return.
  It's explained better in [1].


- Explain this state/continuation business again.  From the intuitive
  point the continuation is there to be able to patch into values
  passing from one computation to another.  At those inspection points
  `let' bindings can be inserted, as part of the `bind' or `return'
  functions that make up the plumbing.


- Is it really necessary to make a fully-serialized version?  For
  monads it seems it is.  This does away with the simple pure function
  approach I had before.  A function that has sharing will always have
  a monadic return type.  However, if just the behaviour is desired
  and not the sharing structure, it might be straightforward to
  flatten the result to a pure function and get to the same point.

  Anyway.  The Monad/CPS approach does feel rather clumsy.


[1] http://www.cas.mcmaster.ca/~carette/publications/scp_metamonads.pdf


Entry: Onward: sharing + type classes
Date: Wed Jan 19 20:09:27 EST 2011

Leaving the SC monad as it is:

data SC s r c = SC {runSC :: (s -> (s -> c -> r) -> r)}
instance Monad (SC s r) where
    return x = SC (\s k -> k s x)
    (SC a) >>= f = SC (\s k -> a s (\s' x -> (runSC (f x)) s' k))

and defining explicit sharing ops:

returnN = SC . varShare
bindN a f = a >>= (\x -> ((returnN x) >>= f))
varAlloc (VState n) = 
    (n, VState (n+1))
varShare x s k =
    Let n x x' where
            x' = k s' (Var n)
            (n, s') = varAlloc s

we have essentially all that's necessary:

test y = do
  x <- returnN y
  return (x * x)

showSC $ test $ Lit 1
=> Let 0 (Lit 1) (Mul (Var 0) (Var 0))

Composition also works fine:

showSC $ (test (Lit 1)) >>= test
=> Let 0 (Lit 1) (Let 1 (Mul (Var 0) (Var 0)) (Mul (Var 1) (Var 1)))

The next step is to make the returnN (return Named) operation more
general so that the SC monad can also work on other values, especially
Num classes.

Maybe read the ``Finally, Tagless, ...'' paper again[1].


[1] http://www.cs.rutgers.edu/~ccshan/tagless/jfp.pdf


Entry: TODO
Date: Wed Jan 19 22:28:03 EST 2011

- Generalize returnN such that its type doesn't include Expr.

- Implement another instance that works on ordinary numbers.

- Figure out how to actually use this monadic composition.  It seems
  so clumsy...


Entry: Monads, CPS: too serial?
Date: Wed Jan 19 22:44:52 EST 2011

The idea is that this strict serial character isn't really needed.
The real problem as mentioned before seems to be the management of
intermedate results.

The conflict is about:

  * Functions are simplest to definine using standard expression
    syntax, possibly with named intermediates.

  * DSP algorithms tend to work better in "constraint spec" form where
    inputs and outputs are explicitly named.

Conclusion: while I'm getting a bit annoyed by the overall feel of
this, it's probably best to first work with it a little bit.  My
problem seems to be more with the unfamilarity of using monads.
I.e. what to do with multiple outputs?


Entry: Adding context to values in Haskell: not possible implicitly.
Date: Thu Jan 20 15:05:30 EST 2011

Hmm..  In Scheme it would be straightforward to take something that is
purely functional, and define it over an abstract domain that has
global context.  This is how the abstract interpretation code in
Staapl works.

Can you do something similar in Haskell?  What I mean is: can this be
hidden inside the data type only?

I think the answer is a clear No.  The reason is that Scheme's
strictness and defined order of evaluation is what is actually used to
make the global context evaluation work.  That's exactly the part
that's not accessible in Haskell, and thus needs to be coded
explicitly.


Entry: Back to Scheme?
Date: Thu Jan 20 15:37:31 EST 2011

Maybe it's time to go back?  My impression is that using the unit
approach it will be possible to get what I want: keep the code
specification simple (functionality) and move the implementation hints
to the value domain implementation.


Entry: Racket (PLT) units
Date: Fri Jan 21 09:33:45 EST 2011

To create parameterized (open) code that can later be combined to
create real code.  All parameterized names are values, not macros.
However, it is possible to define macros _as part of a signature_, in
terms of the parameterized values.

To use:

  * `define-signature' creates an interface that can be used in
    imports and exports.  It is a collection of variable names and
    macros.
  
  * `unit' creates a unit object, which is a first class object with
    "holes" as specified by the signatures mentioned in `import'.  

  * Multiple units can be composed through "linking".

  * A unit with no imports can be "invoked" to execute its body.


I managed to use `define-unit' and `define-compund-unit/infer', but
not `unit' and `link'.  There seems to be extra static information in
the former case which then has to be provided to `link', while the
former passes that static information automatically.


[1] http://docs.racket-lang.org/reference/mzlib_unit.html


Entry: Racket units and the type class trick
Date: Fri Jan 21 10:43:09 EST 2011

Yesterday I was thinking that maybe a dynamic dispatch generic
function approach would be useful.  However, that doesn't seem to be
necessary.  Using units it is possible to get the necessary abstract
interpretation by simply building different implementations of the
num^ signature that defines the basic arithmetic operations.

This is what I have:

---------------------------------------------------------------------

#lang scheme

;; The interfaces.  Basic language syntax and a single-function module.
(define-signature num^ (add sub mul lit))
(define-signature fun^ (fun))

;; The language interpretations.
(define-unit num-eval@
  (import)
  (export num^)
  (define add +)
  (define sub -)
  (define mul *)
  (define (lit x) x))


(define-unit num-gen@
  (import)
  (export num^)
  (define (add a b) `(add ,a ,b))
  (define (sub a b) `(sub ,a ,b))
  (define (mul a b) `(mul ,a ,b))
  (define (lit x) `(lit ,x)))


;; A test program
(define-unit fun@
  (import num^)
  (export fun^)
  (define (fun x y)
    (let ((a (add x y))
          (b (sub x y)))
      (mul a b))))

;; Test program linked to interpretations.
(define-compound-unit/infer fun-eval@
  (import)   
  (export fun^ num^) 
  (link fun@ num-eval@))

(define-compound-unit/infer fun-gen@
  (import)
  (export fun^ num^)
  (link fun@ num-gen@))

---------------------------------------------------------------------

Now it's possible to evaluate the test function defined in the `fun@'
unit using two different number models:


box> (define-values/invoke-unit/infer fun-gen@)
box> (fun (lit 1) (lit 2))
(mul (add (lit 1) (lit 2)) (sub (lit 1) (lit 2)))

box> (define-values/invoke-unit/infer fun-eval@)
box> (fun (lit 1) (lit 2))
-3


Next: access instantiated modules as values without relying on name
bindings.  It seems there is a very straightforward workaround.  Just
wrap it in a function body:

box> ((lambda () (define-values/invoke-unit/infer fun-gen@) fun))
#<procedure:fun>

Skipping the explicit fun2 + num-gen@ linking this can be done as:

box> ((lambda () (define-values/invoke-unit/infer (link fun@ num-gen@)) fun))
#<procedure:fun>


Nope, that doesn't work as expected..  No shortcut: figure out how to
use the units as first class values.


Entry: Racket-based DSP eDSL specs: simple abstract interpretation.
Date: Fri Jan 21 11:36:03 EST 2011

Deliverables
------------
  - Simple DSL, should really not be more than algebra.
  - API Code analysis through multiple interpretations (abstract interpretation)


General Architecture
--------------------

I'd like to take the best ideas from the Carette, Kyseljov, Chan in
their typed approach (MetaOCaml and Haskell) and apply them to my more
familiar Racket (PLT Scheme) environment:

  - Syntax as functions instead of data structures, implemented as
    module _interfaces_.  ( I think this is the initial vs. final
    algebra idea -> FIXME: expand on this. ).  The

  - Abstract domains interpretations implemented as module _instances_.


In Racket it seems a good approach is the "sandwich" architecture.

  * Application components are implemented as units in terms of the
    num^ signature which implements the abstract domain on which the
    code operates.  The programmer has complete freedom in naming and
    composing units.

  * Analysis and code generation are performed by instantiationg the
    num^ signature over different data types on one side, and
    providing different analysis methods on the other side for the
    exports defined by the application components.

An example: to perform code generation, the monad approach can be
side-stepped, and the num^ operations can be implemented in a stateful
manner, i.e. using scheme parameters, with the top level defined by
the driver side of the sandwich.

This trades some risk for a simpler interface.  This was the hardest
decision, as it cuts the deepest.  I intend to make up for the lack of
static typing and referential transparency by providing many abstract
domains for analysis.  An advantage is that the recently proposed
partial-continuation approach to code generation probably can be used
without too much adaptation.


Implementation and Sugar
------------------------

I would like to use a scheme #lang that has the following two elements:

  - Phase 0 bindings all defined as abstract operations in terms of
    the num^ interface, allowing different interpretations.

  - Macro language is scheme with all usual bindings available to
    construct non-functional compile time DSL language features.


Entry: Using a single model ouput?  I.e. a "main" function?
Date: Fri Jan 21 13:04:23 EST 2011

It might be a good idea to standardize the naming convention for the
sandwich implementation:  a module is  num^ -> main^


Entry: Signatures and syntax
Date: Sat Jan 22 09:26:14 EST 2011

I currently have the num^ signature extended with macros +,-,* that
simply rename to add,sub,mul respectively.  However, andy kind of
compile time behaviour can be inserted at this point.  Right now that
doesn't seem necessary since the point is to put all the
interpretation behaviour in the abstract domain and its driver.

EDIT: added arity checking


Entry: First class signatures
Date: Sat Jan 22 10:24:41 EST 2011

What I want is to combine a model@ unit and a num@ unit and run an
analysis.  This can be done by creating a compound unit that returns
an expression, link it, and use `invoke-unit' to obtain the expression.


(define test@
  (unit (import) (export) 123))

# (invoke-unit test@) => 123


Linking a unit works as follows.  For each unit in the composition,
define a link-id for each export signatures (capital letters below).
These link-id are used only during linking.  I believe these are
distinguished from signature names to allow for multiple instances of
the same signature.

(define analysis@
  (compound-unit
   (import)
   (export DOMAIN MODEL)
         ;; exports            unit     imports
         ;; --------------------------------------
   (link [((DOMAIN : num^))    num-anf@ MODEL]
         [((MODEL : example^)) example@ DOMAIN])))


Ok, it's working for staapl/absint.  The following instantiates the
`run' function defined in the example@ unit over 3 different domains:

((num-run num-anf@  example@) 1 2)
((num-run num-expr@ example@) 1 2)
((num-run num-eval@ example@) 1 2)


Entry: Natural rise of partial continuations
Date: Sat Jan 22 11:34:19 EST 2011

By introducing the `with-num' function that creates the dynamic
context for the abstract domain implementing the `num^' interface, and
introducing and the `num-snarf' macro that allows the evaluation of an
arbitrary expression in the context fo an instantiated model +
particular abstract num^ domain, it becomes clear that simple closures
will not work any more: a closure created inside `with-num' cannot be
exported outside.  However, partial continuations would still work as
they capture this context.

So indeed, partial continuations seem to be a natural solution to
context-dependent evaluation [1].

[1] http://okmij.org/ftp/Computation/Generative.html#circle-shift


Entry: Sawtooth generator
Date: Sat Jan 22 12:41:55 EST 2011

So, absint functionality is in place.  Next: build saw tooth
oscillator and figure out how to specify the main composition
mechanism for recursive equations.

To get this working we need to create abstract interfaces to
combinators.

The most important combinator is the one that wraps a state update
function around a state and input/output streams.  It would be
interesting if we could use arbitrary functions for this, and use
naming to perform the linking.

(define (saw phase int frq pole) ... (values phase+ int+))

Something like:

    (in:     (frq))
    (out:    (p))
    (state:  (-> (p i) (p+ i+)))
    (update: (-> (p i frq pole) (p+ i+))

Or as a macro:


(define-syntax (state-space stx)
  (syntax-case stx ()
    ;; Link in terms of node names.
    ((_ (s+ o s i) ;; state space form
        (f fi fo)  ;; pure function body
        )
     (let* ((-> syntax->list))
       #`'(lambda (#,@(-> #'s)
                   #,@(-> #'i))
           (let-values ((fo (f . fi)))
             (values #,@(-> #'s+)
                     #,@(-> #'o))))))))


Note that this does just patching based on names.  In general it is
more straightforward to use output node names to solve multi-out
patching problems, as opposed to combinators.

   
Entry: Stacking units
Date: Sat Jan 22 16:19:00 EST 2011

One of the interesting features of Haskell is the ease at which
composite algebraic operations like complex numbers and stream lifting
can be implemented using the type class mechansim.  Inference will
then usually work, selecting the right operations.

Doing this multiple times gives "stacking" of number structures.  I'd
like to get a similar behaviour in the Scheme metaprogramming.


Entry: Different state space form
Date: Sat Jan 22 16:51:58 EST 2011

I'm thinking that another form might be more convenient.  Just write:

(equations
   (s (... s ...))
   (o (i ... s ...)))

In this case, inputs only occur on the RHS, outputs only occur on the
LHS, and state updates occur on both.  That seems the most economical
notation.


Entry: State space form: draft logic
Date: Sun Jan 23 12:16:28 EST 2011

(begin-for-syntax
  (define dict make-hash)   ;; use mutable hashes throughout
  (define (false . _) #f)
  (define (keys d) (dict-map d (lambda (k . _) k)))

  (define (bool x) (if x #t #f))
  
  (define (dict-op logic-op a b)
    (let ((r (dict)))
      (for (((k v) (in-dict a)))
        (when (logic-op (dict-ref b k false))
          (dict-set! r k v)))
      r))
  (define (dict-intersect  a b) (dict-op bool a b))
  (define (dict-difference a b) (dict-op not a b))
    
)

;; Sate space model: automatic connect & sort.
(define-syntax (state-space-eqs stx)

  ;; Add identifier to dictionary.  Index on symbol, i.e. identifier
  ;; equality is implemented as symbol equality.  Set identifier as
  ;; value to keep track of lexical context.
  
  (define (id-add! d id)
    (dict-set! d (syntax->datum id) id))

  (define (id-add-rec! d top-expr)
    (let rec! ((expr top-expr))
      (syntax-case expr ()
        ((op . rands)
         (for ((a (syntax->list #'rands))) (rec! a)))
        (node
         (id-add! d #'node)))))
    
  (let ((state (dict))
        (lhs   (dict))
        (rhs   (dict)))
    (syntax-case stx ()
      ((_ . eqs)
       (begin
         (for ((eq (syntax->list #'eqs)))
           (syntax-case eq ()
             ((var expr)
              (begin
                (id-add!     lhs #'var)
                (id-add-rec! rhs #'expr)))))
         (let* ((state (dict-intersect  lhs rhs))
                (in    (dict-difference rhs state))
                (out   (dict-difference lhs state)))
           
           #`'((in    #,(keys in))
               (out   #,(keys out))
               (state #,(keys state)))))))))

;; test
(state-space-eqs
 (y (+ a b))
 (z (* y y)))


=> ((in (a b)) (out (z)) (state (y)))


--------------------------------------------------------------------

The principle is simple:

  * Build 2 dictionaries of LHS + RHS variables (RHS requires tree
    traversal)

  * From these, build 3 dictionaries of state, in, out as:

       state = LHS ^ RHS
       in    = RHS \ state
       out   = LHS \ state

  * Find a way to impedance-match to ordinary Scheme expressions.
    Possibilities:

       - Explicit I/O spec
       - Keywords
       - Lexical order
       - Order of occurance


The logic operations on the dicts might be reused from a library.


Entry: State space models: representation?
Date: Sun Jan 23 13:15:10 EST 2011

When using lambda abstractions to represent state space models with
flat arguments and output value vector structure, some information
gets lost.

It would be better to find a common representation for the functional
part and attach the meta information somehwere else, i.e. statically.

What I mean is, a state update function can be:

     (lambda (s i)
       (let ((s1 ...)
             (s2 ...)
             (o1 ...)
             (o2 ...))
         (values s o)))

Where the input and output values are structs, and the equations are
in unpacked form.

Combinators that chain state space models together then could be
written in terms of how they wire these structs.


Context:

  - The `state-space-eqs' macro in the previous post doesn't do much,
    and it doesn't have a proper input/output spec: it would work
    better with names.

    Naming inputs can be done using keywords, but naming outputs seems
    to be only possible by wrapping values in a struct.


  - I don't really want a struct: I just want static information that
    keeps track of the names, so I can use the names in combinators.
    Put differently, at run time these can just be vectors, but at
    compile time there is name information.


Let's start with binding the dictionaries to identifiers at compile
time, and see where this leads us.


Entry: Context: absint and state-space
Date: Sun Jan 23 15:05:25 EST 2011

As usual I got distracted by something that seems more interesting,
which is a specification syntax and associated compile time AST
juggling for making the specification of state space models more
straightforward.  The main problem to solve is naming _outputs_ such
that models can be composed.


Entry: State space DSL: next
Date: Sun Jan 23 17:12:00 EST 2011

It seems the state space form is quite natural, so let's make it the
central abstraction.

  - The architecture is taylored according to the requirement for the
    output to be code.  The state space model fits well here.

  - All behaviour is purely 
functional: state feedback is handled in
    the top level combinator.

  - We want composition of state space models.  The reason is that
    different forms of models require different forms of analysis, so
    it makes sense to partition the solution into building blocks that
    can be individually tested and combine them into a single
    solution.

  - Apart from simple composition, we probably also need:
          - domain nesting, i.e. matrices and vectors
          - nested iteration, i.e. function evaluation & optimizatiobn


Entry: Simpler state space syntax
Date: Sun Jan 23 23:36:03 EST 2011

Maybe magic syntax isn't necessary.  Simpler:

(state-space <name>
  [ x1 x2 ] ;; state
  [ i1 i2 ] ;; in
  [ o1 o2 ] ;; out
  
  ( ...  ;; arbitrary lexical context
     (update
       (x1 ...)
       (x2 ...)
       (o1 ...)
       (o2 ...))))


Here `update' needs lexical context introduced by the `state-space'
form to distinguish state updates from outputs.

The arbitrary lexical context is necessary to be able to introduce
sharing.

Hmm..  The path I went on is tricky.  Some notes:

  - expansion time parameters only work for all-at-once explicit
    transformers.  Once the body of the transformer exits, the
    parameter is out of context.

  - syntax parameters work fine, but their generality makes it
    possible to mess up scoping quite easily.


Entry: Syntax parameters
Date: Mon Jan 24 10:12:05 EST 2011

To implement the `state-space' form above it's probably best to use
syntax parameters.

These are strange though, and remind me of my previous encounters with
`let-syntax'.  The idea is that the transformer phase code is
_generated_, i.e. it doesn't now about the transformer code outside of
#`(syntax-parameterize ...)

It's working, but with a little bit of "foefelare".  Look at the way
the `state' and `out' pattern variables are propageted into the
expansion-time part of the `syntax-parameterize' expression.  This is
a staged macro, and not a lexical substitution as I first expected!

(define-syntax-parameter model-update #f)

(define-syntax (model-lambda stx)
  (define (unpack vec ids)
    (for/list ((id (syntax->list ids))
               (n (in-naturals)))
      #`(#,id (vector-ref #,vec #,n))))
  (syntax-case stx ()
    ((_ (state in out)
        . body)
     #`(syntax-parameterize
        ((model-update
          (lambda (stx)
            (define (context sym)
              (datum->syntax stx sym))
            (syntax-case stx ()
              ((_ bindings)
               #`(let bindings
                     (values
                      (vector #,@(map context 'state))
                      (vector #,@(map context 'out)))))))))
        (lambda (vstate vin)
          (let (#,@(unpack #'vstate #'state)
                #,@(unpack #'vin    #'in))
            . body)
          )))))


(model-lambda ((s1 s2)  ;; state
               (i1 i2)  ;; input
               (o1 o2)) ;; output
  (model-update
   ((s1 (+ s1 s2))
    (s2 (- s1 s2))
    (o1 (* s1 i1))
    (o2 (* s2 i2)))))


[1] http://macrologist.blogspot.com/2006/04/macros-parameters-binding-and.html


Entry: State space composition
Date: Mon Jan 24 14:31:35 EST 2011

Should be straightforward.  This is a class/instance thing.

A composition is:
  A collection of (named) instances, each from a particular class.
  A collection of connections.

The basic one should be straightforward: only numeric i/o addresses.
Compile time info could then be used to use names as added sugar.

(composite-model (in) (out)
  ((int1 integrator)
   (int2 integrator))
  ((int1 0 int2 0)
   (in   0 int1 0)
   (int2 0 out  0)))

This is just a dataflow -> DAG expression compiler.  The trouble is:
what to do with state?  Model it as just another output, then patch
again?

The most straightforward way is to just execute the program: build the
graph, then traverse it from the outputs to inputs.

So, the state is something handled behind the scenes.  Trouble is, in
order to compose, this information needs to be available.  Maybe
implementing a model as a struct is good enough.  This would not allow
the use of names, but does allow proper patching.


Entry: Abstract vector
Date: Mon Jan 24 15:10:03 EST 2011

Now, instead of using compile time information, it might also be
useful to use abstract interpretation on abstract vectors.

A model can then be instantiated at compile time to be analyzed.


Entry: Runtime or compile-time?
Date: Mon Jan 24 15:33:02 EST 2011

It's difficult to think of which information needs to be run-time and
which compile-time for the compose operation.  While Scheme removes
the Haskell straightjacket tension, the freedom that is introduced
with arbitrary compile time computation and stage hierarchies is
sometimes confusing..


I think the problem can be solved with state/in/out dimensions
recorded at run time, but it requires that the dataflow sorting is
performed at run-time, which might not be a good idea.  In other
words: that would require an interpreter, and we need a compiler.

The thing is that if this meta-data is stored at compile time, then
the detour through vectors I implemented today isn't so useful: the
vector bundled si->so interface doesn't solve composition as the
vector dimensions still need to be known.

So, next problem to solve is to store model meta data in transformer
bindings [1].

Using `syntax-local-value'[2] this information can be retrieved.

The nice thing about this approach is that names can be used in the
`composite-model' form.


[1] http://docs.racket-lang.org/reference/syntax-model.html?q=struct-type&q=define-struct#%28tech._transformer._binding%29
[2] http://docs.racket-lang.org/reference/stxtrans.html#%28def._%28%28quote._~23~25kernel%29._syntax-local-value%29%29


Entry: Composite Models
Date: Mon Jan 24 19:22:33 EST 2011

Now, instead of working with the models + connections approach, we can
still use the lexical direct-dag approach as a basic device, using the
composition form only to perform the state threading, which eventually
becomes the most important bookkeeping part as state accumulates,
while input/outputs are reduced through connections.

The problem then is state initialization: it becomes hard to track
what corresponds to what..  Starting all states at zero solves this,
but it might be too restrictive.  Overall: it's a huge pain to handle
and there is little real gain when it's left flexible, and if initial
state is really a problem, state equations can be translated such that
they start at different initial conditions.

However, by carefully naming the states, i.e. recursive prefixing,
this can be handled properly.  


Entry: Automatic state patching
Date: Wed Jan 26 06:13:15 EST 2011

What I'd like to do is to create a lambda body that takes its state
input and output from expressions that occur in the body, without the
need for state patching.

(composite-model (i o)
  (let-values (((o1 o2) (apply-model m i o)))
    ...
)


Entry: Compile time vs. Run time
Date: Wed Jan 26 08:26:32 EST 2011

Thinking about this some more, a can see why it makes sense to have a
multi-stage language.  Emulating one in a single state language
requires trickery, or in Scheme, the jump to macros.

The point: it seems that the core of the DSP language is going to be a
composition mechanism with automatic state threading, keeping the
functional representation for analysis.  Keeping analysis might mean
keeping it in data-structure form and have an interpreter perform the
composition operation.

Additionally, it would be nice to use the rest of scheme to build
composition networks: moving that part to macros will make things
uncomfortable.  

It seems that these are at odds, at least in an untyped language.

An example would be to take a vector which is the output of some
computation, and use it as filter coefficients for a parallel circuit
of stateful objects.

What seems to be the problem is the inability to dispatch on the types
of functions in such a flexible way (i.e. on return times) as in
Haskell.

So is it possible to stay away from macros for building the
composition mechanism?

The problem is the classical one for staging and partial evaluation:
when is a loop a loop, and when is it expanded into multiple
instances?  There needs to be a way to express this.

Another conflict in Scheme: I have two very different meta levels: the
abstract interpretation part, and the macro part.  I really do not
want to complicate things by implementing composition in macros such
that it is no longer accessible on the abstract interpretation level.
In feldspar, array access is made abstract, such that the `map'
function can be made abstract also.  I feel this is really the way to
go.


Entry: 2 x n stages
Date: Wed Jan 26 09:09:34 EST 2011

Another pattern that seems strange is that when staging, types seem to
be really helpful to automate the plumbing.  I.e. even if you have n
stages, each stage is actually 2 stages: functions and types.

What I don't understand is how this interplay can be made explicit.

So it seems I have this ping-pong game going between typed and untyped
aqpproaches.


Entry: Problems
Date: Wed Jan 26 09:11:06 EST 2011

- expression of parametric composition (i.e. combinators / loops)

- inline vs. loop


Entry: Composition at run time
Date: Wed Jan 26 09:29:35 EST 2011

What about this: a model is a struct which contains all availble data
necessary to identify inputs/outputs, such that composition can be
performed at run time in a fully automated way.

This means back to the vector-based approach.  It would also mean the
construction of permutation adapters.

Then some sugar can be constructed to allow for a more convenient
lexical scope approach.


Conclusion:

   - Use vectors or lists for I/O and a struct with meta data to
     represent the expected sizes of these data structures.

   - Function composition: start with simple sequential approach
     using:
       - state composition, possibly nested state vectors.

       - i/o composition: use a default "fill inputs with outputs"
         approach, i.e. a stack machine.

       - permutations are simply stateless models

   - Write sugar on top of this to be able to use lexical variables to
     eliminate the need of permutations.


Entry: The real problem is not C.
Date: Sat Apr 16 11:45:59 PDT 2011

It's the C preprocessor.

What I'd like to do is to make a C preprocessor with a type system or
any kind of static analysis.  If I look at my own programming style, I
really can't give up on macros when working on a C project, while the
CPP is exactly the root of the problem for anything involving analysis
of the "meaning" of C code, it is still an essential part.  Moreover
it is standardized which means people use and abuse it.


[1] http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.2.2815
[2] http://www.ioccc.org/2001/herrmann1.hint


Entry: Opaque code in Scheme?
Date: Fri May 20 18:38:57 CEST 2011

I can never distinguish the intensional/extensional adjectives when
talking about the representation of code.  I prefer using
opaque/transparent:

        opaque: code can be composed, but not decomposed (analyzed)
        transparent: composition + decomposition (code = concrete data)

In MetaOCaml, code is opaque.  Once constructed it cannot be modified
to not violate constraints imposed by the type system.

Note that in practical use, such code is often embedded in transparent
data structures that do allow inspection, i.e. to implement partial
evaluation.

I prefer to use Scheme (Racket) for my metaprogramming experiments to
make it easier to "break the rules" at the stage where I'm not sure
yet how to make a program structure fully consistent.  Is it useful to
also only use only opaque code, or should everything be wide open?


Entry: BER MetaOCaml
Date: Fri May 20 18:49:44 CEST 2011

BER : Bracket, Escape and Run.  Is cross-stage persistence somehow
"ill-defined" or "less-well-defined-than-the-others"?

[1] http://okmij.org/ftp/ML/index.html#ber-metaocaml


Entry: Doing something wrong
Date: Sat May 21 12:17:51 CEST 2011

Over the last year I've lost a lot of focus.  Frankly I don't know
what this is about any more.  I found this out recently talking to
someone: I just made up a new goal.  It went something like this:

   In embedded programming there is a trade-off between abstraction
   (for managability) and special-casing (for resource efficiency).  

   I am investigating ways to solve this problem better, and possibly
   commercialize a solution.  However this is hard.

It's already a paraphrase because I don't remember the exact
explanation, but I'm hiding behind the fact that it's hard.

It's definitely what I WANT to do, but my plan of getting there is not
very realistic.

Let's try to find some reasons:

- I got overwhelmed.  The literature I am (was) following on the
  subject of typed metaprogramming is not easy.

- On the practical side I'm conflicted between two language models:
  Scheme + Macros and more restricting typed approaches (MetaOCaml or
  Haskel type classes).

- The problems I'm trying to solve are too hard as a first attempt:

  * An ill-specified USB driver for Staapl/PIC18, where I don't know
    exactly how the thing is supposed to work and I am puzzled by how
    to tackle simple memory management problems,

  * A "generic audio DSP language".  Also very much a trial-and-error
    thing that needs a lot more knowledge integration to get anywhere.

  * A virtual machine for small (64k RAM) 32-bit micros.

- I'm very far removed from the status-quo.  This means there is a lot
  of room for improvement, but also a lot of skepticism.

- I don't have straight-line time; too many distractions.  This is
  probably a motivation/efficiency issue instead of an effective lack
  of time.

What I need is either a clear (difficult, long-term) goal and
discipline and focus, or simpler goals that are self-motivating to
gather more data and knowledge for the next move.


Entry: HiTech C compiler
Date: Mon May 23 13:30:14 CEST 2011

It seems to be quite sophisticated.  Global program optimziation and
use and allocation of global variables for automatic variables: PIC
code is seldom recursive, and indirect access is expensive, so
eliminating a call stack makes sense.


[1] http://www.microchip.com/stellent/idcplg?IdcService=SS_GET_PAGE&nodeId=1406&dDocName=en535448


Entry: Separate meaning from mapping
Date: Wed Jun  8 16:04:31 CEST 2011

It would be nice to find a good example to do the 2-part
implementation: algorithm and abstract interpretation.

One thing that is useful is a retargetable VM.  It has a lot of
implementation trade-offs, of which some might fit in the AI view.


Entry: Macros vs. functions
Date: Wed Jun 15 09:47:33 CEST 2011

There isn't always a real, conceptual distinction between what is a
macro and what is a function if the generated language is
substitution-like (referentially transparent?).

I.e. in Scheme, the biggest difference is the evaluation order
(strictness) and side effects.


Entry: Tethered development
Date: Sun Jun 12 12:41:39 CEST 2011

About that VM.  Embedded software is debugging.  What is nice for
debugging?  Introspection, meaning: everything should be observable.


Entry: Partial Evaluation
Date: Sun Jun 19 10:42:04 CEST 2011

This book[1] is from 1993.  We're almost 20 years later and I don't
really see any obvious solution.  I.e. there's no GNU PE.  It seems
that the recent (+- 5 years) literature on staging is fueled by the
impracticality of PE.  In the book[1] there are 3 significant
remaining problems mentioned:

1. Need for annotation to _not_ unfold loops/recursions to prevent
   large or infinite programs.

2. Need for the user to understand the output program.

3. Need for the user to understand the working of the PE.


Looking specifically at compilers generated from interpreters, the
following attributes have not yet been automated (in the book[1] from
1993)

1. Changing language style, i.e. expressions -> sequential machine code

2. Sequentializing naturally non-sequential processes, i.e. using a
   stack for evaluation.

3. Lowering the level of abstraction (embedding?)

4. Devising low-level data structures for implementing high-level
   features.

5. Value management.


This list seems a bit too arbitrarily detailed to me.  What it seems
to say is that a PE can't "invent stuff".  I.e. it can't create extra
concreteness to embed non-implementable features into different,
concrete code and data structures.

In other words, to draw the analogy with manually solving algebraic
equations: it can't insert an operation and its inverse "at the right
spot".  There is still some un-ecodable (un-encoded) human
intelligence necessary to dream up effective structures.


[1] http://www.dina.kvl.dk/~sestoft/pebook/


Entry: Getting over do-notation
Date: Mon Jun 27 10:51:41 CEST 2011

Problem: I'd like to use standard Haskell Num class operations to
build a code generator for DSP algorithms.  The problem is that
ordinary function composition does not allow the encoding of sharing
information.  I.e. there is no way to use abstract evaluation using
syntax trees to see the difference between these two pieces of code:

  f1 a b = (a + b) * (a + b)
  f2 a b = c * c where c = a + b

The 2 lines define functions, and nothing more; they have no other
structure.

I've tried before to work around this by adding some kind of sharing
information in the data type.  This +- works but is very clumsy, and
I'm not sure if it can be done without leading to a computational
explosion.

The real problem is that joining 2 values that represent networks,
somehow requires shared nodes to be re-discovered.  This is difficult
to do without some central registry.

In Haskell this would require some form of threading or
sequentialization like CPS or another Monad.

Before I was thinking about dropping Haskell altogether because it
would be possible to add this global context in Scheme without
problems.  The reason is that Scheme has an _implicit_ threading
mechanism which is just the order of evaluation of the
interpreter/compiler.

Isn't it better to make this explicit?

What I am fighting here is my fear of do-notation.  Arithmetic in
Haskell is so nice.  Tying it down with a hyper-sequential do-notation
seems overkill.  Is there a middle ground?


Entry: OCaml perform notation
Date: Sat Jul  2 19:21:16 CEST 2011

On the Monad[1] WP page I found a reference to OCaml's
perform-notation, (Haskell's do-notation).  Maybe it made it into
mainline?

[1] http://en.wikipedia.org/wiki/monad


Entry: Monad data flow analysis
Date: Tue Jul 12 09:58:19 CEST 2011

Problem: I'm not used to thinking in monads.  In fact, Bind is still
very counterintuitive to me.

What should bind look like for an operation that updates a data flow
graph?  What I find weird is the appearance of different types a and b
in bind's type signature:

  M a -> (a -> M b) -> M b

Not that much by itself, but in trying to map the "central registry"
idea into an operation that produces this kind of parametetric type.

M a : a value of type `a' with a context

So let's say that M a is actually (Var -> a), a lookup function.
These can be combined as long as it's possible to generate unique
names.  A candidate would be Var = Integer then 

  M a = (Integer, Integer -> a).

What I find strange is that representing this as a function seems to
keep the type under control, but representing it as a "stacked tuple"
lets the type signature grow.  That's a new idea to me.

Something is not right... very uncomfortable feeling of completely
missing the point...

Entry: comonads
Date: Tue Jul 12 10:10:16 CEST 2011

I've also read somewhere that comonads are better for context.  Maybe
that's what I'm looking for really to represent SSA contexts.


Entry: Monad intuition
Date: Sat Jul 16 09:39:01 CEST 2011

Problem: I'm trying to express a data flow graph-constructing monad,
but I find that simply "dreaming it up" doesn't work.  I'm missing
some key insight.

It's entirely trivial to do the computation in Scheme, using a local
mutable variable, and piggy-backing on the order of evaluation.

I know from afar that it is possible to do it using a
state-continuation monad[1].  This approach seems to be essential.  I
don't feel it.

Maybe it has to do with piggy-backing on the order of evaluation?
Because you can't do that in Haskell, it has to be made explicit
somehow, and the CPS Monad does exactly that.

So let's take that as the running assumption.

  The CPS monad is essential.


Type and implementation of CPS monad's bind:

   bind :: Ma -> f -> Mb

   ((t -> r) ->r)   -> (t -> (t' -> r) -> r)  ->   ((t'-> r) -> r)

   bind = \c f k -> c (\ t -> f t k)

In words this is straightforward.  A composition of a CPS computation
c and a function f that produces a CPS computation is obtained by
running c, passing it a new continuation.  That continuation is
created by running the CPS computation f t with the upstream
continuation k, where t is the result from computation c.

The funny thing here is that it is really straightforward to explain
the definition, but somehow I did not make the click yet to
internalize it.

[1] entry://../meta/20110117-141622


Entry: Continuation monad : two return types?
Date: Sat Jul 16 10:20:01 CEST 2011

The central type in the CPS monad is

    data CPS r t = CPS {runCPS :: (t -> r) -> r}

or without wrapping

    (t -> r) -> r

where k = t -> r is the type of the continuation function.  t is the
type of the value that is "returned" from a CPS computation,
i.e. passed from one CPS computation to the next, and r is the overall
result type of the computation.

Why are there 2 return types?  The mind set where this question comes
from has confused me for a long time.  The answer is the following:

  t is the result type of a CPS computation, which is passed to a
  continuation function (t -> r).  This t is the Monad parameter.

  r is the result type of a Haskell expression that is written in CPS
  and does not take part in monadic computation.


The key for me was to understand that r doesn't really matter: i.e. it
does not take part in the bind method.  It is an "impedance match"
between the CPS abstraction - where the result of every computation is
passed to an explicit continuation function - and the Haskell language
in which it is embedded where every function needs to have a return
type.

Stretching the analogy a bit, a CPS function never returns, so its
return type can just as well not exist.  The r type is only necessary
at the boundary of a CPS computation and other Haskell code.


Entry: Using the CPS monad
Date: Sat Jul 16 10:43:38 CEST 2011

I'm trying to record my aha here.  Starting from understanding the
basic idea behind the CPS monad (see previous post) I started writing
down some computations.  Forgive my half-assed terminology but this is
an important relaxation in my head..

I started with the sine function (sin).  The really strange thing here
is that instead of a (CPS a) type popping up, what happens is that an
(a -> CPS a) type pops up very naturally.  I.e. (CPS a) is only the
_result_ of a computation, i.e. a thunk already bound to an input.

What happens is that I first wrote down the wrong type, thinking
"taking the sine of something is a computation"

  (WRONG)    sinCPS :: CPS Floating

But that only specifies the _output_:

  (WRONG)    sinCPS = CPS(\k -> k (sin <what???>))

The solution to that is of course glaringly obvious:

  sinCPS :: Floating -> CPS Floating

  sinCPS t = CPS(\k -> k (sin t))

note that here CPS is already partially aplied: the return value
parameter is not important for the monadic mojo.


Entry: Type of bind
Date: Sat Jul 16 10:53:12 CEST 2011

The next question that pops to mind is: why doesn't bind have the
following type?

    (a -> Mb) -> (b -> Mc) -> (a -> Mc)

The reason is that here we _fix the input_ of each computation.
I.e. this would work fine for simple pipes, but there is no room
"cross product".  I.e. in this type the computation (b -> Mc) is
independent of a.

What would make sense is this:

    Ma -> Mb -> (a -> b -> Mc) -> Mc

which using currying of the (a -> b -> Mc) operation can be written as
2 applications of bind.

Moral of the story: bind doesn't care where the first arrow comes
from, only that the end of the first arrow is compatible with the
start of the next.


But in some cases one does want to just chang multiple (a -> M b)
together.  How to do that?

    composeM :: (Monad m) => (a -> m b) -> (b -> m c) -> (a -> m c)
    composeM f g x = f x >>= g

Ok, found it: called "left-to-right Kleisli composition" in [1] and is
denoted by (>=>).

[1] http://haskell.org/ghc/docs/latest/html/libraries/base/Control-Monad.html


Entry: Learning Haskell: exploring abstractions
Date: Sat Jul 16 11:59:11 CEST 2011

Abstraction is nice and powerful and a good long-term investment, but
in order to learn to work with it you need to break it down
completely.

This takes time.

Haskell abstractions tend to be so general that for a mere mortal it
is quite a stretch to load it all up.  The thing to do is to actually
use it.  See which patterns fit which abstractions.  Build a new taste
for looking at things differently.


Entry: Lifting to CPS computations
Date: Sat Jul 16 12:20:13 CEST 2011

Why do i need to lift to (a -> M b) and not (M a -> M b)?

    cps1 f a   = CPS (\k -> k (f a))
    cps2 f a b = CPS (\k -> k (f a b)) 

The thing is that these are really just return.


What I find strange is that this seems so "easy".  I.e. instead of

  f1 a b = do
    sa <- sinCPS a
    sb <- sinCPS b
    mulCPS sa sb

why not do

  f1 a b = return ((sin a) * (sin b))

The thing is that this monad does not allow room for tracking the data
flow.  I have a CPS monad but I'm not using it yet for what I intend =
to construct a SSA node list.  Then of course (different monad!) these
two forms are not equivalent.

The rule should be:
  1. only use "blessed" lifted ops
  2. do not pass result of nested computations to return


Entry: CPS + memoization
Date: Sat Jul 16 15:29:03 CEST 2011

What needs to happen is this:

  - intercept "return value" == value passed to continuation by
    modifying the bind method of the CPS monad.
  
  - create a new variable in the store and associate it to this binding.

  - pass the variable reference to the real continuation.

What does bind look like in that case?  The problem is types..  Time
for looking at the cheat sheet.

Lets first try to pass an opaque state object along with the CPS
control flow.  This requires a modification from the original pure
CPS:

    bindSC (SC c) f = SC c' where
      c' k = c (\v -> cpsComp (f v) k)

which in non-tagged form looks like

    bindSC c f = c' where
      c' k = c (\v -> (f v) k)

What we need to do here is to allow a state s to travel along side the
value v.  I.e. to complete the following such that s is properly
passed:

(!) bindSC c f = c' where
      c' k = c (\v s -> (f v) k)

The only way to pass this correctly is to pass it to the computation,
thus:

    bindSC c f = c' where
      c' k s = c k' s where
        k' v s' = c'' k s' where
          c'' = f v
Which changes the type from (v -> r) -> r for the CPS monad to

      k -> s -> r
      k = (v -> s -> r)

for the SC monad.  The operation of bind is as follows: a composition
of two SC computations c and one generated by f is a computation c'
which takes two arguments: the overall return point and the input
state.  The composite computation c' first invokes the computation c,
passing it the input state s and a new contuation k' which determines
what happens after c is finished.  The continuation k' takes the value
produced by computation c and the updated state s'.  It uses the value
to construct a new computation c'' from f.  This computation c'' is
then passed the state as it was after c', and the overall continuation
= return point of the value and state produced by c''.

In lambda notation without intermediate names:

   bindSC c f = c' where
     c' k s = c (\v s' -> (f v) k s') s


It seems to be a bit more readable if the state is passed as the first
parameter for both the CPS computation c,c' as the continuation.

   bindSC c f = c' where
     c' s k = c s (\s' v -> (f v) s' k)

Entry: Allocating nodes
Date: Sat Jul 16 17:21:36 CEST 2011

Bon.  I have threading, now it's time to find out how to do the
memoization without getting into type trouble.

It seems simplest to do this in "return", because that's where we go
from the world of nested expressions to the world of CPS.  The
following doesn't work though, it's not generic enough.

   returnSC v = c where
     c s k = k s v
  
   returnSC' v = c where  
     c s k = k l (length l) where l = v:s

    Couldn't match expected type `a' against inferred type `Int'
      `a' is a rigid type variable bound by
          the type signature for `return' at <no location info>
      Expected type: Cont r s a
      Inferred type: [a] -> Int -> r

The reason for this is that the monad parameter a is fixed to Int, the
return type of length.

Look like it's to to really go to the cheat sheet.  If I recall, this
problem is mentioned in the paper[1].  Hmmm.. at first glance I can't
find it, but what happens in the code is that the let that's inserted
does seem to have the proper type for the variable.

Maybe the monad should be kept, but memoization should be made
explicit.

I wonder if [2] has better info..

[1] http://www.cs.rice.edu/~taha/publications/conference/pepm06.pdf
[2] http://www.cas.mcmaster.ca/~carette/publications/scp_metamonads.pdf


Entry: Separate memo function
Date: Sat Jul 16 18:39:40 CEST 2011

What I want to do is to allow for the insertion of a function that
intercepts the value, stores it in the state, and passes the state and
reference on.

What should this be?  It should be a SC computation that has a
specialized type.

Following [1][2] this is really just a specialized form of return:


tmpvar n = var ("R" ++ (show n))

returnN val = SC c where  
  c table k = k table' ref where 
    table' = (ref,val):table
    ref = tmpvar $ length table
    

This is it.  It all just works now.  Wicked.


[1] http://www.cs.rice.edu/~taha/publications/conference/pepm06.pdf
[2] http://www.cas.mcmaster.ca/~carette/publications/scp_metamonads.pdf


Entry: Code gen and eval?
Date: Sat Jul 16 20:38:41 CEST 2011

How to do both SSA code gen and normal numeric evaluation?  The
returnN macro seems to limit the options quite a bit, i.e. it fixes
the term type.  Can we keep it general?


Entry: Mixing expressions (trees) and sharing (directed, acyclic graphs)
Date: Sun Jul 17 10:36:56 CEST 2011

( So much effort for something seemingly so trivial.  Granted, the
  effort was largely due to ignorance.  I understand the state-cont
  monad better now, and playing with it a bit the structure seems
  quite powerful. )

Can we now write such monadic functions as (returnN . sin) to obey a
type class such as Num?  No.

The reason for this is that Num allows nesting s.a. sin(sin(sin x))
and a + (b + c).  Such tree-structured expressions with intermediate
nodes are exactly what we're trying to avoid.  It is the bind
operation (the arrow in do-notation) that introduces order of
evaluation so sorely needed to incrementally build the data flow graph
incrementally as memoized values.

So, what do we have here?  A do-notation that resembles ordinary SSA
form and a collection of functions with types:

  a -> M a
  a -> a -> M a
  a -> a -> a -> M a
  ...

Now, this notation sucks.  Mostly because it's a pain in the ass to
use at points where it would be straightforward to infer from simple
expressions.

What about using a middle ground here?  Allow nested expressions
(simply num class) but use the monad only when sharing needs to be
expressed?

Let's try this.

The prototypical function would be square.  I've noticed a quaint
difference here between sharing the input or output.  See the
functions below and their output.


square x = do
  vx <- returnN x
  return $ vx * vx
           
f x = do
  x2 <- square (1 + x)
  x4 <- square (1 + x2)
  square x4

*Main Control.Monad> toSSA $ f $ var "a"
R0 = (add 1 a)
R1 = (add 1 (mul R0 R0))
R2 = (mul R1 R1)
return (mul R2 R2)


square' x =
  returnN $ x * x

f' x = do
  x2 <- square' (1 + x)
  x4 <- square' (1 + x2)
  square x4


*Main Control.Monad> toSSA $ f' $ var "a"
R0 = (mul (add 1 a) (add 1 a))
R1 = (mul (add 1 R0) (add 1 R0))
R2 = R1
return (mul R2 R2)


I did not realize this subtlety, but in retrospect it is obvious:
returnN introduces the variable, so calling it on the input of square
or the output of square' is quite a difference.  In square'
duplication happens because the input is used twice, and the input
might be an expression.

Simply judging from use, it seems best to let functions themselves
decide whether they should create a shared node for their inputs or
not: whenever it's duplicated in the function body you share,
otherwise not.

It seems also best to introduce a different name for returnN,
i.e. "node" or "name" or "var" or a shorter variant of those.  Btw, is
that the same trick as the "i" in [1] which embeds any monad in tge
CPS monad?

                   Moral: If you DUP, name it.

The drawback in Haskell is that you have to do this explicitly: it's
easy to use a variable more than once.  So yes, it's possible to
express this but it still feels a bit clumsy.

The interesting thing about a concatenative language is that the DUP
is always explicit, so you don't need to invent separate notation.


[1] http://blog.sigfpe.com/2009/01/rewriting-monadic-expressions-with.html


Entry: Can I have let?
Date: Tue Jul 19 17:58:38 CEST 2011

So is it just annoying, or is there something else going on?

  - not enforced, might miss a duplication
  - ugly, i'm not used to it
  - no Num class ("true" monad)
  + distinguish between share & dup (macro semantics)

Looks like I need to accept it really is a monad, and requires
threading.  The other approach (merging graphs) was very convenient
this way, because the expression structure is intact.

But is that structure really necessary?  Most of the magic I want to
use is Applicative, so should work just fine.  Probably I just need to
start using it to fill in the familiarity gaps..


Entry: Sharing
Date: Tue Jul 19 18:14:35 CEST 2011

In this:

  f x = do
    x2 <- square (1 + x)
    x4 <- square (1 + x2)
    square x4

it's impossible to do:

  square (square (1 + square (1 + x)))

because of types!  So sharing propagates; it is mandatory.  A
computatin with sharing information will never be able to take the
place of one without sharing, but conversely it's straightforward to
inject pure computations into the sharing monad.

So what makes this a sharing monad?  Internally, it's a much more
powerful state-continuation monad, but the limited amount of
operations made available limit the effects.  If the implementation of
the monad is kept hidden (no constructors) then there are only 2
abstract constructors: the default "return" and the "returnN" sharing
operator.

Because the monad type "protects" the output value in such a way that
only the bind operator can access it, all composition logic can be
incorporated in bind itself, and there is no way the high-level
sharing behaviour can be violated once it is attached, meaning once a
type is wrapped in a monad.

A monadic value is completely opaque, with the only promise that it
has burried inside of it a certain type that can be fished out by
bind.


Entry: Next
Date: Sat Jul 23 11:09:25 CEST 2011

I'd like to move forward.  This means effective code generation for a
code module.  Let's use Pd first.

Let's use the example of [1], which has some elements that are usually
hard to mix:
  - integer math
  - conditionals
  - nonlinear functions
  - IIR filters

Some questions to answer:

- Can we use a type-directed approach to use both 1. typed non-shared
  evaluation-only interpretation and 2. untyped shared dataflow graph
  construction.

- How to keep a typed dataflow graph?


[1] entry://20110122-124155


Entry: Keeping types
Date: Sat Jul 23 11:15:52 CEST 2011

Currently the accumulation of the dataflow graph happens for a data
type representing untyped terms.  I don't see a way to retain typing
information.

Does this matter?

Is it possible to keep typing information for a different class,
i.e. define a dummy returnN for Num classes, such that a stricter
discipline is followed in the non-shared case, but flattening happens
for the shared case?

Is it possible to record types dynamically?
Is it necessary?  (Maybe typed primitives is enough.)


Entry: Term type
Date: Sat Jul 23 11:26:36 CEST 2011

The problem might be that my term type is not not generic enough, and
should at least have integers and floats.  Booleans could be
represented as ints.

I'm following the road to distinguish a term type Code a from a typed
term type:


-- Code with type annotation and type conversions.
data TCode = TCfloat (Code Double)
           | TCint   (Code Integer)
           | TCbool  (Code Bool)
             deriving Show

Conditional jumps and ->Bool projection operations are still missing.
It looks like we're going to need a way to incorporate control flow
graphs at some point.  However, let's first solve comparisons and try
to keep control flow in the form of operators which would allow for a
higher level of representation.


Entry: General -> concrete type?
Date: Sat Jul 23 15:08:45 CEST 2011

I'm trying to express something that doesn't seem to be possible
without special tricks.  Questions: 1. what exactly goes wrong and
2. do I need this?


-- Expr with type annotation and type conversions.
data TExpr = Tfloat {teFloat :: Expr Double}
           | Tint   {teInt   :: Expr Integer}
           | Tbool  {teBool  :: Expr Bool}
             deriving Show

teLift1 :: (Expr a -> Expr a) -> (TExpr -> TExpr)

teLift1 f = l where 
    l :: TExpr -> TExpr
    l (Tfloat x) = Tfloat (f x)
    l (Tint x)   = Tint   (f x)
    l (Tbool x)  = Tbool  (f x)
    l _ = error "teLift1: Type not supported"


The error is:

    Couldn't match expected type `Double'
           against inferred type `Integer'
      Expected type: Expr a
      Inferred type: Expr Integer
    In the first argument of `f', namely `x'
    In the first argument of `Tint', namely `(f x)'
Failed, modules loaded: none.


I think I'm making a conceptual error.  My assumption is that while
`f' has a generic type, at compile time it has a clear and definite
implementation and can't be specialized to the 3 uses of types.

It seems that the approach is flawed.  Is there another way to trick
the compiler into specializing this into 3 different versions?

The following typechecks, but just moves the problem upwards.  Maybe
it is enough though to make this work for instance declarations.

teLift1 ff fi fb = l where 
    l (Tfloat x) = Tfloat (ff x)
    l (Tint x)   = Tint   (fi x)
    l (Tbool x)  = Tbool  (fb x)

teLift2 ff fi fb = l where 
    l (Tfloat x) (Tfloat y) = Tfloat (ff x y)
    l (Tint x)   (Tint y)   = Tint   (fi x y)
    l (Tbool x)  (Tbool y)  = Tbool  (fb x y)

instance Num TExpr where
  (+) = teLift2 (+) (+) (+)
  (*) = teLift2 (*) (*) (*)
  abs = teLift1 abs abs abs
  signum = teLift1 signum signum signum
  fromInteger = Tint . lit
  

It doesn't seem to be such a problem though.  Just a bit of
duplication to "punish me" because I want to do automatic type
conversion ;)


Entry: Types for DSP language
Date: Sat Jul 23 16:06:17 CEST 2011

To simplify type conversions, let's stick to two types: machine
integers and floating point numbers.  Integers can represent bools,
but floats and ints are fundamentally different.

I don't like the Pure Data approach of encoding ints in floats.  Many
algorithms need exact counting and using floats for that is too
dangerous and unnecessarily inefficient for low-level rep.

Now it's important to distinguish two factors:

  - embedding of constants

  - being able to qualify operations or values as typed

The embedding isn't such a big deal.  I don't think I ever ran into
integer constants that can't be embedded in a double apart from raw
memory addresses.  The real issue is typed computations.

I run into this problem with (Expr a) as (Expr Double) or (Expr
Integer).  The type only refers to constants, not necessarily to the
type of variables (which are essentially untyped).


Entry: Typed syntax
Date: Sat Jul 23 16:29:38 CEST 2011

I think I'm running into the limitation that is the central idea of
[1].  It is possible to embed a typed language into Haskell, OCaml
without fancy types if a "final" representation is used (generic
functions in Haskell's type classes or OCaml's modules) instead of a
"intial" one (straight up data types).

I'm using the coalgebraic representation, but I'm trying now to work
around some type limitations that hint at the impossibility of being
able to pull that off.

What I don't understand is how this is seemingly sidestepped in
MetaOCaml.

I'm confused.. I think I'm missing one layer of abstraction: my
representation of code (Expr, TExpr) should be final, not initial.

Is it possible to glue that to ordinary Num classes and my sharing
monad?

How far am I getting off-track here? :)


[1] http://www.cs.rutgers.edu/~ccshan/tagless/jfp.pdf


Entry: Very simple type system -> inference is trivial.  Can we ignore?
Date: Sat Jul 23 16:55:04 CEST 2011

What I'm trying to do is almost trivial.  There is no difficult type
inference.  In fact, it's quite simple: types promote from int ->
float if necessary, and need to be explicitly cast down from float ->
int.

This isn't mathematically correct but will probably be the simplest
thing to do.  Every expression is annotated with its type, and this is
"inferred" through straight dataflow.

Let's try that first.

The idea is this:

  Expr a -> Expr t a

with t representing the type annotation.  Put all lifting and
converting in a type class (TypeTag t => ...)


Entry: Tagless..
Date: Sat Jul 23 17:08:37 CEST 2011

I need to distinguish two features:

  - representation of typed terms

  - terms implementing Num class


Will that work, or am I trying to stack incompatible parts?  I.e. a
Num instance "blesses" a datatype.  Can the final representation be a
data type, or is it a class?

Let's just try.

Fantastic.  It's still possible to use Num classes for the abstract syntax!


type Tfloat = Double
type Tint   = Integer


-- Class represents syntax, instances represent semantics.
class Symantics repr where
  int   :: Tint   -> repr Tint
  float :: Tfloat -> repr Tfloat
  addi  :: repr Tint   -> repr Tint   -> repr Tint
  addf  :: repr Tfloat -> repr Tfloat -> repr Tfloat
  

-- Interpretation 1: terms
data Term a = Int   a
            | Float a
            | Addi (Term a) (Term a)
            | Addf (Term a) (Term a)
            deriving (Show, Eq)

instance Symantics Term where
  int   = Int
  float = Float
  addi  = Addi
  addf  = Addf

-- Allow (Symantics repr) => Num (repr Tint)
instance (Show (repr Tint), Eq (repr Tint), Symantics repr) => Num (repr Tint) where
  (+) = addi
  fromInteger = int  
  -- Left out dummies

f a = a + a 


This gives:

> :t f
f :: (Num a) => a -> a

> f (int 1) :: Term Tint
Addi (Int 1) (Int 1)


Entry: Symantics => Num, or the other way around?
Date: Sat Jul 23 19:14:32 CEST 2011

So I wonder maybe it's simpler to work the other way around?  Make
(Symantics repr) depend on (Num (repr Tint), Num (repr Tfloat))

This could be a mistake though.  The point is not to implement Num,
the point is to have a good description of the syntax.  It might be
necessary to use just Symantics but not Num.

I.e. we want to have a simple language def and if possible also be
able to lift Num -> Symantics.


Entry: Type conversions
Date: Sat Jul 23 19:32:29 CEST 2011

How to do these:

  -- Type conversions
  i2f   :: repr Tint   -> repr Tfloat
  f2i   :: repr Tfloat -> repr Tint


Trouble seems to be that 2 types don't fit well in (Term a) or (Eval
a) when they are mixed.  How to work around that?

For Eval it seems straightforward:

  i2f = Efloat . fromInteger . eInt
  f2i = Eint . floor . eFloat
 
For Term I don't see it.  

i2f :: Term Tfloat -> Term Tint
f2i :: Term Tint -> Term Tfloat

Due to the recursive relationship, this doesn't seem to be possible
because you can't graft both term types together.

Should this become (Term i f) ?

I need to look at the cheat sheet[3].  This makes it clear that repr
parameter doesn't need to encode the float/integer types: the
parameter already does.  So what with casts?

The code in [3] either interprets, or compiles to GADT.  The MetaOCaml
code doesn't need that.  Is the code type in MetaOCaml somehow
special, i.e. GADT-like?

So, problem:

  Interpretation of i2f and f2i is not a problem: it is trivial to
  lift :: Double -> Integer to :: Eval Double -> Eval Integer.

  However, using simple recursive types it seems impossible to
  construct something with type :: Term Double -> Term Integer.

But, is this really a problem?  If I use the same approach as in
MetaOCaml - to generate code and never post-process it - then it might
be simpler to skip this step and dump out a textual representation
straight from the syntax class.


[1] http://www.cs.rutgers.edu/~ccshan/tagless/jfp.pdf
[2] http://www.cs.rutgers.edu/~ccshan/tagless/talk.pdf
[3] http://okmij.org/ftp/tagless-final/Incope.hs


Entry: GADTs for typed assembly language
Date: Sat Jul 23 21:27:22 CEST 2011

In [1] GADTs are used for representing a typed assembly language.  The
reason seems to be that it's not possible to construct something of
the following type using just data types:

   Term a -> Term b

which was necessary for type conversion functions:

  i2f   :: repr Tint   -> repr Tfloat
  f2i   :: repr Tfloat -> repr Tint


[1] http://okmij.org/ftp/tagless-final/Incope.hs


Entry: Textual representation: no intermediate data type
Date: Sat Jul 23 23:13:42 CEST 2011

I guess this is a similar question than: can we have a writer monad
implementation?

Anyways, the string concatenation code is really straightforward:

data Print a = Print String
               deriving (Show, Eq)
                        
prt x = Print $ show x
prt1 op (Print x)           = Print $ concat [op,"(",x,")"]
prt2 op (Print x) (Print y) = Print $ concat [op,"(",x,",",y,")"]

instance Symantics Print where
  int = prt
  float = prt
  
  isign = prt1 "isign"
  iabs  = prt1 "iabs"
  iadd  = prt2 "iadd"
  imul  = prt2 "iadd"  
          
  fsign = prt1 "fsign"
  fabs  = prt1 "fabs"
  fadd  = prt2 "fadd"
  fmul  = prt2 "fadd"  

  i2f   = prt1 "i2f"                        
  f2i   = prt1 "f2i"


This essentially ignores the type parameter, so it seems that the same
technique can be used for terms.


Entry: Next
Date: Sun Jul 24 01:17:40 CEST 2011

The good news is that final representation is possible together with
the Num class.  Does sharing mix like that also?


Entry: Higher-order abstract syntax
Date: Sun Jul 24 10:43:55 CEST 2011

It's quite neat really: instead of using a concrete type + operations,
use a type class that can abstract over many representation types.

The downside is that when you want to make the representation into a
data type, something more powerful than ordinary algebraic types is
necessary.  However, functional representation isn't an issue, and
since terms are necessary well-typed, an untyped rep might also be
enough to generate assembly code.


type Tfloat = Double
type Tint   = Integer
type Tbool  = Bool

type Op2 repr t = repr t -> repr t -> repr t
type Op1 repr t = repr t -> repr t


class (Show (repr Tint),   Eq (repr Tint),
       Show (repr Tfloat), Eq (repr Tfloat)) 
      => Semantics repr where
  -- Literals
  int   :: Tint   -> repr Tint
  float :: Tfloat -> repr Tfloat
  bool  :: Tbool  -> repr Tbool
  
  -- Integer primitives
  isign :: Op1 repr Tint
  iabs  :: Op1 repr Tint
  iadd  :: Op2 repr Tint
  isub  :: Op2 repr Tint
  imul  :: Op2 repr Tint
  idiv  :: Op2 repr Tint
  imod  :: Op2 repr Tint
  
  -- Float primitives
  fsign :: Op1 repr Tfloat
  fabs  :: Op1 repr Tfloat
  fadd  :: Op2 repr Tfloat
  fsub  :: Op2 repr Tfloat
  fmul  :: Op2 repr Tfloat
  fdiv  :: Op2 repr Tfloat
  
  fsin  :: Op1 repr Tfloat
  fcos  :: Op1 repr Tfloat
  fexp  :: Op1 repr Tfloat
  
  -- Conditional
  if_ :: repr Tbool -> repr a -> repr a -> repr a
  
  -- Type converssions
  i2f   :: repr Tint   -> repr Tfloat
  f2i   :: repr Tfloat -> repr Tint


Entry: Merging HOAS & Sharing monad
Date: Sun Jul 24 11:10:36 CEST 2011

Now for the real problem: how to merge the state-continuation monad
and the higher order syntax approach?  They seem to live in different
worlds.

Probably writing the sharing code more abstractly as a class would
help a bit.

Looking at the type of returnN :: TExpr -> SC r Bindings TExpr it
seems that making TExpr generic here would work.  This would then
probably be (repr Tfloat) and (repr Tint).


Is this the trick? : Represent a generated variable as a "hole",
i.e. a function.


Anyways, this compiles:


data Bindings t = Bindings [(t,t)]
                  deriving (Show,Eq)
                           
class SSA t where
  tmpvar :: Int -> t -> t

-- Note that returnN is not as generic as return.
returnN :: SSA t => t -> SC r (Bindings t) t
returnN val = SC c where  
  c (Bindings table) k = k (Bindings table') ref where 
    table' = (ref,val):table
    ref = tmpvar (length table) val


Then I tried to abstract this into a monad but ran into a problem.
Apparently that constraint on the type of returnN makes it impossible
to have generic monad behaviour.  

-- Abstract SSA as a monad
data SSA r t = SSA { unSSA :: SC r (Bindings t) t }

instance Monad (SSA r) where
  return = SSA . returnN
  (SSA sc) >>= f  = SSA (sc >>= (unSSA . f))
  

So this needs a different type class.  Maybe just add the sharing
return to the letN type class (renamed from SSA / tmpvar)


-- The Let class abstracts term naming.  Given a unique number and a
-- term, it returns a term unique to that number that can be used as a
-- name.
class LetN term where
  letN    :: Int -> term -> term
  returnN :: term -> SC r (Bindings term) term

  -- Default implementation inserts into the table.
  returnN val = SC c where  
    c (Bindings table) k = k (Bindings table') ref where 
      table' = (ref,val):table
      ref = letN (length table) val


The problem is now how to combine this with Symantics.  I don't feel
any understanding bubbling up here..

A problem I ran into before is the type of the bindings list that is
threaded through.  This type has to be constant if it's to serve as a
state parameter for the SC monad.

Is there really a problem?  Essentially there are only 2 types the
store needs to take.  This could go into two lists.

Maybe the store should also be a typeclass then?


Entry: Representing "let" in the final language
Date: Sun Jul 24 13:44:11 CEST 2011

It seems that in order to use the sharing approach, it might be best
to define it directly in the language itself, or as an extension to
Symantics.

Following the lam amd app examples in the tagless paper it seems that
there need to be 2 elements: a term to bind, and a body to bind it
in.  The result is another term.

class Symantics r => SymanticsLet r where
  -- term -> body -> result
  letS  :: r a -> r (a -> b) -> r b

Let's see if this can be implemented.

Indeed it can be.  I came to this:

class Symantics r => SymanticsLet r where
  -- name -> body -> value
  letS  :: r a -> r (a -> b) -> r b
  body  :: (a -> b) -> r (a -> b)
             
instance SymanticsLet Eval where
  body = Eval
  letS (Eval term) (Eval body) = Eval $ body term

expr6 = letS (int 3) (body $ \x -> x * x)

The "body" is probably not necessary by building it into the let.

The "lam" in the paper maps (r a -> r b) -> r (a -> b).  The input is
a body, the output is the representation of a function.  So maybe I
should use (r a -> r b).

So it seems to be even simpler then:


class Symantics r => SymanticsLet r where
  letS  :: r a -> (r a -> r b) -> r b
  -- Default implementation just substitutes
  letS term body = body term
   
instance SymanticsLet Eval 
instance SymanticsLet Code


An implementation that needs to capture the sharing structure then
needs to fill out this operation differently.


Entry: Let : summary
Date: Sun Jul 24 17:22:02 CEST 2011

class Symantics r => SymanticsLet r where
  letS  :: r a -> (r a -> r b) -> r b
  letS term body = body term

instance SymanticsLet Eval 
instance SymanticsLet Code

expr6 = letS (int 3) (\x -> x * x)


The cool thing is that this "just works" for Eval.  It doesn't need
name generation: embedded language uses the binding structure of
Haskell.

However, the code implementation doesn't do what it's supposed to do,
which is to generate names instead of duplicating terms.  This
requires some extra effort.

The problem however is that this brings me back to an old foe: if the
dictionary is encoded in the representation type, what to do when 2
branches with different context are merged?  Maybe this simply can't
happen?  I'm confused so let's just start building it and see where it
blocks.

I got to this, which does the right thing for expr7 below but feels
horribly wrong:

data Code a = Code { dict :: [(String,String)], code :: String }
            deriving (Eq, Show)
 

code' x = Code [] $ show x
code1 op (Code d x)                          = Code d (codeApp [op,x])
code2 op (Code d x) (Code d' y)              = Code d (codeApp [op,x,y])
code3 op (Code d x) (Code d' y) (Code d'' z) = Code d (codeApp [op,x,y,z])

codeLet (Code dict term) body = body (Code dict' var) where
  var = "R" ++ (show $ length dict)
  dict' = (var, term):dict
  
instance Symantics Code where
  ...
  let_ = codeLet

expr7 = let_ (int 3) (\x ->
        let_ (x*x)   (\xx -> (xx * xx)))             


*Final> expr7 :: Code Tint
Code {dict = [("R1","(imul R0 R0)"),("R0","3")], code = "(imul R1 R1)"}


It seems that the dictionary should go somewhere completely different.
But where?  Let's first try to construct something that breaks.

Since contexts are nested, the only case in which they can be
different is if one is contained in the other.  Is that enough?

Hmm... i'm going down the drain here.  There has to be a much simpler
way.

There seems to be really no way around making a linear traversal over
the code to replace all intermediate nodes with name and collect the
original node contents.  However, it seems impossible to do this for
the code1, code2, code3 operations above.  Or is it?

Why can't repr be a Monad?  Maybe there's no reason at all why it
cannot be one.

code2 op (Code x) (Code y)          = Code $ codeApp [op,x,y]

can this be replaced by something like:

code2 op x y  = do
      vx <- x
      vy <- y
      return $ print op vx vy

I.e. for the 2-ops the type could be

M a -> M a -> M a


Wait a minute.  Can't this be implemented in terms of let_ directly?


Entry: Intuition
Date: Mon Aug  1 09:05:55 CEST 2011

The problem seems to really be that I don't understand it enough.  I
understand the bind operation, I understand lexical scope, but I don't
understand how far-reaching their interaction can be.

Why is this so hard?


Entry: Joining dictionaries
Date: Mon Aug  1 09:30:47 CEST 2011

Is the following correct?  The `codeLet' operation is relatively
trivial: create binding and propagate binding and augmented
dictionary.  Since let_ is always strictly nested there is no
confusion about how to propagate the dictionary.

The confusion starts with multi-input operations.  Since the
dictionary travels with the variable instead of being stored in a
global state object, it is necessary to reconstruct that state object.
Because we know that dictionaries have to be nested, reconstruction
can be done by just inspecting dictionary size:

dictJoin d1 d2 =
  if (length d1) > (length d2) then d1 else d2

code' x = Code (show x) []

code1 op (Code x d)
  = Code (codeApp [op,x]) d

code2 op (Code x d) (Code y d')
  = Code (codeApp [op,x,y]) d'' where
  d'' = dictJoin d d'

code3 op (Code x d) (Code y d') (Code z d'')
  = Code (codeApp [op,x,y,z]) d''' where
  d''' = dictJoin (dictJoin d d') d''

codeLet (Code term dict) body = body (Code var dict') where
  var = "R" ++ (show $ length dict)
  dict' = (var,term):dict       

So what happens when dictionaries are not nested, i.e. when scopes fan
out?  I.e. is that assumption actually correct?  Nope, it's not.  In
the following, one of the dictionaries is lost,

expr9  = let_ (int 1) (\x -> (x * x))
expr10 = let_ (int 2) (\x -> (x * x))

*Final> expr9 :: Code Tint
Code {code = "(imul R0 R0)", dict = [("R0","1")]}
*Final> expr10 :: Code Tint
Code {code = "(imul R0 R0)", dict = [("R0","2")]}
*Final> (expr9 * expr10) :: Code Tint
Code {code = "(imul (imul R0 R0) (imul R0 R0))", dict = [("R0","2")]}


There is no way around threading a single state == CPS conversion.

No free lunch.


So it shouldn't be that hard, but the "data rep" should be a state
continuation monad.


Entry: Data vs. Code
Date: Mon Aug  1 11:11:55 CEST 2011

I've espressed this before[1]: monads are not about data structures,
they are about computations.  Of course, you "hook" the computations
onto a data structure (a parametric type that wraps one end of a
computation) but the structure of the computation itself is quite
arbitrary.

[1] entry://../compsci/20110723-141330


Entry: Struggling with SC monad
Date: Mon Aug  1 11:41:51 CEST 2011

Trouble is that the Show,Eq,Integral,... type classes are required.
I'm doing something wrong because it seems that the semantics need to
be more general.  Can these be CPS-ified also?

I don't know what I'm doing..  It's too abstract.

The funny thing is that some form of "local-consistency" reasoning can
help to get the code right, and then when it's finally there you can
"interpret" the composition.  This is a _strange_ way of working!


So what is the core problem?

It's juggling the types.  I got this to work, which I find surprising:

type Dict = [(String,String)]
type Res  = (String,Dict)
returnN :: String -> SC Res Dict String

returnN x = SC c where
  c d k = k d' v where
    v = "R" ++ (show $ length d)
    d' = (v,x):d
    

data SSA t = SSA (SC Res Dict String)

ssa2 fn (SSA x) (SSA y) = SSA $ do 
  vx <- x
  vy <- y
  returnN $ concat ["(",fn," ",vx," ",vy,")"]


The awesome bit here is that the state-contiuation monad is completely
behind the scenes, meaning the monadic parameter (which is a String!)
is not the same as the t parameter of SSA: t is fake!

The threading performed by the SC monad is at the heart of things: the
function that implements 2-op composition.

Wow..

This blows my mind.

*FinalSSA> 1 + 2 + 3 + 4 :: SSA Tfloat
("R2",[("R2","(fadd R1 4.0)"),
       ("R1","(fadd R0 3.0)"),
       ("R0","(fadd 1.0 2.0)")])


Entry: So... instead of strings, what about different data types?
Date: Mon Aug  1 14:05:52 CEST 2011

Indeed, works like a charm.

TODO: fix let_ :: too much sharing: don't copy variables.


Entry: Next?
Date: Mon Aug  1 17:23:19 CEST 2011

I think I get the big picture, but there's some magic for partial eval
I don't understand yet.  One remark was intriguing:

  ".. to squeeze more invariants out of a type system as simple as
  Hindley-Milner, we shift the burden of representation and
  computation from consumers to producers .."

This is a bit similar to the "printing is simpler than parsing" mantra.

So what's next?  Fix the "let" duplication problem and start using it;
it looks like it's ready.

[1] http://www.cs.rutgers.edu/~ccshan/tagless/aplas.pdf


Entry: Sharing
Date: Mon Aug  1 18:38:14 CEST 2011

f3 x = let xx = x * x in xx * xx

*Main> f3 1 :: SSA Tfloat
R0 = fmul 1.0 1.0
R1 = fmul 1.0 1.0
R2 = fmul R0 R1
R2


Why isn't "fmul 1.0 1.0" shared?  Or, a bit more obvious:


f4 x = let x1 = x  * x  in
       let x2 = x1 * x1 in
       let x3 = x2 * x2 in
                x3 * x3

*Main> f4 1 :: SSA Tfloat
R0 = fmul 1.0 1.0
R1 = fmul 1.0 1.0
R2 = fmul R0 R1
R3 = fmul 1.0 1.0
R4 = fmul 1.0 1.0
R5 = fmul R3 R4
R6 = fmul R2 R5
R7 = fmul 1.0 1.0
R8 = fmul 1.0 1.0
R9 = fmul R7 R8
R10 = fmul 1.0 1.0
R11 = fmul 1.0 1.0
R12 = fmul R10 R11
R13 = fmul R9 R12
R14 = fmul R6 R13
R14


Why doesn't this share anything, while there is threading going on?

f5 x = let_ (x  * x)  $ \x1 ->
       let_ (x1 * x1) $ \x2 ->             
       let_ (x2 * x2) $ \x3 ->             
             x3 * x3
*Main> f5 1 :: SSA Tfloat
R0 = fmul 1.0 1.0
R1 = let R0
R2 = let R0
R3 = fmul R1 R2
R4 = let R3
R5 = let R3
R6 = fmul R4 R5
R7 = let R6
R8 = let R6
R9 = fmul R7 R8

This is correct apart from the copying of variables.

The right implementation of let_ seems to be:


let' (SSA x) body = SSA $ do
  vx <- x
  lowerSSA body $ return vx
  

Or with wrapping expanded:

let' (SSA x) body = SSA $ do
  vx <- x
  unSSA $ body $ SSA $ return vx


Is all that wrapping/unwrapping necessary?  It might be simpler to
define SSA as a monad.

I came to the following explanation, which could be worded better but
I believe does hint at what's going on:

    The idea is that in let' the monadic variable x is only
    "dereferenced" once, which would not be the case for an ordinary
    let expression which would run a computation multiple times.


Entry: Better syntax
Date: Mon Aug  1 21:39:42 CEST 2011

Is there a better way to write this?

f5 x = let_ (x  * x)  $ \x1 ->
       let_ (x1 * x1) $ \x2 ->             
       let_ (x2 * x2) $ \x3 ->             
             x3 * x3

I don't think the wrapping used is compatible with do notation.
I.e. something like this:

f6 x = do
   x1 <- x * x
   x2 <- x1 * x1
   x3 <- x2 * x2
   x3 * x3

Which has the not-so-useful type:

f6
  :: (Num (m (m (m (m b)))),
      Monad m,
      Num (m (m (m b))),
      Num (m (m b)),
      Num (m b)) =>
     m (m (m (m b))) -> m b


The closest I get to something useful is this:

v = return

f7 x = do
  x1 <- v $ x * x
  x2 <- v $ x1 * x1
  x3 <- v $ x2 * x2
  v $ x3 * x3

The type of v can probably be restricted, but it still looks ugly.  It
seems best to just use template haskell to generate a new binding form
that expands to let_

What would help is to be able to lift all the num monops and binops to:

              a -> m b   and   a -> b -> m c

Too bad that won't work, since the prototypes aren't compatible:

*Main> :t (+)
(+) :: (Num a) => a -> a -> a


Conclusion: do notion would work if we make the monad explicit.
However, the approach taken leaves the monad behind the scenes,
leaving only the language syntax as the main player.


Entry: Conclusion: Num can be sequential
Date: Tue Aug  2 09:13:37 CEST 2011

The big revelation of last couple of days is that it is possible to
compile a Num class expression into a dataflow graph (SSA expression)
by using the CPS monad.

My earlier attempt had a fundamental misconception; it represented
numbers as data structures.  The fix is to implement numbers as a
state-continuation monad, i.e. as functions.  Here the Num operations
don't need to "merge" data structures; they simply compose functions
using bind from the monad.


Still there is the problem of sharing, which seems to necessitate the
use of a special purpose `let' expression.  This is a bit subtle and
has to do with running a CPS computation multiple times vs. using its
result.  See previous posts.


Entry: Atom
Date: Tue Aug  2 09:26:29 CEST 2011

Maybe I'm ready now to look at Atom?

Some observations:
 - The monad's values are variable names.
 - These can be dereferenced using `value', and set using `<=='


Entry: Phantom types
Date: Tue Aug  2 10:23:10 CEST 2011

The parameter of the (repr t) type in (Symantics repr) is used as what
is called a phantom type in my compilation to String.

[1] http://www.haskell.org/haskellwiki/Phantom_type


Entry: SymExpr: compiling to type-stripped recursive data structure
Date: Tue Aug  2 10:53:41 CEST 2011

The problem is that the language doesn't fit into a recursive data
structure like:

data Expr t = Lit String
            | Op String [Expr t]
            deriving (Eq, Show)

Because of functions like

i2f :: r Tint -> r Tfloat

If t would be a phantom type this works fine.  Wait, maybe a level of
indirection would help here:

data Term = Lit String
          | Op String [Term]
          deriving (Eq, Show)
data Expr t = Expr Term                   


Yep.  This works.  EDIT: this is essentially the example in [1] "The
use of a type system to guarantee well-formedness".

[1] http://www.haskell.org/haskellwiki/Phantom_type

Entry: Type magic
Date: Tue Aug  2 11:16:54 CEST 2011

It's still quite magic to me.  I can't predict what would work, and in
what kind of weird way it is possible to combine all that type-level
trickery.

Cool though.


Entry: Predicates and Control flow
Date: Tue Aug  2 12:22:32 CEST 2011

It's straightforward to write Eval for if_ but what about code?

Another interesting problem is the types of Eq and Ord: a -> a ->
Bool.  This is only useful for partially evaluated conditiond.

Maybe it's time to switch to AwesomePrelude[1], though there are a lot
of problems that sort of negate its usefulness.  Seems to me that
writing a parser that replaces part of the Haskell syntax is a simpler
approach.  I'm starting to miss Scheme now.  Makes me wonder, how hard
is it to do this in (typed) Scheme?

The problem of Eq and Ord is a non-issue.  Either define some
different symbols like .== .> .< or redefine the Prelude symbols in
modules that perform computations.

Representing control flow otoh is a whole other story.  The main
questions to answer are:

- variable context: it is absolutely essential that a C-style (local)
  context is maintained for loops.  tail recursion is nice but the
  simultaneous assignment is something i don't know how to handle.
  maybe it's a non-issue?

- function inlining: at some point a decision needs to be made about
  function inlining.  currently in SymAsm everything is inlined.


[1] entry://../compsci/20110802-141549


Entry: Next: representing basic blocks
Date: Tue Aug  2 16:26:11 CEST 2011

To find out:
- how do bindings interact with jumps
- named bindings vs. intermediate results
- serial or parallel assignment

CPS has well-defined binding structure and parallel assignment.  In
SSA[1] this seems to be somewhat looser.  Is there a real difference
here?  Remember, the point is to make _really fast code_ that goes
straight onto a DSP or FPGA.

[1] entry://../compsci/20110802-173135

Entry: Closure Conversion / Lambda Lifting
Date: Tue Aug  2 16:41:01 CEST 2011

A brief rehash.  According to [1] both mean the same.  The basic
principle is to close a lambda that captures a lexical variable by
adding that variable as a formal parameter, and then supplying this
parameter at all the call sites.

The "lifting" refers to bringing the lambda into a more general
context, i.e. the toplevel.

The point is to _avoid_ having to create closures.  The catch is that
you need to know all the call sites of the function to provide the
extra arguments.

When a function escapes, it is still necessary to represent it as a
closure, which is a (toplevel) function together with an environment
that defines the captures variables.


How's this relative for the DSP language?  The DSP lang is used to
specify programs that process data, i.e. they are in some sense "fully
bound" meaning there will not be any escaping lambdas.  However,
lambdas might still be useful because they allow for instantiation of
iteration patterns through higher order functions.

[1] http://en.wikipedia.org/wiki/Lambda_lifting


Entry: Control flow graphs, is it really a problem?
Date: Wed Aug  3 20:57:09 CEST 2011

1. scopes

There is only a single scope for the whole function.  All
intermediates and variables are "global" and only assigned to once.

2. goto with arguments

Doesn't exist because all function calls are either true function
calls and not relevant, or inlined and part of the whole expression.


So it doesn't look like there is a real problem.  Let's try it by
adding a "label" line in the generated program.

Hmm.. trying this I do run into impedance match issues.  Label and a
binding are different things.  And while it would be possible to "pile
a label onto the genering code", doing so seems to be a bit
problematic.  Some issues:

  - C doesn't allow mixing labels and variable definitions.  Does
    LLVM?  Point is: something needs to come out that uses assignments
    and jumps in the proper way.  Plain SSA does allow this.

  - What with back-jumps.  Currently I'm thinking only of forward
    jumps to implement if_ which doesn't pose the problem of multiple
    assignments.

I'm mixing up things; too much handwaving.


Entry: Next
Date: Sat Aug  6 13:44:50 CEST 2011

- control flow graphs (CPS/ANF or SSA?)
- learn SSA formalism, see difference with CPS/ANF
  (-> http://delicious.com/doelie/ssa)

side tracks:
- types: GADT, dependent
- category theory pierce


Entry: LLVM's SSA
Date: Mon Aug  8 10:05:25 CEST 2011

It might be a good idea to make sure the low level representation can
express the form needed by LLVM.  One of the differences with the
current approach is that all arguments are type-tagged, instead of the
return value, which is probably inferred.

Or maybe I get this wrong: could be that the op is tagged, and the
literals are tagged, but not the registers, as their type is known.

Here it might really be better to not start reinventing things.  There
are LLVM bindings in Haskell already[1].  It uses template haskell and
type-level[2].

It seems to have bidings for the full LLVM API, and generates bitcode,
not assembler text.  Is this useful for me?  From a question on the
Racket list I got the answer that for just generating code, it's not
worth it to use the API calls directly, and much simpler to generate
the asm.  Since I'm probably going to generate LLVM from different
programming systems (Haskell + Racket) it might be best to get to get
to know the textual syntax.

Otoh, using the programmatic interface might be interesting to perform
other kinds of tests on the code, i.e. meta-optimizations that measure
execution speed.  It might go straight into the interpreter too (if
the monadic type isn't an issue as mentioned before).


mAddMul :: CodeGenModule (Function (Int32 -> Int32 -> Int32 -> IO Int32))
mAddMul = 
  createFunction ExternalLinkage $ \ x y z -> do
    t <- add x y
    r <- mul t z
    ret r

[1] http://hackage.haskell.org/package/llvm
[2] http://augustss.blogspot.com/2009/01/llvm-llvm-low-level-virtual-machine-is.html


Entry: synthesizer-llvm
Date: Mon Aug  8 11:04:10 CEST 2011

Of course[1], I'm not the first one ;)

It points to Numeric Prelude[3], something to look at later.  See also
Haskore[4] with lots of music-related links.

Hmm, from [1] I read that "You also have to write the core signal
functions using LLVM assembly language."  This probably means I have
something to contribute still ;)

Looking at the code it seems still to contain a lot of boilerplate.
Let's keep the existence of this in mind but not look at it too much.
I think it can be done simpler using a more abstract "final" DSL
approach, focussing on the specification of the math, instead of doing
things more manually.


[1] http://hackage.haskell.org/package/synthesizer-llvm
[2] http://www.haskell.org/haskellwiki/Synthesizer
[3] http://www.haskell.org/haskellwiki/Numeric_Prelude
[4] http://www.haskell.org/haskellwiki/Haskore


Entry: Numeric Prelude
Date: Mon Aug  8 11:35:52 CEST 2011

[1] http://www.haskell.org/haskellwiki/Numeric_Prelude


Entry: LLVM Haskell bindings
Date: Mon Aug  8 12:47:13 CEST 2011

It's probably more educational to look at the programmatic LLVM
bindings in Haskell instead of trial-and-error programming on the
textual syntax.  I might learn a thing or two about Haskell...

A starting point adapted from [1] and [2]:

import LLVM.Core
import LLVM.ExecutionEngine
import Data.Word


mAddMul :: CodeGenModule (Function (Word32 -> Word32 -> Word32 -> IO Word32))
mAddMul = 
  createFunction ExternalLinkage $ \ x y z -> do
    t <- add x y
    r <- mul t z
    ret r

main = do
  initializeNativeTarget  -- Otherwise: error: Interpreter not linked in.
  addMul <- simpleFunction mAddMul
  a <- addMul 2 3 4
  print a


To integrate this with Symantics, the point to focus on is the type:

*Main> :t add
add
  :: (ABinOp a b (v c), IsArithmetic c) =>
     a -> b -> CodeGenFunction r (v c)

How to make this fit?  It should be mostly the same as the SymAsm
code, which also uses a monad.  See [5]:

  data CodeGenFunction r a
  Monad (CodeGenFunction r)

The type parameters are fixed in the mAddMul function.

  ExternalLinkage :: Linkage

  createFunction :: (FunctionArgs f g r, IsFunction f) =>
                    Linkage -> g -> CodeGenModule (Function f)

With CodeGenFunction r = M, the type can be written simpler as:

   add :: (ABinOp a b (v c), IsArithmetic c) => a -> b -> M (v c)

What is `v'?  The do notation used above only implements the monadic
chaining, but how can (v c) be fed back into a second application,
i.e. the result of `add' is taken out of the monad, bound to `t' then
passed to `mul', so it looks like types will accumulate.  Taking out
the body gives:

  _AddMul x y z = do
    t <- add x y
    r <- mul t z
    ret r

  _AddMul ::
     (ABinOp a b (v c),
      ABinOp (v c) b1 (v1 c1),
      Ret (v1 c1) r,
      IsArithmetic c,
      IsArithmetic c1) =>
     a -> b -> b1 -> CodeGenFunction r Terminate

Hmm.. what does this (v c) and (v1 c1) business actually represent?

Wild guess: `v' stands for "value-ify".  It somehow guarantees that
there is a parameterized type involved?  Looking at the instance
declarations for ABinOp[6] all instances have Value or ConstValue type
constructor wrappers.  Is this `v' there to somehow type-pattern-match
on the type constructor?

Hmm.. trying to understand.  The type classes are used to express type
checking constraints.  It doesn't seem that they have instances.  I've
seen this pattern before..  See next post.

Anyways, moving forward to matching this monadic representation with
the Symantics instance.  The following factorization makes more sense,
and is still composable, by removing `ret'.

  _AddMul x y z = do
    t <- add x y
    mul t z
                     
  mAddMul :: CodeGenModule (Function (Word32 -> Word32 -> Word32 -> IO Word32))
  mAddMul = createFunction ExternalLinkage $ \x y z -> do
    r <- _AddMul x y z
    ret r

Let's move to unary functions to make the composition simpler.


[1] http://augustss.blogspot.com/2009/01/llvm-llvm-low-level-virtual-machine-is.html
[2] https://github.com/bos/llvm/blob/master/examples/Arith.hs
[3] http://stackoverflow.com/questions/6050721/type-problem-with-codegenfunction-codegenmodule-with-llvm-haskell
[4] http://hackage.haskell.org/packages/archive/llvm/latest/doc/html/LLVM-Core.html
[5] http://hackage.haskell.org/packages/archive/llvm/latest/doc/html/LLVM-Core.html#t:CodeGenFunction
[6] http://hackage.haskell.org/packages/archive/llvm/latest/doc/html/LLVM-Core.html#t:ABinOp


Entry: Type classes as constraints
Date: Mon Aug  8 15:02:13 CEST 2011

In the LLVM bindings[1] for haskell, type classes are used to restrict
operations/combinations.  On some level it makes sense to me, but I'm
missing a bit of mentoring..

[1] http://hackage.haskell.org/packages/archive/llvm/latest/doc/html/LLVM-Core.html#t:ABinOp


Entry: All this fuss over liftA2 ?
Date: Mon Aug  8 15:59:40 CEST 2011

For a monad, is liftA2 the same as l1 or l2?

  l1,l2 :: Monad m => (a -> b -> c) -> m a -> m b -> m c

  l1 fn = \ma mb -> do
    a <- ma
    b <- mb
    return fn a b

  l2 fn = \ma mb -> do
    b <- mb
    a <- ma
    return fn a b

It has to be one of the two, but which one?  This depends on the
definition of <*> for a monad.

in [1] the liftM2 operation is defined with the same type.  Is liftM2
the same as liftA2?  It has to be..

[1] http://haskell.org/ghc/docs/latest/html/libraries/base/Control-Monad.html#v:ap


Entry: Interface question:  a -> b -> M c  or  M a -> M b -> M c 
Date: Mon Aug  8 16:12:11 CEST 2011

The Haskell LLVM library uses (something like) a -> b -> M c, which
enables the use of the do notation to map to SSA variable binding.

The Symantics language however uses expressions, not SSA form, which
means the representation needs to be  M a -> M b -> M c.

That already answers the question: Why doesn't Symantics have the
former type?  ( Because it supports espressions. ) LLVM is already
SSA, while Symantics can be converted to SSA by fixing an order in the
2-op lift function.  This is exactly what is done in SymAsm:

  op2 t fn (Asm x) (Asm y) = Asm $ do 
    vx <- x
    vy <- y
    op t fn [vx, vy]

where `op' creates a new value in the State Continuation monad.

For LLVM I'm trying this, which fails due to too general type
restrictions:

_iadd (SSA ma) (SSA mb) = SSA $ do
  a <- ma
  b <- mb
  add a b

_iadd
  :: (ABinOp a b (v c), IsArithmetic c) =>
     SSA r a t -> SSA r b t1 -> SSA r (v c) t2

Can these r & a parameters just be dropped?

I tried the following[1]:

data SSA t = forall r a. SSA { unSSA :: CodeGenFunction r a }

But then the instance declaration gives problems.  This is for another
time; the type mojo is going over my head again..

Ha, a solution to the problem is explained in [2].  

What I take from this is that it's probably simpler to turn everything
upside down anyway: start with a monadic interface, i.e. make
Symantics monadic so do notation can be used for let, and then define
num classes for that monad.

Hmm... this defines Num classes for all variants of types that get
built up, but doesn't try to force things into a simpler type
altogether, which is what I'm trying to do.

Pff..

The solution seems to be to fix the type:

_iadd ::(ABinOp a a a, IsArithmetic a) => SSA r a t -> SSA r a t -> SSA r a t
_iadd (SSA ma) (SSA mb) = SSA $ do
  a <- ma
  b <- mb
  add a b


[1] http://en.wikibooks.org/wiki/Haskell/Existentially_quantified_types
[2] http://augustss.blogspot.com/2009/01/llvm-arithmetic-so-we-want-to-compute-x.html


Entry: do and binding
Date: Mon Aug  8 18:15:07 CEST 2011

From [1] (sometimes you really need to see things spelled out for you):

    When you define a DSEL you usually use the do-notation for the
    binding construct in the embedded language.  This is the only
    programmable binding construct (by overloading (>>=)), so there is
    little choice.

[1] http://augustss.blogspot.com/2011/07/impredicative-polymorphism-use-case-in.html


Entry: Struggling with LLVM type classes
Date: Sun Aug 14 12:57:02 CEST 2011

The way types are used in the LLVM Haskell wrapper is a bit over my
head.  My guess is that it uses phantom types to represent the
different basic types of LLVM, but this seems to interact badly with
the phantom types in my Symantics class.

Maybe I should just read about this a bit more.  Or read the LLVM
wrapper code.


Entry: Re-focus
Date: Sun Aug 14 13:48:40 CEST 2011

Now that tools are accessible, where do I need to go next?

The main point is about "manually guided partial evaluation", which
means that from specification -> implementation the following should
happen:

  - expensive run-time binding removed, i.e.:

    * don't implement higher order functions (that abstract iteration
      / mapping) as function calls, but specialize them to the
      function(s) they are applied to.

    * manage duplication, i.e. unrolling to expose software
      pipelining.

  - put the code into a form that allows lower level compiler
    (i.e. LLVM) to optimize it further.

    * pick a serialization for an inherently multi-dimensional
      problem.  tune this to the memory hierarchy.


Since the Symantics approach is now +- clear to me, it might be better
to leave it alone for a while and work directly on top of the
LLVM.Util.Arithmetic abstractions.  It looks like these LLVM bindings
are a good way to learn more Haskell by reading code.


Entry: LLVM command line tools
Date: Sun Aug 14 14:06:37 CEST 2011

llvm-dis  : disassemble .bc files
llc       : compile .bc to i386 .s
  --march=arm  -> arm .s
  --march=c    -> C code


Entry: full circle
Date: Sun Aug 14 14:21:46 CEST 2011

Hmm..  actually I really don't need more to make code AND wrappers to
run DSP modules in Pd.

Problem: LLVM really doesn't work in GHCi so I need to split the work
in 2 steps:

  - Find a common ground for the LLVM.Utils.Arithmetic syntax such
    that functions can be tested in Haskell without requiring LLVM
    in-image.

  - Perform code generation through GHC only.


It looks like most ops are defined as plain functions, not typeclass
functions.  Making a wrapper for these should be enough.

So why am I grinding to a halt?  Motivation?  Confusion?  Realization
I need to go back to state machines?


Entry: Sawtooth
Date: Sun Aug 14 15:36:40 CEST 2011

Back to the sawtooth example.  It's a nice one because it has:
  - different data types (int, float)
  - time dependency
  - nonlinearity

The mix of time-dependency and nonlinearity means that the only simple
representation (I can think of) is as a state-space model.

Actually, the int is not really an int, it's used as a phasor (angle).

 \ (th, x2, freq) -> (th', x2', out)  where

   th'  = th + freq   -- update phasor
   x    = ph2i th     -- convert phasor to [0,1[
   x2'  = x * x       -- square
   out  = x2 - x2'    -- filter (differentiate)


( very wrong, but the idea seems ok. )
   

Entry: State machines and applicative.
Date: Sun Aug 14 15:53:24 CEST 2011

Again, what's an applicative Functor?

A Functor is something that has

  fmap :: Functor f => (x -> y) -> f x -> f y

An Applicative functor is something that has <*>  (map-apply?)

  (<*>) :: Applicative a => a (x -> y) -> a x -> a y


So a functor is an abstraction that supports a lift from a function to
a function over abstractions.  An applicative functor is something
that has a lift from an abstraction of a function to a function over
abstractions.

A state machine can be represented by an applicative functor.  The
state threads behind the scenes, but the observable i/o behaviour is
what we can expose.

  Smach s t = Smach s -> (s, t)

A state machine is a function that produces a value of type t (which
can be a function) and a next state s.

The iteration of this is also Applicative.

( construct! )

Why is this trivial?  Because it is also a Monad (the state monad),
which is more specific than Applicative.


Is it really that simple?  Pd is a state monad.

Is there a useful modification that can be made to the state monad
that will no longer make it monadic?  Maybe causality?


Entry: Audio DSP is not Image DSP
Date: Sun Aug 14 16:13:33 CEST 2011

I came to a realization today that audio DSP and image DSP are
fundamentally different.  The reason is causality - the directionality
of the time variable.

What this means is that for Audio, the principle abstraction is the
difference equation, a recursive relation between current and past
values.

When this recursive model is not imposed or "strongly pulled" by the
problem domain, other "symmetric" models are possible that allow for
different divide and conquer approaches.

With "strongly pulled" I mean that a lot of algorithms have more bang
for the buck if recursion is allowed: reusing past solutions in
current ones.  What this means practically for my DSP language is that
I probably don't get far without acknowledging the direction of time,
and using the state monad as a basic model.


Entry: Monad transformers
Date: Sun Aug 14 16:23:59 CEST 2011

Looks like what I need is a mixture of State and List.

[1] http://www.haskell.org/haskellwiki/Applicative_functor


Entry: The State Monad and State Space Models
Date: Sun Aug 14 17:09:02 CEST 2011

The State Monad is  

    s -> (t, s)

What I always wondered is, shouldn't this be:

    s -> (i -> o, s)

In its basic form when t is a primitive type like Integer, an instance
of the State Monad can be used to generate a stream of values.  The
second type generates a stream if i -> o functions.

Is this enough?

I have trouble writing this down.  It seems that it's not possible to
capture the behaviour fully because the output s depends on the i
input to i->o which is one of the outputs.  So it seems that the type
needs to be a permutation of:

      i -> s -> (o, s)
      (i, s) -> (o, s)

The first one has the same type as the 2nd argument of bind.

This illustrates the point that monads are "output oriented" instead
of "input oriented"[1].

Now, the problem is the following.  When composing two state monad
functions using bind, the state is changed through all functions in
the composition.  This is *not* what we want: we want to tuple the
state: make the product of the state of the two composed entities.
I.e. the type of composition is

chainSSM :: (a -> s1 -> (b, s1)) ->
            (b -> s2 -> (c, s2)) ->
            a -> (s1,s2) -> (c, (s1,s2))

with a straightforward implemenation:

chainSSM f g = fg where
  fg a (s1,s2) = (c, (s1',s2')) where
    (b, s1') = f a s1
    (c, s2') = g b s2
  
In fact, the normal State monad "threading" operation is never needed.
The state is always private to the modules.  How can this kind of
composition be abstracted?  Is this an Arrow, or can it be
Applicative?  What about Category?  Haven't tried that yet..

In fact, this doesn't work: can't make composeSSM be the (.) from
Category because the type parameter that represents the state is not
constant: it grows "bigger".

The fact that the types get bigger seems to encode the property that
SSM composition cannot form loops, i.e. there is a strict order.

Is it somehow possible to automatically lift the states of the
functions such that the state is the same?  Hmm.. doesn't seem to
work.  Any other way to hide it?

Look here[2] for a "type hider", an existentially qualified type.  The
following seems to compile with -XExistentialQuantification :

data SSM i o = forall s. SSM (i -> s -> (o, s))

The thing is that you can't really do anything with these (because the
type is unknown!) until you limit it to some type class, then it's
possible to use the type class ops.

data SSM i o = forall s. SSMstate s => SSM (i -> s -> (o, s))

So, setting the limit to SSMstate which has an initState operation, it
looks that I can at least define this function:

unfoldSSM :: (SSM i o) -> [i] -> [o]
unfoldSSM (SSM tick) = f initState where
  f _ [] = []
  f s (i:is) = (o:os) where
    (o,s') = tick i s
    os = f s' is

Now the Category instance works also:

instance Category SSM where
  (.) (SSM f) (SSM g) = SSM $ g `chainSSM` f
  id  = SSM $ \i _ -> (i,())

The discrete integral then becomes:

int = SSM $ \i s -> let o = i+s in (o,o)

Define show to print the impulse response:

dirac = (1:[0,0..])
instance (Enum i, Num i, Show o) => Show (SSM i o) where
  show f = show $ take 10 $ unfoldSSM f dirac

Which prints int as:

*Main> int
[1,1,1,1,1,1,1,1,1,1]
*Main> int . int
[1,2,3,4,5,6,7,8,9,10]
*Main> int . int . int
[1,3,6,10,15,21,28,36,45,55]

Cool huh ;)

Now, it seems to be possible to get rid of the SSMstate altogether by
storing the initial state in the SSM object:

data SSM i o = forall s. SSM s (i -> s -> (o, s))

instance Category SSM where
  (.) (SSM f0 f) (SSM g0 g) = SSM (g0,f0) $ g `chainSSM` f
  id  = SSM () $ \i s -> (i,s)

int = SSM 0 $ \i s -> let o = i+s in (o,o)

unfoldSSM :: (SSM i o) -> [i] -> [o]
unfoldSSM (SSM init tick) = f init where
  f _ [] = []
  f s (i:is) = (o:os) where
    (o,s') = tick i s
    os = f s' is

This is even cooler.  State is now completely hidden.

Moral: initial state and update function had to be packaged together
to make it easier to hide the state type in all the other operations.

[1] http://research.microsoft.com/en-us/um/people/emeijer/publications/meijer95merging.pdf
[2] http://en.wikibooks.org/wiki/Haskell/Existentially_quantified_types


Entry: Num SSA ?
Date: Mon Aug 15 00:19:37 CEST 2011

Looks like it's time now to start regarding the SSM types as
functionals, and define operations over them.  It looks like this will
eventually lead to "d" and "z" operators in a straightforward way.

After defining:

lift2 :: (a -> b -> r) -> 
         (SSM a' a) -> 
         (SSM b' b) -> 
         (SSM (a',b') r)
lift2 op a b = fmap (uncurry op) $ par a b

I have trouble instantiating Num because of:

    Expected type: SSM i o -> SSM i o -> SSM i o
      Actual type: SSM i o -> SSM i o -> SSM (i, i) o

This is another one of those cases where the presence of an input (or
state) parameter gives trouble when we're only interested in the
output behavior.  Summarized in a hand-waving way:

  Working with things that are not "applicative" in Haskell is a pain
  because it is value-oriented.

It's fine to work in point-free style, but working in expression style
makes it quite hard to do.


Entry: Applicative?
Date: Mon Aug 15 12:49:11 CEST 2011

So..  I have the following: series / parallel composition and pure
functions.

serSSM :: ((s1, b)  -> (s1, c)) ->
          ((s2, a)  -> (s2, b)) ->
          ((s1,s2), a) -> ((s1,s2), c)
serSSM f g = fg where
  fg ((s1,s2), a) = ((s1',s2'), c) where
    (s2', b) = g (s2, a)
    (s1', c) = f (s1, b)
    
parSSM :: ((s1, a)  -> (s1, c)) ->
          ((s2, b)  -> (s2, d)) ->
          ((s1,s2), (a,b)) -> ((s1,s2), (c,d))
parSSM f g = fg where          
  fg ((s1,s2), (a,b)) = ((s1',s2'), (c,d)) where
    (s1',c) = f (s1, a)
    (s2',d) = g (s2, b)
    

pureSSM :: (a -> b) -> ((), a) -> ((), b)
pureSSM f (_,a) = ((), f a)


I wonder if parallel composition is really necessary since it seems to
be mostly useful for doing things like:
   a -> b -> (a,b) -> (a+b)

I.e. replacing the tuple by an arithmetic reduction.

Maybe it's enough to just allow for input/output/state to contain
curried functions.  Then Applicative can be used.

This doesn't work because my abstraction is Arrow-like (explicit
input/output, kind * -> * -> *) and Applicative is Functor-like (* ->
*).  See also [1].  It seems Applicative needs a "generator" approach
instead of a "processor" approach.

I'm first trying for Functor:

instance Functor (SSM i) where
  fmap f a = (SSM () $ pureSSM f) . a

This works!
*Main> fmap (1+) int
[2,2,2,2,2,2,2,2,2,2]

The case for mapping over inputs should be completely symmetric,
though I don't know how to "flip" the arguments of the type
constructor SSM.  As mentioned here[2] it's possible to use a
Bifunctor instead.


So the trick seems to be to focus on the output.  Let's try
Applicative.  There it seems to go wrong.  Paraphrased: x is a value
while pureSSM expects a function.

instance Applicative (SSM i) where  
  pure x = SSM () $ pureSSM x


So.. is it really necessary?  It doesn't look like it.  Ser/par +
Functor seem to be really enough.


[1] http://cdsmith.wordpress.com/2011/07/30/arrow-category-applicative-part-i/
[2] http://stackoverflow.com/questions/2335967/manipulating-the-order-of-arguments-to-type-constructors
[3] http://hackage.haskell.org/packages/archive/category-extras/0.53.5/doc/html/Control-Functor.html


Entry: Applicative approach
Date: Mon Aug 15 14:47:14 CEST 2011

It looks like my current abstraction doesn't work well in an
Applicative, Num world.  Let's try to do it differently, working in a
pointwise style.

What I read in [1] -- focusing on a single line in a big article as
sigfpe suggests -- is that some quantification is necessary to put
this into Applicative:

    "Arrow is equivalent to a Category instance, plus a universally
    quantified applicative instance for any fixed domain"

What this means is that you can only combine the outputs.


So let's scrap it and start over: build everything from the start as a
sequence abstraction, where operators are sequences of (time-varying)
functions.

ssmApp :: (s1      -> (s1,       (a -> b))) ->  -- fi
          (s2      -> (s2,       a)) ->         -- ai
          (s1, s2) -> ((s1, s2), b)             -- bi
          
ssmApp fi ai = bi where
  bi (s1, s2) = ((s1',s2'), b) where
    (s1', f)  = fi s1
    (s2', a)  = ai s2
    b = f a

The data type and Applicative instance then become:

data SSM a = forall s. SSM s (s -> (s, a))
    
instance Functor SSM => Applicative SSM where
  pure  v = SSM () $ \_ -> ((), v)
  (<*>) (SSM sf f) (SSM sa a) = SSM (sf, sa) (ssmApp f a)
  

That seems straightforward.  Let's go down the path and define the
other operations.

Trouble starts when I'm trying to make a state-dependent i -> o
function.  Why is this?  It seems to be the problem I ran into before
[2].

The trouble seems to be in converting (s,i) -> (s,o) into s -> (s,
i->o) because the output state can depend on the input.

Can this be solved by making the type of the composition more general?
I.e. :

signalApp :: (s1      -> (s1,       (s1 -> a -> b))) ->  -- fi
             (s2      -> (s2,       a)) ->               -- ai
             (s1, s2) -> ((s1, s2), b)                   -- bi

Nope.. this messes up the whole Applicative type because it leaks
through.  Maybe the type needs to be:

data Signal a = forall s. Signal s (s -> (s, s -> a))

This still doesn't work because s can depend on i.  Why not write that
explicitly, separating state update and output equation?

fun  s -> i -> s
     s -> i -> o

val  s -> s
     s -> o

Wait.  For something carrying a function, make the state also depend
on the input.

val  s -> s, o
fun  (i -> s) -> (i -> s, i -> o)

Can we make that work?  I'm loosing the point...  It can't be that
hard if it is at all possible.  Dead end.

Backtracking to the previous version, using Applicative it does become
very simple to make Num work:

instance (Eq (Signal a), Num a) => Num (Signal a) where
  (+) = liftA2 (+)
  (*) = liftA2 (*)
  abs = fmap abs
  signum = fmap signum
  fromInteger = pure . fromInteger

So there is definitely some utility in having an Applicative
interface.  The trouble is that I currently can't express this.

To make it more explicit, why can't this work?

currySSM :: ((s,i) -> (s,o)) ->
            (s -> (s, i -> o))
            
currySSM ssm = f where            
  f s = (s', io) where
    io i = o where
      (s', o) = ssm (s, i)

The trouble here is that s' depends on the input of io.  Is there a
way to re-arrange this such that order of evaluation is enforced,
i.e. that s' is never evaluated before the input is applied to io?

I don't think it's possble to write a function like that, except when
it passes the state without updating.  Let's investigate in another
article how to maybe do this with separate state and output equations.


[1] http://cdsmith.wordpress.com/2011/07/30/arrow-category-applicative-part-i/
[2] entry://20110814-170902


Entry: Tying the knot with Applicative
Date: Mon Aug 15 17:16:31 CEST 2011

The following type doesn't seem to work because state can depend on
the input to the function:

currySSM :: ((s,i) -> (s,o)) ->
            (s -> (s, i -> o))

Is there a way to do this with


currySSM :: ((s,i) -> s) ->
            ((s,i) -> o) ->
            (s -> (s, i -> o))

Seems to have the same problem.  The only thing that works is to take
the input out of the tuple:

currySSM :: ((s,i) -> (s,o)) ->
            (s -> i -> (s, o))

currySSM ssm = f where
  f s i = (s', o) where
    (s', o) = ssm (s, i)


What about requiring the state of a function to be of the form (s -> i
-> (s, o)) ?   Hmm.. seems like another ill-informed idea..

The secret really has to be in signalApp.  I'm doing something that
fundamentally prevents Applicative to work.

Maybe try to answer some more generic questions.

  - Look at the type of <*>.  This seems to have "naked" functions and
    values.  These need to be parameters of some type, so where do
    they appear?  As inputs, outputs, middles?

    <*> :: Applicative a => a (x -> y) -> a x -> a y

In some sense the function is definitely some output of something.
The problem is that I want to depend on what goes into a function like
that to make state updates.  This seems to be not allowed.

The only way I can see this work is if somehow an i->o function can
parameterize a state update.  Meaning *not* the input, but the
function itself.  Can the state be a function?

  (s1        -> (s1,      i))       ->
  ((i -> s2) -> (i -> s2, i -> o))  ->
  ((s1,s2)   -> ((s1,s2), o)

The thing is that in app the input is available.  Can we use this to
chain it?

The following compiles and seems to be correct from inspection.

_app :: (s1        -> (i -> s1,  i -> o))  ->  -- fx
        (s2        -> (s2,       i))       ->  -- ix
        ((s1, s2)  -> ((s1, s2), o))           -- ox
         
_app fx ix = ox where
  ox (s1, s2) = ((s1', s2'), o) where
    (is1', f) = fx s1
    (s2',  i) = ix s2
    s1' = is1' i
    o   = f i

But the types are iregular and don't fit (s -> (s,v)).  Can they be
cast into a different form?

_app :: ((i -> s1)      -> (i -> s1,       i -> o))  ->  -- fx
        (s2             -> (s2,            i))       ->  -- ix
        ((i -> s1, s2)  -> ((i -> s1, s2), o))           -- ox
         
_app fx ix = ox where
  ox (s1, s2) = ((s1', s2'), o) where
    (s1', f) = fx s1
    (s2', i) = ix s2
    o   = f i

Hmm... the input doesn't go into the state update.  This can't be
right.  Let's just be done with it and make the input an existential
type also.

_app :: (a  -> s1 ->      (s1,       a  -> b)) ->
        (() -> s2 ->      (s2,       () -> a)) ->
        (() -> (s1,s2) -> ((s1, s2), () -> b)
  

Yeah well, I can keep permuting this but it doesn't look like it's
going to fall out.

Why doesn't this work?

Maybe because it's really a Monad?  


Entry: Is this SSM threading a Monad?
Date: Mon Aug 15 18:16:12 CEST 2011

It's not the state monad, but...

The "purity" of Applicative (being parameterized by pure functions at
some point) doesn't seem to allow the kind of threading I want to
perform.  Let's try to cast it in a Monad instance.


_bind :: (     sa      -> (a, sa)) ->
         (a -> sb      -> (b, sb)) ->
         (     (sa,sb) -> (b, (sa,sb)))
         
_bind ma f = mb where         
  mb (sa, sb) = (b, (sa',sb')) where
    (a, sa') = ma sa
    (b, sb') = f a sb
    
_return :: a -> (() -> (a, ()))    
_return x = \_ -> (x, ())    
    

But (being a bit tired?) I can't seem to cast this into a Monad
instance for the data type:

data Signal a = forall s . Signal s (s -> (s, a))


It seems to be just wrapping issue.  Using the non-wrapped _bind I
managed to perform a composition:

ones = \s -> (1, s)
int  = \i s -> let o = i + s in (o,o) 
  
_run seq init = f init where
  f s = (v : f s') where
    (v, s') = seq s
   
*Main> take 10 $ _run (ones `_bind` int) (0,0)
[1,2,3,4,5,6,7,8,9,10]

*Main> take 10 $ _run ((ones `_bind` int) `_bind` int) ((0,0), 0)
[1,3,6,10,15,21,28,36,45,55]

This illustrates neatly why I really want to propagate initial states :)

Anyways, since Monad m => Applicative m I wonder why I can't write an
Applicative instance directly since it exists.

I found the following (ugly) implementation which includes initial
state passing.


--         in    state0   state       out state+
---------------------------------------------------  
__bind :: (     (sa,      sa      -> (a,  sa))) ->
          (a -> (sb,      sb      -> (b,  sb))) ->
          (     ((sa,sb), (sa,sb) -> (b,  (sa,sb))))
         
__bind (a0,ma) f = ((a0,b0),mb) where
  -- Get to b0 through a0 -> ma -> f since init state does not depend
  -- on the input.  There seems to be no other way to get at it.
  (a, _)  = ma a0
  (b0, _) = f a
  
  mb (sa, sb) = (b, (sa',sb')) where
    (a, sa') = ma sa
    (_, f')  = f a
    (b, sb') = f' sb

__return x = ((), \() -> (x, ()))

__ones = __return 1
__int  = \i -> (0, \s -> let o = i + s in (o,o))
  
__run (init,seq) = f init where
  f s = (v : f s') where
    (v, s') = seq s

This also works, but is a bit contrived, especially with that state
hiding on the inside of the __int.  Maybe this can be two monads, one
for state chaining, and one for passing the state around.

*Main Control.Monad> take 10 $ __run __ones
[1,1,1,1,1,1,1,1,1,1]

*Main Control.Monad> take 10 $ __run (__ones `__bind` __int)
[1,2,3,4,5,6,7,8,9,10]

*Main Control.Monad> take 10 $ __run ((__ones `__bind` __int) `__bind` __int)
[1,3,6,10,15,21,28,36,45,55]


However.. Trying to wrap this in Signal doesn't seem to work very
well.  I can't get it out of the wrapper!

instance Monad Signal where
  return = Signal . __return
  (>>=) (Signal ma) f = Signal $ __bind ma f' where
    f' a = case (f a) of
      (Signal mb) -> mb

This will give a generic, unknown type t to mb, not (s, s -> (v, s))
Is there a detour throug fmap and join?

Why won't unpacking work?  Maybe the "unpacking" is essential to monad
structure?

I think I need to stop.  Thoroughly confused now.

Hmm it seems I can define Applicative:

__ap mf ma =  mf `__bind` \f ->
              ma `__bind` \a ->
              __return $ f a

instance Functor Signal => Applicative Signal where
  pure = Signal . __return
  (<*>) (Signal f) (Signal a) = Signal (__ap f a)

Now I don't understand anything any more.  I thought I needed bind to
use processors of the form:

   m a -> (a -> m b) -> m b

But it seems that given that bind, it's not possible to create things
that could evaluate:

   m (a -> b) -> m a -> m b

How to create that m (a -> b) value?

That seems to be the catch.  Such values are not the same a (a -> m
b) because they cannot implement input-dependent state transitions.

Man this is confusing.


Entry: Conclusion
Date: Mon Aug 15 22:18:18 CEST 2011

( EDITED )

I made a couple of implementations that are all quite similar.  The
big conclusion seems to be that signal processors are Kleisli arrows
of the monad that represents signals.

Due to problems with the "growing" state types I had to resort to
existentials to be able to hide that state and implement some class
instances.  However, this doesn't seem to work for Monad.

The files in ssm/

* StateSpace.hs: simple composition, abstracted as Category with state
  hidden.  ( Almost an Arrow. )

* SigApp.hs: applicative built from the ground up.  I could not
  express input-dependent state.  This seems to be normal since I
  later found that signal operators are Kleisli arrows.

* SigJoin.hs: monad in terms of fmap & join which works a lot better.
  This has instances for Functor, Applicative, Category, Arrow (the
  Kleisli arrow of the monad) but misses Monad itself due to a problem
  with typing existentials.

  ( There's also SigBind.hs which is similar but more clumsy.
  Starting from fmap and join seems to work better. )

About the need for Kleisli arrows; I don't know how to make it precise
but here it goes:

The imposibility to combine input-dependent state as <*> seems to be
simply a property of the purity of the applicative interface: it is
parameterized by pure functions (i -> o) which do not mesh well with
s -> i -> (s, o).

I read[1] that Applicative provides pure expressions and sequencing
but no binding.  Maybe that's the point?  Maybe what I need to express
needs binding, or more put differently "joining".

Actually, in retrospect, that I need Kleisli arrows isn't such a
surprise.  In general, types like (a -> M b) are used for
"representing a computation like a -> b, but with hidden effects", and
influencing state sure looks like an effect.  The monad output M b
represents the state update function, and the fact that it depends on
the input a encodes that a can influence the state transition
function, and so indirectly the state when the SSM is unrolled.

[1] http://haskell.org/ghc/docs/6.12.2/html/libraries/base-4.2.0.1/Control-Applicative.html


Entry: Join
Date: Mon Aug 15 22:36:50 CEST 2011

So..  a Monad can be built from fmap and join.  Since fmap is
conceptually simple to understand, what is join in the case of the SSM
monad?

M (M a) =

M ((s, s -> (a, s))) = 

(s', s' -> ((s, s -> (a, s)), s'))

Let's try to implement that directly.

First in a form without initial state passing:

(s' -> ((s -> (a, s)), s'))

It looks like this is really just a straightforward translation to
"parallelize" the states:

((s',s), (s',s) -> (a, (s',s)))

I was able to write down join and map without too much trouble for
doing the 2 functions (init state pass + state chaining)
simulataneously.


type Sig s a = (s, s -> (a, s))

_join :: (Sig s1 (Sig s2 a)) -> Sig (s1,s2) a
_join (i1, u1) = ((i1, i2), u12) where
  ((i2, _), _) = u1 i1
  u12 (s1, s2) = (a, (s1', s2')) where
    ((_, u2), s1') = u1 s1
    (a, s2')       = u2 s2

_fmap :: (a -> b) -> Sig s a -> Sig s b
_fmap f (s0, u) = (s0, u') where
  u' s = (f a, s') where
    (a, s') = u s


Entry: HC question: Hiding "growing" state using existentials.
Date: Tue Aug 16 01:50:07 CEST 2011

See the thread at:
http://www.haskell.org/pipermail/haskell-cafe/2011-August/094673.html

Dear Cafe,

I'm building an abstraction for representing sequences as difference
equations, storing initial values and update equation.

I have something that resembles a Monad, but has an extra state
parameter s that "grows" on _join or _bind, so I can't simply create
an instance Monad (Sig s).

> {-# LANGUAGE ExistentialQuantification #-}

> data Sig s a = Sig s (s -> (a, s))

> _join :: (Sig s1 (Sig s2 a)) -> Sig (s1,s2) a
> _join (Sig i1 u1) = Sig (i1, i2) u12 where
>   ((Sig i2 _), _) = u1 i1
>   u12 (s1, s2) = (a, (s1', s2')) where
>     ((Sig _ u2), s1') = u1 s1
>     (a, s2')          = u2 s2

> _fmap :: (a -> b) -> Sig s a -> Sig s b
> _fmap f (Sig s0 u) = Sig s0 u' where
>   u' s = (f a, s') where
>     (a, s') = u s
    
> _return x = Sig () $ \() -> (x, ())

> _bind :: (Sig s1 a) -> (a -> Sig s2 b) -> (Sig (s1,s2) b)
> m `_bind` g = _join ((_fmap g) m)


I try to hide the state parameter using existential quantification.

> data Signal a = forall s. Signal (Sig s a)

This approach works for defining Functor and Applicative instances,
but I can't seem to find a way to obtain an unwrapped version of f
that can be passed to _bind to implement the Monad instance's (>>=).
The following returns:

    Couldn't match type `t' with `Sig s1 b'
      `t' is a rigid type variable bound by
          the inferred type of f' :: a -> t
          at /home/tom/meta/ssm/SigHC.lhs:52:7
    In the expression: mb
    In a case alternative: Signal mb -> mb
    In the expression: case (f a) of { Signal mb -> mb }

> instance Monad Signal where
>   return = Signal . _return
>   (>>=) (Signal ma) f = Signal $ ma `_bind` f' where 
>     f' a = case (f a) of
>       Signal mb -> mb

Any tips on how to work around this?

Cheers,
Tom


Entry: Kleisli arrows
Date: Tue Aug 16 11:56:21 CEST 2011

Let's assume that the Monad trouble from the previous entries can be
solved.  The resulting structure is quite neat.

The basic idea is that the monad values are signals, and processors
are represented by the corresponding Kleisli arrows.

{- Signal operators are Kleisli arrows of the monad. -}
type Opr s i o = i -> Sig s o
_kleisli :: Opr s1 a b -> Opr s2 b c -> Opr (s1,s2) a c
_kleisli f g = \x -> (f x) `_bind` g

I still think the way the type looks is a bit ugly, so I'm using a
translation from something that is easier to work with:

{- Convert a convenient state space model sigature to a Kleisli arrow. -}
_signal :: s -> ((s,i) -> (s,o)) -> Opr s i o
_signal s0 f = f' where
  f' i = Sig s0 u where
    u s = (o, s') where
      (s',o) = f (s,i)


Anwyays.  Enter blog post of Dan Piponi[1], who seems to be going into
an entirely different way implementing causal stream functions.
However, his stream functions do not seem to be recursive (FIR but no
IIR).

So how to use kleisli arrows in do notation?

Creating a signal that is twice the integral of the constant 1:

do
  one    <- return 1
  ramp   <- int one
  square <- int ramp

There doesn't seem to be much of an issue.  Though this only works if
int is not wrapped in a class, but is a "naked" arrow a -> M b.

The moral of the story seems to be: if binding (variable names,
applicative programming, value-oriented programming) is required, then
it's best to just use do notation.  For "operator algebra" it's best
to use the arrows as objects themselves.


[1] http://blog.sigfpe.com/2006/06/monads-kleisli-arrows-comonads-and.html


Entry: Arrow class
Date: Tue Aug 16 15:34:48 CEST 2011

Let's try to define an Arrow instance.  It turnes out to be quite
straightforward.  Note that is already has (.) and id from Category.

_arr :: (a -> b) -> Opr () a b  
_arr f a =  _return $ f a

_first :: Opr s a b -> Opr s (a, z) (b, z)
_first opr = opr' where
  opr' (a,z) = (s0, u') where
    (s0, u) = opr a
    u' s = (s', (b, z)) where
      (s', b) = u s

instance Arrow Operator where
  arr   = Operator . _arr
  first (Operator a) = Operator $ _first a


So this is the Kleisli Arrow of the Monad I can't implement because of
type inference problems.

The _first can probably be implemented in terms of _bind.

_first arr (a, z) = do
   b <- arr a
   return (b, z)
   
  
Entry: TODO
Date: Tue Aug 16 18:22:37 CEST 2011

I have a feeling I came to something of a conclusion on representing
state space models.  Monads seem to be best, so I'd like to find a way
around the existential typing problem.

So what's next?

- Implement Arrows for StateProc.hs

- Tinker with that existential unpacking.  Maybe find a way to define
  a Monad from an Arrow?


Entry: StateSpace as Arrow
Date: Tue Aug 16 18:41:09 CEST 2011

This was really straightforward:

instance Arrow SSM where
  arr f = SSM () $ ssmPure f
  (***) (SSM i1 u1) (SSM i2 u2) = SSM (i1,i2) $ ssmPar u1 u2
  first   a = a *** (arr id)
  second  a = (arr id) *** a
  

Entry: Arrow to Monad
Date: Tue Aug 16 19:36:19 CEST 2011

Now, can this be used to go from Arrow to Monad?

ssmKleisli :: (a -> SSM () b) -> SSM a b  
instance Monad (SSM ()) where
  return x = arr $ \() -> x
  (>>=) ma f = ma (.) (ssmKleisl f) where

So the answer seems to be yes, if the same "unpacking" problem can be
solved.  So it looks like I ca't work around it, only attack it at the
root..


Entry: The existential unpacking problem
Date: Tue Aug 16 20:24:29 CEST 2011

I'm thinking, to all great problems the nuts & bolts solution is to
introduce state or sequentiality.  What about writing this in CPS?
Something to keep in the back of the head..

Anyway, the problem is that I can't use something like this:

   f' a = case (f a) of (SSM i0 io) -> (i0, u0)

because it doesn't infer properly.  I switched to CPS because I
somehow have this hunch that this would work.  However, the following
doesn't infer properly:

  ssmUnpack (SSM s0 u) k = k s0 u

Using -XRank2Types I was able to construct a type like this:

  ssmUnpack :: SSM i o -> (forall s. s -> ((s,i) -> (s,o)) -> r) -> r

So it looks like we're a bit closer.  Can the same trick be used in
non-CPS form?  Doesn't look like it, at least I can't make this work:

  ssmUnpack' :: SSM i o -> (forall s. (s, ((s,i) -> (s,o))))
  ssmUnpack' (SSM s0 u) = (s0, u)

So let's try the CPS version.  It looks like I have to dig very deep
because that value is only exposed on the inside of bind (if
primitive) or join.

Plan: write the _join in SigJoin.hs in using a CPS unpack function.

Detour: the above seems not so simple, so I'm trying first to implemnt
the Kleisli isomorphism which I've used in the definition of Monad in
terms of Arrow.

ssmKleisli :: (i -> SSM () o) -> SSM i o

However, writing out this function with non-CPS unpack already gives
rise to a seemingly serious problem: the initial state is not
accessible without having the output, and the SSM representation
requires it to be naked.

ssmKleisli :: (i -> SSM () o) -> SSM i o
ssmKleisli f = 
  let u (s,i) =
        let (s0, u) = unpack (f i)
            (s', o) = u (s, ())
        in
         (s', 0)
  in
   SSM <cant-get-at-s0> u

In short: the initial state is hidden behind the input, and I have to
invert that order when creating an SSM.  This looks like a dead end.
Can't go from Kleisli to Monad for this particular case.

So the only way out seems to be to make the _join use a CPS unpack
function.  First, write _join in direct let form, then perform partial
CPS conversion.


I get to something, but I get stuck at a typing issue which I think
has to do with multiple versions of unpack that are used with
different types.

-- PLAIN
_join' (i1, u1) = ((i1, i2), u12) where
  (_, (i2, _)) = u1 i1 -- (**) 
  u12 (s1, s2) = ((s1', s2'), a) where
    (s1', (_, u2)) = u1 s1
    (s2', a)       = u2 s2

-- LET
_join mm =
  let unpack = \x -> x 
      (i1, u1) = unpack mm   -- unpack the double wrapper
      (_, m)   = u1 i1       -- use initial input to expose single wrapper
      (i2, _)  = unpack m    -- unpack it to get at initial state
      u12 (s1, s2) = 
        let (s1', m) = u1 s1    -- use current input to expose single wrapper
            (_, u2)  = unpack m -- unpack it to get at update function
            (s2', a) = u2 s2    -- run update
        in
         ((s1', s2'), a)
  in ((i1, i2), u12)

-- CPS
_joinCPS unpack mm =
  unpack mm 
  (\ (i1, u1) ->
    let (_, m) = u1 i1
    in unpack m 
       (\(i2, _) ->
         ((i1, i2),
          (\(s1, s2) ->
            let (s1', m) = u1 s1
            in unpack m 
               (\(_, u2) ->
                 let (s2',a) = u2 s2
                 in ((s1, s2'), a))))))


type Unpack w s a r = w -> ((Sig s a) -> r) -> r
_joinCPS :: (Unpack w' s a' r) -> w -> Sig (s1,s2) a

unpackTuple (i,u) k = k (i,u)


Bottom line: I can't seem to find a single spot to modify without
creating a huge unreadable expression that I can't type...

I'm done.


Entry: Arrows to Monads: different SSM representation.
Date: Wed Aug 17 09:37:45 CEST 2011

The "hiding behind input" problem I ran into in the arrow -> monad
conversion might be solved by representing the state machines as

  i -> (s, s -> (s, o))

instead of

  (s, (s, i) -> (s, o))

This way the isomorphism to

  i -> () -> (s, s -> (s, o))

can probably be expressed.


Entry: Writing down the Kleisli isomorphism with existential types.
Date: Thu Aug 18 17:48:23 CEST 2011

Oleg's reply to [1]: It is possible to do this by moving the
quantification inside:

  type Kl i o = i -> Kl1 o
  data Kl1 o = forall s. Kl1 (s -> (s,o))
  iso :: (i -> Kl () o) -> Kl i o
  iso f = \i -> f i ()

Adding a data wrapper and an unpack function written as a pattern
match as before (to avoid other GHC issues) gives:

  data Kl i o = Kl (i -> Kl1 o)
  data Kl1 o = forall s. Kl1 (s -> (s,o))
  iso :: (i -> Kl () o) -> Kl i o
  iso f = Kl $ \i -> (\(Kl kl) -> kl ()) (f i) 

And lo and behold, the extra initial state parameter can just be
added!

  data Kl i o = Kl (i -> M o)
  data M o = forall s. M s (s -> (s,o))
  iso :: (i -> Kl () o) -> Kl i o
  iso f = Kl $ \i -> (\(Kl kl) -> kl ()) (f i) 

I wonder if now it will break at another point.  Maybe it's no longer
possible to write composition of Kl?

Let's give it a try.  But first, let's try bind again since the monad
is naked anyway.  Hmm.. can't do it just like that.

I was thinking though, since we're not chaining states, just inputs
and outputs, that it might be possible to pull it off anyway: all
states are self-contained.

So, trying to write compose for signature (i -> (s, s -> (s,o))).
This seems to get stuck because the initial state is hidden behind the
input, so it can't be bubbled up.

In other words, the following isomorphism doesn't seem to exist:

  (i -> (s, s -> (s, o)))  ->  (s, i -> s -> (s, o))
  
Or does it?  What if we make i part of a class that has somehow a
"default" value.  Because we do know that once unpacked, the value we
find does not depend on i.

What I wonder is why these tricks are necessary.  The idea does seem
to be sound though.  Let's try it out.

Ok, I found a trick to unlock the initial state and write a
composition function:


  class Unlock a where
    key :: a

  compose :: (Unlock a, Unlock b) =>
             (a -> (s1, (s1 -> (s1, b)))) ->
             (b -> (s2, (s2 -> (s2, c)))) ->
             (a -> ((s1,s2), ((s1,s2) -> ((s1,s2), c))))
    
  compose f1 f2 = f12 where 
    (i1, _) = f1 key {- u -}
    (i2, _) = f2 key {- u -}
    f12 a = ((i1,i2), u12) where
      u12 (s1, s2) = ((s1', s2'), c) {- p -} where
        (_, u1)  = f1 a  {- u -}
        (s1', b) = u1 s1 
        (_, u2)  = f2 b  {- u -}
        (s2', c) = u2 s2 

Can this be used to implement compose for the wrapped type?  There are
4 unpacks that cross the hiding barrier, and one repack.  Let's
convert to CPS:

  (.>) v f = f v

  compose' :: (Unlock a, Unlock b) =>
              (a -> M b) ->
              (b -> M c) ->
              (a -> M c)

  compose' f1 f2 = 
    (f1 key) .> (\(M (i1, _)) ->
    (f2 key) .> (\(M (i2, _)) ->
    \a -> M $ ((i1, i2),
               (\(s1, s2) ->
                 (f1 a)  .> (\(M (_, u1)) ->
                 (u1 s1) .> (\(s1', b)    ->
                 (f2 b)  .> (\(M (_, u2)) ->
                 (u2 s2) .> (\(s2', c)    ->
                 ((s1',s2'), c)
                 ))))))))

It looks like it's really either/or.  If you put the quantification as
Oleg suggested, it doesn't seem possible to write the compose
function.  If you put it like I had it originally, you can't write the
isomorphism.

Stuck again.  Going in circles.

Reply :
 - real point is extra s
 - to get the s behind the i, use a constant argument
 - write compose in his example nomenclature
 - the contrived syntax to avoid pattern binding errors.


EDIT: the "key" problem already indicates that something isn't right:
there is a depedency in the types that I simply ignore here..

[1] http://www.haskell.org/pipermail/haskell-cafe/2011-August/094718.html


Entry: Answer to Oleg's post
Date: Sat Aug 20 20:11:59 CEST 2011

( Not sent.  Trying to answer it myself first. )

On 08/18/2011 07:27 AM, oleg@okmij.org wrote:
>> -- Is there a way to make this one work also?
>> data Kl i o = forall s. Kl (i -> s -> (s, o))
>> iso :: (i -> Kl () o) -> Kl i o
>> iso f = Kl $ \i s -> (\(Kl kl) -> kl () s) (f i)
>
> Yes, if you move the quantifier:
>
> type Kl i o = i -> Kl1 o
> data Kl1 o = forall s. Kl1 (s -> (s,o))
> iso :: (i -> Kl () o) -> Kl i o
> iso f = \i -> f i ()
>
> iso1 :: Kl i o -> (i -> Kl () o)
> iso1 f = \i -> (\() -> f i)
>
>
> I'm not sure if it helps in the long run: the original Kl and mine Kl1
> are useless. Suppose we have the value kl1 :: Kl Int Bool 
> with the original declaration of Kl:
>
> data Kl i o = forall s. Kl (i -> s -> (s, o))
>
> Now, what we can do with kl1? We can feed it an integer, say 1, and
> obtain function f of the type s -> (s,Bool) for an _unknown_ type s.
> Informally, that type is different from any concrete type. We can
> never find the Bool result produced by that function since we can
> never have any concrete value s. The only applications of f that will
> type check are
> 	\s -> f s
> 	f undefined
> both of which are useless to obtain f's result.

Thanks, Oleg.

The real data type I intend to use is this one:

  data Kl1 o = forall s. Kl1 s (s -> (s,o))

Kl1 then represents infinite sequences of values of type o, and
functions of type i -> s -> (s, o) can then represent sequence
operators.  (Monad and Kleisli arrow).  The reason I left it out is
that it makes the plumbing more awkward.

The original problem I'm trying to solve is that I can't seem to write
down composition of the Kleisli arrows, or the bind or join operation
in the monad, without running into typing problems.

Everything works fine for the parameterized type:

  type Kl' s i o = i -> Kl1' s o
  data Kl1' s o = Kl1' (s -> (s,o))
  compose' :: (Kl' s1 a b) -> (Kl' s2 b c) -> (Kl' (s1,s2) a c)

But not with the quantified one:

  type Kl i o = i -> Kl1 o
  data Kl1 o = forall s. Kl1 (s -> (s,o))

  compose :: (Kl a b) -> (Kl b c) -> (Kl a c)
  compose f1 f2 = 
    \a -> Kl1 $ \(s1,s2) ->
      (f1 a)  .> (\(Kl1 u1) -> (u1 s1) .> (\(s1', b) ->
      (f2 b)  .> (\(Kl1 u2) -> (u2 s2) .> (\(s2', c) ->
      ((s1',s2'), c)
      ))))

I guess this is the same question as before where the answer would be
"move the quantifier back where it was", i.e. :

  data Kl i o = forall s. Kl1 (i -> s -> (s,o))

So maybe I should ask a different question: is there any reason why
you can't make a Monad which has some state type that's hidden?

Any way I try to write down the bind or join operations of a Monad
instance for the type you suggested

  data Kl1 o = forall s. Kl1 (s -> (s,o))

I run into typing problems of the kind in compose above.

I.e. for bind, the problem is of course almost exactly the same:

  bind :: (Kl1 i) -> (i -> Kl1 o) -> (Kl1 o)
  bind mi f =
    Kl1 $ \(s1,s2) ->
          mi      .> (\(Kl1 u1) -> (u1 s1) .> (\(s1', i)  ->
          (f i)   .> (\(Kl1 u2) -> (u2 s2) .> (\(s2', o)  ->
          ((s1',s2'),o)))))


Entry: Moving on
Date: Sun Aug 21 13:46:06 CEST 2011

So, no Monad.

Let's embrace the Arrow.

How to work with Arrows?  There is a new notation.

First thing to do is to use Functor and Applicative to lift unary and
binary arithmetic.

Entry: Towards 1 / 1 - z
Date: Sun Aug 21 17:18:40 CEST 2011

It doesn't seem too hard now to define Num class for the Arrows.
Would it be possible to also define rational functions?  That would be
really cool.

Maybe the Num class is simpler to do through defining Applicative
instances?  Indeed:

  instance Functor (SigOp i) where
    fmap f op = (arr f) . op

  instance Applicative (SigOp i) where
    pure f = arr $ \_ -> f
    (<*>) f a = (arr $ \(f,a) -> f a) . (f &&& a)

  instance (Num o, Show (SigOp i o), Eq (SigOp i o)) => 
           Num (SigOp i o) where
    (+) = liftA2 (+)
    (*) = liftA2 (*)
    abs = fmap abs
    signum = fmap signum
    fromInteger = pure . fromInteger


But this is not really what I want.

I want (z + 1) to mean (z + id).
  and  (z * z)         (z . z)


The real problem is that I'm confusing generic state space models
(i.e. non-linear ones) with linear ones that might support a different
notation in terms of z operator polynomials.


Entry: Next
Date: Sun Aug 21 20:25:42 CEST 2011

So what's next?  Looks like this is a fairly complete system.

I'm a bit tired ATM; I don't really see what's next very clearly.
Maybe this should just be merged with some abstract eval or LLVM
compilation?


Entry: AudioProc
Date: Sun Aug 21 22:51:38 CEST 2011

Had a quick look at AudioProc[1] and it doesn't seem to work for me
since it's list-based.  I need to explicit equation-rep for abstract
interpretation and code gen.

It uses loop to introduce feedback (with delay).  I'm not using that
at all: feedback is built-in as a primitive in my rep so it is easily
recoverable.  Meaning:

An open recursion relation is easier to convert back to an open
recursion relation in a different form than is a closed one.

However, it might be possible to use ArrowLoop to perform the binding
"interpretatively" only, leaving the open representation intact but
just recording the binding information separately.

[1] http://cs.yale.edu/c2/images/uploads/AudioProc-TR.pdf


Entry: Code gen & let/if
Date: Tue Aug 23 16:46:47 CEST 2011

It's not only sharing, it's also conditionals that can make code
generation through abstract evaluation a bit problematic.  Note that a
"strict" if is not a problem, as long as both legs are computed the
code can be generated, and the dispatch only selects a value.  The
problem is with "lazy" if.


Entry: C/C++ is good enough
Date: Wed Aug 24 17:59:06 CEST 2011

Sometimes things are just too broken, and all you can do is quickly
hack & slash, cutting abstraction boundaries.  The real question is not
as much whether more abstract methods work, it's whether it's useful
when the problem really is more about glue and broken abstractions.

Sometimes it's not *practically feasible* to properly abstract.  If
you have a language that doesn't allow side channels this is a real
problem: you can't easily make-it-work-for-the-demo and cleanup-later.


Entry: The problem with hardware is state
Date: Thu Aug 25 12:11:52 CEST 2011

State creates complexity explosion which makes any kind of formal
approach practically impossible.  The trouble is: most of this state
seems to be an optimization.  Cache & buffers.  Sequentialized
combinatorial processes.  Is there a way around this?


Entry: Recent HC stuff
Date: Thu Aug 25 19:14:29 CEST 2011

I'm confused.

I understand:
 - Why my original formulation can't be a monad.
 - How it can be an arrow, and why the restricted control flow in the arrow is useful.

I don't understand:
 - How "bad" the suggestion is to use dynamic state variables
 - Oleg's comments on streams with extra inputs
 - How the stream monad could be useful for my state update problem
 

Entry: Next: make abstract interpretation work.
Date: Fri Aug 26 10:10:50 CEST 2011

Given a SigOp, we need to observe:
  - initial state
  - representation of recurrence relation

So, for each domain there should be an unfold or eval function.

TODO: print int

data Expr = I String
          | Add Expr Expr
          deriving (Eq)
instance Show Expr where                   
  show (I v) = v
  show (Add a b) = concat ["(",show a," + ",show b,")"]

instance Num Expr where
  (+) = Add
  (*) = undefined
  abs = undefined
  signum = undefined
  fromInteger = I . show


*Main> cg (I "in") int
"(((in + 0),(in + 0)),0)"
*Main> cg (I "in") (int . int)
"(((((in + 0) + 0),(in + 0)),((in + 0) + 0)),(0,0))"
*Main> cg (I "in") (int . int . int)
"((((((in + 0) + 0) + 0),(((in + 0) + 0),(in + 0))),(((in + 0) + 0) + 0)),(0,(0,0)))"


This is almost correct, except that I don't have a way to name: this
isn't the generic update equation, this is an expression for the first
output from the initial state, "parameterized" over the input.

I.e. (cg int) works for:

  cg (SigOp u s0) = show (u (s0, I "i"), s0) 

but not for

  cg (SigOp u s0) = show (u ("s", I "i"), s0)

We need some kind of test function that goes between the initial state
and what is actually fed into the function, or how that function is
unrolled or otherwise transformed for that matter..

This comes back to placing the unroll function in a type class.

Hmm.. Using some horrible hack I can do this:


cg (SigOp u s0) = show (u (sigOpInit s0, I "i"), s0) 

instance SigOpInit Expr where
  sigOpInit s = I "s"

-- Stream unfolding.
class Show s => (SigOpInit s) where
  sigOpInit :: s -> s
  sigOpInit x = x
  
-- Distribute over pairs, lists and void
instance (SigOpInit a, SigOpInit b) => SigOpInit(a,b) where
         sigOpInit (a,b) = (sigOpInit a, sigOpInit b)
instance SigOpInit a => SigOpInit [a]
         sigOpInit = map sigOpInit
instance SigOpInit ()


*Main> cg int
"(((i + s),(i + s)),0)"

Trouble with that of course is uniqueness, i.e. variable name
generation:

*Main> cg $ int . int
"(((((i + s) + s),(i + s)),((i + s) + s)),(0,0))"

Not simple..  Where to put all that functionality?


Start again.  What's the main problem?  The following seem not
compatible.

  1.  Bundling initial state with processors.

  2.  Getting a representation of an opened-up function.


Instead of using some foefelare, is it possible to build that
intensional representation together with everything else?  The trouble
seems to be that at some point the content of tuples needs to be
named.

Yes.  s needs to support some kind of labeling function that needs to
be threaded at composition time.


Entry: I want code not just behaviour
Date: Fri Aug 26 13:26:33 CEST 2011

Trouble is that I want an "open" description, I want the intension[1]
of the function that generates the sequence and not just the extension
(it's interface with the world).

[1] entry://../compsci/20110616-115059


Entry: State access
Date: Fri Aug 26 14:49:03 CEST 2011

I need to find a way to label the primitive state nodes.  Eventually
they need to be translated to a sequence like this:

  [["float", "S0", "0"],
   ["int",   "S1", "1"], ..]

Or even lower level, untyped: address + bit pattern.

Trouble is, there is no way to access the state directly, so this
needs to be done explicitly in the form of some function.

Instead of using the initial state value to seed some computation, I
want to seed it by something that has an extra input: the label of the
state.

So an initial state :: Double with lables indexed by :: Nat could be
implemented as

   Nat -> (Double, Nat)

where the output Nat represents the number of "primitive" objects
allocated.  Composing the tupled version is straightforward.

Maybe the whole "state composition" operation could be abstracted
instead?  Let's go for the more general operation first.

Hmm.. maybe this will give type problems..

This is what I got to work: I did not have to change anything to the
state composition: pairs are used.  What I changed was just a
restriction on the state itself:
 
    data SigOp i o = forall g s. SigOpState s => SigOp ((s, i) -> (s, o)) s

Then implemented an interface to iterate over state.  The key entry
here is the instance for pairs which will "bind" the init functions.


  class Show s => (SigOpState s) where
    nameState :: Int -> s -> (Int, s)
    -- By default, states are primitive and unchanged.
    nameState n s = (n+1, s)


  -- For just running, this is all we need.
  initState i = snd $ nameState 0 i


  instance (SigOpState a, SigOpState b) => SigOpState (a,b) where
    nameState n (a,b) = (n'', (sa, sb)) where
      (n',  sa) = nameState n  a
      (n'', sb) = nameState n' b
  instance SigOpState () where
    nameState n () = (n, ())

  instance SigOpState Int
  instance SigOpState Integer
  instance SigOpState Double
  
  -- Other queries.
  stateSize (SigOp _ s0) = n where
    (n, _) = nameState 0 s0


Then the expression type can use named variables:


  instance SigOpState Expr where
    nameState n s = (n+1, I ("s" ++ show n))

  cg (SigOp u s0) = show (u (initState s0, I "i"), s0, initState s0) 


*Main> cg $ int . int
"(((((i + s1) + s0),(i + s1)),((i + s1) + s0)),(0,0),(s0,s1))"
  
*Main> cg $ int . int . int
"((((((i + s2) + s1) + s0),(((i + s2) + s1),(i + s2))),(((i + s2) + s1) + s0)),(0,(0,0)),(s0,(s1,s2)))"

Seems to work.

What about doing the same thing with inputs and outputs?


Entry: Tuples to lists
Date: Fri Aug 26 16:38:39 CEST 2011

The tuples are good for creating heterogenous collections that can
still be composed without fear of run time errors.  However, once type
checked the compilation to output syntax can be allowed to strip a lot
of information.

One of those is the tuple structure for the states.  We just need
lists in the end because all the types are removed and only syntax
remains.  Once we have lists we can use simpler run-time iteration.

Something like the following, but then working..


class Show s => TupleShow s where
  tupleShow :: s -> [String]
  tupleShow x = [show x]
  
instance TupleShow () where
  tupleShow () = []
  
instance (TupleShow a, TupleShow b) => TupleShow (a,b) where
  tupleShow (a,b) = tupleShow a ++ tupleShow b


Ah the trick is to not let TupleShow depend on Show since that gives
conflicting instances.  Still I have some trouble with the
existentials..

Ok, I've moved to a different composite state type StateProd to keep
general tuples and compisite states separate.  Also moved the
stateShow operation into SigState:


class Show s => (SigState s) where
  stateIndex :: Int -> s -> (Int, s)
  stateShow :: s -> [String]
  
  stateIndex n s = (n+1, s)
  stateShow s = [show s]
  
instance (SigState a, SigState b) => SigState (StateProd a b) where
  stateIndex n0 (StateProd (a, b)) = (n2, (StateProd (a', b'))) where
    (n1, b') = stateIndex n0 b
    (n2, a') = stateIndex n1 a
  stateShow (StateProd (a,b)) = stateShow a ++ stateShow b
  
instance SigState () where
  stateIndex n () = (n, ())
  stateShow () = []
 
instance SigState Int  
instance SigState Integer
instance SigState Double


If renaming is required the stateIndex method needs to be overwritten.

It looks like I'm done with the hard work:

*Main> test3
([["s2","s1","s0"],["0","0","0"],["(((i + s0) + s1) + s2)","((i + s0) + s1)","(i + s0)"]],["(((i + s0) + s1) + s2)"])
*Main> map head $ fst test3
["s2","0","(((i + s0) + s1) + s2)"]

Next: find a way to combine with sharing.  It's not clear to me where
exactly that can be plugged.


Entry: Sharing revisited
Date: Sat Aug 27 08:53:24 CEST 2011

It's clear from the previous example that sharing (observable through
abstract evaluation) is necessary.  This means that when defining the
integrator int we already need to add a constraint on the data types
that allows the use of an expression like:

        let_  expr body

The trouble here is syntax.  It's going to be ugly.

When designing the do notation, the main constraint is that we want
expressions like

           m t -> m t -> m t

instead of "machine code" like

           t -> t -> m t

Previously I thought it better to not use do notation at all, and just
use the let_ construct.  Let's see what it takes to make this work.


Anyways, trying to bride SymAsm.hs with SigOp.hs
For now, just use an explicit DFL class:

  class DFL v r where
    var :: v -> (v -> r) -> r
    -- Default is no sharing.
    var v b = b v
 
Trouble is that tupling doesn't go well with Sym.  So maybe this needs
automatic distribution?

/home/tom/meta/dspm/test2.hs:46:26:
    No instance for (DFL (Asm Tint) (Asm Tint, Asm Tint))

Ok, got some wrap/unwrap problem.  
Maybe it's simpler to catch it at the root and define a Monad instance for Asm.

What i really want is Asm (Tint, Tint).  The reason seems to be that
tupling is not embedded in the language.  I'm using the metalanguage
tuple.

I thought this would be no problem but apparently it doesn't work for
let.  Let's remove that approach and use the let_ from Sym.  This then
needs a different int.

Same problem using the let_ from Sym:

  int' = SigOp (\(s, i) -> let_ (i `iadd` s) (\o -> (o, o))) 0

This returns (SymAsm a, SymAsm b) and not (SymAsm (a,b)).  How to fix
that?  It seems like a serious problem.

What it looks like is that when I DSL-ify, lot's of native constructs
can't be used: if let (,)

The problem is that I don't have a good mental model of what is object
language, and what is meta language.  In my mind, the tuples should
just be "unrolled" to form some kind of data dependency structure.

Does it help to allow let_ to be more general?

Doesn't seem so..

This is a problem.  I don't understand it.

It seems the only way around this is to allow let_ to have generic
return type.

Or, to just mix let and let_ like this:

   int' = SigOp (\(s, i) -> let o = let_ (i `iadd` s) id in (o,o)) 0

Which is more like the approach in the sharing monad in [1].
Essentially the let_ can be replaced with "share" as

   share x = let_ x id

This looks like it's a better approach.  Nope.  It's not the same.
This is still not CPS because the body of the bind is not executed in
the correct context.

This approach is too difficult.  I need something simpler.

Maybe it's best to express the language as a monad, and use the do
form for the binding.  That way there is no room for obscure sharing
problems.  Expressions can be built on top of that using the same
trick as for LLVM.

The question then remains: would it still be possible to use SigOp ?


[1] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.144.7880&rep=rep1&type=pdf


Entry: Sharing monad
Date: Sat Aug 27 12:54:16 CEST 2011

I keep making mistakes wrt. the sharing monad so the idea is to build
a DSL that is monadic from the start so we have a proper binding
mechanism that allows the implementation of sharing and maybe other
side effects.

  class Monad m => DSPM r m where
    add  :: r t -> r t -> m (r t) 
    mul  :: r t -> r t -> m (r t)
    lit  :: t -> r t

Here r is the representation type and m is an explicit monad wrapper
to allow expressions such as:

  f a b = do  
    c <- mul a a
    d <- mul b b
    add c d
  
Is the monad fixed?
Can it be part of r?

The first problem I run into is managing the types in a shared
context.  Maybe now's a good time to use a different approach to
storing collections of different types.


Step 1: got something working without sharing on top of the SC monad:

  data Asm t = Var String
             | Op  String [Asm t]

  instance Show t => Show (Asm t) where
    show (Var v) = v
    show (Op opc as) =  opc ++ (show as)

  op2 opc a b = return (Op opc [a,b])

  instance (DSPM Asm (SC r e)) where  
    add = op2 "add"
    mul = op2 "mul"
    ivar = return . Var

  f1 a b = do  
    c <- mul a a
    d <- mul b b
    add c d
  f2 = do  
    a <- ivar "a"
    b <- ivar "b"
    f1 a b

  asm (SC c) = putStrLn $ c () $ \e t -> "env: " ++ show e ++ ", term: " ++ show t
  type Rv = SC String () (Asm Tint)
  asm (f2 :: Rv)

  env: (), term: add[mul[a,a],mul[b,b]]


Let's fix that first so the type annotation is kept.  Hmm this is not
so simple.  Also running into another such problem after implementing

  returnN :: (Asm t) -> SC r Env (Asm t)
  returnN t = SC c where  
    c e k = k e' (Var v) where
      v = "R" ++ (show $ length e)
      e' = (v, "<dummy>") : e
  
Can't seem to propagate the show constraint into returnN.  Maybe best
to solve another problem first: the Env type needs to be heterogenous.
Existentials or dynamics?  Was quite straightforward

  data Bind = forall t. Show t => Bind String t
  instance (Show Bind) where
    show (Bind v t) = v ++ " <- " ++ (show t) ++ "\n"

  newtype Env = Env [Bind]
  instance (Show Env) where
    show (Env e) = concat $ map show (reverse e)
  

I was able to work around the typing issue by reducing generality:

  op2 opc a b = returnN (Op opc [a,b])
  type TypedOp2 r t = String -> (Term t) -> (Term t) -> SC r Env (Term t)

  op2i = op2 :: TypedOp2 r Tint
  op2f = op2 :: TypedOp2 r Tfloat

and making the DSPM methods more specific, like the Sym before:

  class Monad m => DSPM r m where
    iadd  :: r Tint -> r Tint -> m (r Tint) 

  instance DSPM Term (SC r Env) where  
    iadd = op2i "add"


Entry: Connecting sharing monad to SigOp
Date: Sat Aug 27 16:00:18 CEST 2011

So it seems to work, after some struggle.  Now the tuple problem.
What does it mean to do something like this, creating two monadic
values (m,m) instead of m(,)  ?

  
  f2 a b = do  
    aa <- a `imul` a
    bb <- b `imul` b
    (aa `iadd` bb, aa `isub` bb)

The inferred type sheds some light:

  f2 :: (DSPM r m1, DSPM r ((,) (m (r Tint))), DSPM r m) =>
     r Tint -> r Tint -> (m (r Tint), m1 (r Tint))

The problem is that the tuple operator does not produce a single
monadic value.  I.e. it doesn't "live in the monad".

So how to I shoehorn functions that look like a -> m b into
(s,i)->(s,o) ?

Wait a minute.  What about this:

  f3 a b = do  
    aa <- a `imul` a
    bb <- b `imul` b
    ap <- aa `iadd` bb
    am <- aa `isub` bb
    return $ (ap, am)

At least this has a proper type:

  f3 :: DSPM r m => r Tint -> r Tint -> m (r Tint, r Tint)


Is it possible to "evaluate" the monad into a simple function so it
can still serve as a pure update (s,i) -> (s,o) ?

Just thinking about it: of coruse there should be no problem to
convert the CPS computation back into a pure function.  It *is* a pure
function.  But this can only be done if the monad instance is known.

Hmm... maybe that's not really true.  The point is to build one huge
function *inside the monad* such that compositions made by SigOp can
use sharing.  That's what was wrong in the other approach.

So I really need to be able to compose (s,i) -> m (s,o) for some Monad
m which is part of the language in which the siso's are written.

Is that a problem?

No it was surprisingly straightforward.

The only difference is that I don't have the num class for examples.
Or maybe I do?  Ok, I just define everything in terms of the monadic
DSPM language, that's what it's made for.  Here we are:

  int = opr $ \(s, i) -> do { o <- i `iadd` s; return (o, o) }

*Main> :t int
int
  :: (SigState (r Tint), DSPM r m, Num (r Tint)) =>
     SigOpM m (r Tint) (r Tint)

*Main> :t int . int . int
int . int . int
  :: (SigState (r Tint), DSPM r m, Num (r Tint)) =>
     SigOpM m (r Tint) (r Tint)

Looks like it's working.  Now, how to convert to expression?  Need to
take a step back.  Damn it's easy to abstract away common sense!

I'm having trouble to get this to behave.  Somehow the monad types are
not identified when I insert such an ilit in an expression

  ilit  :: Tint -> (r Tint)

this was the previous definition, but it is inconvenient as it has to
be unpacked.

  ilit  :: Tint -> m (r Tint)


Pff.. I'm doing shotgun programming.  Maybe the initial value should
be monadic?  Probably not a good idea since that requires also binding
when composing.  Anyways.  Try later..

What should work with this approach though is stream unrolling!


Entry: Functional dependencies
Date: Sat Aug 27 21:19:44 CEST 2011

Trouble is that when using

  ilit  :: Tint -> (r Tint)

it is inferred that this operation comes from DSPM r m where r can be
matched with other surrounding types, but m is left unspecified.  What
we do know is that given r, the m is fixed.

Can this be solved with functional dependencies?  Yep.

  class Monad m => DSPM r m | r -> m


Entry: 2 x existentials
Date: Sun Aug 28 14:51:11 CEST 2011

Throuble is that I have both state as an existential, and the terms in
the environment.  I've used a "show" trick before to avoid knowing the
state type, but it seems to interfere badly with the monad that's used
for the Environment generation.  How to fix?

Why does it have to be a problem?

Well, it is.  stateShow returns M [String].

I have

        (s, i) -> m (s, o)

Where the type of s is not known, but an initial value can be
generated to provide something of type m (s, o) where the s component
still implements stateShow.  The question:

  - how to get s out of the monad, and collect a "compiled" version?

The monad will produce Asm when run, which is Env and (Term t).  This
term is a Variable with some type attached.  But we can't observe the
Term t because we don't know the state.  Will it be possible to return
an Env + (show term)?

What I have now is a bit contrived.  It's probably better to simplify
a bit.

First simplification: does Asm need to have a Term t?  Why not just a
string to accomodate both Var and Res which are multiple vars in a list.

Term t should just be a front for UntypedTerm.


Entry: Disentangling the Arrow and Monad
Date: Tue Aug 30 18:02:41 CEST 2011

Two components interact in a way I don't understnd:

 * existentials for state, to be able to implement stream operator
   oposition as Arrow.

 * monad to implement the underlying machine language.

Some ponderings..  Why is the state existential in SigOpM?  To keep
the type abstract, but at the other time guarantee that types of
initial values and state transformers match.

Can this also be accomplished using dynamics?  Probably yes.  Then
there can be just one type which can be a parameter for Arrow (and a
"broken" Monad) instances, and would be trivially accessible using
dynamic type casting.  Are there any other pitfalls?  Probably the
broken monad is good enough (problem is that a conditional can
introduce computations with different state - I'm not too happy with
that..)

The monad to implement the data sharing and possibly other machine
language constructs seems essential.  However, at this time I'm using
existentials to hide term types, which I then later only convert back
into strings.

Problem is I don't see what the problem is.  If the state inspection
code of SigOpM is moved inside the sharing monad, the monad can be
executed to produce a result type that is compatible with stateShow.

Let's try this: stateShow will produce a Term instance, where for
ordinary "numbers" it is enough to produce Lit String.  StateShow can
then be renamed to Compilable or something.


Let's revisit the point.  I want to create structures that can be:

  - evaluated on numbers without any issues.

  - exported as syntax after abstract interpretation


Because of the "inspection" it's not possible to abstract everything.
The first requirement alone wouldn't need much, and can be very
abstract (i.e. infinite streams where the state is only implicit in
function compositions).

The problem I run into is that I need to *name* the state variables.
They can't just be a bunch of hidden variables.

This is possible by imposing an interface on the hidden variables, but
that somehow feels like a bit of a hack.  It would be nice to be able
to hide all the details of the representation somewhere else.


Wait.  It's not just that I have to hide the state type to make it an
Arrow instance.  It's also that in general it is completely unknown,
so I *do* need an abstract interface, which is currently:

  - label / count the nodes  (generate variables)

  - collect nodes in a list  (collect results)

So, the fact that it is hidden is beside the (functional) point.  It's
just to enable some generic Arrow sugar.


Could the state be parameterized on the "observer" type?  I.e. instead
of Show a, have something like Compilable a t, where t is an observer
result?

Let's try that out.  The change doesn't seem trivial as I run into
trouble parameterizing this one:

  instance (SigState a t, SigState b t) => SigState (StateProd a b) t where
    stateIndex n0 (StateProd (a, b)) = (n2, (StateProd (a', b'))) where
      (n1, b') = stateIndex n0 b  -- Align numbering with causality through composition
      (n2, a') = stateIndex n1 a
    statePeek (StateProd (a,b)) = statePeek a ++ statePeek b


t is apparently ambiguous.


Entry: Ha!
Date: Wed Aug 31 20:30:30 CEST 2011

The trick is the following: the only thing I don't know (and don't
care about) in practice is the form of the container (nested pairs)
because it will be flattened into a list.

In practice, when compiling I do know what the base terms are, so
compiling will probably not effectively loose any information, just
the static types which are no longer necessary because the output data
(language syntax) is already in the required form.

So let's start out with a class that expresses this: a state object is
composite, and has an interface that allows enumeration and gathering
into lists of a certain type.  Is this a standard interface?

So the idea is to convert:

  stateShow :: s -> [String]

to something more like

  stateShow :: s -> [p]

where p is another parameter that represents the primitive state type.
The default implementation could then be p = String.

The trouble I had before was to get the following to typecheck (with
stateShow renamed to statePeek) :

  instance (SigState a p, SigState b p) => SigState (StateProd a b) p where
    stateIndex n0 (StateProd (a, b)) = (n2, (StateProd (a', b'))) where
      (n1, b') = stateIndex n0 b  -- Align numbering with causality through composition
      (n2, a') = stateIndex n1 a
    statePeek (StateProd (a,b)) = statePeek a ++ statePeek b


I got it to work, but I needed UndecidableInstances because of:

       the Coverage Condition fails for one of the functional dependencies;
       Use -XUndecidableInstances to permit this


Entry: Commuting monad & state observe
Date: Tue Sep  6 14:34:41 CEST 2011

The point is to make the projection from state to list of observed
primitive states only structural, meaning that just the original
structural product is lost, but not the primitive contents.  I.e. with
pi primtive state components we have:

  (s0,(s1,s2)) -> [p0,p1,p2]
  ()           -> []

This way there is no worry that any projection in the evaluation of
the monad (for construction of binding environment) looses any
information.

Remarks:

  1. Why keep original product in composition?  -> It serves a
     purpose: easier to take apart.

  2. Is it necessary to have the following path-independence?

        m (s,o) -> (e,s,o) -> (e,[p],o)

        m (s,o) -> m ([p],o) -> (e,s,o)

  3. Phantom types get stripped also, so the typed primitives and
     compiled-to primitives cannot be the same.
     
     Actually, the notion of state primitive does not need to be know
     in the SigOpM primitive functions.  I.e. one could lift a SigOpM
     over a complex number class.

  4. Is it possible to treat i and o the same, or does their structure
     always needs to be known?  Same argument about primitivies.
     Projection to primitive probably needs to be more geneneral than
     only state.


Conclusion: 

  * Focus on structural projection to list of primitives.  

  * As long as the primitive base type has dynamic type annotation,
    order of operations should not be important: project from
    structured to unstructured type at any convenient moment.
     

Entry: Primitive data
Date: Tue Sep  6 14:53:09 CEST 2011

The SigState class needs to be more general, and should also apply to
inputs and outputs.  It should capture the relation between abstract
data and "flat" data for the compilation target.

Renamed to PrimData.

Remarks:

  - Can have only one primitive type due to s -> p fundep.  This makes
    sense in that states really can be general: we can compily
    anything to [p], so p needs to be fixed to one single type.

    Can it be done with a quantification instead of fundeps?  I don't
    know.  I don't understand why this doesn't make sense:

      instance forall p. ((PrimData p a, PrimData p b) => PrimData p (ProdData a b)) 

    However, only supporting a single concrete type is not a problem:
    it looks like class instances can be module local, so if multiple
    primitives need to be combined, they can at least be hidden behind
    module interfaces.


Entry: NEXT
Date: Tue Sep  6 15:58:05 CEST 2011

 - Combine PrimData, DSPM, SigOpM + concrete expression type


Entry: Status
Date: Sat Sep 10 08:57:46 EDT 2011

Life happening lately, so where are we with the meta project?

- Trying out Haskell DSL using some standard methods:

  * DSPM: A first order, statically typed Monadic DSL implemented as a
    type class where primitives are a -> M b, a -> b -> M c, ... to
    allow do-notation to provide binding structure.
  
  * SigOpM: composition of state-space models (causal, finite-state
    signal operators) written in a monadic DSL using the Arrow
    interface.

  * PrimData: representation of composite state to support abstract
    interpretation (compilation).


- Problems:

  * Getting to know the power and limitations of the Haskell type
    system.  The main difficulty is seeing why some "intuitively
    obvious" are not structured properly to be expressed using the
    constructs I know.


Entry: Combinatory Logic and State Machines
Date: Sat Sep 10 09:18:54 EDT 2011

Rename the classes?

     DSPM   : combinatorial logic
              behaviour = stateless (expresses a function)
              however, the _structure_ is also specified


     SigOpM : state machine (mealy[1] machine) built from
              combinatorial logic building blocks.

What I have in mind is a numeric state space model (SSM), but the
interface is probably generic enough to accomodate a generic mealy
machine.  One could say that a Mealy machine is a SSM where the rule
expression is a case statement.

[1] http://en.wikipedia.org/wiki/Mealy_machine


Entry: Interfacing Pd with LLVM objects
Date: Thu Sep 22 12:56:08 EDT 2011

Notes:

  - Pd is frame-based.  This means it's possible to use the frames for
    something else than sequential signals, i.e. sequences of vectors
    like STFT In the bridging from primitive processors -> frames, the
    last step is an unrolling or a for loop that takes care of the
    adaptation, and in the translation to LLVM primitives some types
    are stripped (no distinction between sequential / parallel
    frames).  How to systematize this?

  - The interface will take:
    - input/output frames (could be sequential or vectors).
    - flat state vector

  - There is a lot of red tape to formalize.  It should be possible to
    tackle this all at once, find some kind of formalism to put this
    patching and rolling to rest.


Entry: Commutation with staging
Date: Thu Sep 22 13:17:59 EDT 2011


One of the things that struck me in the Feldspar paper, is to define
"map" for array indexes.  Let's read that again and look at it in
terms of Functor and how it _commutes dereference and staging_.

This seems to be a basic problem: how do certain operations commute
with staging?

This is also something hinted at in Staapl.


Entry: Giving up on Haskell's LLVM bindings
Date: Fri Oct 14 11:54:31 EDT 2011

After some experiments[1] I give up trying to use the Haskell LLVM
bindings because they are "too strongly typed" for me as a beginning
Haskell programmer.  I can definitely see the usefulness of the
approach, but practically it just gets in my way at this time.  I need
to concentrate on getting it to work first using simple algebraic data
types, and then maybe encode more stuff as type classes later on.

[1] entry://../nodes/20111014-113441


Entry: DSPM
Date: Fri Oct 14 11:58:17 EDT 2011

What is necessary?

  - Focus on the monadic base language and how to map it to an
    algebraic data type which preserves type information, so it can
    generate typed intermediate code (either LLVM or C in character
    form).

  - Reuse later: the Arrow composition and state abstract can follow
    after that.  The representation is important because the hidden
    state in the Arrow implementation needs to be representable.


Entry: Pure Monad?
Date: Fri Oct 14 12:00:51 EDT 2011

I run into an interesting problem where I want to capture the syntax
of a pure function using a monadic language.

The monad is necessary to capture binding information.  However, is it
possible (or necessary) to constrain the expression language such that
it can only express pure functions?

I.e. make it so that the identity monad is part of the class?


Entry: DSPM : interpretation
Date: Fri Oct 14 12:04:20 EDT 2011

Main problem: bind a function in SSA form to memory references:
load/store.

Do this as part of 1 interpretation, not for the language (syntax): it
should remain pure.


Some simplifications to get it to run as quickly as possible:

  - Use type annotations, but use float only in ExprM.

  - Ints will just be part of the representation of one of the
    interpretations of ExprM (as loops over IO buffers).


Entry: Type annotations in Asm interpretation of ExprM
Date: Fri Oct 14 13:07:43 EDT 2011

It seems that these instances where missing in a previous attempt:

instance TypeOf (Term Tint)   where typeOf _ = aInt
instance TypeOf (Term Tfloat) where typeOf _ = aFloat

Now all type info seems to propagate properly.  I've added type
annotation to everything, so that even the really "dumb" LLVM
annoation can be generated:

data Term t = Var TypeName VarName
            | Lit TypeName LitRep
            | Op  TypeName OpName [Term t]
            | Res [Term t]
            deriving (Eq)


Entry: Remove existential type
Date: Fri Oct 14 13:08:59 EDT 2011

The Env doesn't need static types I think... Can the existential be
removed by using typeOf earlier?
 
data Bind = forall term. Show term => Bind TypeName VarName term
instance (Show Bind) where
  show (Bind typ var term) = (typed typ var) ++ " <- " ++ (show term) ++ "\n"

newtype Env = Env [Bind]
instance (Show Env) where
  show (Env e) = concat $ map show (reverse e)
  

This probably means adding an indirection for the phantom type.
I.e. the type below can't represent casts..

data Term t = Var TypeName VarName
            | Lit TypeName LitRep
            | Op  TypeName OpName [Term t]
            | Res [Term t]
            deriving (Eq)
                 

Got it.  Replaced by:

data STerm = Var TypeName VarName
           | Lit TypeName LitRep
           | Op  TypeName OpName [STerm]
           | Res STerm
           deriving (Eq)

data Term t = Term STerm                     


Entry: TermApp
Date: Fri Oct 14 17:21:26 EDT 2011

So what about this one:

class TermApp f where
  termApp :: f -> [VarName] -> ([STerm], m [STerm])

Starting from a function and a variable name supply, produce a list of
input/output variables, with the latter wrapped in a binding monad.

The have to be STerm and not Term because we can't put the latter in a
list due to typing constraints.

The first problem to solve is, how to create a Term t object from an
STerm?  Maybe that's the real deal here: how to create properly typed
variables.


Maybe that's it.  Am I looking at this the wrong way?  Map a function
to its inputs and outputs and be done..


Entry: Getting unstuck with recursive type class definitions.
Date: Sat Oct 15 10:15:49 EDT 2011

It really shouldn't be such a big deal.  This is just pattern
matching, but done on types instead of data structures.

What do I want to do?

I have a type like this:

  (i1 :& i2 :& ...) -> m (o1 :& o2 :& ...)

And I want to map this to a type like this:

  ([i], m [o])

Let's do it in 2 steps: one for the input, and one for the output.

I have something like

       (Term t :& ...)

       [String]

And I need to create a value of Term t such that it can be passed to
the function.

Amazing.. I got it to work with fundeps and undecidable instances:

class TermApp m f | f -> m where
  termApp :: [VarName] -> f -> ([STerm], m [STerm])

instance (MakeVar t, TermApp m f) => TermApp m ((Term t) -> f) where
  termApp (n:ns) f = ((i:is), mos) where
    term :: Term t = makeVar n
    (is, mos) = termApp ns (f term)
    Term i = term

instance (Monad m) => TermApp m (m (Term t)) where
  termApp _ mt = ([], ms) where
    ms = do
      (Term s) <- mt
      return [s]

Then with a bit of effort the output tuple recursion also works.  I
had to use liftM snd and liftM fst to make it type correctly without
annotations that I did not know how to write.


instance (Monad m, TermApp m (m os))
         => TermApp m (m (Term t, os)) where
  termApp _ m_t_ts = ([], mss) where
    mss = do
      (Term s) <- liftM fst m_t_ts
      ss <- snd $ termApp [] (liftM snd m_t_ts)
      return (s : ss)


The rest should be straightforward.  Inputs can already be printed:

varnames = map (\n -> "in" ++ (show n)) [0..]

compile f = (is, os) where
  (is, mos) = termApp varnames f
  os = ()
  
For outputs the monad needs to be executed.

The trick was to change the Asm type such that it uses STerm in the
result as opposed to Term.  This allows the Asm constructore to be
used as a finial continuation of the state-continuation monad,
grabbing state (environment) and the argument passed to the
coninuation (list of result terms).

data Asm = Asm Env [STerm]

asm :: SC Asm Env [STerm] -> Asm
asm (SC c) = c env0 Asm


Entry: Bug: computations run twice
Date: Sat Oct 15 12:18:10 EDT 2011

test3 :: Term Tint -> Term Tfloat -> Term Tfloat -> MTerm (Term Tfloat, Term Tfloat)
test3 _ a b = do
  s <- fadd a b
  p <- fmul a b
  return (a, b)

*ExprM> compile test3
in:  [i.in0,f.in1,f.in2]
out: [f.in1,f.in2]
env:
f.%0 <- f.add f.in1 f.in2
f.%1 <- f.mul f.in1 f.in2
f.%2 <- f.add f.in1 f.in2
f.%3 <- f.mul f.in1 f.in2


Looks like this is becasue of the double use of m_t_ts in:

instance (Monad m, TermApp m (m os))
         => TermApp m (m (Term t, os)) where
  termApp _ m_t_ts = ([], mss) where
    mss = do
      (Term s) <- liftM fst m_t_ts
      ss <- snd $ termApp [] (liftM snd m_t_ts)
      return (s : ss)

Indeed.  The following solved it:

instance (Monad m, TermApp m (m os))
         => TermApp m (m (Term t, os)) where
  termApp _ m_t_ts = ([], mss) where
    mss = do
      ((Term s), ts) <- m_t_ts
      ss <- snd $ termApp [] (return ts :: m os)
      return (s : ss)


I'm starting to get this "monad is a computation" idea.  In the bug
above it was quite obvious that m_t_ts got "executed" twice.


Entry: Uncurried input
Date: Sat Oct 15 17:03:36 EDT 2011

Let's add one more instance to TermApp so it can also unpack input
binary tuples.

Now this was rediculously elegant:

-- Allow tuple inputs.
instance CodeApp m (i1 -> i2 -> f) => CodeApp m ((i1,i2) -> f) where
  codeApp is f = codeApp is $ \i1 i2 -> f (i1, i2)


This means that I have now a way to treat functions like (s,i) ->
(s,o).  Because the input state types are the same, this only needs to
know the size of that first tuple (== s_n) to take the first s_n
elements from a flattened i/o list as returned by genFun.


Entry: TML + SysM
Date: Sun Oct 16 09:54:38 EDT 2011

Time to integrate the insights from TML into SysM.  Most important
problem: how to deal with the existentials.

How does genFun work on something that's been hidden behind SysM with
an existential state qualifier?

Let's see how to go about this.  First, the genFun (which needs a
different name btw; i'm aliasing it to `compile'.) needs to be changed
to support a more generic interface.  Instead of only working on Code
deconstruction (:: Code t -> Term) that operation should be made
generic as "compilable" much like the PrimData class is expressing.

Let's try: Compile.hs

It seems to at least typecheck, so now I need some better names to
reflect the meaning behind this:

class CompData r s | r -> s where
  compile :: r -> s
  var     :: VarName -> r

class CompFun s m f | f -> m, f -> s where
  codeApp :: [VarName] -> f -> ([s], m [s])

CompData means compilable data: convert representation r to syntax (data) s.

CompFun means compilable function: convert function and variable name
pool to a pair of inputs and outputs parameterized by some monad.


Entry: Full circle?
Date: Sun Oct 16 12:20:06 EDT 2011

Milestone: SysM composes with TML!

f :: (Code Tfloat, Code Tfloat) 
     -> MCode r (Code Tfloat, Code Tfloat)
f (s, i) = do
  s' <- add s i
  return (s', s')
fs = SysM f (lit 0)
sysFun (SysM f f0) = genFun f

*Main> sysFun (fs . fs . fs)
in:  [f.in0,f.in1,f.in2,f.in3]
out: [f.2,f.1,f.0,f.2]
env:
f.0 <- f.add f.in2 f.in3
f.1 <- f.add f.in1 f.0
f.2 <- f.add f.in0 f.1


Let's see if that's correct by changing var names


*Main> sysFun (fs . fs . fs)
in:  [s2,  s1,  s0,  i]
out: [s2', s1', s0', s2']
env:
s0' <- f.add s0 i
s1' <- f.add s1 s0'
s2' <- f.add s2 s1'


Yep seems to work!


Entry: Generating C code
Date: Sun Oct 16 13:44:40 EDT 2011

Basic stuff is done (bindings).  Now we need some abstraction over
loops.  This might be best handled by writing "map" instances of Fun.


Entry: Combinators
Date: Mon Oct 17 12:27:37 EDT 2011

Next: combinators.  It's one thing to make the arrow combinator of
in/out chaining work, but there are definitely other things that need
some kind of "map" or for loop.

Currently I don't really know how to do that.  Maybe it's best to
focus on the simple thing: just lift a Sys to a Pd/SM object.  This
entails:

  - state initialization

  - creation of for loop over inputs + state variables
  
  - float to int conversion and back


Automatic float<->int conversion can be postponed until later.  the
main question is: should this "map" be simply generation of C syntax,
or should it go first into a data structure to make the implementation
a bit more abstract?

Maybe I should start with the interface.  What is it that should be
mapped?  And then write it as a functor.

Let's clean up the code a bit more to prepare for this.


Entry: Adding load/store
Date: Mon Oct 17 13:47:32 EDT 2011

So, I have a syntactic representation TermFun which has inputs,
outputs and internal bindings (outputs are varrefs into internal
bindings).,

Next step: add load/store.  These seem to be the issues:

  - Binding is not dereference: make load/store explicit.

  - Loop variables are special.  They resemble store but are also read
    from.  This is phi in SSA.

Since I'm generating C it might not be necessary to use SSA until
there are arbitrary control flow paths.  For now it's really quite
limited: iterate a function, feed back its state.


Entry: SSA
Date: Mon Oct 17 13:52:42 EDT 2011

Maybe it's best to use SSA directly.  However, for my current needs
SSA might be overkill, meaning too low level


Entry: lifting over streams
Date: Tue Oct 18 12:44:22 EDT 2011

I made a small lift operation that maps a function over a stream for
syntax only.  Now it would be good to implement this lifting operation
also in a generic way, so it can be tested in Haskell also.

This brings me back to one of the earlier ideas that a lot of what I
try to do is about commutation between compilation (function <->
syntax) and lifting over data structures (repetition / loops).  Having
2 representations at this point might make it simpler to give names to
all legs of the diagram.


  SM (Haskell function)  --->  function lifted over Haskell list

             |                             |
             |                             |  ???
             v                             v

  SM (target syntax)     --->  syntax lifted over target memory arrays


The ??? probably doesn't exist directly (can't be implemented) because
the domain is unrolled while the target is not (it's a representation
of loop code).

The left arrow actually also doesn't exist directly: both ends are
semantics of the TML class.

So the arrows are related only in that they are instances of the same
thing.  Essentially it would be possible to interpret the target
language to reconstruct the code rep.

So what I'm saying is that it might be a good idea to not only
abstract the base langauge, but the lifting operation and the data
containers since there is clearly some kind of structure to capture
here.


Entry: Lifting SMs
Date: Tue Oct 18 15:30:00 EDT 2011

What is there to say, really?  How many ways can you take a sequence
and fold it?

I really only have nested signals, and in-place iteration.  What's the
big deal?

The problem is that I'm doing more than that.
- Need float conversion for IO
- dereference of input arrays
- state feedback

The float conversion is specific to SM.  It is also that feature that
enables the representation of the IO as an array of pointers instead
of a struct of pointers  (implementation would be the same though.).

It seems to be reasonable to generalize to struct, but implement it
untyped, so we can have an array of void pointers, or an array of
unions with the different types, which makes little difference.

Let's forget about the casting itself, but use void pointer interface.

   int sm_tick(void *state, void **ins, void **outs);

dereference then is simply

   float *in0 = ins[0];
   int   *in1 = ins[1];
   ...

Start with adding "level" to Var in Term.


Entry: Memory read is just another binding
Date: Tue Oct 18 17:57:42 EDT 2011

Things seem to go smoother if array dereference is treated as just
another op which has a result bound to a variable name.  The only
special op is store.


Entry: Applicative vs. dataflow
Date: Tue Oct 18 21:29:47 EDT 2011

Still, really, why is this such a big deal..  Why can't it be
applicative until the very end, when the last return value is
"allocated".  I don't know why I keep feeling resistance to the idea
of just store it and be done..

Since the conversion from applicative to dataflow (output binding by
unification of caller-provided bindings) has to happen at some point,
the real question is: what is the most natural place for this?  Should
we do any operations on code that is in this form, or should it be
just the final step, i.e. part of the "printing".

Maybe that's the real problem though.  Unification..  Is there a
simple way to express this in haskell?  [1]

Anyway, sleeping about it a bit, it seems best to separate memory
references from local variables, but reuse the "Bind" object to print
them.


[1] http://web.engr.oregonstate.edu/~erwig/HaskellRules/HaskellRules_June06.pdf


Entry: Nesting
Date: Wed Oct 19 09:33:22 EDT 2011

The next hurdle is block nesting.  Doing this only inside the printer
is too messy.  This needs an intermediate syntax that can represent
variable scope.

I'm not sure if the SSA dominator approach is necessary though.

General principle: 

  - Syntax can be simple, meaning it might be able to represent
    illegal constructs.  Correctness is guaranteed through properties
    of the domain of the transformation, not the image..

  - Printing is messy.  Try to limit it to straightforward mappings.


Maybe this just needs 2 languages.  Embed Term into something else
that can represent all this in a straightforward way.

Hmm.. do subsets need separate representations (and operations..).
Why not just use subsets on the same type, trying to restrict
operations maybe..

The thing is, nested scopes are a really useful feature of C that
probably completely covers my use case.  I think they are quite
straightforward to translate to the more low-level LLVM
representation, so let's just make it work with C in mind.

I'm thinking to separate pointer computation and poiner dereference.


Entry: Stuck - control flow / nested scopes.
Date: Wed Oct 19 10:00:21 EDT 2011

The problem seems to be that I don't have a good mental picture of
SSA.  I'm trying to use a C model of nested scopes, but I keep running
into arbitrary not-quite-right data representations.

Maybe it is better to bite the bullet and go for a proper SSA rep with
basic blocks and dominator structure.  This is a bit choice though
because I don't think that C can represent arbitrary SSA code, though
it can express some nesting like these:


void foo1(void) {

  a:
    {
        int x = 0;
        int y = 0;
        goto b;
    }
  b:
    {
        int x = 0;
        int y = 0;
        goto a;
    }
}

void foo2(void) {

  a:
    {
        int x = 0;
        int y = 0;
        goto b;

      b:
        {
            int x1 = x;
            int y1 = y;
            goto a;
        }
    }
}


Looks like I'm not going to pull this off before having some deeper
understanding of different nesting forms.  Maybe I have to embrace PHI
and GOTO and their relation to CPS anyway [1].

Reading [1] I'm tempted to just stick to recursion and local function
definitions, replacing the function calls with assignments and jumps
and just dumping out C code.  I have everything in the correct form
already; I don't need dominator analysis.  Also sticking to recursion
makes it possible to formulate combinators in a more abstract way.

I.e. to implement fold

// Initial arguments.
int arg1 = ..;
int arg2 = ..;

fun:

int out1 = op1(arg1, arg2, ...);
int out2 = op2(arg1, arg2, ...);

// recursive call
arg1 = out1;
arg2 = out2;
goto fun;

Using a left fold as a basic building block, it seems possible to
limit the recursion to only tail recursion, where it can be replaced
with assignment and goto (no other stack frames necessary).

In a similar vein, output storage could be seen as a function call
also.  Really, that "fun" function could take some extra "inputs"
which are only visible at the call site, and are represented as output
pointer storage abstracted in the fold.

So really, this is just fold of (s,i) -> (s,o) over some containers C
i, C o.  Let's work on that first.

Morale: don't use "for", use "fold".


Or actually not fold but some combination of cata[2] + anamorphism[3],
hylomorphism[4]... Find the correct way to express this.

See next post.

[1] http://www.cs.princeton.edu/~appel/papers/ssafun.ps
[2] http://en.wikipedia.org/wiki/Catamorphism
[3] http://en.wikipedia.org/wiki/Anamorphism
[4] http://en.wikipedia.org/wiki/Hylomorphism_(computer_science)

Entry: SigFold
Date: Wed Oct 19 11:34:07 EDT 2011

Implemented the signal fold in an abstract way.  See for list instance
below.  The idea is to also implement this on 1. Concrete syntax
elements and 2. maybe on a higher level like TML directly?  Currently
it's a bit too abstract so I probably need some playing first..


-- a: arrow (stream abstraction as an arrow between i and o)
-- s: recursion state
-- i: input stream element
-- o: output stream element

class Arrow a => SigFold a s i o where
  sigFold :: s -> ((s,i) -> (s,o)) -> a i o

-- Represent as functions on infinite lists.
data ListOp i o = ListOp ([i] -> [o])
instance Category ListOp where
  id  = ListOp $ id
  (ListOp a) . (ListOp b) = ListOp $ a . b
instance Arrow ListOp where
  arr   = ListOp . map
  first = undefined  -- ??
  
instance SigFold ListOp s i o where
  sigFold s0 f = ListOp $ fa s0 where
    fa s (i:is) = (o:os) where
      (s',o) = f (s, i)
      os = fa s' is
      
intgr (s, i) = (s', s') where s' = s + i
intgr' = sigFold 0 intgr


Maybe the arrow instance isn't necessary..  Category might be enough.
Actually, I don't even use Category.  This is really just a
representation thing.


Entry: Abstracting foldable code
Date: Wed Oct 19 11:54:35 EDT 2011

I just tried some more aimless poking around, making both the function
body and the result more abstract, but I'm loosing the point of the
exercise.

What I want to do really is to have a representation of a "bound" fold
in both a functional and list form.  The reason I want this is that it
might not be trivial to express manually (separately) when there are
multiple nestings.

In short, I have TML which has no support for recursion, and I can map
it to 2 representations:

TMLf -> Code -> SSA   (Code is an intermediate form, the real result is SSA).
TMLf -> Value

Now TMLf can be used to represent a code body which can be folded.  I
would like to keep this fold operation abstract, so I can have the
same expression (a body + fold) be evaluated to a Haskell function on
(nested) lists, or some output machine code.

This requires two objects:
  - function rep (i.e. Sys or TermLS)
  - IO rep

These could be combined in multiple ways.  I.e. the function body
could be translated to a representation of just the body, but it might
be simpler to do it all together by evaluating the body over a
specific IO rep.

The real flexibility then comes when these loops can be nested.

Conclusion: I don't just need a loop body representation, I also need
a way to represent nested loop bodies.

Let's let this sink in a bit..

Some elements:
 - key are folds of (s,i) -> (s,o) bodies.  Loops with state.
 - I can't represent nesting yet


Entry: Reflection / summary.
Date: Wed Oct 19 21:22:33 EDT 2011

DONE:

- TML + Value/Code interpretations.  This just needs additional
  operations.

- Convenient generic compilation of functions over nested binary
  tuples.

- Arrow composition of state machine iteration functions, as a
  representation of their trivial fold: stream operations.


TODO:

- Make fold part of the generic language stack (type classes) instead
  of an ad-hoc "print".

  This is not so important now (direct correspondence between state
  space model and its trivial fold over 1D i/o streams) but might be a
  problem later when folds are nested.  Think about it now.

  Also, it is a not elegant if there are 2 folds (one for a code
  interpretation, one for a syntax interpretation) and they are not
  related.

- Make the commutations clear: interpretations and folds/maps and
  their relations + find the proper names for the morphisms / find
  some references.


Entry: Fold is good
Date: Thu Oct 20 10:34:36 EDT 2011

So how to express nesting of fold?

I need an example for this to be meaningful at all.  Maybe it's time
to wait wait a bit to try to do this in a generic style, and write the
two cases explicitly first.

Value => Sys            folds (trivially) over streams.
Code => Sys == TermFun  folds over C input prototypes

Here "==" is a morphism, not a strict identity: a Code => Sys can be
translated to a TermFun.

Note that I really don't care about TermFun : it is an intermediate
representation.  The starting point is Sys and TML.

So fold should only reference those two, right?

First, let's find a name for this recursion pattern

mapfold :: (s -> i -> (s, o)) -> s -> [i] -> [o]
mapfold f s (i:is) = (o:os) where
  (s', o) = f s i
  os = mapfold f s' is

Maybe I'm looking at the wrong type.  Looking at PD, what I want to do
is more something like this:

  ((s,i) -> (s,o)) -> ((s, A i) -> (s, A o))


I'm getting confused again..  Too abstract...  Need examples.


Entry: Tail recursion
Date: Thu Oct 20 11:18:48 EDT 2011

Maybe it's time to look at a representation in terms of mutually
recursive functions.

If there is no recursion, only tail recursion, then all function
parameters can be allocated globally, and a function call is
assignment to those variables followed by a jump.


What about something like this.  It's not very general, but it can
probably represent all loops I need.

{
    // loop context, constants
    const int a = ... ;
    const int b = ... ;

    // fallthrough: first call defines arguments
    // variables are assigned on subsequent function calls.
    int fun_a1 = ... ;
    int fun_a2 = ... ;
fun:
    ...
    // recursive tail call
    fun_a1 = ...;
    fun_a2 = ...;
    const int c = ...;
    const int d = ...;
    goto fun;

}

Corresponds to a named let in a lexical context.

(let ((a ...)
      (b ...))
   (letrec

 
 fun ((a1 ...) (a2 ...))
     (let ((c ...)
           (d ...))
       (fun ... ...))))

I think I finally got it.  Looks like this is one of the most
important "obvious" insights I had in quite a while.  It was there all
along but had to jump in front of my face somehow.  It's really
described almost exactly in [1] and [2].

See next post for cleaned up version.

[1] http://www.cs.princeton.edu/~appel/papers/ssafun.ps
[2] http://wingolog.org/archives/2011/07/12/static-single-assignment-for-functional-programmers


Entry: Compiling letrec
Date: Thu Oct 20 12:09:27 EDT 2011


/* The following Scheme code illustrates what we want to do: allow for
   mutually recursive procedures that live in a shared lexical
   context, and that have private lexical content of their own.

   All variable bindings are final; no set! in the source language
   representation.  Assignment is used to bind function parameters.


(let ((a 0)
      (b 0))
  (letrec
      ((fun1 (lambda (x y)
               (let ((c x)
                     (d y))
                 (fun2 c d))))
       (fun2 (lambda (x y)
               (let ((c x)
                     (d y))
                 (fun1 c d)))))
    (fun1 a b)))
*/

void test(void) {

    /* Top-level lexical env in which fun is defined.  These are
       visible from the fun1 and fun2 code bodies. */
    const int a = 0;
    const int b = 0;

    /* Function arguments for all functions defined on this level.

       Note that on the C level, all variables are visible to all
       functions defined in this basic block, which is an artifact of
       the compilation.  The original form of the code of course does
       not have such a lexical structure.  For this reason, the names
       are prefixed with the function name they belong to. */

    int fun1_x, fun1_y;
    int fun2_x, fun2_y;

    /* First call. */
    fun1_x = a;
    fun1_y = b;
    goto fun1;

    /* Function bodies. */
   fun1:
    {
        // fun's internal lexical environment.
        const int c = fun1_x;
        const int d = fun1_y;
        
        // Recursive tail call
        fun2_x = c;
        fun2_y = d;
        goto fun2;
    }
   fun2:
    {
        // fun's internal lexical environment.
        const int c = fun2_x;
        const int d = fun2_y;
        
        // Recursive tail call
        fun1_x = c;
        fun1_y = d;
        goto fun1;
    }
}


Entry: Stackless recursion
Date: Thu Oct 20 12:34:21 EDT 2011

So, what distinguishes a "state-machine" letrec form from a more
general recursive one?  All bindings contain primitive operations.
I.e. this is ANF form, but without "app".

Each block ends in a tail call, and no continuation stack is necessary
for evaluating the bindings because they are all primitive functions
or literals.

Touble here is nesting though.  How to represent "return"?

It doesn't seem possible without allowing nesting in app.  What is
possible though is nesting several letrec and let bindings in the tail
position.

So essentially, there is no return (ha!).

Only tail calls.

Is that enough?


Entry: Summary
Date: Thu Oct 20 13:50:41 EDT 2011

- Representing functions in nested let / letrec form is
  straightforward if the following limitations are kept in mind:

   * let can only bind primitive function calls (no call stack)

   * tail calls need to be calls to functions defined by letrec.


- Question: is absence of "return" a problem?  At first sight it
  doesn't seem so, as it can be implemented by a call to a function
  higher up the lexical ladder.

- Question: while this is enough to represent loops, there is still an
  open problem about how to represent array loads and stores.


Entry: Nested let syntax
Date: Thu Oct 20 14:03:24 EDT 2011

expr = let | letrec | if | tailcall

let    = [value binding] expr
letrec = [function binding] expr
if     = var expr expr


Entry: Representing I/O as consumer functions.
Date: Thu Oct 20 14:11:24 EDT 2011

What about making I/O a kind of function call.  If we're folding (s,i)
-> (s,o) then the recursive call that feeds the state back could be
something like (s,o) -> (s,i), a dual "consumer" that takes the output
and produces a new input.

So instead of just:

  state1 = ...;
  state2 = ...;
  goto tick;

we do something like:

  *oport1 = out1;
  *oport2 = out2;

  in1 = *iport1;
  in2 = *iport2;
  state1 = ...;
  state2 = ...;
  goto tick;
  
where the first block stores the output, and the second call is
actually passes input and state arguments.

Maybe not.. this doesn't have the correct symmetry.  Where does the
first input come from?

It's probably best i stick to the stream analogy.  While in Haskell we
might use a pair that is passed around as a parameter, it's probably
best to do something simpler but use some virtual representation of a
pair, i.e. a pointer or a pointer+index combo.

Trouble here is that we can't use the return value, so the fold would
have to be a left fold, where the "cons" can be represented as a write
to an array.


Entry: Assignments suck
Date: Thu Oct 20 21:11:49 EDT 2011

Also, there's a problem: the assignment trick doesn't work in some
cases, i.e. the "swap" function  (define (swap a b) (swap b a)) 

Is there a way to avoid this, or maybe it can be simply ignored in
first iteration and patched up later?  A fix is simple: just add more
bindings before the assignment..


Entry: Summary
Date: Thu Oct 20 21:13:01 EDT 2011

Still quite some floating around, but at least there are two useful
advancements.  See also [1].

 - Syntax ADT for modified ANF/CPS form for state-machine language.

   Basic idea: if there are only tail-recursive calls, then function
   calls are set+goto, so loops can be functions.  This means that
   there is no need for special imperative representation which should
   keep it simpler.

 - Simple C correspondence.

   The interesting part is that lexical scope still works for variable
   bindings, i.e. ANF form can be implemented directly in C if there
   are only primitive operations (which is the whole idea: everything
   is inlined for speed).


type Opc  = String
type Val  = String

data Fun  = Fun String
            deriving (Show, Eq)

data Var  = Var String
            deriving (Show, Eq)

data DefVar = DefVar Var Prim 
            deriving (Show, Eq)
                     
data DefFun = DefFun Fun [Var] Expr
            deriving (Show, Eq)
                     
data Prim = Lit Val
          | Op  Opc [Var]
          | Ref Var
          deriving (Show, Eq)
                   
data Expr = Let    [DefVar] Expr
          | LetRec [DefFun] Expr
          | If     Var Expr Expr
          | Call   Fun [Var]
          deriving (Show, Eq)

[1] entry://20111020-135041


Entry: Fold or Map
Date: Fri Oct 21 09:49:14 EDT 2011

I've been trying to see for a while where the fold belongs.  The one
I'm using to fold (s -> i -> (s,o)) over [i] to produce [o].

The general form of this would be:

    (s -> i -> (s, o)) ->  (s -> f i -> (s, f o))

where f is a container type, i.e. lists.  In case f is infinite we
can't have the state output, so we get:

                       ->  (f i -> f o)

Note that for the processing code, f cannot be infinite and I really
do need the state output.

So.. what is this lift operation?


Entry: Expression form
Date: Sat Oct 22 09:42:14 EDT 2011

So, with the insight that letrec is a possible way of representing
state machines with lexical data context, it's best to adopt this
simpler tree representation.

The only remaining problem is representation of output storage.

I made a fix distinguishing a variable representation from a
reference:

type VarName   = String
type LitRep    = String
type OpName    = String
data TypeName  = AFloat | AInt | AVoid deriving (Eq)
type Order     = Int

-- Language syntax terms.
data Var  = Var TypeName VarName Order
          deriving (Eq)
data Term = Ref Var                     -- variable reference
          | Lit TypeName LitRep         -- literal
          | Op  TypeName OpName [Term]  -- primitive operation
          deriving (Eq)


On top of this the flowchart language becomes:

type FunName = String
data Fun     = Fun FunName            -- function reference
data FBind   = FBind Fun [Var] Expr   -- function definition

data Expr = Let     Bind Expr
          | LetRec  [FBind] Expr
          | If      Term Expr Expr
          | Apply   Fun [Var]


Entry: Compiling to expression form
Date: Sat Oct 22 11:47:03 EDT 2011

It looks like this Expr form is better served with a language
counterpart that mirrors it exactly, but includes typing information.
If this can be pulled off in a generic way, i suppose the rest would
be really straightforward.  So let's see if it breaks down somewhere.

Now it gets confusing because this requires encoding of functions,
working around the fact that we don't know the arity.

Anywyays, let's proceed.

class TML m r => TMLP m r where

Simple, but wrong (not monadic):

  _let :: (r a) -> (r (a -> b)) -> (r b)

Is this correct?  It compiles..

  _let :: (r a) -> (r (a -> m b)) -> (r (m b))
  _if  :: (r a) -> (r (a -> m b -> m b -> m b)) -> (r (m b))

Now, for recursive function bindings it probably becomes quite a
problem to break the cycle..  It needs some kind of multi-argument
Y-combinator.  Or, it might be possible to construct a cyclic data
structure and flatten it later?

There's another problem to solve first.  There is no return value!
This is CPS.  Let's try to see if that makes sense, setting return
value to () for convenience.

  _let :: r a -> r (a -> m ()) -> r (m ())
  _if  :: r a -> r (a -> m () -> m () -> m ()) -> r (m ())
  _app :: r (as -> m x) -> r as -> r (m x)
 

It's weird that _app has practically the same type as _let.  This can
not be the case: some more annotation is necessary.

class TMLvar r a
class TMLfun r f

class TML m r => TMLP x m r where
  _let :: TMLvar r a
          => r a
          -> r (a -> m x)
          -> r (m x)
  _if  :: TMLvar r a
          => r a
          -> r (a -> m x -> m x -> m x)
          -> r (m x)
  _app :: (TMLvar r a, TMLfun r f) 
          => r (a -> m x)
          -> r as
          -> r (m x)
          

Just string from _let I come to:

  _letrec :: (TMLfun r fs)
          => r fs
          -> r (fs -> m x)
          -> r (m x)

Where fs can be multiple functions.  I still don't understand how the
cyclic binding would work here.

Hmm.. looks like there's an error.  This doesn't support forms like:

_let .. $ \v -> ...

This should be it:


class TML m r => TMLP x m r | r -> m where
  _let :: TMLvar m r a
          => r a
          -> (a -> r (m x))
          -> r (m x)
  _if  :: TMLvar m r a
          => (r a -> r (m x) -> r (m x) -> r (m x))
          -> r (m x)
  _app :: TMLvars m r as
          => r (as -> m x)
          -> r as
          -> r (m x)          
  _letrec :: (TMLfuns m r fs)
             => r fs
             -> (fs -> r (m x))
             -> r (m x)
  _lambda :: TMLfun m r (as -> m x)
             => (r as -> r (m x))
             -> r (as -> m x)


I also needed a _lambda form to actually create function bodies.  The
_letrec just binds names with a type constraint (that they are
functions).

After this I have trouble expressing tuples, so maybe (since it's
almost the lambda calculus) it's better to use currying anyway?

Actually, no.  The monad makes it impossible (or hard beyond my
current abilities).


I wonder... What about limiting the monad to the serialization of the
expressions?  Maybe the control flow part doesn't actually need it?
Can it be expressed without it?

It might be best to experiment with this in isolation.

1. Make a lambda language.
2. Make it monadic
3. Restrict functions to first class.

Does it help to realize that a representation of a lambda in the first
class language needs to be a reference?  Prolly not.

I get in trouble when I want to represent curried monadic functions.
Maybe such things make no sense?

I don't know..  Let's take a break.

I lost the point.  I just need lambda and application, and a way to
distinguish between function and variable references.  Maybe curried
functions are just not such a good idea..

So.. Back to non-curried functions.  The problem was the
representation of the tuple if I recall.


The trouble is that _lambda expects a type (r (a,b)) but we're giving
it (r a, r b).

  _lambda :: TMLfun m r (as -> m x)
             => (r as -> m (r x))
             -> r (as -> m x)

   h1 = _lambda $ \(a,b) -> _ret a

That seems to be a problem.

Maybe it needs a separate unpack operation?  Yep, that was it:

  _uncons :: r (a, b) -> (r a, r b)
  _cons   :: (r a, r b) -> r (a, b)

-- single
g1 = _lambda lam where
  lam l = _app g1 l' where
    (a, b) = _uncons l
    l' = _cons (b, a)

-- mutual
g2 = _lambda lam where
  lam l = _app g3 l' where
    (a, b) = _uncons l
    l' = _cons (b, a)
g3 = _lambda $ \l -> _app g2 l


Then in terms of these _cons and _uncons it's probably possible to
write some generic pack / unpack operations that can dispatch on the
type.


Ok.  Getting closer.  Now some type errors.

I've added explicit _nil :: r () to avoid overlapping instances.  Now
I can't type g1 manually.  Its inferred type looks quite horrible.

g1 = _lambda f where
  f l = _app g1 l''' where
    (a, l') = _uncons l
    (b, _) = _uncons l'
    l''  = _cons (a, _nil)
    l''' = _cons (b, l'')

*TML> :t g1
g1
  :: (TMLvar m r a,
      TMLP x4 m r,
      TMLP x3 m r,
      TMLP x m r,
      TMLP x5 m r,
      TMLP x1 m r,
      TMLP x2 m r) =>
     r ((a, (a, ())) -> m x3)


Adding a functional dependency r -> x helped, but I'm not sure if
that's not a bit too limited..  Anyway, can always work around this in
separate modules.

So, trying to implement pack / unpack doesn't seem to work well.  It
looks like it's too generic..  Things infer but types look weird.

It's probably possible but I'm getting too tired for this madness.


Summary?  I don't know..  I got a bit lost before I could see if it's
actually going the right direction.

The cool thing is that if this works: if it's possible to use the same
generic code to construct target code and host evaluation/test code,
then that is a big plus.  I'm not 100% convinced yet it will work, or
I will be able to actually pull it off.


Entry: Overlapping Instances
Date: Sun Oct 23 09:27:48 EDT 2011

When using type classes to represent compile-time recursive data
structures, one often runs into "Overlapping instances" errors.

While overlapping instances in data structure pattern matching are not
an issue (order is important!), for type classes it is a problem.  I
don't exactly understand why, but it seems it's possible to avoid
this.  I had to use the following extension in my case.  What does it
mean?


  IncoherentInstances


Entry: One return type?
Date: Sun Oct 23 10:05:35 EDT 2011

Maybe it's best to relax that constraint as it makes it difficult to
write tests that have different return values. Or maybe I should make
the tests more clear.

Anwyays..  Too much hassle.  Removed for now.

Next: write a simple loop.


Entry: Encoding of loop
Date: Sun Oct 23 10:55:49 EDT 2011

type I = Identity (Value Tint)
type B = Identity (Value Tbool)

t16 = f where
  f = _lambda $ \n -> do
    cond <- lt n (lit 123)
    n' <- mul n (lit 2)
    _if cond (_app f n') (_ret n)

t17 = _app t16 (lit 2) :: I

*Main> t17
Identity (Value 128)


Now of course this uses host-language code graphs, meaning that if
_app would build a data structure, and _lambda would too, this
structure would be cyclic.

So what is needed is a way to break the loop.  This needs a Y
combinator which can probably be disguised inside the letrec binding
form by inserting references.

Met de losse hand I come to this:

t18 = _letrec $ 
      (\f -> 
        _lambda $ \n -> do
          cond <- lt n (lit 123)
          n' <- mul n (lit 2)
          _if cond (_app f n') (_ret n))
      (\f ->
        _app f (lit 2))

It separates the definition of the functions with open recursion from
the creation of a context in which the functions can be used.

  _letrec :: (LoopPrim t, LoopFun m r f)
             => (r f -> r f)     -- body with open recursion
             -> (r f -> m (r t)) -- scope with functions defined
             -> m (r t)

Cool.  That was the next milestone.

Looks like this part is done except for handling tuples better, both
for arguments and parallel letrec bindings.

I'm curious though how this wil interact with the node naming monad..


Entry: Handling tuples
Date: Sun Oct 23 14:29:08 EDT 2011

It seems to be a pain to do this because a separate type class is
necessary.


Entry: Primtiive type stuff
Date: Sun Oct 23 15:37:44 EDT 2011

class TMLprim t where primType :: t -> TypeName
instance TMLprim Tbool   where primType _ = ABool
instance TMLprim Tint    where primType _ = AInt
instance TMLprim Tfloat  where primType _ = AFloat

class TypeOf t where typeOf :: t -> TypeName
instance TMLprim t => TypeOf (Code t) where
  typeOf (Code t) = primType t


of course that last instance won't work, but this does:

class TypeOf t where typeOf :: t -> TypeName
instance TMLprim t => TypeOf (Code t) where
  typeOf _ = primType (undefined :: t)


Entry: _if
Date: Sun Oct 23 15:51:52 EDT 2011

Trying to implement _if I'm wondering if I'm not doing something wrong
here.  It seems that the computation is split.  Does this correspond
to forking the variable generation?

Something isn't right with my intuition here..  Or my intuition's
right that it's wrong :)

Maybe I should start with writing a different returnN and let it build
Expr forms instead of Env?  Env can be fully captured with the Let
form only.  LetRec, If, App then do flow control.

Does it matter that If only references procedure calls?  Maybe not a
good idea since it's quite OK to represent If in C.

This needs a zipper.  At any point it needs to be clear what the
lexical context is, and for that we need to turn the Expr inside out.
The monad is then the zipper + variable naming state.

( I already had a zipper: the Env was "reversed" before. )

So there are 2 parts here: connecting up the monad data structure, and
evaluating it using a zipper.

It probably takes a form like this:

do
 c' <- nameVar c
 y' <- nameBlock y
 n' <- nameBlock n
 return $ If c' y' n'


It's not sure if nameBlock acktually needs to do naming..  It might
just pass the expression, but do some other bookkeeping on the side,
like updating the variable generation store.

An nameVar might have been done earlier also..

Actually, it's probably more like this:

_if c y n = do
    y' <- y
    n' <- n
    return $ If c (return y') (return n')

The two bind forms ensure that the CPS monad has "passed through" the
subexpressions.  Actually.. why do I think that If takes a monadic
argument?  It doesn't.  Once in syntax land (the If constructor) It's
really just values:

_if c y n = do
    y' <- y
    n' <- n
    return $ If c y' n'

Actually, that's the case after taking into account the wrapper that
implements the phantom type:

_if (Code c) y n = do
    (Code y') <- y
    (Code n') <- n
    return $ Code $ If c y' n'


Entry: Zipper for Expr?
Date: Mon Oct 24 19:15:38 EDT 2011

Maybe that's not even necessary.  There is a direct correspondence
between the functions from the type class rep and the constructors so
we don't really need to turn trees inside out, or do we?  Yes we do.
At least for "let".

Anyways, a more important change is to abstract the monad in TML, so
that we can implement TML in terms of Term only (just an environemnt)
but also Expr.  Or maybe just switch completely to Expr?


( It's crazy how this is so little code, but so much work to write it!
Because of the seemingly unlimited compositional freedom in haskell of
course things that look almost the same are not quite. )


Entry: 2 levels: dataflow and control flow
Date: Tue Oct 25 12:19:55 EDT 2011

I wanted to make the "makeVar" operation generic so we can have two
representations: one for Env and one for Expr.  Though it seems that I
have to bind Code representation to Env directly, so it might make
more sense to just get rid of Env and go for Expr directly.

It looks like 2 changes are needed:

 - Replace Env with Expr, working with Let constructors.
 - Make the variable naming state more explicit.

Probably easiest to do the latter first.  Indeed was straightforward,
now the next one.

Does it need to be a zipper?  If I recall the MetaOCaml code I'm
trying to recreate did not have a zipper structure.  Why is that?  It
did some CPS magic to make sure that the eventual return value of the
expression had the let form, but the current "hole" was passed the
reference.  How did that work?

This is what needs to be changed:

makeVar :: TypeOf (Code t) => (Code t) -> MCode r (Code t)
makeVar term@(Code sterm) = SC c where  
  c (State bs n) k = k (State bs' $ n+1) (Code $ Ref $ var) where
    var = Var typ v 0
    v = regPrefix ++ (show $ n)
    bs' = (Bind var sterm) : bs
    typ = typeOf term

Let's see.  The real point is the following: the stuff above works
well descending into an expression.  The context is indeed represented
as a zipper.  What needs to be done is to find out when we get to the
inner node.  At that point we need to provide a return value for the
enclosing contuation, and at that point we can invert the zipper.

These inner points are function calls, or maybe a "return" which is a
tail call to the toplevel continuation.

So the idea is to simply keep the makeVar code above, but extend it
with a zipper.

But first, find out if the original paper[1] uses a zipper.

No, it does not.  But it manipulates the continuations in a way that
looks like delimited continuations, and a dcont has a call stack that
takes the form of a zipper data structure.


[1] http://www.cs.rice.edu/~taha/publications/conference/pepm06.ps

Entry: Zipper for Exp?
Date: Tue Oct 25 13:18:04 EDT 2011

See [1] for a derivation.  Doing this straight from the polynomial
isn't really possible since it looses too much information.  Though
it's useful for verifying consistency.

One question that bubbles up though is how do I know what the current
toplevel expression is?  Suppose we have:

(Let a 123
(Let b 123
(Let c 123
  (Call f a b c))))

Here the continuation of the Call node needs to be made so that the
inverted Let node can be reconstructed.  The solution of course is to
insert a continuation that does this.  So we need to modify makeVar
after all.


makeVar :: TypeOf (Code t) => (Code t) -> MCode r (Code t)
makeVar term@(Code sterm) = SC c where  
  c (State bs n) k = k (State bs' $ n+1) (Code $ Ref $ var) where
    var = Var typ v 0
    v = regPrefix ++ (show $ n)
    bs' = (Bind var sterm) : bs
    typ = typeOf term

So instead of just calling k directly as in the definition above, we
install a new k that performs the inversion.

It seems simpler to then construct the data structure on an as-needed
basis, defining these continuations.

[1] entry://../compsci/20111025-132211


Entry: Smart Generators
Date: Tue Oct 25 13:57:19 EDT 2011

[1] http://smart-generators.org/do/view/Main/WebHome


Entry: TML Code in terms of Expr
Date: Tue Oct 25 15:30:23 EDT 2011

This needs a cleanup of Expr.  I currently do not have a way to
represent a variable reference as a term of Expr.

Then I'm not getting the CPS version of makeVar.

Time to take a sneak peak[1].  It defines the following form:

let retN a =
  fun s k -> .<let z = .~a in .~(k s .<z>.)>.

The trick is that this manipulates the return value of the final CPS
expression.

With this change I'm having trouble trying to separate Expr and Term,
so let's merge them.

These work:

makeVar :: TypeOf (Code t) => (Code t) -> MCode (Code t)
makeVar term@(Code sterm) = SC c where  
  c (State n) k = 
    Let var sterm (k state' ref) where
      state' = State $ n+1
      ref = Code $ Ref $ var
      var = Var typ nam 0
      typ = typeOf term
      nam = regPrefix ++ (show $ n)

runMCode :: MCode rt -> (State -> rt -> Term) -> Term
runMCode (SC c) = c (State 0)

termExpr m = runMCode m k where
  k _ (Code (Ref v)) = Ret v  -- toplevel continuation


l = lit 123
l :: TML m r => r Tint

> termExpr $ add l l

Let (Var i "t0" 0)
     (Op i "add" [Lit i "123",
                  Lit i "123"])
 (Ret (Var i "t0" 0))


However, the termFun compiler seems to need some tinkering.  I don't
see immediately how it is different.  It seems most of it can be
reused though.  The result will be a FunDef node.

Maybe it should be a Lambda node.


[1] http://www.cs.rice.edu/~taha/publications/conference/pepm06.ps


Entry: (->) (Code Tint) as monad
Date: Tue Oct 25 22:13:54 EDT 2011

I see that indeed ((->) t) is a monad, and it matches 2 instances.  I
don't see why this is a problem only now; this same code worked
before.


    Overlapping instances for CompileFun
                                s0
                                ((->) (Code Tint))
                                (Code Tint
                                 -> Code Tfloat
                                 -> Code Tfloat
                                 -> MCode (Code Tfloat, Code Tfloat))
      arising from a use of `termFun'
    Matching instances:
      instance (CompileVar r, CompileData r s, CompileFun s m f) =>
               CompileFun s m (r -> f)
        -- Defined at Compile.hs:(37,10)-(40,35)
      instance (CompileData r s, Monad m) => CompileFun s m (m r)
        -- Defined at Compile.hs:48:10-59
    In the expression: termFun test3'
    In an equation for `it': it = termFun test3'


So the culprit is this one:

instance (CompileData r s, Monad m) => CompileFun s m (m r) where
  compileFun _ mt = ([], liftM compileData mt)


I tried some shuffling but didn't get anywhere.  I've added
IncoherentInstances and that seems to shut it up.  See [1] for more
info.

A little more tinkering made termFun work:

termFun :: CompileFun Term (SC TML.State Term) f => f -> Term
termFun f = Lambda args body where
  (targs, mresult) = compileFun varnames f 
  -- Lambda arguments are variable terms generated from names in varnames
  args = vars targs
  -- Body is a nested expression that ends in a Ret expression.
  body = runMCode mresult k
  k _ cs = Ret $ vars cs
  
  -- Both input and output terms are expected to be variable references.
  unRef (Ref v) = v
  vars = map unRef

test a b = do
  s <- add a b
  p <- mul a b
  return (s, p)

 
test' = test :: Code Tfloat -> Code Tfloat 
                -> MCode (Code Tfloat, Code Tfloat)

t = termFun test'

Lambda [Var f "i0" 0,
        Var f "i1" 0]
  (Let (Var f "t0" 0)
          (Op f "add" [Ref (Var f "i0" 0),
                       Ref (Var f "i1" 0)])
  (Let (Var f "t1" 0)
          (Op f "mul" [Ref (Var f "i0" 0),
                       Ref (Var f "i1" 0)])
  (Ret [Var f "t0" 0,
        Var f "t1" 0])))


This is amazing, isn't it?  f -> Term ;)

Another milestone.
The rest seems a bit more straightforward.


[1] http://www.haskell.org/ghc/docs/6.6/html/users_guide/type-extensions.html


Entry: Fixing some of the overlapping instances cruft
Date: Wed Oct 26 16:40:01 EDT 2011

The trouble is with two ways of specifiying primitives:

- CompileVar: for Compile.hs to implement primitive data representation.

- TMLprim / LoopVar / LoopVars: for limiting inputs to certain syntax
  elements of Loop to primitive data.

These should be merged somehow, or derived from the same root.

Let's just delete LoopVar / LoopFun and start over.

The problem seems to be that for termFun to work I need to have an
instance 

CompileVar (Code t)

But the language's constraints are about about collections of (r t).

Something isn't quite right there..

This should go beter:

instance (TMLprim t) => TypeOf (Code t)
instance (TMLprim t, TypeOf (Code t)) => CompileVar (Code t)

That cleared up some errors, though now there is the problem of
representing multiple-arity functions.  termFun uses both currying and
tupling.


Entry: Representing functions
Date: Thu Oct 27 09:13:44 EDT 2011

The main problem with doing it the standard way is that the "r" and
the "m" seem to be in the way:

The source type is one of these:

    r a -> r b -> r c -> m (r t)

    (r a, (r b, r c)) -> m (r t)

which is what compileFun accepts and turns into ([i], m [o]), which is
further converted by termFun into a Lambda data structure.

The question then seems to be, how to perform the conversation to:

    (r ((a, (b,c)) -> m t))

or any other type that can be understood by _app, the inverse of _lambda.

The important parts seem to be:

  - _app and _lambda are inverses: as long as the representation is
    understood by both we're fine.

  - for proper embedding we really need the types above and not the
    "wrapped tuple" approach where _cons and _uncons are used in the
    meta language.  that is way too inconvenient.


So... How to express this?  Essentially this is an abstraction of the
unpack / pack that already happens in syntax elements like:

  _if (Code (Ref v)) mx my = do
    (Code x) <- mx
    (Code y) <- my
    return $ Code $ If v x y

in a way that is very similar to what happens in compileFun.  To
reiterate: it really doesn't matter what the form of the
representation is, as long as there is some form of type checking that
doesn't allow operation of _app and _lambda on function types that do
not match.

Note that for a Code embedding, the type parameter (Code (a -> m b))
is a phantom type and only used as a constraint.

Plan: first try to make it work for explicit in-language tupling.
Then see if this can somehow be automated using a recursive instance
definition.


    Could not deduce (TMLprim a) arising from a use of `termFun'
    from the context (TMLprim t, LoopArgs a)
      bound by the type signature for
                 _lambda :: (TMLprim t, LoopArgs a) =>
                            (Code a -> MCode' (Code t)) -> Code (a -> MCode' t)

What you want is (CompileVar (Code a)) not necessarily (TMLprim a).

Looks like this is the (r a, r b) vs r (and _only_ used for checkinga, b) issue again.  The former
is handled by the recursive instance of CompileVar defined in
Compile.hs, but the latter is new.

Maybe we need to represent "primitive data tree" on the TML level and
derive CompileVar from that?

Let's take the error message literally: termFun knows only one way to
compile (Code a) and that is when (TMLprim a).  How can we make
termFun smarter?  I.e how to define CompileVar for  (Code (LoopArgs a))?


I don't immediately see how to do this in terms of the data interface
of Compile.  The unpacking might need to happen somewhere else.

Hmm.. no good overview.  Let's fork and try to get rid of the _cons
_uncons stuff.  It's just phantom types so maybe it's possible to do
it differently.

This is hard...  Why?  I'm making some assumptions that don't hold up.

I'm trying this approach: make functions and function reps abstract.
Solve the representation of functions in a separate class, then use
this class' ops in the Loop language.


So I have something compiling, but that usually doesn't mean much with
classes when they are not used.  Basically, I've separated LoopFun
into a separate class that defines application and abstraction for a
set of types.  This should be enough to make it work.

class LoopFun m r t as ras where
  _lambda :: (ras -> m (r t)) -> r (as -> m t)
  _app :: r (as -> m t) -> (ras -> m (r t))


I have to stop for a while..

Conclusion: because there are going to be an infinite number of
possible function types for which this "unpacking" has to be done,
it's probably indeed better to put this in a class.

TODO: how to implement that class correctly?  Does the recursion need
to be implemented for each representation r separately?

Entry: Fixing type infrerence
Date: Fri Oct 28 09:35:28 EDT 2011

No instance for (LoopFun Identity Value Tint Tint (r1 t1))

t17 = _app t16 (lit 2) :: I


Maybe the class def is a bit too general.  Here with fundeps removed
as they seem to not work.

class LoopFun m r t as ras where
  _lambda :: (ras -> m (r t)) -> r (as -> m t)
  _app :: r (as -> m t) -> (ras -> m (r t))


What I know is that ras has the forms:
 ()
 (r a, ())
 (r a, (r b, ()))

Which brings me straight back again to implementing tuples, which I
was trying to avoid because I couldn't figure it out.

Going in circles.

At this point however it does seem to be better to just focus on
argument lists.  The functions are unary, so really the only thing
that changes is those lists.

Now it's time to focus.  This is wrong:

  _app :: LoopArgs r as ras => r (as -> m t) -> ras -> m (r t)

  _app (Value f) (Value v) = 
    liftM Value $ f v  

/home/tom/meta/dspm/Loop.hs:152:19:
    Could not deduce (ras ~ Value as)
    from the context (LoopArgs Value as ras)
      bound by the type signature for
                 _app :: LoopArgs Value as ras =>
                         Value (as -> Identity t) -> ras -> Identity (Value t)
      at /home/tom/meta/dspm/Loop.hs:(152,3)-(153,21)
      `ras' is a rigid type variable bound by
            the type signature for
              _app :: LoopArgs Value as ras =>
                      Value (as -> Identity t) -> ras -> Identity (Value t)
            at /home/tom/meta/dspm/Loop.hs:152:3
    In the pattern: Value v
    In an equation for `_app':
        _app (Value f) (Value v) = liftM Value $ f v
    In the instance declaration for `Loop Identity Value'

Why?  
f :: as -> m t
but ras van't be unified with Value as
This is because we're not usng the LoopArgs op that does the conversion.

So we need to make sure that ras becomes r as using argsPack :: ras -> r as

What about this:

  _app f v = app f (argsPack v) where
    app (Value f) (Value as) = 
      liftM Value $ f as
      
It seems to compile.

But the type is probably wrong (flipped).
I need some terminology.

Let's use:

   rf  :: r (as -> m t)         representation of function
   hf  :: ras -> m (r t)        metalanguage HOS embedding

Funny how getting this commutation right is so confusing, but really
the whole point as it links 2 worlds: syntax objects and their
embedding as functions in a host language.

  argsPack   :: ras -> r as
  argsUnpack :: r as -> ras


In this way, _app will always act on (ra, (rb, ...)) == ras and not on
r (a,(b,...)) == r as so the types are correct:

  _lambda :: LoopArgs r as ras => (ras -> m (r t)) -> r (as -> m t)
  _app    :: LoopArgs r as ras => r (as -> m t) -> ras -> m (r t)

So the following typechecks,

  _lambda f = Value rf where
    rf as = do
      Value as <- f $ argsUnpack $ Value as
      return as

but I don't get why the (argsUnpack . Value) is necessary.  It seems
to be to make sure the representation type of functions is r (as -> m
t) and not r (ras -> r (m t)).  This seems rather arbitrary, though it
makes sense.

Next problems: 

- how to map ras to a list of variables?  This is needed in _app / App
  for Code.

- test if Value works now: make some recursive classes.


Entry: LoopArgs
Date: Fri Oct 28 11:34:27 EDT 2011


{- Encodes the relation between a metalanguage tree of represented
   values and a representation of a tree value.  I.e.

          (r a, (r b, r c))   <->   r (a,(b,c))
-}
class LoopArgs r as ras | ras -> r, ras -> as where
  argsPack   :: ras -> r as
  argsUnpack :: r as -> ras


TODO: write the recursive instance.

Blank stare..

Something doesn't add up...  Maybe it's too simple?  What I need is a
way to distribute a simple constructor over tuples.  That really can't
be too hard, but to do it in a generic way the constructor and
destructor need to be wrapped.

Maybe this is the basic idea:

{- Constructor that commutes with binary tupling operator. -}
class Pair c where
  cons   :: (c a, c b) -> c (a, b)
  uncons :: c (a, b) -> (c a, c b)


How bout this as final solution?

{- Encodes the relation between a metalanguage tree of represented
   values and a representation of a tree value.  I.e.

          (r a, (r b, r c))   <->   r (a,(b,c))
-}
class Args r as ras | ras -> r, ras -> as where
  pack   :: ras -> r as
  unpack :: r as -> ras


{- Constructor supporting commute with binary tupling. -}
class Pair c where
  cons   :: (c a, c b) -> c (a, b)
  uncons :: c (a, b) -> (c a, c b)

instance (Pair r, Args r as r_as, Args r bs r_bs)
         => Args r (as,bs) (r_as, r_bs) where
  pack (r_as, r_bs) = cons (pack r_as, pack r_bs)
  unpack r_as_bs =  (unpack r_as, unpack r_bs) where
    (r_as, r_bs) = uncons r_as_bs

instance (Pair r, TMLprim t) => Args r t (r t) where  
  pack = id
  unpack = id


instance Pair Value where
  cons (Value a, Value b) = Value (a,b)
  uncons (Value (a, b)) = (Value a, Value b)
 

Then trouble:

f1 :: (Args r as ras, Loop m r) => r (as -> m t)
f1' :: Value (Tint -> Identity Tint)
f1' = f1

    No instance for (Args Value Tint ras0)
      arising from a use of `f1'
    Possible fix:
      add an instance declaration for (Args Value Tint ras0)
    In the expression: f1
    In an equation for `f1'': f1' = f1
Failed, modules loaded: StateCont, Term, TML, Compile, Loop.


It seems that the ras parameter is not fixed when specifying the type
of f1'.  Indeed, it does not appear in r (as -> m t).  So this needs
some functional dependency maybe?

This is called an "ambiguous type".  See [1]

What I want to express is that r and as together imply ras.  How to do
that?

What I need to do is to try to really understand functional
dependencies, and see how (or why not) I can have multi-arg fundeps.


Anyways, in the mean time I removed the fundeps from Args, and used an
explicit type signature:

  _app rf ras = app rf (pack ras) where
    app :: Code (as -> MCode t) -> Code as -> MCode (Code t)
    app (Code (FRef f)) (Code as) = 
      return $ Code $ App f [] -- FIXME: compile variables


After reading part of [1]: multi-arg deps are simple:  a b -> c
Does that solve it?

Almost.. I had to add some type annotation.  Especially the "forall as
ras t" seems to be significant.  To introduce type scoping?  I thought
it was implicit.

  _lambda = lambda where
    lambda :: forall as ras t. Args Value as ras => 
              (ras -> Identity (Value t)) -> Value (as -> Identity t)
    lambda f = Value rf where
      rf as = do
        Value t <- f  $ unpack $ Value (as :: as)
        return t


[1] http://www.reddit.com/r/haskell/comments/7oyg5/ask_haskellreddit_can_someone_explain_the_use_of/


Entry: Pfff...
Date: Fri Oct 28 15:55:25 EDT 2011

This local reasoning isn't getting me very far.  Suddenly I run into
the need for a "Cons" in the Term language.

Or maybe it is.  Makes sense at least since it solves another problem:
how to get to a list of Var types for App?  Like this:

instance Pair Code where
  cons (Code a, Code b) = Code $ Cons a b
  uncons (Code (Cons a b)) = (Code a, Code b)
  
instance Loop MCode' Code where
  _app rf ras = app rf (pack ras) where
    app :: Code (as -> MCode t) -> Code as -> MCode (Code t)
    app (Code (FRef f)) (Code as) = 
      return $ Code $ App f (toList as) where
        -- Convert the "Cons" structure to a flat list of variables.
        toList (Cons a b) = toList a ++ toList b
        toList (Ref v) = [v]

Next: Lambda.

The whole point is that termFun will take functions like this:

  (Code a, Code b) -> MCode (Code c, Code d)

so why can't I feed a host embedding straight into termFun?  At this
point we changed completely from r (a,b) to (r a, r b).

It seems to be that this is just what is needed:

  _lambda f = Code $ termFun f

With some extra instances:

instance Args Code as ras => CompileData ras Term
instance Args Code as ras => CompileVar ras


Entry: Term = (type-annotated) Scheme?
Date: Fri Oct 28 16:47:19 EDT 2011

It seems simpler to continue to relax the representation Term and rely
on the type system to actually limit the possible combinations.
Especially the use of variable references instead of terms in App seem
to be a problem for testing..

Maybe it's even possible to first fully encode Scheme and then build a
restricted interface on top of this?

Anyway we can find that out later.  The only reason I raise the issue
is because the following doesn't work:

t7 = _app (f5 :: C1) $ lit 1

Because there is no direct application of Lambda terms, only function
references.

So let's try Lambda separately.  This one asks for CompileData instance.

-- test Lambda
t9 = unCode (f5 :: C1)

That seems straightforward:

instance Args Code as ras => CompileData ras Term where
  compileData r = ts where
    (Code t) = pack r  
    vs = codeList t
    ts = map Ref vs

Though for CompileVar I don't know.  My first attempt doesn't seem to
work, probably because there is no terminating instance..

instance Args Code as ras => CompileVar ras where
  compileVar r = compileVar t where
    (Code t) = pack r

Next: find out why this pops up here.  This is really for base types,
not for composites.


Entry: Simply typed functional language?
Date: Fri Oct 28 17:02:37 EDT 2011

Is it useful to first implement a simply typed language with tuple
arguments (which is easier) and then use type restriction on top of
that to convert it into a first order language?

It might even be useful to have a compiler for such a language.  I
have a simple GC written in C already..


Entry: Close, no sigar
Date: Fri Oct 28 18:39:23 EDT 2011

I'm closer to getting it to work.  I'm getting a bit fed up though.
The reason this takes so long is probably just my lack of
understanding (i.e. reading error messages is still very hard for me).
However, after yesterday's Sussman talk I'm wondering if it isn't time
to switch back to Scheme for a while, with the isights gained, and
work test-driven instead of type-driven..


Entry: CompileVar
Date: Fri Oct 28 18:40:42 EDT 2011

Probably need to rethink Compile to work with Loop / Lambda.  Trouble
is that this will mess up the Sys existential too..  Real trouble is
that I list oversight.

This (CompileVar ras) constraint shouldn't actually be there.  Somehow
it doesn't know that this is a tuple and needs to be unpacked..

instance (CompileVar r, CompileVar rs)  
         => CompileVar (r,rs) where
  compileVar ns = ((t1,t2), ns'') where
    (t1, ns')  = compileVar ns
    (t2, ns'') = compileVar ns'


What is the problem I'm trying to solve actually?  I have something
which I know is a function from nested tuples of primitives to nested
tuples of primitives, monadically wrapped.  And I want to generate a
bunch of variables in the right structure, apply them to the function
and get the result.

This, by itself, isn't really rocket science.  The only thing that
complicates it is to map it into the typeclass relational setting
which needs a certain dependency structure.

So, essentially, Args and CompileVar CompileData are the same thing,
no?

I've added CompileVar as a requirement to Args.  Now it seems to
compile.

The trouble of course is that now I also need CompileVar (Value Tint).

instance CompileVar (Value t) where
  compileVar = undefined

So I'm getting closer.  Though now I run into problems with embedding.
funExpr adds a Ret, but an embedded function will already do this.
Need to distinguish between ordinary functions, and those that embed
the Loop control structure.


Entry: Primitive data
Date: Fri Oct 28 20:17:10 EDT 2011

Let's try to flesh this out a bit.  It's getting too complicated and
can probably be made a bit simpler.


The root of it all is

  class (Prim t)

which groups all primitive types.  On top of primitive types there are
representations, such as 

  data Code t
  data Value t

which can parameterize the TML and Loop languages, and tupling, such as

  (a, b)

The tupling needs to commute with the representation to make embedding
of lambda abstraction with tuple arguments simpler (possible?).

Compilation needs variable name generation and type projection (for
embedding in a single algebraic data type).


To simplify the current approach, I could try to get rid of the list
gathering, but assume that the underlying expression type can
represent binary trees.


Entry: Strategy
Date: Fri Oct 28 20:28:56 EDT 2011

The approach is too alien to me to fit into my head.  The abstraction
doesn't work yet in that I don't know yet how to pull it off, so I
need to keep track of this whole mountain of details, and that simply
doesn't work.

It seems I really need to stick to a local, incremental approach.  Try
not to break anything, and once it works, clean it up.


Entry: Pure functions vs. embedded syntax
Date: Fri Oct 28 20:46:19 EDT 2011

Hmm... looks like we're at a dead end.

termFun is made for compiling pure functions, not HOS functions.

It looks like overall it's a better strategy to stick to compilation
of HOS functions only, and write wrappers for pure functions.

This means the whole Compile.hs needs to be adjusted for this too.
Let's dupe it and make a variant.  ( Tried, gave up..  Can't do it now. )

Oh man, this isn't easy...

So what about doing this again.  Maybe it's even simpler than the
Compile.hs approach.  Really, we only need to provide an input, apply
it to the body of the function and we have an expression.  Take it out
of the monad and wrap it.

In order for this to work, we probably need to stay in the monad.
_lambda can't be a pure function, maybe that's the trick?

WRONG:
_lambda :: Args r as ras => (ras -> m (r t)) -> r (as -> m t)

RIGHT:
_lambda :: Args r as ras => (ras -> m (r t)) -> m (r (as -> m t))

This indeed seems to be necessary.

So, doing the eval purely for the type checking gives this:

  _lambda = lambda where
    lambda :: forall as ras t. Args Code as ras => 
              (ras -> MCode (Code t)) -> MCode (Code (as -> MCode t))
    lambda f = term where
      inTermTree :: ras
      inVarList :: [Var]
      inTermTree = undefined -- FIXME
      inVarList  = undefined -- FIXME: from inTermTree

      term = do
        (Code bodyTerm) <- f inTermTree
        return $ Code $ Lambda inVarList bodyTerm
        
Now I have to fill in that "undefined" with a bunch of variables that
have the right form, and translate them to a representation as a list.

One remark: the generated variable names need to come from the monad,
to avoid any possibility of capture, and also to make later
implementation simpler; meaning we can use global variables for
implementation without fear of clashing.

Maybe that's actually a better way of programming.  Write down the
types, use undefined for everything, and then fill in the blanks.

Forget about name generation for now, this can easily be added later.
Let's focus on the construction of proper variable trees.

class TreeVars ras where
  treeVars      :: ras -> ras
  treeVarNames  :: ras -> [Var]

instance Args r as ras => TreeVars ras where
  treeVars = undefined
  treeVarNames = undefined

instance Loop MCode' Code where
  _lambda = lambda where
    lambda :: forall as ras t. Args Code as ras => 
              (ras -> MCode (Code t)) -> MCode (Code (as -> MCode t))
    lambda f = term where
      inTermTree = treeVars (undefined :: ras)
      inVarList  = treeVarNames inTermTree      
      term = do
        (Code bodyTerm) <- f inTermTree
        return $ Code $ Lambda inVarList bodyTerm


Now, how to fill in the blanks?  The fact that there is no base case
for Args is probably still not a good thing..

Little tweaking, with proper varList interface:


class TreeVars ras where
  treeSize :: ras -> Int
  treeVars :: [Var] -> ras -> ras

instance Args r as ras => TreeVars ras where
  treeVars = undefined
  treeSize = undefined

instance Loop MCode' Code where      
  _lambda = lambda where
    lambda :: forall as ras t. Args Code as ras => 
              (ras -> MCode (Code t)) -> MCode (Code (as -> MCode t))
    lambda f = term where
      n = treeSize (undefined :: ras)
      term = do
        inVarList <- varList n
        let inTermTree = treeVars inVarList (undefined :: ras)
          in do
           (Code bodyTerm) <- f inTermTree
           return $ Code $ Lambda inVarList bodyTerm
        

So, what about putting this TreeVars straight into Args interface?
This means a little leakage to the Value implementation, but would
make things more straightforward.  Or.. can it go in the type
signature for lambda above?

If it goes into Args, then Args needs an extra type, which is the
compilation result type.  This is a bit ad-hoc..

Trouble is that the ability to call these methods on ras really needs
to be encoded in the context of _lambda, which means Args.  This
bleeds through to the Loop class, which needs a mention of the type it
will eventually be compiled to.

Let's just do it like that.  Will probably solve the problem.

Indeed:

termCompile m = runMCode m k where
  k _ (Code t) = t

f5 = _lambda $ \a -> _ret a

type C1 = MCode (Code (Tint -> MCode Tint))

> termCompile (f5 :: C1)
Lambda [Var i "t0" 0] (Ret [Var i "t0" 0])


But this one's no good:


f6 = _lambda $ \a -> do
  square <- mul a a
  _ret square

> termCompile (f6 :: C1)
Let (Var i "t1" 0) (Op i "mul" [Ref (Var i "t0" 0),Ref (Var i "t0" 0)]) (Lambda [Var i "t0" 0] (Ret [Var i "t1" 0]))

The Lambda needs to insert itself higher up.  Needs some monad
support, like makeVar.


Entry: Inserting Lambda binding
Date: Sat Oct 29 09:25:38 EDT 2011

I'm pretty sure that this is the way to insert an outer binding:

insertContext :: (Term -> Term) -> Code t -> MCode (Code t)
insertContext outer inner = SC c where
  c s k = outer (k s inner)

but I'm confused as how to use it.  Esentially (insertContext context)
has the same type as return, so it should be possible to use it in
place of return.

I get complaints about return type: 
    Expected type: MCode (Code (as -> MCode t))
      Actual type: MCode (Code t)

-          in do
-           (Code bodyTerm) <- f argTerms
-           return $ Code $ Lambda argVars bodyTerm
+          binder = Lambda argVars
+          body = f argTerms
+         in do
+          body' <- body
+          insertContext binder body'
+
+

Maybe need to unpack/pack to change the type?

Now this has still the same problem because of the order of the bind
operations:

  _lambda = lambda where
    lambda :: forall as ras t. Args Term Code as ras => 
              (ras -> MCode (Code t)) -> MCode (Code (as -> MCode t))
    lambda f = term where
      n = treeSize (undefined :: ras)
      term :: MCode (Code (as -> MCode t))
      term = do
        varNames <- varList n
        let 
          (argTerms, []) = treeTerms varNames (undefined :: ras)
          argVars  = map (\(Ref v) -> v) (treeVars argTerms)
          binder = Lambda argVars
          body = f argTerms
         in do
          body' <- body
          -- Need to unpack, repack the Code wrapping to change the
          -- phantom type from (Code t) to (Code (as -> MCode t)).
          (Code inner) <- insertContext binder body'
          return $ Code $ inner

Maybe insert context needs to be just a side effect?
That seems to do it:

-- insertContext :: (Term -> Term) -> MCode ()
insertContext outer = SC c where
  c s k = outer (k s ())


instance Loop Term MCode' Code where
  _lambda = lambda where
    lambda :: forall as ras t. Args Term Code as ras => 
              (ras -> MCode (Code t)) -> MCode (Code (as -> MCode t))
    lambda f = term where
      n = treeSize (undefined :: ras)
      term :: MCode (Code (as -> MCode t))
      term = do
        varNames <- varList n
        let 
          (argTerms, []) = treeTerms varNames (undefined :: ras)
          argVars  = map (\(Ref v) -> v) (treeVars argTerms)
          binder = Lambda argVars
          body = f argTerms
         in do
          insertContext binder 
          -- Need to unpack, repack the Code wrapping to change the
          -- phantom type from (Code t) to (Code (as -> MCode t)).
          (Code body') <- body
          return $ Code $ body'


f6 = _lambda $ \a -> do
  square <- mul a a
  _ret square


t10 = termCompile (f6 :: C1)

> t10
Lambda [Var i "t0" 0] 
  (Let (Var i "t1" 0) 
      (Op i "mul" [Ref (Var i "t0" 0),
                   Ref (Var i "t0" 0)])
   (Ret [Var i "t1" 0]))


Entry: Next
Date: Sat Oct 29 10:21:03 EDT 2011

Looking good.
Next:
- multi-arg functions don't infer correctly.
- _letrec for Code


Entry: Infer multi-arg functions
Date: Sat Oct 29 10:41:53 EDT 2011

Inference doesn't work properly, but manual annoation seems to work:

f2 :: (StructPair (,) r,
       StructPrim s (r a),
       Monad m,
       Loop s m r)
      => m (r ((a,a) -> m t))
f2 = _lambda $ \(a,b) -> do 
  f <- f2
  _app f (b,a)


Can we compile it?

One of the (undefined :: ras) variables got evaluated in the tuple
pattern matching, so I changed the following implementation to ignore
the inputs, and reconstruct new undefined values:

hunk ./dspm/Loop.hs 112
-  structSize (r_as, r_bs) = structSize r_as + structSize r_bs  [_$_]
+  structSize _ = structSize (undefined :: r_as) +
+                 structSize (undefined :: r_bs)
hunk ./dspm/Loop.hs 115
-  structVariables ns (r_as, r_bs) = ((a,b), ns'') where
-    (a, ns')  = structVariables ns r_as
-    (b, ns'') = structVariables ns' r_bs
+  structVariables ns _  = ((a,b), ns'') where
+    (a, ns')  = structVariables ns  (undefined :: r_as)
+    (b, ns'') = structVariables ns' (undefined :: r_bs)

Next problem is that without letrec, this produces an infinite data
type:

t11 = termCompile (f2 :: C2)

So let's try tupling first.  Seems to work.  I did have to annotate f7
to make it type check.


f7 :: (StructPair (,) r,
       StructPrim s (r a),
       Monad m,
       Loop s m r)
      => m (r ((a,a) -> m a))  
f7 = _lambda $ \(a,b) -> _ret a

type C2 = MCode (Code ((Tint, Tint) -> MCode Tint))

t13 = termCompile (f7 :: C2)


> t13
Lambda [Var i "t0" 0,Var i "t1" 0] (Ret [Var i "t0" 0])


I noticed that pack/unpack and thus cons/uncons are not used for Code
instance, but this seems not correct: _app probably needs it.  Check this.

Actually, _app uses structPack while it can probably use structCompile.

( Are these cons/uncons and structXXX actually special cases of the
same interface?  They do seem to behave in a similar way.. )

So _app for Code indeed works without _cons / _uncons.

For the test to run it makes sense to have more general application.
I'm also changing Term in this way:

-            | App    Fun [Var]                -- function application.
-            | If     Var Term Term            -- conditional branching
-            | Ret    [Var]                    -- invoke toplevel continuation

+            | App    Term [Term]              -- function application.
+            | If     Term Term Term           -- conditional branching
+            | Ret    [Term]                   -- invoke toplevel continuation


Still getting botched lambda insertion..

t14 = termCompile $ do
  f <- (f7 :: C2)
  l1 <- insertVar $ lit 1
  l2 <- insertVar $ lit 2
  _app f $ (l1, l2)

> t14
Lambda [Var i "t0" 0,
        Var i "t1" 0]
  (Let (Var i "t2" 0) (Lit i "1")
  (Let (Var i "t3" 0) (Lit i "2")
    (App (Ret [Var i "t0" 0])
         [Ref (Var i "t2" 0),
          Ref (Var i "t3" 0)])))


Entry: Next
Date: Sat Oct 29 12:01:15 EDT 2011

- fix toplevel lambda problem
- letrec

Entry: Lambda problem
Date: Sat Oct 29 12:38:18 EDT 2011

It seems that I'm confusing the operation types.

  insertLet ::  r t -> m (r t)

but what I have has an extra computation argument that needs to be
properly sequenced.

  insertLambda :: [t] -> m (r t) -> m (r t)

the 2nd argument is the body, while the first is the arguments.

Can this sequencing be done with bind?  I don't see how, so let's do
it manually.  I end up with this as first attempt:

insertLambda vars (SC cBody) = SC cTerm where
  cTerm s k = k s' $ Lambda vars body where
    (s', (Code body)) = cBody s (,)

To obtain the body term. the whole cBody computation has to be
executed.  The state output of that computation is then reused; body
probably did some variable allocation.

What I don't undersetand is the type:

insertLambda :: [Var] -> SC s (s, Code t) (Code t) -> SC s r Term

This might be problematic, as the return type of the computation is
(s, Code t) instead of Term.  Let's see.

Indeed:

    Couldn't match expected type `(TML.CompState, Code t0)'
                with actual type `Term'


So is there a way to avoid this by not running the computation
directly, but by chaining it into the other using CPS?  Something like
this:

insertLambda vars (SC cBody) = SC cTerm where
  cTerm s k = cBody s k' where
    k' s' (Code body) = k s' $ Code $ Lambda vars body

Yep it looks like that was it:

> t14

(Let (Var i "t2" 0) (Lit i "1")
(Let (Var i "t3" 0) (Lit i "2")
  (App
    (Lambda [Var i "t0" 0,
             Var i "t1" 0])
      (Ret (Ref (Var i "t0" 0)))) 
    [Ref (Var i "t2" 0),
     Ref (Var i "t3" 0)])))


Entry: Cleanup
Date: Sat Oct 29 14:03:43 EDT 2011

Maybe I just throw away Compile.hs and CompileTerm.hs This kind of
behaviour, when needed, can probably best be built on top the Code /
Term compilation.


Entry: Next
Date: Sat Oct 29 18:02:59 EDT 2011

_letrec
Similar problem: collection of references.

Probably can just reuse the Struct mechanism, though this requires to
use Var for function bindings.  Trouble with that is type annotation
for Var.

That's not exactly free..  To some extent a more liberal syntax is
simpler to implement, though type annotation for higher order
functions is extra work.

Why did I change to more liberal syntax?  To be able to
App (Lambda ...) [...]

Yeah, it's messy..

Maybe let it sit for a while.

Maybe I can actually not implement this for now as it's not really
necessary for making the Pd/SM code run; that just needs single
recursion for loops.

The _letrec has the same problem as _lambda earlier: it's not monadic.

Got it working, but it requires some heavy type annotation:

t17def :: Code (Tint -> MCode Tint) -> MCode (Code (Tint -> MCode Tint))
t17def = \f -> _lambda $ (\x -> do
                             x2 <- mul x x
                             _app f x2)
t17exp = \f -> _app f (lit 123)

t17 = term $ _letrec t17def t17exp

> t17
Let (Var i "r2" 0) 
      (Op i "mul" [Ref (Var i "r1" 0),
                   Ref (Var i "r1" 0)])
  (LetRec [(Fun "f\"r0\"",
              Lambda [Var i "r1" 0] 
                (App (FRef (Fun "f\"r0\"")) 
                     [Ref (Var i "r2" 0)]))]
     (App (FRef (Fun "f\"r0\"")) [Lit i "123"]))


All kinds of things wrong with this..  Let's first fix the FRef
printing though.  Then make a proper insert for the LetRec.


insertLetRec ref (SC cDef) (SC cExpr) = SC cTerm where
  cTerm s k = cDef s k' where -- compile definition, then goto k'
    k' s' (Code def) = cExpr s' k'' where -- compile body, then goto k''
      k'' s (Code expr) = k s' $ Code $ LetRec [(ref, def)] expr  -- return wrapped form

instance Loop Term MCode' Code where
  _letrec open body = do
    [fName] <- nameList "fun" 1
    let fRef  = Fun $ fName
        fCode = Code $ FRef fRef -- How to correctly type this?
      in do
      insertLetRec fRef (open fCode) (body fCode)

Doesn't seem to work...

> t17
Let (Var i "r2" 0) (Op i "mul" [Ref (Var i "a1" 0),Ref (Var i "a1" 0)]) 
 (LetRec [(Fun "fun0",Lambda [Var i "a1" 0] 
             (App (FRef (Fun "fun0")) [Ref (Var i "r2" 0)]))] 
   (App (FRef (Fun "fun0")) [Lit i "123"]))


I don't know what I'm doing.  It seems to be different from both the
Let and Lambda continuation juggling.

Maybe the problem is elsewhere.  Maybe Lambda isn't right..  All the
evaluation juggling should really be done already at this point.  This
is just using bind to obtain results in the monad and wrap them in
LetRec:

instance Loop Term MCode' Code where
  _letrec def expr = do
    [fName] <- nameList "fun" 1
    let ref  = Fun $ fName
        fref = Code $ FRef ref
      in do
      (Code def') <- def fref
      (Code expr') <- expr fref
      return $ Code $ LetRec [(ref, def')] expr'
  

So what's wrong with Lambda? Trouble is that evaluation of the body
really inserts things at the toplevel, so how to delimit that
toplevel?  In some way, that Lambda expression is really a separate
world, so maybe it should be run as a separate computation?

I had some trouble with the return type doing that.. How to fix?  Can
we keep the monad's result open?  I don't think we can because of this
naked Let.. in insertLet.

This implementation seems to work:

-- Run the cBody computation separate from our current context by
-- calling cBody directly.  This is necessary because insertLet would
-- mess with our continuation.

-- Note that because the toplevel continuation gets lost due to Let
-- insertion, we can't keep track of the compilation state that
-- threads through the lambda expression.  This means the variable
-- naming scheme will fork.  This doesn't seem to be a problem since
-- all variables used inside the Lambda are in an isolated context.
  
insertLambda vars (SC cBody) = SC cTerm where
  cTerm s k = k s $ Code $ Lambda vars body where
    body = cBody s $ \s (Code b) -> b
    

> t17


LetRec [(Fun "fun0",Lambda [Var i "a1" 0] 
           (Let (Var i "r2" 0)
              (Op i "mul" [Ref (Var i "a1" 0),
                           Ref (Var i "a1" 0)])
             (App (FRef (Fun "fun0")) [Ref (Var i "r2" 0)])))] 
   (App (FRef (Fun "fun0")) [Lit i "123"])


Interesting stuff!


t18 = term $ _letrec 
      t17def $
      \f -> do 
        x <- mul (lit 3) (lit 4)
        y <- mul x x
        _app f y
> t18

Let (Var i "r2" 0) (Op i "mul" [Lit i "3",Lit i "4"]) (Let (Var i "r3" 0) (Op i "mul" [Ref (Var i "r2" 0),Ref (Var i "r2" 0)]) (LetRec [(Fun "fun0",Lambda [Var i "a1" 0] (Let (Var i "r2" 0) (Op i "mul" [Ref (Var i "a1" 0),Ref (Var i "a1" 0)]) (App (FRef (Fun "fun0")) [Ref (Var i "r2" 0)])))] (App (FRef (Fun "fun0")) [Ref (Var i "r3" 0)])))

And wrong again..

Looks like the evaluation of the second expression needs to happen in
a delimited context also.  EDIT: see mBlock next post.


t17def :: Code (Tint -> MCode Tint) -> MCode (Code (Tint -> MCode Tint))
t17def = \f -> _lambda $ (\x -> do
                             x2 <- mul x x
                             _app f x2)
t18 = term $ _letrec 
      t17def $
      \f -> do 
        x <- mul (lit 3) (lit 4)
        y <- mul x x
        _app f y
  
> t18
LetRec [(Fun "fun0",Lambda [Var i "a1" 0] 
          (Let (Var i "r2" 0) (Op i "mul" [Ref (Var i "a1" 0), Ref (Var i "a1" 0)]) 
          (App (FRef (Fun "fun0")) [Ref (Var i "r2" 0)])))] 
   (Let (Var i "r2" 0) (Op i "mul" [Lit i "3",Lit i "4"]) 
   (Let (Var i "r3" 0) (Op i "mul" [Ref (Var i "r2" 0),Ref (Var i "r2" 0)])
   (App (FRef (Fun "fun0")) [Ref (Var i "r3" 0)])))


Entry: Delimiting Let
Date: Sun Oct 30 10:49:52 EDT 2011

Shuffled things around a bit.  The main idea is to run a code
generation computation MCode (Code t) in an isolated context to avoid
Let insertion to grab beyond the current expression point, as it
manipulates the top continuation.

The interface:

subTerm s (SC c) = c s $ \s (Code t) -> t
term = subTerm (CompState 0)

mBlock :: MCode (Code t') -> MCode (Code t)
mBlock sub = SC main where
  main s k = k s $ Code $ (subTerm s sub)


Running an isolated compilation is then:

  (Code term) <- mBlock mCode


Entry: Language.C
Date: Sun Oct 30 11:16:41 EDT 2011

Seems that the toughest job is done.  Code generator is working.
Interesting thing to do next might be C code generation.

Is there an easy to use C prettyprinter for Haskell?
There is one[2] in Language.C[1].

In theory this just requires translation of (a subset of Term) to the
Language.C.Syntax.AST type[3].

To find out how to use this, it might be simplest to first parse some
C and see what it looks like, then try to recreate the syntax.
Starting from the docs/source itself is a bit too much..


[1] http://hackage.haskell.org/package/language-c
[2] http://haskell.org/packages/archive/language-c/0.4.2/doc/html/Language-C-Pretty.html
[3] http://hackage.haskell.org/packages/archive/language-c/0.4.2/doc/html/Language-C-Syntax-AST.html
[4] http://trac.sivity.net/language_c/
[5] http://trac.sivity.net/language_c/wiki/GettingStarted


Entry: Reusing haskell parser
Date: Wed Nov  2 14:51:57 EDT 2011

I changed upon this[1] today.  Does it actually mention compilation
from haskell syntax to some functional representation?  This would be
neat.  How to do that?

Looking up Core[2] in the Haskell wiki, which is from YHC (not GHC).


[1] http://www.haskell.org/pipermail/haskell-cafe/2011-October/095843.html
[2] http://www.haskell.org/haskellwiki/Yhc/API/Core


Entry: Parsing with Language.C
Date: Fri Nov  4 21:34:24 EDT 2011

A first attempt to make it do something, starting from parseC and
following types on hackage[1].

> import Language.C
> import Data.ByteString.Char8
> import Language.C.Data.Position
> cCode = "int fun(void *state, void **ins, void **outs, int n) { }"
> c = parseC  (pack cCode)  $ initPos "/tmp/foo.c"

This gives:

Right (CTranslUnit [CFDefExt (CFunDef [CTypeSpec (CIntType (NodeInfo ("/tmp/foo.c": line 1) (("/tmp/foo.c": line 1),3) (Name {nameId = 1})))] (CDeclr (Just "fun") [CFunDeclr (Right ([CDecl [CTypeSpec (CVoidType (NodeInfo ("/tmp/foo.c": line 1) (("/tmp/foo.c": line 1),4) (Name {nameId = 3})))] [(Just (CDeclr (Just "state") [CPtrDeclr [] (NodeInfo ("/tmp/foo.c": line 1) (("/tmp/foo.c": line 1),5) (Name {nameId = 6}))] Nothing [] (NodeInfo ("/tmp/foo.c": line 1) (("/tmp/foo.c": line 1),5) (Name {nameId = 5}))),Nothing,Nothing)] (NodeInfo ("/tmp/foo.c": line 1) (("/tmp/foo.c": line 1),5) (Name {nameId = 7})),CDecl [CTypeSpec (CVoidType (NodeInfo ("/tmp/foo.c": line 1) (("/tmp/foo.c": line 1),4) (Name {nameId = 8})))] [(Just (CDeclr (Just "ins") [CPtrDeclr [] (NodeInfo ("/tmp/foo.c": line 1) (("/tmp/foo.c": line 1),3) (Name {nameId = 11})),CPtrDeclr [] (NodeInfo ("/tmp/foo.c": line 1) (("/tmp/foo.c": line 1),3) (Name {nameId = 12}))] Nothing [] (NodeInfo ("/tmp/foo.c": line 1) (("/tmp/foo.c": line 1),3) (Name {nameId = 10}))),Nothing,Nothing)] (NodeInfo ("/tmp/foo.c": line 1) (("/tmp/foo.c": line 1),3) (Name {nameId = 13})),CDecl [CTypeSpec (CVoidType (NodeInfo ("/tmp/foo.c": line 1) (("/tmp/foo.c": line 1),4) (Name {nameId = 14})))] [(Just (CDeclr (Just "outs") [CPtrDeclr [] (NodeInfo ("/tmp/foo.c": line 1) (("/tmp/foo.c": line 1),4) (Name {nameId = 17})),CPtrDeclr [] (NodeInfo ("/tmp/foo.c": line 1) (("/tmp/foo.c": line 1),4) (Name {nameId = 18}))] Nothing [] (NodeInfo ("/tmp/foo.c": line 1) (("/tmp/foo.c": line 1),4) (Name {nameId = 16}))),Nothing,Nothing)] (NodeInfo ("/tmp/foo.c": line 1) (("/tmp/foo.c": line 1),4) (Name {nameId = 19})),CDecl [CTypeSpec (CIntType (NodeInfo ("/tmp/foo.c": line 1) (("/tmp/foo.c": line 1),3) (Name {nameId = 21})))] [(Just (CDeclr (Just "n") [] Nothing [] (NodeInfo ("/tmp/foo.c": line 1) (("/tmp/foo.c": line 1),1) (Name {nameId = 22}))),Nothing,Nothing)] (NodeInfo ("/tmp/foo.c": line 1) (("/tmp/foo.c": line 1),1) (Name {nameId = 23}))],False)) [] (NodeInfo ("/tmp/foo.c": line 1) (("/tmp/foo.c": line 1),1) (Name {nameId = 24}))] Nothing [] (NodeInfo ("/tmp/foo.c": line 1) (("/tmp/foo.c": line 1),3) (Name {nameId = 2}))) [] (CCompound [] [] (NodeInfo ("/tmp/foo.c": line 1) (("/tmp/foo.c": line 1),1) (Name {nameId = 25}))) (NodeInfo ("/tmp/foo.c": line 1) (("/tmp/foo.c": line 1),1) (Name {nameId = 26})))] (NodeInfo ("/tmp/foo.c": line 1) (("/tmp/foo.c": line 1),1) (Name {nameId = 27})))

I tried to use the prettyprinter:

-- import Text.Show.Pretty           --            .. pretty-show

But this doesn't seem to work..  After removing the explicit Show
instances for NodeInfo and Position (botha are in directory
language-c-0.4.2/src/Language/C/Data/) it does prettyprint properly,
though the size of this tree makes it rather difficult to handle.  See
next post.

Let's just make a list of the constructors.

CTranslUnit
CFDefExt
CFunDef
CTypeSpec
CIntType
NodeInfo
Position
Name
CDeclr
CFunDeclr
CDecl
CVoidType
CDeclr
CPtrDeclr

Might not be such a big deal to make it work from the inside out.

Man this is too much.  C syntax isn't really so simple as it looks.


Entry: Pretty-printed C AST
Date: Fri Nov  4 22:39:04 EDT 2011

*Main> pc c
CTranslUnit
  [ CFDefExt (
      CFunDef
        [ CTypeSpec (
            CIntType (
              NodeInfo
                Position
                  { posOffset = 0
                  , posFile = "/tmp/foo.c"
                  , posRow = 1
                  , posColumn = 1
                  }
                ( Position
                    { posOffset = 0
                    , posFile = "/tmp/foo.c"
                    , posRow = 1
                    , posColumn = 1
                    }
                , 3
                )
                Name
                  { nameId = 1
                  ) ) )
        ] (
        CDeclr (
          Just "fun" )
          [ CFunDeclr (
              Right
                ( [ CDecl
                      [ CTypeSpec (
                          CVoidType (
                            NodeInfo
                              Position
                                { posOffset = 8
                                , posFile = "/tmp/foo.c"
                                , posRow = 1
                                , posColumn = 9
                                }
                              ( Position
                                  { posOffset = 8
                                  , posFile = "/tmp/foo.c"
                                  , posRow = 1
                                  , posColumn = 9
                                  }
                              , 4
                              )
                              Name
                                { nameId = 3
                                } ) )
                      ]
                      [ ( Just (
                            CDeclr (
                              Just "state" )
                              [ CPtrDeclr
                                  [] (
                                  NodeInfo
                                    Position
                                      { posOffset = 13
                                      , posFile = "/tmp/foo.c"
                                      , posRow = 1
                                      , posColumn = 14
                                      }
                                    ( Position
                                        { posOffset = 14
                                        , posFile = "/tmp/foo.c"
                                        , posRow = 1
                                        , posColumn = 15
                                        }
                                    , 5
                                    )
                                    Name
                                      { nameId = 6
                                      } )
                              ]
                              Nothing
                              [] (
                              NodeInfo
                                Position
                                  { posOffset = 14
                                  , posFile = "/tmp/foo.c"
                                  , posRow = 1
                                  , posColumn = 15
                                  }
                                ( Position
                                    { posOffset = 14
                                    , posFile = "/tmp/foo.c"
                                    , posRow = 1
                                    , posColumn = 15
                                    }
                                , 5
                                )
                                Name
                                  { nameId = 5
                                  } ) )
                        , Nothing
                        , Nothing
                        )
                      ] (
                      NodeInfo
                        Position
                          { posOffset = 8
                          , posFile = "/tmp/foo.c"
                          , posRow = 1
                          , posColumn = 9
                          }
                        ( Position
                            { posOffset = 14
                            , posFile = "/tmp/foo.c"
                            , posRow = 1
                            , posColumn = 15
                            }
                        , 5
                        )
                        Name
                          { nameId = 7
                          } )
                  , CDecl
                      [ CTypeSpec (
                          CVoidType (
                            NodeInfo
                              Position
                                { posOffset = 21
                                , posFile = "/tmp/foo.c"
                                , posRow = 1
                                , posColumn = 22
                                }
                              ( Position
                                  { posOffset = 21
                                  , posFile = "/tmp/foo.c"
                                  , posRow = 1
                                  , posColumn = 22
                                  }
                              , 4
                              )
                              Name
                                { nameId = 8
                                } ) )
                      ]
                      [ ( Just (
                            CDeclr (
                              Just "ins" )
                              [ CPtrDeclr
                                  [] (
                                  NodeInfo
                                    Position
                                      { posOffset = 27
                                      , posFile = "/tmp/foo.c"
                                      , posRow = 1
                                      , posColumn = 28
                                      }
                                    ( Position
                                        { posOffset = 28
                                        , posFile = "/tmp/foo.c"
                                        , posRow = 1
                                        , posColumn = 29
                                        }
                                    , 3
                                    )
                                    Name
                                      { nameId = 11
                                      } )
                              , CPtrDeclr
                                  [] (
                                  NodeInfo
                                    Position
                                      { posOffset = 26
                                      , posFile = "/tmp/foo.c"
                                      , posRow = 1
                                      , posColumn = 27
                                      }
                                    ( Position
                                        { posOffset = 28
                                        , posFile = "/tmp/foo.c"
                                        , posRow = 1
                                        , posColumn = 29
                                        }
                                    , 3
                                    )
                                    Name
                                      { nameId = 12
                                      } )
                              ]
                              Nothing
                              [] (
                              NodeInfo
                                Position
                                  { posOffset = 28
                                  , posFile = "/tmp/foo.c"
                                  , posRow = 1
                                  , posColumn = 29
                                  }
                                ( Position
                                    { posOffset = 28
                                    , posFile = "/tmp/foo.c"
                                    , posRow = 1
                                    , posColumn = 29
                                    }
                                , 3
                                )
                                Name
                                  { nameId = 10
                                  } ) )
                        , Nothing
                        , Nothing
                        )
                      ] (
                      NodeInfo
                        Position
                          { posOffset = 21
                          , posFile = "/tmp/foo.c"
                          , posRow = 1
                          , posColumn = 22
                          }
                        ( Position
                            { posOffset = 28
                            , posFile = "/tmp/foo.c"
                            , posRow = 1
                            , posColumn = 29
                            }
                        , 3
                        )
                        Name
                          { nameId = 13
                          } )
                  , CDecl
                      [ CTypeSpec (
                          CVoidType (
                            NodeInfo
                              Position
                                { posOffset = 33
                                , posFile = "/tmp/foo.c"
                                , posRow = 1
                                , posColumn = 34
                                }
                              ( Position
                                  { posOffset = 33
                                  , posFile = "/tmp/foo.c"
                                  , posRow = 1
                                  , posColumn = 34
                                  }
                              , 4
                              )
                              Name
                                { nameId = 14
                                } ) )
                      ]
                      [ ( Just (
                            CDeclr (
                              Just "outs" )
                              [ CPtrDeclr
                                  [] (
                                  NodeInfo
                                    Position
                                      { posOffset = 39
                                      , posFile = "/tmp/foo.c"
                                      , posRow = 1
                                      , posColumn = 40
                                      }
                                    ( Position
                                        { posOffset = 40
                                        , posFile = "/tmp/foo.c"
                                        , posRow = 1
                                        , posColumn = 41
                                        }
                                    , 4
                                    )
                                    Name
                                      { nameId = 17
                                      } )
                              , CPtrDeclr
                                  [] (
                                  NodeInfo
                                    Position
                                      { posOffset = 38
                                      , posFile = "/tmp/foo.c"
                                      , posRow = 1
                                      , posColumn = 39
                                      }
                                    ( Position
                                        { posOffset = 40
                                        , posFile = "/tmp/foo.c"
                                        , posRow = 1
                                        , posColumn = 41
                                        }
                                    , 4
                                    )
                                    Name
                                      { nameId = 18
                                      } )
                              ]
                              Nothing
                              [] (
                              NodeInfo
                                Position
                                  { posOffset = 40
                                  , posFile = "/tmp/foo.c"
                                  , posRow = 1
                                  , posColumn = 41
                                  }
                                ( Position
                                    { posOffset = 40
                                    , posFile = "/tmp/foo.c"
                                    , posRow = 1
                                    , posColumn = 41
                                    }
                                , 4
                                )
                                Name
                                  { nameId = 16
                                  } ) )
                        , Nothing
                        , Nothing
                        )
                      ] (
                      NodeInfo
                        Position
                          { posOffset = 33
                          , posFile = "/tmp/foo.c"
                          , posRow = 1
                          , posColumn = 34
                          }
                        ( Position
                            { posOffset = 40
                            , posFile = "/tmp/foo.c"
                            , posRow = 1
                            , posColumn = 41
                            }
                        , 4
                        )
                        Name
                          { nameId = 19
                          } )
                  , CDecl
                      [ CTypeSpec (
                          CIntType (
                            NodeInfo
                              Position
                                { posOffset = 46
                                , posFile = "/tmp/foo.c"
                                , posRow = 1
                                , posColumn = 47
                                }
                              ( Position
                                  { posOffset = 46
                                  , posFile = "/tmp/foo.c"
                                  , posRow = 1
                                  , posColumn = 47
                                  }
                              , 3
                              )
                              Name
                                { nameId = 21
                                } ) )
                      ]
                      [ ( Just (
                            CDeclr (
                              Just "n" )
                              []
                              Nothing
                              [] (
                              NodeInfo
                                Position
                                  { posOffset = 50
                                  , posFile = "/tmp/foo.c"
                                  , posRow = 1
                                  , posColumn = 51
                                  }
                                ( Position
                                    { posOffset = 50
                                    , posFile = "/tmp/foo.c"
                                    , posRow = 1
                                    , posColumn = 51
                                    }
                                , 1
                                )
                                Name
                                  { nameId = 22
                                  } ) )
                        , Nothing
                        , Nothing
                        )
                      ] (
                      NodeInfo
                        Position
                          { posOffset = 46
                          , posFile = "/tmp/foo.c"
                          , posRow = 1
                          , posColumn = 47
                          }
                        ( Position
                            { posOffset = 50
                            , posFile = "/tmp/foo.c"
                            , posRow = 1
                            , posColumn = 51
                            }
                        , 1
                        )
                        Name
                          { nameId = 23
                          } )
                  ]
                , False
                ) )
              [] (
              NodeInfo
                Position
                  { posOffset = 7
                  , posFile = "/tmp/foo.c"
                  , posRow = 1
                  , posColumn = 8
                  }
                ( Position
                    { posOffset = 51
                    , posFile = "/tmp/foo.c"
                    , posRow = 1
                    , posColumn = 52
                    }
                , 1
                )
                Name
                  { nameId = 24
                  } )
          ]
          Nothing
          [] (
          NodeInfo
            Position
              { posOffset = 4
              , posFile = "/tmp/foo.c"
              , posRow = 1
              , posColumn = 5
              }
            ( Position
                { posOffset = 4
                , posFile = "/tmp/foo.c"
                , posRow = 1
                , posColumn = 5
                }
            , 3
            )
            Name
              { nameId = 2
              } ) )
        [] (
        CCompound
          []
          [] (
          NodeInfo
            Position
              { posOffset = 53
              , posFile = "/tmp/foo.c"
              , posRow = 1
              , posColumn = 54
              }
            ( Position
                { posOffset = 55
                , posFile = "/tmp/foo.c"
                , posRow = 1
                , posColumn = 56
                }
            , 1
            )
            Name
              { nameId = 25
              } ) ) (
        NodeInfo
          Position
            { posOffset = 0
            , posFile = "/tmp/foo.c"
            , posRow = 1
            , posColumn = 1
            }
          ( Position
              { posOffset = 55
              , posFile = "/tmp/foo.c"
              , posRow = 1
              , posColumn = 56
              }
          , 1
          )
          Name
            { nameId = 26
            } ) )
  ] (
  NodeInfo
    Position
      { posOffset = 0
      , posFile = "/tmp/foo.c"
      , posRow = 1
      , posColumn = 1
      }
    ( Position
        { posOffset = 55
        , posFile = "/tmp/foo.c"
        , posRow = 1
        , posColumn = 56
        }
    , 1
    )
    Name
      { nameId = 27
      } )
*Main> 


Replacing all the NodeInfo nodes by ni this gives:


[1] http://hackage.haskell.org/packages/archive/language-c/0.3.2/doc/html/Language-C-Parser.html


Entry: Language.C Functor
Date: Mon Nov  7 14:44:43 EST 2011

The light went on..

Each node in the AST has a hole in it.  For parsing/printing this
seems to be filled with a NodeInfo type.  However, there is a Functor
instance for each type in the tree, which is for mapping over an
entire tree, changing those holes.

So to remove all the NodeInfo nodes for the purpose of prettyprinting
the syntax tree only, use this:

  pc = Prelude.putStrLn . Pretty.ppShow . (fmap $ \_ -> ())
     
This then doesn't need any code patching as described in a previous
post.  Starting from 

  int fun(void *state, void **ins, void **outs, int n) { }

We get the following tree which is a lot easier to dissect:

CTranslUnit
  [ CFDefExt (
      CFunDef
        [ CTypeSpec ( CIntType () )
        ] (
        CDeclr (
          Just "fun" )
          [ CFunDeclr (
              Right
                ( [ CDecl
                      [ CTypeSpec ( CVoidType () )
                      ]
                      [ ( Just (
                            CDeclr (
                              Just "state" )
                              [ CPtrDeclr
                                  []
                                  ()
                              ]
                              Nothing
                              []
                              () )
                        , Nothing
                        , Nothing
                        )
                      ]
                      ()
                  , CDecl
                      [ CTypeSpec ( CVoidType () )
                      ]
                      [ ( Just (
                            CDeclr (
                              Just "ins" )
                              [ CPtrDeclr
                                  []
                                  ()
                              , CPtrDeclr
                                  []
                                  ()
                              ]
                              Nothing
                              []
                              () )
                        , Nothing
                        , Nothing
                        )
                      ]
                      ()
                  , CDecl
                      [ CTypeSpec ( CVoidType () )
                      ]
                      [ ( Just (
                            CDeclr (
                              Just "outs" )
                              [ CPtrDeclr
                                  []
                                  ()
                              , CPtrDeclr
                                  []
                                  ()
                              ]
                              Nothing
                              []
                              () )
                        , Nothing
                        , Nothing
                        )
                      ]
                      ()
                  , CDecl
                      [ CTypeSpec ( CIntType () )
                      ]
                      [ ( Just (
                            CDeclr (
                              Just "n" )
                              []
                              Nothing
                              []
                              () )
                        , Nothing
                        , Nothing
                        )
                      ]
                      ()
                  ]
                , False
                ) )
              []
              ()
          ]
          Nothing
          []
          () )
        [] (
        CCompound
          []
          []
          () )
        () )
  ]
  ()


Entry: Next
Date: Mon Nov  7 16:45:58 EST 2011

So.. Got unstuck with Language.C with the rest seemingly
straightforward, a bit boring even.


Entry: Meta Introduction
Date: Sat Nov 19 09:30:57 EST 2011

( This is an expansion of the blog header. )

Metaprogramming is about dealing with programs as data.  This is a
general principle that hides under the following names:

  - code generation
  - model based design
  - model driven development
  - active libraries
  - multi-stage programming
  - macro programming
  - language towers
  - domain specific languages
  - ...

There is a continuum of different techniques and not really a standard
one-size-fits-all approach.  I focus on formal languages (programming
languages) as models.  In this approach the main problems are:

   - Design of a Domain Specific Language (DSL) that is used to
     express a model.  A DSL lives in the trade-off space between:

       * Simplicity: Make analysis and compilation practically
         feasible.

       * Expressivity: Allow full specification of a solution in the
         problem domain.

   - Translation in the form of interpretation and compilation of DSL
     programs.

       * Static analysis: determine properties of the model without
         executing it in its final destination environment.

       * Code generation.  Transfer from high level form that allows
         static analysis, to low level form that allows execution in
         the field, i.e. as machine (or C) code executing on a
         microcontroller.

     Translation can involve a wide spectrum of techniques, but the
     basic trade-off seems to be about how to add implementation hints
     to a model.  This is a trade-off between:

       * A black-box compiler that does not need hints.  This is
         usually catalogued under the name ``partial evaluation''.

       * A white-box compiler that uses implementation strategies
         explicitly provided by the programmer, next to the model
         specification.  This is called ``staging''.

     Ideally, you'd want a black-box compiler to just solve the
     problem for you.  However due to a usually astronomically huge
     implementation space this is not feasible.  Aiming at a
     well-designed approach to providing implementation knowledge
     seems a second best approach.
       

Concerning the tool set I focus on two approaches, both based on the
idea that side-effect-free functional programs are easier to
manipulate (transform, analyze, verify).

  * Hygienic macros/modules combined with first class units.  Racketc
    (PLT Scheme) is a framework for building language towers based on
    the Scheme programming language.  See Staapl[1].

  * Polymorphism and types in typed functional programming.  Haskell's
    type class based polymorphism, (Meta)OCaml modules and MetaOCaml
    multi-stage code types.

[1] http://zwizwa.be/staapl
[2] http://zwizwa.be/-/libprim


Entry: PrettyC.hs
Date: Sat Nov 19 12:28:57 EST 2011

got: Ret, Let, App, Lit, Ref (Var), Op

todo: LetRec, If


For LetRec we need the following form:

{
  _ fun_0;
  _ fun_1;
  ...
fun:
  {
    ...
  }

  {
    ...
  }

}

Argument variables are declared inside a new scope without initial
assignment.  Function body is a new scope, and the body of the scope
for which the "fun" identifier is valid is also a new scope.  Probably
needs some nesting of the "fun" names to avoid clashes.

Problem: argument variable naming: at call site we don't know names,
and at definition site we can't change names.  So an extra indirection
is necessary to bind numbered names to given names.  This also solves
the name clash problem.


Got this coming out as compilation of

(letrec ((fun1
          (lambda (a b)
            (fun1 b a))))
  (fun1 1 2))


void fun()
{
    {
        float fun1_0;
        float fun1_1;
        {
            fun1_0 = 1;
            fun1_1 = 2;
            goto fun1;
        }
    fun1:
        {
            float a = fun1_0;
            float b = fun1_1;
            fun1_0 = b;
            fun1_1 = a;
            goto fun1;
        }
    }
}


Extended to mutial recursion:

(letrec ((fun1 (lambda (a b) (fun2 b a)))
         (fun2 (lambda (a b) (fun1 b a))))
  (fun1 1 2))


void fun()
{
    {
        float fun1_0;
        float fun1_1;
        float fun2_0;
        float fun2_1;
        {
            fun1_0 = 1;
            fun1_1 = 2;
            goto fun1;
        }
    fun1:
        {
            float a = fun1_0;
            float b = fun1_1;
            fun2_0 = b;
            fun2_1 = a;
            goto fun2;
        }
    fun2:
        {
            float a = fun2_0;
            float b = fun2_1;
            fun1_0 = b;
            fun1_1 = a;
            goto fun1;
        }
    }
}


Entry: Next
Date: Sat Nov 19 22:21:35 EST 2011

Time to tie things together.  Still have to find a way to do array
mutations.  Should this be hidden in combinators?


Entry: Streams - Named outputs
Date: Sun Nov 20 08:05:56 EST 2011

So... Where does the stream combinator go?  Actually the real problem
is named outputs, or unbound variables in the Oz sense.

Where does it go?  Should it be part of the Term language?  Should I
expose arrays, or maybe just "ports" or some other bind operator?
Should this mimick the array monads in Haskell?

The main question seems to be: do we keep side-effects out of Term or
not?  At some point there are going to be things like delay lines that
could be implemented as assignment, or hidden behind some operator.

One thing is clear: the Language.C AST is a lot more complicated to
work with than the Term syntax.  Moreover, the typed frontend will
prevent access to unmanaged assignments, so maybe it's best to just
add set! to Term?

What about this change:
- Use only a single Var type for: values, functions, lvalues (output ports).
- Distinguish by Ref, FRef, PRef.


I did this:

- One Var

- One Ref

- A dumb Write statement in Term, directly translated to deref +
  assignment in PrettyC


So PrettyC is dumb, and management of assignment needs to happen on
the typed level that generates the Term structures.


Entry: Term Write
Date: Sun Nov 20 09:17:51 EST 2011

Time to cut the knot.

Ok, day went by doing other stuff..  Can I tackle this now?

   (s,i) -> (s,o)   ->   (rs,ri,ro) -> m ()

Convert a system into something that binds to state, input and output
references, where state is R/W, input is R and output is W.

This problem has been there for so long that I essentially started to
hate it :)

On the one end there's the unification based approach such as used in
Oz declarative programming.  On the other hand there's the just take
that damn pointer and store it approach.

Let's look at Haskell's mutable arrays for inspiration.  Interesting,
but brings nothing really.  It has get and set...

This is really a non-problem.

  1. It was good to solve the control flow problem.  Now it's possible
     to write loops in a functional style, i.e. where loop state
     update doesn't need assignment.

  2. The control flow problem has nothing to do with storing outputs
     and final state though.  It's a separate issue that has more to
     do with the system we're embedding in (i.e. Pd).

Still... where to put 2?

Maybe I it should be mutable arrays?  Think about it upside-down: the
MArray is an *interface* which can probably be made to cover both
sides: make a generic map that translates a system function to an
MArray update function, and implement MArray for the Term / C target.
This should then allow to run tests in Haskell on IO or ST arrays.

  in:  (s,i) -> (s,o)
  out: (as,ai,ao) -> m r
       dimension info for low-level implementation

where 'r' is some analysis variable, i.e. output array, or () for side
effect only (recording / playback).


Entry: Mutable arrays
Date: Sun Nov 20 20:00:17 EST 2011

[1] http://www.haskell.org/ghc/docs/4.08/set/sec-marray.html
[2] http://www.haskell.org/ghc/docs/4.08/set/sec-ioexts.html#SEC-IOARRAY
[3] http://www.haskell.org/ghc/docs/4.08/set/sec-st.html


Entry: Arrays or ports?
Date: Sun Dec  4 09:02:30 EST 2011

So I wonder still, arrays or some kind of dataflow binding?  Probably
arrays are the better interface.  However, I have to abstract
allocation, so I can't just use MArray[1] as is.

Maybe it is better to first abstract some kind of "store once" binding
mechanism, or an output port.

So I wonder, is there a way to model communicating processes in
Haskell?  I.e. sequential programs with 'read()' and 'write()'?  There
is CHP[2] but this might be overkill.

What I am looking for really is "read process write, and do that
multiple times", which fits nice in a "left-right fold" which I don't
know how to name properly:

       [i] -> s -> (i -> s -> (o, s)) -> [o]

So the point is not really this read/write thing, but to actually
represent the implementation of array access in my target language.

Let's say that again: I'm not looking for some fancy middle
abstraction between functions and real arrays, I'm looking for a way
to represent array read/write and index computation in the EDSL such
that I can both compile to target language, and build a Haskell
function maybe over a mutable array.

The point is to not have to resort to tricks in the compiler that need
to be duplicated: this functionality should be part of the
*interface*, i.e. the syntax of the EDSL, or at least an extension of
it to be able to distinguish pure code from code with array
read/write.


After writing this down, it becomes very clear that the basic
difficulty is the representation of the array's metadata.  Since we
can't do allocation in the EDSL, the bounds need to somehow be
specified.  I see 2 ways of doing this:

  - all part of declaration: an array declaration binds a tuple (a,f,l,s)

    _array ::  (r (a,f,l,s) -> m (r t)) -> m (r t)

     a : array reference (i.e. pointer)
     f : index of first element
     l : index of last element
     s : array stride  (important to include in low-level rep!)

  - separate array declaration, binding a, and a _bounds operation
    that returns (f,l,s) as values:

    _array  :: (r a -> m (r t)) -> m (r t)
    _bounds :: r a -> m (f, l, s)

The first one seems to be more suited to dumb pointers where all have
the same type: it concentrates bookkeeping in a single place.  The
second one is more modular but requires some machinery to represent
array metadata in the target.

I don't have much to go with, but the more explicit first option seems
to be suited to what I'm currently thinking of (Pd and fairly
non-parameterized code to allow for aggressive constant propagation
and unrolling).  Defining bounds early and exposed as constants seems
to make this easier.


Hmm... toying with this a bit it seems that something like this really
is better, since the array reference itself needs to be passed in from
outside, so I guess this is really the 2nd bullet in a different form..

  _bounds :: (TMLprim v, TMLprim t) => 
            (r (a v)) ->
            (r f,
             r l,
             r s) -> m (r t)
            -> m (r t)

So what about when I have a collection of pointers and they all have
the same array size?  This his can be represented as one array of
records (i1, ..., in, o1,... om).  I.e. it's better to make asserts
implicit (one array with a given length is better than 2 arrays with
the same length.

Ok, let's take this approach.  This requires pointer chasing so the
return type of _get should not be a primitive.

Hmm.. it doesn't look like "begin" is a good idea.  Another problem,
it looks like '_let' really needs a monadic type in its first
argument, i.e. this istead of (r a) as first argument:

  _let :: (TMLprim a, TMLprim t)
          => m (r a)
          -> (r a -> m (r t))
          -> m (r t)

The (r a) version works for things like literals, but not really for
subexpressions.  Otoh do we want subexpressions?

  t1 = _let (lit 123) $ \a -> _ret a

I'm getting confused.  This probably needs to be fixed first.


[1] http://www.haskell.org/ghc/docs/4.08/set/sec-marray.html
[2] http://www.cs.kent.ac.uk/projects/ofa/chp/


Entry: Problem: first argument of _let monadic?
Date: Sun Dec  4 12:20:19 EST 2011

I'm not sure..  Currently we need

do
  v <- ...
  _let v (\v -> ...)

But this looks weird.  _let should really take a monadic value.  The
alternative would be to not use _let but an explicit
variable-introduction interface:

do
  v <- nameVar $ ...
  ...


Let's fix that first.

The thing is that apparently all results are already named, so this
approach isn't necessary.  Really, _let isn't necessary if all
operations return named expressions.

What is necessary is a way to explicitly store non-monadic values
(i.e. literals or other variable references) in freshly named
variables.  So let's introduce _var.

It turns out that _begin also isn't necessary.  Maybe mVar can be
modified such that a variable of type () is never created?


Entry: Representing arrays in Term
Date: Sun Dec  4 13:25:18 EST 2011

Instead of order being an integer, it might be a good idea to also
encode the array in the variable definition.


Entry: Array types
Date: Sun Dec  4 15:12:29 EST 2011

Currently I'm assuming that types of variables scalar (constrained by
mVar), but this won't work for arrays.

Ok, this needs some deeper ajustment.  The idea is that a type is
really base data type + indirection information (arrays/pointers).
This type should be properly derefereced for a _get in Code, and mVar
should also support this.

Ha.  Pointers behave a bit like curried functions, don't they?

I'm running into a problem: array sizes behave as type information,
but they really are not.  The same code can work on arrays of
different types.

Maybe just work around that for now.  Array types can be fixed in
which case their parameters are host language integers -- not target
language entities! or unknown.


Ok, I'm properly confused now.  I would like to encode the pointer
order in the Code type, but I don't see a good way to do that.  It
seems that from

  _get   :: r (a t) -> r Tint -> m (r t)

this could also be

            r (a (a (a v))) -> r Tint -> m (r (a (a v)))

but doing that I have trouble mapping this to Code and Term.


Ok, I just use the input Var's Type info to derive the output's type.
This avoids the typeclass stuff.

This is what comes out now:


t19' inArray outArray = do
  (inFirst,  inLast,  inStride)  <- _bounds inArray
  (outFirst, outLast, outStride) <- _bounds outArray
  -- We assume that output and input dims are compatible, ignoring
  -- inLast (or equivalently outLast).
  i <- _var $ lit 1
  v <- _get inArray i
  _set outArray i v
  _exit
      
int n = Lit (Type AInt []) (show n)

varArr :: String -> Code (CodeArray Tint)
varArr name =  Code $ Ref $ Var (Type AInt [()]) name


t19 = term $ t19' (varArr "in") (varArr "out")

This gives:
      
  (Let (Var (Type i []) "r0") (Lit (Type i []) "1")
  (Let (Var (Type i []) "r1") (Get (Var (Type i [()]) "in") 
                                   (Ref (Var (Type i []) "r0")))
  (Begin 
   (Set (Var (Type i [()]) "out")
        (Ref (Var (Type i []) "r0"))
        (Ref (Var (Type i []) "r1")))
   (Ret (Lit (Type i []) "0")))))

which looks about right.


Entry: Next
Date: Sun Dec  4 20:58:58 EST 2011

What to do with array bounds?  Maybe it's just not necessary at this
point.


Entry: Haskell type classes and optimization
Date: Mon Dec  5 17:30:49 EST 2011

I was worrying about optimization in the face of all this type class
abstraction once I'm moving to STU arrays.  But.. All numeric
operations are type classes and they are optimised properly in tests
I've seen onlin.

It seems that once the types are specific enough (which would be the
case for STU arrays), there is no reason why the "type class
arguments" can't be eliminated.


Entry: runSTUArray
Date: Mon Dec  5 19:03:52 EST 2011

So... How to make Value work with STUArray / ST[2].  I think it's best
to make the interface a bit more general though, and start with
MArray[1][3].

import Control.Monad.ST
import Data.Array.ST
import Data.Array.MArray
import Data.Array.Base

This seems to be it:

instance (Monad m, MArray a t m) => Array m Value (a Int) t where
  _get (Value a) (Value i) =           liftM Value $ readArray a i
  _set (Value a) (Value i) (Value e) = liftM Value $ writeArray a i e

Now, how to use it?  To test it, let's try to write a "sum" program.
This involves the whole bazar: computations, conditionals, loops and
arrays.


[1] http://www.haskell.org/ghc/docs/latest/html/libraries/array/Data-Array-MArray.html
[2] http://www.haskell.org/ghc/docs/6.10.4/html/libraries/base/Control-Monad-ST.html#t%3AST
[3] http://www.haskell.org/haskellwiki/Arrays#Mutable_arrays_in_ST_monad_.28module_Data.Array.ST.29


Entry: Types are too complex
Date: Mon Dec  5 20:30:12 EST 2011

This doesn't want to infer:

f8 arr = \f ->
  _lambda $ (\(accu  :: r Tint,
               index :: r Tint) 
             -> 
              do
                v      <- _get arr index
                index' <- add index (lit 1)
                accu'  <- add accu v
                cond   <- eq index' (lit 10)
                _if cond 
                    (_ret accu')
                    (_app f $ accu' index'))


I think it's at least typable.  This should really just work.  The
following doesn't even infer correctly and needs a pin down to avoid
occurs check error.

f7 :: (StructPair (,) r,
       StructPrim s (r a),
       Monad m,
       Loop s m r)
      => m (r ((a,a) -> m a))  
f7 = _lambda $ \(a,b) -> _ret a


I think the problem lies with functional dependencies.  I don't really
quite understand how they work.  I don't understand the error messages
and I simply get stuck.

Fiddling with the code a bit usually makes it work after a while, but
that's really no way to do programming ay?

Maybe type families are the solution?

Summary: I believe the recursive class definitions for nested tuples
(see Struct), and the fundeps used there, mess things up.  They
probably express something else than I intend..


Entry: Occurs check
Date: Mon Dec  5 20:50:43 EST 2011

So why does it do that?  Why is there a good interpretation if the
automatic interpretation finds something that can't unify?

I'd say it would give an error that is an indication of a lack of
information, but not of some information that leads to a false
outcome..  How should I see this?


Entry: Ambiguities..
Date: Mon Dec  5 21:15:33 EST 2011

*Main> :t _lambda $ \(a,b) -> do {v <- add a b; _ret v}

<interactive>:1:1:
    Couldn't match type `(,) as' with `(,) (as, bs)'
    When using functional dependencies to combine
      Struct s r t (r t),
        arising from the dependency `r as -> ras'
        in the instance declaration at Loop.hs:128:10
      Struct s0 ((,) as) (as, bs) ((as, bs), (as, bs)),
        arising from a use of `_lambda' at <interactive>:1:1-7
    In the expression: _lambda
    In the expression:
        _lambda
      $ \ (a, b)
          -> do { v <- add a b;
                  _ret v }

What happens here is that r is somehow identified with the partially
applied tuple constructor.  Maybe r should be more constrained?


Entry: Signatures..
Date: Mon Dec  5 22:08:24 EST 2011

I've seen this before.  Using :t at the GHCi prompt gives a different
type signature than specified:

*Main> :t _lambda
_lambda
  :: (Loop s m r, StructPair (,) r, StructPrim s r as) =>
     (r as -> m (r t)) -> m (r (as -> m t))


class TML m r => Loop s m r | r -> m, r -> s where
  ...
  _lambda :: (Struct s r as ras) 
             => (ras -> m (r t))
             -> m (r (as -> m t))
  ...

After a night of sleep..  I removed the -XIncoherentInstances and I
get an error:

/home/tom/meta/dspm/Loop_test_arity.hs:74:6:
    Overlapping instances for Struct s r a (r a)
      arising from a use of `_lambda'
    Matching instances:
      instance [overlap ok] (StructPrim s r t, StructPair (,) r) =>
                            Struct s r t (r t)
        -- Defined at Loop.hs:(130,10)-(132,30)
      instance [overlap ok] (StructPair (,) r,
                             Struct s r as r_as,
                             Struct s r bs r_bs) =>
                            Struct s r (as, bs) (r_as, r_bs)
        -- Defined at Loop.hs:(111,10)-(114,43)
    (The choice depends on the instantiation of `r, s, a'
     To pick the first instance above, use -XIncoherentInstances
     when compiling the other instance declarations)
    In the expression: _lambda
    In the expression: _lambda $ \ (a, b) -> _ret a
    In an equation for `f7': f7 = _lambda $ \ (a, b) -> _ret a

This is strange because the Prim/Pair heads should really be
exclusive.

Following [1] it says that GHC does not take into account the context
of a rule when matching.  Then -XOverlappingInstances makes it
possible to have multiple matches as long as there is a specific one,
and -XIncoherentInstances a choice is forced to pick the more general
instance at compile time of a generic function when possibly that
function might be specialized later and a more specific one would
match.  It looks like that is where the trouble comes from.  Somehow
one of these is chosen at a time when it shouldn't be:

instance (StructPair (,) r, 
          Struct s r as r_as,
          Struct s r bs r_bs)
         => Struct s r (as,bs) (r_as, r_bs) where

instance (StructPrim s r t,
          StructPair (,) r)
         => Struct s r t (r t) where  

The second one is clearly more general, and this is indeed what the
error message suggests.

So.. How to fix that?  Maybe it's a good idea to construct some
simpler example that has the same problem and ask it on the Haskell
list.

If I recall I saw some trick that used an extra type argument to
disambiguate the heads.  Maybe this can be used here too?

I run into fundep problems, so let's try to simplify it a bit.

I removed the (,) argument in StructPair.  By adding an explicit tag
that reflects the basic structure of the instance, i.e. () ((),()),
... I was able to at least make an unambiguous instance, but the
trouble then is to invoke it, because the type needs to be known!
Maybe this is the whole overlapping/existential thing?

Ok, I'm getting into random mutation territory.  I removed the tagging
and now I get to a point where I can't instantiate multiple
representations in the same module, probably by changing one of the
fundeps.

  as -> r ras, ras -> r as, r -> stx

which is of course incorrect..  It should be:

  r as -> ras, ras -> r as, r -> stx

this brings me back to square one.

Maybe it's just necessary to properly tag the "atom" in the struct
such that a tree node and a leaf node cannot be confused.  ( Then
after that works it might be possible to write something on top of
that to make argument lists easier to use. )

Similar problems..  It disambiguates some things, but now I can't make
a fundep ras -> r as because this will fix r.  Maybe only fundep ras
-> as and fix r in another way?

Ok, there was a bug!  It should be 

   Struct s r (StructAtom t) (StructAtom (r t)) where

instead of

   Struct s r (StructAtom t) (r (StructAtom t)) where

and StructPair should have the `atom' and `unatom' bijection that
commutes the StructAtom tag with the representation.

Following that through it seems to be clear what's going on now, at
the expense of some verbosity in the HOAS as it's littered with Cons
and Atom constructors.

Most type annotations in Loop_test_arity.hs can now also be removed.

[1] http://www.haskell.org/ghc/docs/6.6/html/users_guide/type-extensions.html


Entry: Mile stone
Date: Tue Dec  6 14:22:45 EST 2011

Now that _lambda and _app use explicit data structure built with Atom
and Cons everything seems to infer just fine.  It's probably possible
to write some wrappers on top of these to hide this tagging and work
directly with Haskell tuples.  Maybe even explicit wrapper that
implement 1,2,3,... args for curried functions directly.  I.e. I want
to automate this:


app1 f a1       = _app f (Atom a1)
app2 f a1 a2    = _app f (Cons (Atom a1) (Atom a2))
app3 f a1 a2 a3 = _app f (Cons (Atom a1) (Cons (Atom a2) (Atom a3)))

lam1 f = _lambda $ \(Atom a1) -> f a1
lam2 f = _lambda $ \(Cons (Atom a1) (Atom a2)) -> f a1 a2
lam3 f = _lambda $ \(Cons (Atom a1) (Cons (Atom a2) (Atom a3))) -> f a1 a2 a3

The thing which is different that before is that the return type is
not so relevant.  It's only a single value, i.e. an exit code.  All
the rest should be computed with side effects

An important point seems to be that the language has only 1 lambda and
app form, both taking a single argument.  If we want different
lamba/app forms, this would probably be implemented as a type class on
top of Loop.

So.. I can get to the base case.  


class LoopLam arg res fun | res -> fun, fun -> arg res where
  lam :: (arg -> res) -> fun
  
class LoopApp arg res fun| res -> fun, fun -> arg res  where
  app :: fun -> (arg -> res)

-- lam1
instance (Loop s m r,
          StructPrim s r a,
          StructRepr r)
         => LoopLam (r a) (m (r t)) (m (r (Atom a -> m t))) where
  lam f = _lambda $ \(Atom a1) -> f a1
    
        
-- app1
instance (Loop s m r,
          StructPrim s r a,
          StructRepr r)
         => LoopApp (r a) (m (r t)) (r (Atom a -> m t)) where
  app f a1 = _app f (Atom a1)

How to write the recursive case?  It seems that I'm running into the
same problem here as before: Can't "unpack" that lambda.  It seems the
solution is to never pack it in the first place: it won't be the app
and lam that can be recursed, but the underlying apply and abstraction
ops.

Hmm.. let's be smarter about this.  The trick is probably to first
write the explicit lam1,lam2,lam3 in a recursive way.


Entry: Missing _letrec for Value
Date: Tue Dec  6 14:41:04 EST 2011

The Y combinator seems to not work with the monadic types.

Is this it?  Non-monadic:

  _letrec open body = closed where
    closed = open closed

Monadic:

  _letrec open body = do
    closed <- mclosed
    body closed
    where
      mclosed = do
        closed <- mclosed
        open closed


Simplified:

  _letrec open body = app body mfix where
    mfix = app open mfix
    app f mfix = do {fix <- mfix ; f fix}


Entry: Review
Date: Tue Dec  6 17:47:02 EST 2011

What's next?

* The app/lam class might be useful but I don't see it as a pressing
  point right now.  It's not in the way.  Higher orders can easily be
  added in an explicit way.

* Merge SM and the recent Cons/Atom structs.  This still needs some
  work.  

* Arrays: in Value, fold an SM over a STUArray.

* Use the same for generating C code.


Entry: Sys.hs
Date: Tue Dec  6 18:19:37 EST 2011

I tried to port Sys.hs to using Cons instead of (,) but this turns out
not to be possible because the Arrow instance relies on the (,) binary
tuples.

This seems like a hard constraint that's probably best to completely
avoid by not using Cons in the code, and replacing it by (,).

For the same reason we als needs instances for Nil or ().  Just added.

Another thing is that we probably don't want to use Atom/(,)/()
structures inside Sym anyway.  Sym is for composing functional
descriptions, which is a different world than the struct argument
passing that's done in Loop.

Otoh, a Sym could contain Loop code.. how to work with that?  There
seem to be some things not so clear as to how these things should
compose.  In any case, there should be enough support for it now, as
long as functions operate on Atom-wrapped atomic types.

I don't know what to do with the Sys phantom type though.  Currently
it's empty (it still had the old CompileVar stuff) so maybe move
towards that?


Anyways.. This seems to work, but is awkward:

compileLoop update init =
  _letrec 
  (\loop -> _lambda (\s -> do
                        i <- undefined -- FIXME: read input
                        (s', o) <- update (s, i)
                        -- FIXME: write output
                        -- FIXME: terminate condition
                        _app loop s))
  
  (\loop -> _app loop init)

Awkward because we can't get around the fact that composite state,
even if it's built with (,) needs to be terminated in Atom nodes.

So now, how to type the phantom type?

data Sys p m i o = 
  forall s. () -- FIXME: Add interface to phantom type
            => Sys ((s, i) -> m (s, o)) s

Something seems deeply burried.  I don't know what to express, and the
error message for compiling this doesn't help:

compileSys (Sys update init) = compileLoop update init

/home/tom/meta/dspm/0test_TML_Sys.hs:48:32:
    Ambiguous type variable `as0' in the constraint:
      (Struct s1 r as0 s2) arising from a use of `compileLoop'
    Probable fix: add a type signature that fixes these type variable(s)
    In the expression: compileLoop update init
    In an equation for `compileSys':
        compileSys (Sys update init) = compileLoop update init
Failed, modules loaded: TML, StateCont, Term, Code, Loop, Struct, Effect, Sys.


Entry: Building Atom into TML?
Date: Tue Dec  6 21:37:41 EST 2011

This means some notational overhead, or it means we need to propagate
the Atom wrapper all the way down to TML.  Is that actually possible?


Entry: Next: fix Sys.hs
Date: Tue Dec  6 21:59:56 EST 2011

It seems that with a little work this could be quite nice to have.  As
long as this existential type gets filled in properly

    data Sys p m i o = 
      forall s. () -- FIXME: Add interface to existential type
                => Sys ((s, i) -> m (s, o)) s


it should be little more than generalization and stub fill in of this:

    compileLoop update init =
      _letrec 
      (\loop -> _lambda (\s -> do
                            i <- undefined -- FIXME: read input
                            (s', o) <- update (s, i)
                            -- FIXME: write output
                            -- FIXME: terminate condition
                            _app loop s))

      (\loop -> _app loop init)

   
I was intuiting about this and I wonder why such a tree can't just be
exposed as a thing over which you can map.  The current Struct
interface seems unnecessarily clumsy, especially with that r as ras
stuff.


It seems that translation between representation of structuring and
structuring of representations is only needed for _app and _lambda.
When we stay in the domain of structuring of representations (which I
think is where Sys actually lives), these operations are not
necessary.

So sys doesn't need to support pack/unpack, just structSize,
structVariables, structCompile.  Maybe it's best to separate those out
anyway?


Entry: Just structuring of representation (<-> representation of structuring)
Date: Wed Dec  7 00:55:14 EST 2011

I separated operations on structuring of representation (ras) and
those that work both on ras and representation of structuring (r as).
The latter are only used in Loop (packing/unpacking in _app and
_lambda), and it seems that Sys only need to know StructComp.

    class StructComp stx ras | ras -> stx where
      -- Support for compilation: nb of atomic elements, typed variable
      -- creation, and compilation to flat syntax type.  To select the
      -- right type, pass (undefined :: <type>) for the first arg.
      structSize      :: ras -> Int
      structVariables :: ras -> [VarName] -> (ras, [VarName])
      structCompile   :: ras -> [stx]


    class StructComp stx ras => 
          Struct stx r as ras | ras -> r as, r as -> ras, r -> stx where

      -- Commute representation and data structuring, 
      -- i.e. (r a, (r b, r c)) <-> r (a,(b,c))
      structPack   :: ras -> r as
      structUnpack :: r as -> ras


Entry: Sys.hs existentials
Date: Wed Dec  7 01:34:52 EST 2011

Now moving on to trying to make this compilation step work, I'm stuck
at fixing the types.  This is where I got, which doesn't work.  I
suppose the culprit is Struct stx r su which sort of "opens up" the
state..  Maybe existentials should be next on the list.

Or.. I can give up on the Arrow abstraction and just use an
alternative that doesn't need the data hiding.

I don't know..  It seems a bit irrelevant.  The goals is to be able to
use the standard arrow interfaces so the arrow notation can be used,
but is this really so important?


    compileLoop ::
      forall s i o stx m r su.
      (StructComp stx s,
       StructComp stx i,
       StructComp stx o,
       Struct stx r su s,
       Loop stx m r) =>
      -- ((s, i) -> m (s, o)) -> s 
      Sys stx m i o
      -> m (r Tint)
    compileLoop (Sys update init) =
      _letrec open body where
        open loop =
          (_lambda (\s -> do
                       (s' :: s, o :: o) <- (update (s :: s, undefined :: i)) :: m (s, o)
                       -- FIXME: write output
                       -- FIXME: terminate condition
                       _app loop s')) :: m (r (su -> m Tint))
        body loop =
          _app loop (init :: s)


Anyway, because stx already appears as a Sys parameter, it might be
good to just dump the whole compiler in there, one that generates an
init and update function.

Ok, for now I've exposed sysSer, sysPar and sysPure as non-hiding
primitive operations that represent systems as (update,init) pairs.

TODO: It would be a good exercise to punch through the ignorance and
really understand why this doesn't work instead of avoiding
confrontation once again.


Entry: So is it ready?
Date: Thu Dec  8 22:55:57 EST 2011

Apart from some cosmetics (see Sys existentials in previous posts) it
seems that most of it is working.  I managed to generate some C code
representing a composition of 3 integrators:

  type AC t = Atom (Code t)
  f :: (AC Tfloat, AC Tfloat) -> MCode (AC Tfloat, AC Tfloat)
  f (Atom s, Atom i) = do
    s' <- add s i
    return (Atom s', Atom s')

  fi = Atom $ lit 0


  t5 = term $ compileSysLL fi $ (f,fi) <-- (f,fi) <-- (f,fi)

Which gives

  void fun()
  {
      {
          float fun0_0;
          float fun0_1;
          float fun0_2;
          {
              fun0_0 = 0.0;
              fun0_1 = 0.0;
              fun0_2 = 0.0;
              goto fun0;
          }
      fun0:
          {
              const float a1 = fun0_0;
              const float a2 = fun0_1;
              const float a3 = fun0_2;
              const float r4 = add(a3, 0.0);
              const float r5 = add(a2, r4);
              const float r6 = add(a1, r5);
              fun0_0 = r6;
              fun0_1 = r5;
              fun0_2 = r4;
              goto fun0;
          }
      }
  }

The next step is to fill in arrays.  

I updated compileLoop to include arrays.  This required a bit of type
annotation due to 2 occurances of lit.  Is there a way around that?

  compileLoop
    :: (Struct stx r su s,
        StructRepr r,
        StructPrim stx r Int,
        Array m r arri i,
        Array m r arro o,
        Loop stx m r,
        TMLprim o) =>
       r (arri i)
       -> r (arro o)
       -> r Tint
       -> ((s, Atom (r i)) -> m (s, Atom (r o)))
       -> s
       -> m (r Tint)

  compileLoop arri arro arrn update init =
    _letrec open body where
      open loop =
        _lambda (\(s, Atom n) -> do
                    i <- _get arri n
                    (s', Atom o) <- update (s, Atom i)
                    _set arro n o
                    more <- lt n arrn
                    _if more 
                      (do
                          n' <- add n (lit 1)
                          _app loop (s', Atom n'))
                      _exit)
      body loop =
        _app loop (init, Atom $ lit 0)

  varArr :: String -> Code (ACode Tfloat)
  varArr name =  Code $ Ref $ Var (Type AFloat [()]) name

  t7 = term $ compileLoop (varArr "in") (varArr "out") (lit 64) f (Atom $ lit 0)


Gives the following result, which is ready except for the outer
function body and serialization of loop state.

  *Main> c t7
  void fun()
  {
      {
          float fun0_0;
          int fun0_1;
          {
              fun0_0 = 0.0;
              fun0_1 = 0;
              goto fun0;
          }
      fun0:
          {
              const float a1 = fun0_0;
              const int a2 = fun0_1;
              const float r3 = *add(in, a2);
              const float r4 = add(a1, r3);
              *add(out, a2) = r4;
              const int r5 = lt(a2, 64);
              const int r6 = add(a2, 1);
              if (r5)
              {
                  fun0_0 = r4;
                  fun0_1 = r6;
                  goto fun0;
              }
              else
              {
                  return 0;
              }
          }
      }
  }


Wait, there's a bug still in the code gen monad.  If should delimit
context inside the blocks.  The 'mBlock' was missing.

  _if (Code c) mx my = do
    (Code x) <- mBlock mx
    (Code y) <- mBlock my
    return $ Code $ If c x y


Which gives the following:

  *Main> c t7
  void fun()
  {
      {
          float fun0_0;
          int fun0_1;
          {
              fun0_0 = 0.0;
              fun0_1 = 0;
              goto fun0;
          }
      fun0:
          {
              const float a1 = fun0_0;
              const int a2 = fun0_1;
              const float r3 = *add(in, a2);
              const float r4 = add(a1, r3);
              *add(out, a2) = r4;
              const int r5 = lt(a2, 64);
              if (r5)
              {
                  const int r6 = add(a2, 1);
                  fun0_0 = r4;
                  fun0_1 = r6;
                  goto fun0;
              }
              else
              {
                  return 0;
              }
          }
      }
  }


Entry: Toplevel definitions
Date: Fri Dec  9 15:27:48 EST 2011

I've added (Topdef Var Term) to Term.  This can represent toplevel
definitions such as variable and function definitions.

To make this work, each function should be compiled to 2 parts: an
update function and an init constant (or an init method).  The
following should compile the update function.  It typechecks but gives
problems when it's actually used.

  compileLoopTop update = _lambda funBody where
    funBody (state, 
            (Atom arri,
            (Atom arro,
             Atom arrn))) =
      _letrec loopExpr initExpr
        where
          initExpr loop = _app loop (state, Atom $ lit 0)        
          loopExpr loop = _lambda loopBody  
            where
              loopBody (s, Atom n) = do
                i <- _get arri n
                (s', Atom o) <- update (s, Atom i)
                _set arro n o
                more <- lt n arrn
                _if more 
                  (do
                      n' <- add n (lit 1)
                      _app loop (s', Atom n'))
                  _exit


The arrays seem to be the problem:

  *Main> :t compileLoopTop f
  compileLoopTop f
    :: (Array (StateCont.SC Code.CompState Term) Code a Tfloat,
        Array (StateCont.SC Code.CompState Term) Code a1 Tfloat,
        TMLprim (a Tfloat),
        TMLprim (a1 Tfloat)) =>
       StateCont.SC
         Code.CompState
         Term
         (Code
            ((Atom Tfloat, (Atom (a Tfloat), (Atom (a1 Tfloat), Atom Tint)))
             -> StateCont.SC Code.CompState Term Tint))

Two things: the array type should probably have a fundep.  It's fixed
by the representation.  Then the TMLprim (a Tfloat) is probably not
right.  Inded, after fixing the r -> a fundep, I get:

  *Main> :t compileLoopTop f

  <interactive>:1:1:
      No instance for (TMLprim (ACode Tfloat))
        arising from a use of `compileLoopTop'
      Possible fix:
        add an instance declaration for (TMLprim (ACode Tfloat))
      In the expression: compileLoopTop f

Actually, an array is a TMLprim: it's a machine word (pointer).
TMLprim is a bit of an awkward name.  It means "non-divisable".
Primitives can be stored in Atom leaf nodes of a Struct binary tree.

Trouble here is that I want a primType that's not just a TypeName but
also a pointer order.  So it might be better to rearrange the classes
a bit to express this in a different way.

Where is TMLprim used as a constraint?

  _set   :: TMLprim t => r (a t) -> r Tint -> r t ->  m (r ())

There's another one somewhere, I can't find it right away.  The
workaround below doesn't work because _get will try to "pop" the
pointer order, which results in a pattern error.

  instance TMLprim t => TMLprim (ACode t) where
    primType _ = primType (undefined :: t)

EDIT: I removed the TMLprim t constrained here and replaced it with
TypeOf (Code t).  So conceptually, TypeOf represents machine words
while TMLprim represent primtive values, not pointers.  TypeOf is not
defined for StructPrim.

  instance TypeOf (Code t) => StructPrim Term Code t where
    primCompile (Code t) = t
    primVariable = codeVar $ typeOf (undefined :: Code t) where
      codeVar t n = Code $ Ref $ Var t n
   
Hmm it's not that easy.  The problem is that it's not possible at this
point to recursively define primitives (higher order pointers) because
Array and Struct dosn't allow this.  I.e. the 'r' and 't' are naked.

So the classes aren't sitting well.  Maybe this is really a symptom of
a conceptual error?  Needs some thought.


Entry: I'm done with these stupid type indexing tricks
Date: Mon Dec 12 20:23:52 EST 2011


{- Primitive data types + Pointers -}
class TMLword t tag where 
  _wordType :: t -> (TypeName, Int, tag)
  
wordType t = (name, order) where
  (name, order, _) = _wordType t
  
instance TMLprim t => TMLword t () where
  _wordType _ = (primType (undefined :: t), 0, ())

instance TMLword t tag => TMLword t ((), tag) where  
  _wordType _ = (name, order + 1, ((), tag)) where
    (name, order, tag) = _wordType (undefined :: t)


How to just juse integers for these types?


Entry: Deleted patch
Date: Mon Dec 12 20:56:18 EST 2011

Too much trouble getting it to work.  See patch belo.  I need to
figure out how to do this properly.  What are pointers, what are
arrays, what are primitives that support arithmetic & logic etc..  The
type structure really isn't trivial!

I guess the problem is really to make sure that mVar works for arrays
also, but currently Arrays are implemented differently, and typed as
Int -> ...

-- Reify Haskell type information as type tag.
mVar :: TypeOf (Code t) => (Code t) -> MCode (Code t)
mVar term = mVarTyped (typeOf term) term
  

tom@zoo:~/meta/dspm$ darcs wh
hunk ./dspm/Code.hs 50
-instance TMLprim t => TypeOf (Code t) where
-  typeOf _ = Type (primType (undefined :: t)) []
-
+instance TMLword t () => TypeOf (Code t) where
+  typeOf _ = Type name order where
+    (name, order, ()) = wordType (undefined :: t)
hunk ./dspm/Code.hs 55
-  typeOf _ = Type p (t:ts) where
-    Type p ts = typeOf (undefined :: Code t)
-    t = ()
+  typeOf _ = Type p (1 + o) where
+    Type p o = typeOf (undefined :: Code t)
hunk ./dspm/Code.hs 85
-mVar :: TypeOf (Code t) => (Code t) -> MCode (Code t)
-mVar term = mVarTyped (typeOf term) term
+-- mVar :: TypeOf (Code t) => (Code t) -> MCode (Code t)
+-- mVar term = mVarTyped (typeOf term) term
hunk ./dspm/Code.hs 88
+mVar :: TMLword t o => (Code t) -> MCode (Code t)
+mVar term = mVarTyped (Type name order) term where
+  (name, order, o) = wordType (undefined :: t)
+
hunk ./dspm/Code.hs 116
-cl t = Code . (Lit (Type t [])) . show
+cl t = Code . (Lit (Type t 0)) . show
+
+cop2 :: (TMLword i (), TMLword o ()) => Type -> TypedOp2 r i o
hunk ./dspm/Code.hs 122
-icop2 = cop2 (Type AInt [])   :: TypedOp2 r Tint Tint
-fcop2 = cop2 (Type AFloat []) :: TypedOp2 r Tfloat Tfloat
+icop2 = cop2 (Type AInt 0)   :: TypedOp2 r Tint Tint
+fcop2 = cop2 (Type AFloat 0) :: TypedOp2 r Tfloat Tfloat
hunk ./dspm/Code.hs 125
-ibcop2 = cop2 (Type ABool []) :: TypedOp2 r Tint Tbool
-fbcop2 = cop2 (Type ABool []) :: TypedOp2 r Tfloat Tbool
+ibcop2 = cop2 (Type ABool 0) :: TypedOp2 r Tint Tbool
+fbcop2 = cop2 (Type ABool 0) :: TypedOp2 r Tfloat Tbool
hunk ./dspm/Code.hs 129
-cconv typ nam (Code t) = mVar (Code $ Op (Type typ []) nam [t])
+cconv :: (TMLword i (), TMLword o ()) => [_$_]
+         TypeName -> String -> (Code i) -> MCode (Code o)
+         [_$_]
+cconv typ nam (Code t) = mVar (Code $ Op (Type typ 0) nam [t])
hunk ./dspm/Code.hs 187
-instance TMLprim t => StructPrim Term Code t where
+instance TypeOf (Code t) => StructPrim Term Code t where
hunk ./dspm/Code.hs 222
-    let ref         = Var (Type AVoid []) fName  -- Create a symbolic reference Term.
+    let ref         = Var (Type AVoid 0) fName   -- Create a symbolic reference Term.
hunk ./dspm/Code.hs 281
-  _get (Code (Ref a@(Var (Type base (dim:dims)) _))) (Code i) = [_$_]
-    mVarTyped (Type base dims) (Code $ Get a i)
+  _get (Code (Ref a@(Var (Type base order) _))) (Code i) = [_$_]
+    mVarTyped (Type base (order-1)) (Code $ Get a i)
hunk ./dspm/Code.hs 288
--- FIXME: Actually, an array is a TMLprim: it's a machine word
--- (pointer).  TMLprim is a bit of an awkward name.  It means
--- "non-divisable". Primitives can be stored in Atom leaf nodes of a
--- Struct binary tree.  This can use a better name.  Also, make this
--- more generic for all array representations.
+
+
+-- TypeOf is a bad name for a class of values that represent machine
+-- words, i.e. primitive types like integers and floats, and pointers
+-- to machine words.  Note that TMLprim only represents primitive
+-- types.
+  [_$_]
+-- FIXME: this is not recursive.  Why isn't this ACode (Code t) ?
hunk ./dspm/Code.hs 297
-instance TMLprim t => TMLprim (ACode t) where
-  primType _ = primType (undefined :: t)
+instance TypeOf (Code t) => TypeOf (ACode t) where
+  typeOf _ = Type prim (1 + order) where
+    Type prim order = typeOf (undefined :: (Code t))
hunk ./dspm/Effect.hs 24
-
-
hunk ./dspm/Effect.hs 57
+
+
+
hunk ./dspm/Loop.hs 37
-  _var :: TMLprim t =>
-          r t -> m (r t)
+  _var :: TMLprim t => r t -> m (r t)
hunk ./dspm/PrettyC.hs 51
-            (map (\_ -> CPtrDeclr [] ()) order)
+            (map (\_ -> CPtrDeclr [] ()) [1..order])
hunk ./dspm/PrettyC.hs 212
-  add a b = (Op (Type AInt []) "add" [a,b])  [_$_]
+  add a b = (Op (Type AInt 0) "add" [a,b])  [_$_]
hunk ./dspm/TML.hs 19
-           Tint, Tfloat, Tbool, Tvoid, TMLprim(..),
+           Tint, Tfloat, Tbool, Tvoid, TMLprim(..), TMLword(..),
hunk ./dspm/TML.hs 41
+{- Primitive data types + Pointers -}
+class TMLword t tag where [_$_]
+  wordType :: t -> (TypeName, Int, tag)
+  [_$_]
+type Suc x = ((),x)
+suc x = ((),x)
+
+instance TMLprim t => TMLword t () where
+  wordType _ = (primType (undefined :: t), 0, ())
+
+instance TMLword t tag => TMLword t (Suc tag) where  [_$_]
+  wordType _ = (name, order + 1, suc tag) where
+    (name, order, tag) = wordType (undefined :: t)
+
+
+  [_$_]
+
+
hunk ./dspm/Term.hs 75
-data Type   = Type TypeName [TypeDims]
+data Type   = Type TypeName Int


Entry: Indexing
Date: Mon Dec 12 21:11:36 EST 2011

See, I'm trying to make two types, one indexed with ((), x) and one
with Int -> x.  Why not keep it at one and build an Array instance for
(Code (Int -> x))?

  class TMLprim t where 
    primType  :: t -> TypeName
    primOrder :: t -> Int
    primOrder _ = 0

  instance TMLprim Tbool   where primType _ = ABool
  instance TMLprim Tint    where primType _ = AInt
  instance TMLprim Tfloat  where primType _ = AFloat
  instance TMLprim Tvoid   where primType _ = AVoid

  instance TMLprim t => TMLprim (Tint -> t) where
    primType _  = primType (undefined :: t)
    primOrder _ = 1 + (primOrder (undefined :: t))


This gives a straightforward Array instance


  instance Array MCode' Code ((->) Tint) t where

    _get (Code (Ref a@(Var (Type base (dim:dims)) _))) (Code i) = 
      mVarTyped (Type base dims) (Code $ Get a i)

    _set (Code (Ref a)) (Code i) (Code e) =
      mVoid $ Code $ Set a i e


Then the TypeOf becomes straightforward:

  instance TMLprim t => TypeOf (Code t) where
    typeOf _ = Type (primType (undefined :: t)) dims where
      dims = map (\_ -> ()) [1..order]
      order = primOrder (undefined :: t)


Entry: Compiling toplevel functions
Date: Mon Dec 12 21:45:11 EST 2011

Next error:

*Main> c $ term $ compileLoopTop f
*** Exception: statement: Lambda [Var (Type f []) "a0",Var (Type f [()]) "a1",Var (Type f [()]) "a2",Var (Type i []) "a3"] (LetRec [(Var (Type v []) "fun4",Lambda [Var (Type f []) "a5",Var (Type i []) "a6"] (Let (Var (Type f []) "r7") (Get (Var (Type f [()]) "a1") (Ref (Var (Type i []) "a6"))) (Let (Var (Type f []) "r8") (Op (Type f []) "add" [Ref (Var (Type f []) "a5"),Ref (Var (Type f []) "r7")]) (Begin (Set (Var (Type f [()]) "a2") (Ref (Var (Type i []) "a6")) (Ref (Var (Type f []) "r8"))) (Let (Var (Type *** Exception: Term.hs:(33,3)-(35,19): Non-exhaustive patterns in function show

There's some missing code to infer the return type of the function,
but basically here's what's working:

  t9 = code $ termTopdef $ Topdef (Var (Type AInt []) "fun") t8

The array pointers are properly inferred.

  *Main> t9
  int fun(float a0, float * a1, float * a2, int a3)
  {
      {
          float fun4_0;
          int fun4_1;
          {
              fun4_0 = a0;
              fun4_1 = 0;
              goto fun4;
          }
      fun4:
          {
              const float a5 = fun4_0;
              const int a6 = fun4_1;
              const float r7 = *add(a1, a6);
              const float r8 = add(a5, r7);
              *add(a2, a6) = r8;
              const int r9 = lt(a6, a3);
              if (r9)
              {
                  const int r10 = add(a6, 1);
                  fun4_0 = r8;
                  fun4_1 = r10;
                  goto fun4;
              }
              else
              {
                  return 0;
              }
          }
      }
  }

Next:
- the state load/store
- return type infer


Entry: Toplevel definitions
Date: Tue Dec 13 10:54:57 EST 2011

Should this be part of Loop class?  What should be the return type?
Nothing probably, it's a side effect.

  class TML m r => Loop s m r | r -> m, r -> s where
    ...
    -- Toplevel definition.  
    _def :: String -> t -> m (r ())

So what about the type?  What I need mostly is to map t to a primitive
type, so it needs a TypeOf instance.

What about this:

  class TML m r => Loop s m r | r -> m, r -> s where
    ...
    _def :: TypeOf (r t) => String -> m (r t) -> m (r ())

  instance Loop Term MCode' Code where
    ...
    _def name mterm = do
      ct@(Code term) <- mterm
      return $ Code $ Topdef (Var (typeOf ct) name) term

I didn't try yet, but one thing that's missing is obviously the
instance of TypeOf for functions, and a way to collect toplevel
definitions.  This should manipulate the continutation, inserting the
definitions in a Begin form.  This can reuse mVoid:

    _def name mterm = do
      ct@(Code term) <- mterm
      mVoid $ Code $ Topdef (Var (typeOf ct) name) term


Ok, that's it.  After adding a workaround in PrimC for the Begin
wrapper introduced by mVoid, this is what comes out:

  ctop t = code $ termTopdef t
  t10 = _def "fun" (compileLoopTop f)
  t11 = ctop $ term $ t10

  *Main> t11
  int fun(float a0, float * a1, float * a2, int a3)
  {
      {
          float fun4_0;
          int fun4_1;
          {
              fun4_0 = a0;
              fun4_1 = 0;
              goto fun4;
          }
      fun4:
          {
              const float a5 = fun4_0;
              const int a6 = fun4_1;
              const float r7 = *add(a1, a6);
              const float r8 = add(a5, r7);
              *add(a2, a6) = r8;
              const int r9 = lt(a6, a3);
              if (r9)
              {
                  const int r10 = add(a6, 1);
                  fun4_0 = r8;
                  fun4_1 = r10;
                  goto fun4;
              }
              else
              {
                  return 0;
              }
          }
      }
  }


Entry: C structs
Date: Tue Dec 13 17:05:02 EST 2011

For load/store of loop state it seems C structures are necessary.
This requires some support.  Doing it just for arrays would probably
be 90% of the work, since this needs some compilation support from the
Struct class.

Reiterate:

  Struct is mainly there so that we can:

     - compose state objects

     - keep everything "flat" in the implementation, i.e. local
       variables for function args

  It has no equivalent in Term.

What needs to be done?  Essentially this is just dereferencing and
assignment using "." or "->" + structure type definition.  Given a
reference, construct a Struct object using those.

Maybe extending structVariables with a nameSpace argument?


Entry: C Struct's "." and "->" dereference.
Date: Thu Dec 15 08:40:33 EST 2011

Is it necessary to express such a thing, or is it probably better to
represent a struct as an array with a named index?  Probably not
because the distinguishing property of a struct is that it's a
heterogenous product type.

There seems to be a structural gap between what I want in a vague
intuitive sense, and what I have in a concrete structural sense.

Current structure: function application and abstraction.  These are
implemented by abstraction and assignment of structure types, so maybe
they should be implemented as such?

Transform a function operating on a variable (an in-place operation)
into something that is abstracted as a pure function that
"dereferences" the value, and a continuation function that "assigns"
the value.

That's nice and dandy, but doesn't eliminate the fact that at some
point real assignment needs to be implemented somewhere.  So let's
just extend Get Set in Term to have composite types: [Type] instead of
Type, and see where this gets us.

No, that's the wrong track.  It's important to distinguish:

  - primitive operations accept and produce primitive data (Let)

  - composite operations accept composite data (Lambda / App).  they
    do not produce anything as they are in CPS.

    (Eexcept for invocation of Ret which does produce primitive data
    but this is a special case.)


Another option is to pass the structs by value.  However this doesn't
solve anything since it just replaces "->" dereference with "." but
still needs explicit support for dereference and
assignment/definition.

The short of it seems to be that there is no way around implementing
proper structure support.  The question is how to encode it both in
Term and as proper (phantom) type signatures in Effect.


Entry: Typed CPP
Date: Sun Dec 18 09:13:03 EST 2011

[1] entry://../c/20111218-090742


Entry: C struct
Date: Sun Dec 18 10:03:33 EST 2011

Thinking about this doesn't seem to give any solution.  There's a
missing realization; I'm glossing over something important.

( I guess this is because this is that a structure definition is the
reification as a Term of Haskell type info, for which I have no direct
support. )

Let's just start to build it and see if it pops up.  First stop:
convert a Term Struct to C struct in PrettyC.hs

First thing that pops up: a the reason why Struct needs to be a Term
constructor is that it has a name.  Can it be nameless?  I.e can we
just extract the type from a nested variable declaration?  Making
Struct part of Term feels wrong, because it is not a value, just a
type.


Ok, so what is a structure?  Optionally it has a name, but it's just
an ordered list of variables, each with its own type.

   (Maybe String, [Var])

Where to start?  The AST of some C code of course.

  a2 = parse "struct foo {int a; float b;};"


  CTranslUnit
    [ CDeclExt (
        CDecl
          [ CTypeSpec (
              CSUType (
                CStruct
                  CStructTag (
                  Just "foo" ) (
                  Just
                    [ CDecl
                        [ CTypeSpec ( CIntType () )
                        ]
                        [ ( Just (
                              CDeclr (
                                Just "a" )
                                []
                                Nothing
                                []
                                () )
                          , Nothing
                          , Nothing
                          )
                        ]
                        ()
                    , CDecl
                        [ CTypeSpec ( CFloatType () )
                        ]
                        [ ( Just (
                              CDeclr (
                                Just "b" )
                                []
                                Nothing
                                []
                                () )
                          , Nothing
                          , Nothing
                          )
                        ]
                        ()
                    ] )
                  []
                  () )
                () )
          ]
          []
          () )
    ]
    ()

It's probably best to start at CTypeSpec.  Filling it in top-down,
replacing undefined by actual members I get to this, which reuses
cVarDecl which maps Term Var to AST elements:

  cStructDecl name vars = 
    CTypeSpec 
    (CSUType 
     (CStruct CStructTag 
      (Just $ ident name)
      (Just -- [CDeclaration a]
       (map (\var -> cVarDecl Nothing var) vars))
      [] -- CAttribute a
      ())
     ()) 


The test (with code printing AST as C code) gives:

  tv vn = Var (Type AFloat []) vn
  tvs ns = map (\n -> tv ("mem" ++ show n)) ns
  t8 = cStructDecl "foo" (tvs [1..10])

  *PrettyC> code t8
  struct foo {
      float mem1;
      float mem2;
      float mem3;
      float mem4;
      float mem5;
      float mem6;
      float mem7;
      float mem8;
      float mem9;
      float mem10;
  }
  *PrettyC> 


So, next is integration.  First I'd like to see if it's possible to
use unnamed structs.  That would separate dealing with the type and
dealing with a shortcut name for it.

What I want to do is something like this:

  float foo123(struct {float a, float b} arg1) {
      return arg1.a;
  }

That doesn't work.  So let's split it in two parts:

1. Find a canonical name for each struct.  This is equivalent to using
   tuples (type products) vs. tagged tuples (abstract data types).

2. Somehow insert the tagged tuples into the C code generator.  This
   probably means it will need to become monadic.

Making the generator monadic is probably a pain, though there seems to
be no way around it.


Got this going.

   cTupleDecl vars = cStructDecl canonicalName vars where
     canonicalName = "tuple" ++ (concat $ map shortTag vars)
     shortTag (Var (Type typeName order) _) = "_" ++ bt typeName ++ ot order where
       -- ot order = concat $ map (\_ -> "x") order
       ot order = show $ length order
       bt AFloat = "f"
       bt AInt   = "i"
       bt ABool  = "b"

   t10 = cTupleDecl [
     Var (Type ABool []) "mem0",
     Var (Type AInt [noTypeDims]) "mem1",
     Var (Type AFloat [noTypeDims, noTypeDims]) "mem2"
     ]

   *PrettyC> code t10
   struct tuple_b0_i1_f2 {
       int mem0; int * mem1; float * * mem2;
   }


Entry: PrettC : monadic generator necessary?
Date: Sun Dec 18 12:08:11 EST 2011

Bummer.. It looks like the PrettyC Term -> AST conversion can't be
done with a local transformation due to the fact that local struct
definitions (I.e. inside function arguments) are not possible in C.

Wait, the following might work to at least simulate that, and
side-step the non-local AST building necessary for global struct
declarations:

  int foo_op(void *arg1_v) {
    struct tuple_b0_i1_f2 {
      int mem0; int * mem1; float * * mem2;
    } *arg1 = arg1_v;
    return arg1->mem0;
  }

This keeps the type definition local to the function.  Initialization
can then be hidden behind functions also:

  void foo_init(void *varg1) {
    struct tuple_b0_i1_f2 {
      int mem0; int * mem1; float * * mem2;
    } *arg1;
    arg1->mem0 = 0;
    arg1->mem1 = 0;
    arg1->mem2 = 0;
  }

  int foo_size(void) {
    struct tuple_b0_i1_f2 {
      int mem0; int * mem1; float * * mem2;
    } *arg1;
    return sizeof(*arg1);
  }

Yep, compiles fine.  Let's go for that.

What about Term DestructureLet?  Yeah, but we need assignment too.

What about extending Get / Set to operate on structure types also?
Maybe it already just works?  There seems to be no restriction on the
_get / _set of the Array class?


  class TML m r => Array m r a t | r -> a  where

    _get   :: r (a t) -> r Tint -> m (r t)
    _set   :: r (a t) -> r Tint -> r t ->  m (r ())


For now, let's just add a separate _getStruct _setStruct, and require
Struct constraints like _app and _lambda.


Entry: Structure get/set and _lambda / _app
Date: Sun Dec 18 12:45:19 EST 2011

Maybe it really is simpler to do this reusing _lambda and _app.  There
is a very nice type correspondence between:

    _lambda <->  _structGet  (binding in)
    _app    <->  _structSet  (binding out)

Both have quite heavy Code implementations already.  Can they just be
placed on the same plan so this code is reused?

Maybe the thing is to extend the current way of passing arguments into
using structs directly?

Wait..  What can be done is to cheat and make flat C struct objects
into primitive types.  In fact, it really doesn't matter to the
Term/Code system as long as assignment works for both statement and
declaration.  Yep it's allready just there, what is needed is support
as TMLprim and a pack/unpack (de)structuring operation part of Loop,
probably simplest to do this in one go.


It looks like what is necessary is to 

   1. define Array instances of Struct: (,) and Atom
   2. Allow struct elements to reside in variables.

The first one seems simple if TMLword (was TMLprim) is extended to
struct types, which doesn't seem like a good idea.

Stuck... Let's make some examples.  Haskell and C type correspondence
using the canonical tuple naming from before:


  TMLprim: Tfloat                              // float 
           Tint -> Tfloat                      // float *

  Struct:  Atom Tfloat                         // struct tuple_f0
           (Atom Tfloat, Atom Tfloat)          // struct tuple_f0_f0
           Tint -> (Atom Tfloat, Atom Tfloat)  // struct tuple_f0_f0 *
 

What about making struct ref/deref fit in array access (ignoring
representation r) ?

  (Tint -> (Atom Tfloat, Atom Tfloat))   -- pointer to struct
  -> Tint                                -- member index
  -> Tfloat                              -- member
  -> m ()

  (Tint -> (Atom Tfloat, Atom Tfloat))   -- pointer to struct
  -> Tint                                -- member index
  -> m Tfloat                            -- member

Almost but not quite..  It really should be done all at once:

  (Tint -> r as)       -- representation of (array of struct)
  -> Tint              -- index
  -> r_as              -- struct of representation
  -> m ()

  (Tint -> r as)       -- representation of (array of struct)
  -> Tint              -- index
  -> m r_as            -- struct of representation

So maybe it really can just be an instance?  Let's try.

Nope that will overlap in a nasty way with the scalar instance.  Make
the only array instance to be one of Struct, and wrap all scalar
arrays in Atom?


Pfff..  This doesn't work because pointers (to pointers ...) to
structs are words, while the structs themselves are not words
(currently).

Meaning: the base type is different than the pointer type.

Really, this needs a TMLword instance for a struct, but that will
probably mess up many other things.

It's inconsistent..


Unless the Struct used for grouping machine word struct that is a
machine word are treated as different things.

They are, really...

But how to represent them?
And how to bridge them (their recursion) and the Struct recursion?

Something really doesn't fit..  Ha, I'm confusing types and data
again.  So, let's try: make a primitive type (C struct) that resembles
at least the structure of the Struct recursive class instances (binary
tree).

Here we go:

  {- Structs are also words.  Is this bad?  The main reason this is here
  is to support C structs and maybe later unions.  These can behave as a
  real atomic type (a single variable), but support destructuring. -}
  instance (TMLword t1, TMLword t2) => TMLword (t1,t2) where
    primType _ = APair (primType (undefined :: t1)) (primType (undefined :: t2))


  data TypeName   = AFloat | AInt | ABool | AVoid   -- atomic
                  | APair TypeName TypeName         -- composite
                  deriving (Eq)


Next: pack/unpack.  Can these be made the same as Struct?  Should
these be implemented as structs?  I.e. pack/unpack an "atomic" struct
into a representation struct.  This then uses the type

   unpack :: r as -> r_as
   pack   :: r_as -> r as

Note that in the Code interpretation of Loop/Effect, r as is never
visible.  This is currently only used in Value whare there is a struct
representation.


Entry: Struct / Tuple
Date: Thu Dec 22 13:54:56 EST 2011

- Abstract the variable generation that's used right now for _lambda
  and _app and map it to a "default flattening destructring bind" ref/unref.

- For PrettyC: map this flat tuple ref/unref to canonically named C
  structs.


Abstracting the var gen I get this:


mStruct ras = do
  nVars         <- return $ structSize (undefined :: ras)
  varNames      <- nameList "a" nVars
  (argCode, []) <- return $ structVariables (undefined :: ras) varNames
  argVars       <- return $ map unRef (structCompile argCode)
  return $ (argVars, argCode)


The "xxx <- return $ yyy" is to avoid nested do/let constructs.  It's
only the nameList operation that uses the monad state.

Actually this can probably use return type dispatching.


Entry: Flat structures
Date: Sat Dec 24 08:59:57 EST 2011

So.. Should structures always be flattened?  Maybe there's something
to say for Term keeping the binary tree structure.  Flattening does
seem a bit arbitrary.

What's the real point?  Eventually, compiling to C, the data structure
*does* need to be flattened.  Whether this is done in one of the two
following ways doesn't matter much, except for how get/set is
implemented:

    struct {
        float m0;
        float m1;
        float m2;
    };

    struct {
        float car;
        struct {
            float car;
            float cdr;
        } cdr;
    };

The latter seems a bit more "honest" to me.

Also, it might be good to add tuple naming to Term to avoid having to
do this in the C generator (non-monadic PrettyC).  Could be a way out,
though it seems rather awkward..  Let's not.

Also, for arguments in lambda and apply it seems simpler to just stick
with lists.

Back to confusion.  How to specify struct member get/set?

Backtracking a bit, the issue of indexing can be completely
sidestepped if pack/unpack is done in one operation, so let's try that
first.  The problem with that is that there is no multiple-variable
binding form at this point.  The only such form is Lambda.  Maybe
that's the first thing to add?  I.e. this would support primitive
multi-valued ops too.

So, looping further..  Maybe a separate multi-var LetStruct is really
better.  This separates structuring and value return, which then could
be a structure.

There is an assymetry if we're going with all at once unpacking: the
Unpack needs to be a binding form because it introduces multiple
binders.  The Pack operation can be the same as ordinary application.

This assymetry comes from a many->one assymetry that's already present
because tree-like (expressions), and not DAG-like.  This is an
essential structural element.  Trees are really easier to handle in
anything that has t do with notation.  DAGs can be built on top of
trees by introducing binders = opening up the graph structure.

Anyways, this is what _pack -> Pack looks like.  I have no idea how to
recover that type though..  After the flattening introduced by
structCompile it's really gone.  However, from the list of varrefs a
type could be reconstructed at ->C compile time.

  _pack ras = return $ packed where
    packed = Code $ Pack typ refs
    refs = structCompile ras
    typ = undefined -- FIXME: ???
    
Let's just kick that type out for now:


  _pack ras = return $ packed where
    packed = Code $ Pack refs
    refs = structCompile ras
   

  t25 = term $ _pack (Atom $ lit (1 :: Int), 
                      (Atom $ lit (2 :: Int),
                       Atom $ lit (3 :: Int)))
                     

  *Main> t25
  Pack [Lit (Type int []) "1",
        Lit (Type int []) "2",
        Lit (Type int []) "3"]


For _unpack we ahve something that this is very similar to _lambda,
which is also a multi-variable binding form.

  _unpack (Code as) body = do
    (structVars, structCode) <- mStruct "a"
    (Code body) <- mBlock $ body structCode
    return $ Code $ Unpack structVars as body
  
  t26 = term $ do
    p <- _pack (Atom $ lit (1 :: Int), 
                (Atom $ lit (2 :: Int),
                 Atom $ lit (3 :: Int)))
    _unpack p (\(Atom a, (Atom b, Atom c)) ->
                _ret b)


Problem with previous def of _pack is that it doesn't introduce a
variable.  Fixing that (replace 'return' with 'mVar') a missing
instance TMLprim (Atom t) arose.  After fixing that and a missing case
of show Apair, we get:

  *Main> t25
  Let (Var {varType = Type (int,(int,int)) [], varName = "t0"})
      (Pack [Lit (Type int []) "1",
             Lit (Type int []) "2",
             Lit (Type int []) "3"])
   (Ref (Var {varType = Type (int,(int,int)) [], varName = "t0"}))

  *Main> t26
  Let (Var {varType = Type (int,(int,int)) [], varName = "t0"}) 
      (Pack [Lit (Type int []) "1",
             Lit (Type int []) "2",
             Lit (Type int []) "3"])
    (Unpack [Var {varType = Type int [], varName = "a1"},
             Var {varType = Type int [], varName = "a2"},
             Var {varType = Type int [], varName = "a3"}]
        (Ref (Var {varType = Type (int,(int,int)) [], varName = "t0"}))
      (Ret (Ref (Var {varType = Type int [], varName = "a2"})))) 


Entry: Why binary trees as primitive structure?
Date: Sat Dec 24 10:18:37 EST 2011

Because it reflects the state composition of state space model
Arrow composition, both parallel and serial.

That makes perfect sense.  However, I still see no real reason to
flatten such datastructures in Pack and App.  Though in Lambda and
Unpack it seems to make sense: they are *really* unstructured:
replaced by individual variable names.  Let's take that as the reason:
a list is just a convenient way to represent a set, and it has just
the right amount of structure to also fix memory allocation in the
lower layers.


Entry: APair 
Date: Sat Dec 24 13:25:35 EST 2011

This is incorrect.

  data TypeName   = AFloat | AInt | ABool | AVoid   -- atomic
                  | APair TypeName TypeName         -- composite
                  deriving (Eq)

This cannot represent a pair of pointers, only pairs of primitives.

It looks like the Typename / Type system needs to be adjusted a bit.
Let's first factor it out.

It seems that TMLword should just give Type, not TypeName.

  data TypeName   = AFloat | AInt | ABool | AVoid   -- atomic
                  | AStruct [Type]                  -- composite
                  deriving (Eq)

  data Type       = Type TypeName TypeOrder
                  deriving (Eq,Show)


  class TMLword t where 
    primType  :: t -> Type
  

Have to stop for a bit.  Looks like I broke many things, among them I
introduced a "-> r" fundep that avoids mixing Code / Value
interpretations in the same module.


Entry: Binder for _unpack
Date: Sun Dec 25 09:44:31 EST 2011

The use of a lambda body is probably overkill:

  _unpack :: (TMLword as, Struct s r as ras) 
             => r as -> (ras -> m (r t))
             -> m (r t)

Can't this be done in a more straightforward way?

  _unpack :: (TMLword as, Struct s r as ras) 
             => r as -> m ras

Yes.  This is the implementation: simply insert Unpack form once the
type is destructured into a collection of variables.

  _unpack = mUnpack

  mUnpack (Code packed) = do
    (vars, unpacked) <- mStruct "a"
    SC $ (\state k -> Unpack vars packed (k state unpacked))
  
Hmm.. This can probably be generalized to other binding forms that use
explicit lambdas..  Or not.  The only remaining ones are actual
functions, not representations of let-like forms.

Looking good after adding Unpack in PrettyC (CInitList).


Entry: Struct declarations
Date: Sun Dec 25 11:33:44 EST 2011

Currently structs are not used inside functions to support any
essential infrastructure: they only get data in/out of the function's
context.  Thus it seems OK to perform struct declarations in the
argument list as it's done now.

If memory allocation is solved by a sizeof() operation, the struct
layout never needs to be part of a module's API.


Entry: Multi-dim arrays
Date: Tue Dec 27 08:40:17 EST 2011

Almost ready to integrate in Pd/SM.  Next part is multi-arity
input/output arrays.  The simplest thing to do seems to be to use:

    struct {
        float *a0;
        int *a1;
        ...
    };

This seems to involve some type commutation (mapping) of dereference
and structuring (struct of pointer <-> pointer of struct).  Can this
be expressed?

This needs a type class with recursive instances to implement the
recursion over types.

( Is there something like "fold" or "map" for recursive class instances? )

So in essence this doesn't seem hard, but it seems cumbersome and
arbitrary to have to do this with a type class again..

Simply put: what I really want is just an array of structs, so I want
to transform a struct of arrays into something that looks like an
array of structs.  

Wait.  Maybe this can be just another Array instance?

So..  I'm looking for Array and Struct commutation.

How to do this without introducing ambiguity?  I.e. the following
can't work because its head would be the same a proper array of a
struct:

  instance (Array m r a t1, 
            Array m r a t2) 
         => Array m r a (t1, t2)

It looks like this needs to be disambiguated by an explicit operation,
otherwise inference will not know what to do and it will need to be
type-annotated anyway..

This is the new (transposing) and old get:

  get' :: r (a t1, a t2) -> r Tint -> m (r (t1, t2))
  get' :: r (Atom (a t)) -> r Tint -> m (r (Atom t))

  get  :: r (a (t1, t2)) -> r Tint -> m (r (t1, t2))
  get  :: r (a (Atom t)) -> r Tint -> m (r (Atom t))
= get  :: r (a t)        -> r Tint -> m (r t)

Note that the last line accomodates both cases of get, but the two
get' cases are structurally different.

After some tinkering:

  class TML m r => SArray m r sa t where
    _get' :: r sa -> r Tint -> m (r t)
    _set' :: r sa -> r Tint -> r t -> m (r ())


  instance (TMLword t,
            Array m r a t,
            StructRepr r,
            TML m r) 
           => SArray m r (Atom (a t)) (Atom t) where 

    _get' r_sa i = do
      (Atom a) <- return $ unatom r_sa
      v        <- _get a i
      return $ atom $ Atom $ v

    _set' r_sa i v = do
      (Atom a) <- return $ unatom r_sa
      (Atom v) <- return $ unatom v
      _set a i v


  instance (SArray m r sa1 t1,
            SArray m r sa2 t2,
            StructRepr r,
            TML m r)
           => SArray m r (sa1, sa2) (t1, t2) where

    _get' r_sa i = do
      (a1, a2) <- return $ uncons r_sa
      v1       <- _get' a1 i
      v2       <- _get' a2 i
      return $ cons (v1, v2)

    _set' r_sa i v = do
      (a1, a2) <- return $ uncons r_sa
      (v1, v2) <- return $ uncons v
      _set' a1 i v1
      _set' a2 i v2


Entry: Atom -> L for Leaf ?
Date: Wed Dec 28 09:21:00 EST 2011

I'm getting tired of the visual clutter caused by the Atom
constructor.  Let's pick a one-letter constructor.  What should it be?

"L" Seems to be a good canditate.  Structs are trees, and the end are
leaf nodes.

Entry: compilation without annotation
Date: Wed Dec 28 10:06:58 EST 2011

I'm not sure why, but writing it down in separate toplevel functions
like this makes it possible to infer the types:

  initExpr (L refs) loop = do
    s_packed <- _getp refs
    s <- _unpack s_packed
    _app loop (s, L $ lit 0)

  loopExpr (L refs, ((L arri, L arro), L arrn)) update loop = _lambda loopBody
    where
      loopBody (s, L n) = do
        i <- _get arri n
        (s', L o) <- update (s, L i)
        _set arro n o
        more <- lt n arrn
        _if more 
          (do
              n' <- add n (lit 1)
              _app loop (s', L n'))
          (do
              s_packed <- _pack s'
              _setp refs s_packed
              _exit)


  compile update = _lambda funBody where
    funBody sio@(s, io) =
      _letrec (loopExpr sio update)
              (initExpr s) 

  t14 = ctop $ term $ _def "fun" (compile f2)


So, trying to move from _get to _get' I'm getting stuck somewhere.
Trying to do this first instead to make it simpler: (arrays of structs)

 
      i <- _get arri n
      (s', L o) <- update (s, L i)
      _set arro n o

to

      i <- _get arri n
      (s', o) <- update (s, i)
      _set arro n o

But then I get into trouble due to a missing leaf wrapper L:

  /home/tom/meta/dspm/0test_TML_Sys.hs:335:42:
      Couldn't match expected type `Code t0'
                  with actual type `L (Code Tfloat)'
      Expected type: ((AC Tfloat, AC Tfloat), Code t0)
                     -> StateCont.SC
                          Code.CompState Term ((AC Tfloat, AC Tfloat), Code t1)
        Actual type: ((AC Tfloat, AC Tfloat), AC Tfloat)
                     -> StateCont.SC
                          Code.CompState Term ((AC Tfloat, AC Tfloat), AC Tfloat)
      In the first argument of `compile2', namely `f2'
      In the second argument of `_def', namely `(compile2 f2)'
  Failed, modules loaded: Type, TML, Sys, Term, Code, Loop, Effect, Struct, PrettyC, StateCont.


Ok, I see.  This needs unpack probably:


  loopExpr2 (L refs, ((L arri, L arro), L arrn)) update loop = _lambda loopBody
    where
      loopBody (s, L n) = do
        ip <- _get arri n
        i  <- _unpack ip
        (s', o) <- update (s, i)
        op <- _pack o
        _set arro n op
        more <- lt n arrn
        _if more 
          (do
              n' <- add n (lit 1)
              _app loop (s', L n'))
          (do
              s_packed <- _pack s'
              _setp refs s_packed
              _exit)

  initExpr2 (L refs) loop = do
    s_packed <- _getp refs
    s <- _unpack s_packed
    _app loop (s, L $ lit 0)

  compile2 update = _lambda funBody where
    funBody sio@(s, io) =
      _letrec (loopExpr2 sio update)
              (initExpr2 s) 

  t15 = ctop $ term $ _def "fun" (compile2 f2)


This generates proper code for I/O structs, but not proper C types:
what should be "struct tuple_f0" is actually "float":

  int fun(struct tuple_f0_f0 {
              float m0; float m1;
          } * a0,
          float * a1,
          float * a2,
          int a3)
  {
  ...    

              const float t8 = a1[a7];
              const float a9 = t8.m0;
              ...
              const float t12 = { t11 };
              a2[a7] = t12;
  ...
  }


I found this in Code.hs which is probably wrong:


  instance TypeOf (Code o) => TypeOf (Code (i -> MCode o)) where
    typeOf _ = typeOf (undefined :: (Code o))
  
No that's not it, it's the return type of functions.

This is probably it:

  instance TMLword t => TMLword (L t) where
    primType _ = primType (undefined :: t)

The type of a struct with one member is not the same as its member.
Looks like that was it.  Here's the updated version:


  instance (TMLword t1, TMLword t2) => TMLword (t1,t2) where
    primType _ = Type (AStruct $ ts1 ++ ts2) 0 where
      -- Tuples need to be made up of other tuples, L or () nodes which
      -- are always AStruct types.
      Type (AStruct ts1) 0 = primType (undefined :: t1)
      Type (AStruct ts2) 0 = primType (undefined :: t2)


  instance TMLword t => TMLword (L t) where
    -- This instance terminates the recursion on (,)
    primType _ = Type (AStruct [primType (undefined :: t)]) 0

  instance TMLword (L ()) where
    -- Empty fillers
    primType _ = Type (AStruct []) 0


Entry: Monadic anyway..
Date: Wed Dec 28 11:01:46 EST 2011

The problem is that, while it's possible to declare structs in
argument lists, it's not possible to declare them twice.  It might be
best to just gather declarations in a monad side channel and spit them
out later.

Where to add this?  Toplevel definitions?  There seems to be no need
to push the monad structure down to the statement/expression level.

It might be as simple as a Writer monad.  Generate in first pass,
duplicates can be removed later.

Or, a state monad that takes a list of toplevel expressions.

Let's decide the form later and start witht he identity monad.


So... everything seems to be concentrated in termTopDef, and a Monad
doesn't seem to be necessary because juggling the side-channel of type
definitions is something that can probably be managed.

However, it might be a good exercise to use some of the standard
Writer / State monads.

Actually this turns out to be quite simple.  If the hidden state is a
Monoid, then it's OK to just use MonadWriter class: execWriter, tell.

So... I'm trying to only add structs that aren't there yet but it
doesn't seem to work:

  
  -- Query the type tags.
  getDict = do 
    ((), dict) <- listen $ return ()
    return dict 
  getTag typ = do
    dict <- getDict
    return $ find (\(tag,_) -> tag == typ) dict

  
  -- Save struct definitions of AStruct tags that appear in function
  -- arguments, so canonical struct names can be used in the function
  -- bodies.
  structDef var@(Var (Type (AStruct sub) _) _) = do
    typ <- return $ Type (AStruct sub) 0
    tag <- getTag typ
    case tag of 
      Just _ -> 
        tell [] 
      Nothing -> 
        tell [(typ, CDeclExt (CDecl [(cTypeSub typ)] [] ()))]
  structDef var = 
    tell []


I think "listen" doesn't really work the way I think it does.  If
there are subcomputations, are they actually executed in sequence, or
are they executed separately with results concatenated?  If that's the
case then "listen" only gives part of the answer.


From [1]:

  instance (Monoid w) => MonadWriter w (Writer w) where
          tell   w = Writer ((), w)
          listen m = Writer $ let (a, w) = runWriter m in ((a, w), w)
          pass   m = Writer $ let ((a, f), w) = runWriter m in (a, f w)

What IS a value of m a?  It is a Writer datastructure so indeed this
is only part of what's going on.

So.. I used Control.Monad.State combined with Data.Map and a hackish
way to generate the Ord instance for Map:

  type Dict    = Data.Map.Map TypeName (CExternalDeclaration ())
  type MDict   = State Dict
  instance Ord Type where
    compare t1 t2 = compare (typeHash t1) (typeHash t2) 

  -- FIXME: this ignores the order, which is not used in the Map.  Is
  -- there an automatic way to do this hashing of arbitrary data
  -- structures?
  typeHash (Type AFloat o) = 2
  typeHash (Type AInt   o) = 3
  typeHash (Type ABool  o) = 5
  typeHash (Type AVoid  o) = 7
  typeHash (Type AFun   o) = 9
  typeHash (Type (AStruct ts) o) = product $ map typeHash ts


[1] http://ogi.altocumulus.org/~hallgren/Programatica/tools/pfe.cgi?Control.Monad.Writer

Entry: Full stack?
Date: Wed Dec 28 16:31:00 EST 2011

The following compiles without warnings.

  struct tuple_f0 {
      float m0;
  };
  struct tuple_f0_f0 {
      float m0; float m1;
  };
  int fun(struct tuple_f0_f0 * a0,
          struct tuple_f0 * a1,
          struct tuple_f0 * a2,
          int a3)
  {
      {
          float fun4_0;
          float fun4_1;
          int fun4_2;
          {
              const struct tuple_f0_f0 t8 = a0[0];
              const float a9 = t8.m0;
              const float a10 = t8.m1;
              fun4_0 = a9;
              fun4_1 = a10;
              fun4_2 = 0;
              goto fun4;
          }
      fun4:
          {
              const float a5 = fun4_0;
              const float a6 = fun4_1;
              const int a7 = fun4_2;
              const struct tuple_f0 t8 = a1[a7];
              const float a9 = t8.m0;
              const float t10 = a6 + a9;
              const float t11 = a5 + t10;
              const struct tuple_f0 t12 = { t11 };
              a2[a7] = t12;
              const _Bool t13 = a7 < a3;
              if (t13)
              {
                  const int t14 = a7 + 1;
                  fun4_0 = t11;
                  fun4_1 = t10;
                  fun4_2 = t14;
                  goto fun4;
              }
              else
              {
                  const struct tuple_f0_f0 t14 = { t11, t10 };
                  a0[0] = t14;
                  return 0;
              }
          }
      }
  }

With "gcc -O3 -c test.c" I get the following asm output.  Looks pretty
good.  Can probably improve a bit still if the loop size is fixed.


0000000000000000 <fun>:
   0:	f3 0f 10 47 04       	movss  0x4(%rdi),%xmm0
   5:	31 c0                	xor    %eax,%eax
   7:	f3 0f 58 06          	addss  (%rsi),%xmm0
   b:	f3 0f 10 0f          	movss  (%rdi),%xmm1
   f:	85 c9                	test   %ecx,%ecx
  11:	f3 0f 58 c8          	addss  %xmm0,%xmm1
  15:	f3 0f 11 0a          	movss  %xmm1,(%rdx)
  19:	7e 1d                	jle    38 <fun+0x38>
  1b:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)

  20:	f3 0f 58 44 86 04    	addss  0x4(%rsi,%rax,4),%xmm0
  26:	f3 0f 58 c8          	addss  %xmm0,%xmm1
  2a:	f3 0f 11 4c 82 04    	movss  %xmm1,0x4(%rdx,%rax,4)
  30:	48 83 c0 01          	add    $0x1,%rax
  34:	39 c1                	cmp    %eax,%ecx
  36:	7f e8                	jg     20 <fun+0x20>

  38:	f3 0f 11 0f          	movss  %xmm1,(%rdi)
  3c:	31 c0                	xor    %eax,%eax
  3e:	f3 0f 11 47 04       	movss  %xmm0,0x4(%rdi)
  43:	c3                   	retq 


Entry: SArray and inference
Date: Thu Dec 29 08:38:03 EST 2011

I wonder if it isn't simpler to solve the problem:

   (L (a t1)), L (a t2))   <->   a (L t1, L t2)

which is a more direct description of the isomorphism that is the core
of this problem, instead of trying to fuse this type constructor
commutation with _get / _set as is currently done in SArray.

What does this operation look like in lowlevel code?  Problem there is
that it doesn't exist: it's entirely virtual.

So what's the real problem with current factoring?  There is some
ambiguity left:

    No instances for (SArray
                        (StateCont.SC Code.CompState Term) Code sa0 (L Tfloat),
                      SArray (StateCont.SC Code.CompState Term) Code sa1 (L Tfloat),
                      Struct Term Code as0 (Code sa0),
                      Struct Term Code bs0 (Code sa1))
      arising from a use of `compile3'
    Possible fix:
      add instance declarations for
      (SArray (StateCont.SC Code.CompState Term) Code sa0 (L Tfloat),
       SArray (StateCont.SC Code.CompState Term) Code sa1 (L Tfloat),
       Struct Term Code as0 (Code sa0),
       Struct Term Code bs0 (Code sa1))

Need to think about this...  I'm missing something simple.  The fact
that this isomorphism isn't expressed directly somehow bothers me..

More concretely, what is it that actually happens?  A single _get' /
_set' is replaced with a _get / _set for each array member of the
struct.

Why is this so hard?

Should something be said about properties of t?  It is known that both
sa ant t are part of struct instances.  With the former a struct of
array of type and the latter a struct of type.


  class TML m r => SArray m r sa t where
    _get' :: r sa -> r Tint -> m (r t)
    _set' :: r sa -> r Tint -> r t -> m (r ())


Giving up.  Guess: Something isn't factored properly such that
information that's necessary for inference is lost, but I can't put my
finger on it..


Entry: SArray has no 'a' param
Date: Fri Dec 30 08:36:31 EST 2011

Continuation of last post.  Basically, I don't quite understand why it
doesn't infer, but I guess this is because too much information is
lost in the SArray class. 

    (L (a t1)), L (a t2))   <->   a (L t1, L t2)

Is it possible to do this somewhere else, implementing the isomorphism
in terms of a substrate, i.e. "fake struct array".

I.e. the isomorphism would map "struct of array" to "array of struct"
only on the type level.  The implementation would then do the wrapping
such that these 2 cases can be distinguished.

It feels a bit messy though, but I see no way to make this isomorphism
work without somehow rooting it in representation.

Alternatively, would it be possible to add a fake method to Loop class
that exposes the array parameter?


Ok, I tried for a bit and it's too opaque.  I just don't understand
it.  All I do is fiddling to see what the checker says... Let's go for
something simpler.

Conclusions:

 * I can't get the translation _get' -> multiple _get to work on the
   typed level due to inference errors.

 * It might be simpler to solve it in Term, I.e. allow a
   representation of a struct (which would normally be only one Ref)
   to be a RefList.  This is possibly simpler to understand.


Entry: Supporting multi-arity references in Term?
Date: Fri Dec 30 09:24:26 EST 2011

Still about supporting that isomorphism.  The simplest modification
seems to be to allow Var to also be a list of variables.

  -- Variables can refer to data values or functions.  In the latter
  -- case type refers to the return value's type.
  data Var    = Var {varType :: Type, varName :: VarName}
              | VarList [Var]
              deriving (Eq,Show)

This is possibly very powerful, but it feels like a bit of a dirty
hack though..  It "should" be the case that this can be done at the
typed level, to make it more general.

I don't understand the consequences yet..  This might be problematic.


Entry: Simpler tests for SArray
Date: Fri Dec 30 09:53:35 EST 2011

So.. Maybe the SArray declaration is just simply meaningless?  These
errors are so strange...

  b19 (L arri) = do
    xs <- _get' arri (lit 0)
    (L x1, L x2) <- _unpack xs
    x <- add x1 x2
    y <- add x (lit (1 :: Tfloat))
    _ret y
  f19 =   _def "fun" $ _lambda b19 
  t19 = ctop $ term $ f19

  /home/tom/meta/dspm/0test_TML_Sys.hs:406:21:
      No instances for (Struct Term Code t2'0 (L Tfloat),
                        Struct Term Code Double (Code Double))
        arising from a use of `f19'
      Possible fix:
        add instance declarations for
        (Struct Term Code t2'0 (L Tfloat),
         Struct Term Code Double (Code Double))
      In the second argument of `($)', namely `f19'
      In the second argument of `($)', namely `term $ f19'
      In the expression: ctop $ term $ f19
  Failed, modules loaded: Type, Array, TML, Sys, Term, Code, Loop, Struct, PrettyC, StateCont, SArray.


The "Struct Term Code Double (Code Double)" part makes no sense.  This
is probably a non-tagged leaf node.  This should never occur.  Only
the following instance should, which is defined:

    Struct Term Code (L Double) (Code (L Double))

I tinkered the type class constraints a bit, removing the Struct
constraints to end up with this:

  -- Base case delegates to Array instance.
  instance (TMLword t,
            Array m r a t,
            StructRepr r,
            Loop stx m r) 
           => SArray stx m r (L (a t)) (L t) where 

  -- Inductive case.
  instance (Array m r a st1,
            Array m r a st2,
            SArray stx m r sat1 st1,
            SArray stx m r sat2 st2,
            StructRepr r,
            TML m r)
           => SArray stx m r (sat1, sat2) (st1, st2) where

Then the following example compiles:

  b19 (L arri) = do
    xs <- _get' arri (lit 0)
    (L x1, L x2) <- _unpack xs
    x <- add x1 x2
    y <- add x (lit (1 :: Tfloat))
    _ret y
  f19 =   _def "fun" $ _lambda b19 
  t19 = ctop $ term $ f19

But running it I get the error that uncons is not implemented for
Code.  This is weird.  I conclude that Term support is going to be
necessary anyway.

Man this is over my head..
Interesting though.

So, I officially don't know what I'm doing.  Let's see where this
leads to.  I expect that the two approaches I've been describing are
going to be equivalent and that yest, there is something in Term that
needs to encode the intermediates of the _get' operation.

Ok, it gets interesting.  I've added the following which is what I
thought it would be + an error clause to see what causes the
unexpected match error:


  instance StructRepr Code where
    atom (L (Code a))        = Code (Atom a)
    cons (Code a, Code b)    = Code (Cons a b)
    nil  ()                  = Code Nil

    unatom (Code (Atom a))   = L (Code a)
    uncons = uc where 
      uc (Code (Cons a b)) = (Code a, Code b)
      uc (Code x) = error $ show x
    unnil  (Code Nil)        = ()


The match error is:

  *** Exception: 
  Ref (Var {varType = Type (AStruct [Type AFloat 1,
                                     Type AFloat 1]) 0, 
            varName = "a0"})

Meaning that the Term is a varref.  What does this mean?

It seems to hint that the implementation for StructRepr can be made
completely virtual after all..  All this is is phantom types that are
eventually resolved to something that's implementable.  (guess)

So.. This should be a variant of unpack.

It's becoming too abstract.

Anyways, let's try this route: use only virtual instances, don't do
anything with the contents.  See next post, this is really shotgun
programming..


Entry: Horrible hack
Date: Fri Dec 30 12:00:16 EST 2011

Horribly wrong.  Don't try this at home.  But intriguing nontheless.
How can this be so wrong?

  -- Virtual data structures.  Used by SArray
  instance StructRepr Code where
    atom (L (Code t))        = Code t
    cons (Code t, _)         = Code t
    nil  ()                  = Code Nil

    unatom (Code t)          = L (Code t)
    uncons (Code t)          = (Code t, Code t)
    unnil  (Code Nil)        = ()


  b19 (L arri) = do
    xs <- _get' arri (lit 0)
    (L x1, L x2) <- _unpack xs
    x <- add x1 x2
    y <- add x (lit (1 :: Tfloat))
    _ret y
  f19 =   _def "fun" $ _lambda b19 
  t19 = ctop $ term $ f19


  struct tuple_f1_f1 {
      float * m0; float * m1;
  };
  float fun(struct tuple_f1_f1 a0)
  {
      const struct tuple_f1_f1 t1 = a0[0];
      const struct tuple_f1_f1 t2 = a0[0];
      const float a3 = t1.m0;
      const float a4 = t1.m1;
      const float t5 = a3 + a4;
      const float t6 = t5 + 1.0;
      return t6;
  }

It seems that what is really missing is just representation of
structure indexing.  Something like

   uncons (Code (Ref (Var ...))) = (Code (RefCar (Var ...)),
                                    Code (RefCdr (Var ...)))

It seems simpler to use this as structure representations, and solve
the flattening in PrettyC.

The C code then would become something like this:

  struct tuple_f1_f1 {
      float * m0; float * m1;
  };
  float fun(struct tuple_f1_f1 a0)
  {
      const struct tuple_f1_f1 t1 = a0.car[0];
      const struct tuple_f1_f1 t2 = a0.cdr[0];
      const float a3 = t1.m0;
      const float a4 = t1.m1;
      const float t5 = a3 + a4;
      const float t6 = t5 + 1.0;
      return t6;
  }

Let's see what happens when we go to bigger trees.

No no no... This is wrong.  The line "t1.m0" doesn't make sense.


Entry: Binary trees all the way down
Date: Fri Dec 30 12:28:25 EST 2011

It seems better to cut the knot and make Var support binary tree type
nested structures.  I.e. :

   a.car.cdr.cdr 

or

   a.m0.m1.m1

It seems that implementing StructRepr without these is quite
impossible.


So.. I run into a case where I need to destructure [a,b,c] into
car/cdr.  That doesn't work.  It looks like AStruct needs to be
binary.

Temporarily PrettyC.hs is broken because it needs support for nested
structs.

*** Exception: 
_uncons: RefCdr (Var {varType = Type (AStruct (Type AFloat 1,Type (AStruct (Type AFloat 1,Type AFloat 1)) 0)) 0, varName = "a0"})


Type (AStruct (Type AFloat 1,
               Type (AStruct (Type AFloat 1,
                              Type AFloat 1)) 0))


Aha.  The basic idea seems to be that this is about Var, not about
ref.  So RefCar becomes Ref (VarCar ...).


Seems to go ok.  Next problem is translating such a nested variable
into a proper dereferencing.

I.e. given  (VarCar (VarCdr ... (Var _ 1))) get the type

Jently down the stream..  I'm fixing PrettyC.hs following the
straightforward path guided by the types..  This is the first weird
error:

  *** Exception:
  cRefMem VarPair
   (Var {varType = Type AFloat 0, varName = "t1"})
   (VarPair (Var {varType = Type AFloat 0, varName = "t2"})
            (Var {varType = Type AFloat 0, varName = "t3"}))

Somebody is using a "virtual struct" as a reference.  If this is part
of an Unpack then that case should be handled also.

Yep.  This should be handled somewhere else, preferrably in Code so
that it doesn't need to be undone in PrettyC.


Hmm.. Maybe it's time for another unification: Write Lambda args in
terms of Var / VarPair etc..  That should make it simpler deal with
this Unpack (VarPair ...) business.


Entry: Replace argument lists by Var / VarPair / VarCar / VarCdr
Date: Fri Dec 30 14:54:20 EST 2011

Woo.. that seems like a deep change.  Major headache is structCompile
which now needs a user-specified composition instead of just using
list concatenation.

Abandoning it..  Very deep change, and probably leads to some
simplification which probably also means to change a lot of code
structure..

Hmm... maybe it could be as simple as StructRepr?  All this stuff
isn't necessary if structures are just collections of variables.

Yes that would simplify greatly..

Just for laughs, what about deleting this whole StructComp crap and
seeing where that goes?

First thing seems to be that running away from Struct (pack / unpack)
is not going to work: whole trees need to be transformed, which is the
functionality provided by Struct.hs

Second: the constraint that makes it possible to generate variabes for
the argument type of Lambda means that this needs to be encoded somewhere.


For now it seems to go surprisingly well.  I'm using this for variable
generation:

  class StructVar stx ras where
    structVar :: String -> Int -> (Int, stx, ras)

First trouble I run into is in Sys.hs : It seems a generic composition
instance is necessary if Sys.hs is supposed to be isolated from
Code.hs

  instance (StructVar stx r1,
            StructVar stx r2) 
           => StructVar stx (r1, r2)

To make it work with the old [Var] typed code, I plugged in a Var ->
[Var] function that flattens a variable tree.

Next problem is that mUnpack still generates variables in cases where
its argument is a virtual struct.

So I still have an Unpack instruction for a virtual struct.  Maybe
it's simplest to just do this in PrettyC, because in Code.hs it seems
the phantom type juggling is not cooperating.

*Main> t20
*** Exception: Unpack VarPair (Var {varType = Type AFloat 0, varName = "a3"}) (VarPair (Var {varType = Type AFloat 0, varName = "a4"}) (Var {varType = Type AFloat 0, varName = "a5"}))  VarPair (Var {varType = Type AFloat 0, varName = "t0"}) (VarPair (Var {varType = Type AFloat 0, varName = "t1"}) (Var {varType = Type AFloat 0, varName = "t2"}))  

What this looks like is just renaming.  If the trees match (they
should) this is probably straightforward.


  *** Exception: 
               Unpack VarPair (Var {varType = Type AFloat 0, varName = "a3"}) 
                     (VarPair (Var {varType = Type AFloat 0, varName = "a4"})
                              (Var {varType = Type AFloat 0, varName = "a5"}))  

                      VarPair (Var {varType = Type AFloat 0, varName = "t0"})
                     (VarPair (Var {varType = Type AFloat 0, varName = "t1"})
                              (Var {varType = Type AFloat 0, varName = "t2"}))  


This seems to solve it:
                     
  -- Structure unpack.  
  st (Unpack mvars (Ref svars@(VarPair _ _)) body) = renames ++ st body where
    -- This is the result of unpacking a virtual struct, which is just
    -- a bunch of variable names.  It would be better to avoid these
    -- being generated in the first place, but for now this will do.
    renames = ren mvars svars
    ren new@(Var _ _) old@(Var _ _) = [cVarInitOne new (ex (Ref old))]
    ren (VarPair new1 new2) (VarPair old1 old2) = ren new1 old1 ++ ren new2 old2

  st (Unpack mvars (Ref svar) body) = 
    -- Real C struct.
    cStructVarInit svar (varList mvars) ++ st body


This comes out.

  struct tuple_f1_L_f1_f1_R_0 {
      float * m0;
      struct tuple_f1f1 {
          float * m0; float * m1;
      } m1;
  };
  float fun(struct tuple_f1_L_f1_f1_R_0 a0)
  {
      const float t0 = a0.m0[0];
      const float t1 = a0.m1.m0[0];
      const float t2 = a0.m1.m1[0];
      const float a3 = t0;
      const float a4 = t1;
      const float a5 = t2;
      const float t3 = a3 + a4;
      const float t4 = t3 + a5;
      const float t5 = t4 + 1.0;
      return t5;
  }

It worries me that some of the names (numbers) are duplicated.  Why is
that?  Also, these are argument names..  Why is that?  EDIT: use "s"
prefix for struct members and fixed typo for state update.

Anyways, there's a problem: Lit is not support by _cons: it's expected
to be all variables.  Should this be changed in TML?  It does seem
better to do that..


Another bug:


  int fun(struct tuple_f0f0 * a0, float * a1, float * a2, int a3)
  {
      {
          float fun4_0;
          float fun4_1;
          int fun4_2;
          {
              const int t8 = 0;
              const struct tuple_f0f0 t9 = a0[t8];
              const float s10 = t9.m0;
              const float s11 = t9.m1;
              const int t12 = 0;
              fun4_0 = s10;
              fun4_1 = s11;
              fun4_2 = t12;
              goto fun4;
          }
      fun4:
          {
              const float a5 = fun4_0;
              const float a6 = fun4_1;
              const int a7 = fun4_2;
              const float t8 = a1[a7];
  -->         const float s9 = t8.m0;
              const float t10 = a6 + s9;
              const float t11 = a5 + t10;
              const float t12 = { t11 };
              a2[a7] = t12;
              const _Bool t13 = a7 < a3;
              if (t13)
              {
                  const int t14 = 1;
                  const int t15 = a7 + t14;
                  fun4_0 = t11;
                  fun4_1 = t10;
                  fun4_2 = t15;
                  goto fun4;
              }
              else
              {
                  const struct tuple_f0f0 t14 = { t11, t10 };
                  const int t15 = 0;
                  a0[t15] = t14;
                  const int t16 = 0;
                  return t16;
              }
          }
      }
  }

This corersponds to:

      ip <- _get arri n
      i  <- _unpack ip

So arri should really be a 1-element float struct, meaning that the
code is actually correct, but the type is not.

Maybe this is because Var and VarAtom are not distinguished?

What about this:

  -- Variables can refer to data values or functions.  In the latter
  -- case type refers to the return value's type.  A VarList represents
  -- a structure with membes bound to individual variables.
  data Var    = Var {varType :: Type, varName :: VarName}
              | Tree VarTree

  data VarTree = VarPair VarTree
               | VarCar  VarTree  
               | VarCdr  VarTree
               | VarAtom Var
               deriving (Eq,Show)


Entry: It's a mess
Date: Fri Dec 30 19:50:19 EST 2011

I messed it up good.  It's time to start thinking about the structure
of things.  It looks like allowing Term where we really want Var or
VarTree is just making things worse.

So, TODO:

- Think about the difference of Var, VarTree, Term
  Especially VarTree <-> Var can be quite confusing.


The main problem seems to come from Unpack:

  Unpack VarTree Var
  Unpack VarTree Vartree

Both are valid in some weird distorted way.  The first is an actual C
structure unpack, the second is a rebinding of variables caused by the
use of virtual structs for the _get' and _set' "distributing" struct
dereferencing.

It would be good to get it back to a working state, and redo some of
the changes with this in mind.  Current state is too ill-defined to
get anywhere.


Entry: What is a a virtual struct?
Date: Sat Dec 31 07:55:56 EST 2011

Note that this is still functional programm, so we really should have
referential transparency.  That means that it should be possible to
replace a variable with its value at all times.

However, once we start making a distinction between Var and Term this
seems to disappear.

So.  What is the difference between a VarTree and a Term encoding a
structure?

It seems that the only problem to solve is the Unpack problem above,
and it just needs to know the difference between a reference to a
reified struct, and a bunch of variable names.

Some issues:

- Conceptually separate Tree of binders from Tree of terms.  This was
  separate in the original approach (binders where lists of variables.

  This solves the Unpack issue: if Unpack's argument is a single
  variable reference, it's implemented by a C struct.  If it's a Term
  representing a tree terminated in variable references, it's a
  virtual struct.

- Since both trees are isomorphic, encode this isomorphism explicitly
  in Term.


Conclusion: both are needed as they represent different things, but
there is an embedding from one (VarTree) to the other (Term).

So, making this changes seems to go well up to now.  First thing I run
into is that Get / Set should take a Term as first argument, not a
Var.  This can probably be generalized to other places where just Var
appears, since we're now allowing structured variable references ala:

   (Car (Cdr (Ref v)))

Ok.. Working my way down to PrettyC.hs it seems that the main problem
with that file and its heavy facotorization is: are things represented
as Term subrees or AST subtrees?  It's quite arbitrary, and seems to
be a problem that is intrinsic in complex tree transformations,
related to the order of recursion.

Next error: t20

  *** Exception: _unatom Car (Ref (Var {varType = Type (AStruct (Type AFloat 1,Type (AStruct (Type AFloat 1,Type AFloat 1)) 0)) 0, varName = "a0"}))

  Type (AStruct (Type AFloat 1,
                 Type (AStruct (Type AFloat 1,
                                Type AFloat 1)) 0))

Makes no sense.  Is there anything else that works all the way down to
C code generation?

Yes.  0test_integration.hs : t18 works, and t15 generates code but
still has the 1-element struct vs float problem.

t19 gives a similar error.

What does it mean?

First that type error.  I suspect it's because of this typeOf call:


  instance TMLword t 
           => StructVar Term (L (Code t)) where  
    structVar prefix id = (id + 1, Atom $ t, L (Code $ t)) where
      t = term $ typeOf (undefined :: Code t) where
        term typ = Ref $ Var typ $ prefix ++ show id


Maybe it's even deeper.  Is there a primitive difference between
struct of 1 and base type?  Don't think so.

Maybe it's even deeper.  TypeName doesn't reflect the same binary tree
structure as VarTree

Ok, fixing that as:

  data TypeName   = AFloat | AInt | ABool | AVoid  -- atomic
                  | ATree TypeTree                 -- composite 
                  | AType Int                      -- indexed type (see PrettyC.hs)
                  deriving (Eq,Show)

  data TypeTree   = AAtom Type
                  | ANil 
                  | ACons Type Type
                  deriving (Eq,Show)

  data Type       = Type TypeName TypeOrder
                  deriving (Eq,Show)

And propagating this change down the chain, I get the following which
has an explicit representation of atoms as singleton structs.

  struct tuple_f0 {
      float m0;
  };
  struct tuple_f00f00 {
      struct tuple_f0 {
          float m0;
      } m0;
      struct tuple_f0 {
          float m0;
      } m1;
  };
  int fun(struct tuple_f00f00 * a0,
          struct tuple_f0 * a1,
          struct tuple_f0 * a2,
          int a3)
  {
      {
          float fun4_0;
          float fun4_1;
          int fun4_2;
          {
              const int t8 = 0;
              const struct tuple_f00f00 t9 = a0[t8];
              const float s10 = t9.m0;
              const float s11 = t9.m1;
              const int t12 = 0;
              fun4_0 = s10;
              fun4_1 = s11;
              fun4_2 = t12;
              goto fun4;
          }
      fun4:
          {
              const float a5 = fun4_0;
              const float a6 = fun4_1;
              const int a7 = fun4_2;
              const struct tuple_f0 t8 = a1[a7];
              const float s9 = t8.m0;
              const float t10 = a6 + s9;
              const float t11 = a5 + t10;
              const struct tuple_f0 t12 = { t11 };
              a2[a7] = t12;
              const _Bool t13 = a7 < a3;
              if (t13)
              {
                  const int t14 = 1;
                  const int t15 = a7 + t14;
                  fun4_0 = t11;
                  fun4_1 = t10;
                  fun4_2 = t15;
                  goto fun4;
              }
              else
              {
                  const struct tuple_f00f00 t14 = { t11, t10 };
                  const int t15 = 0;
                  a0[t15] = t14;
                  const int t16 = 0;
                  return t16;
              }
          }
      }
  }


Trouble with this though is that inner structs should not be named.
These are not necessary and they cause conflicts.

Fixing this I run into a problem that the recursion patterns in
PrettyC are becoming pretty unwieldy.  I'm trying to fix the point
where the struct name needs to be dropped.

cTupleType' -> cVarDecl / cTupleType
cVarDecl -> cType
cType -> cType'
cType' -> cTypleType'

It's getting to a point where a monad would probably be better.  Or a
configuration structure that records recursion options and is passed
down the tree..  Passing down multiple flags doesn't seem like a good
idea.

Done


Entry: Next: failing test cases
Date: Sat Dec 31 11:35:47 EST 2011

*Main> t19
*** Exception: _unatom Car (Ref (Var {varType = Type (ATree (ACons (Type (ATree (AAtom (Type AFloat 1))) 0) (Type (ATree (AAtom (Type AFloat 1))) 0))) 0, varName = "a0"}))

As before.  What does it mean?

Yes that makes sense.  This is a virtual struct that's deconstructed
into a variable again.  Just missing code.

  Type (ATree (ACons (Type (ATree (AAtom (Type AFloat 1))) 0) 
                     (Type (ATree (AAtom (Type AFloat 1))) 0))) 0

This probably requires an Unatom constructor in Term.  Indeed.  I had
Atom, which is semantically a constructor not a destructor (accessor).
Replacing this and propagating all the way down fixed it.


Entry: Inference check
Date: Sat Dec 31 12:00:33 EST 2011

Looks like the roller coaster ride of yesterday and a night sleep
payed off.  Next is to check if the inference for _get' works properly
now in a real function body.

It doesn't:

  /home/tom/meta/dspm/0test_integration.hs:194:33:
      No instance for (Struct Term Code as0 (Code (L (Tint -> Tfloat))))
        arising from a use of `compile3'
      Possible fix:
        add an instance declaration for
        (Struct Term Code as0 (Code (L (Tint -> Tfloat))))
      In the second argument of `_def', namely `(compile3 f2)'
      In the second argument of `($)', namely `_def "fun" (compile3 f2)'
      In the second argument of `($)', namely
        `term $ _def "fun" (compile3 f2)'
  Failed, modules loaded: Type, Array, TML, Sys, Term, Code, Loop, Struct, PrettyC, SArray, StateCont.


Is this really just a missing instance of a symptom of something else.
It seems that in t20 structs of pointers work just fine, so something
else is going on here.

2 questions:
 
  - why doesn't as0 resolve to (L (Tint -> Tfloat))

  - is Struct Term Code (L (Tint -> Tfloat)) (Code (L (Tint -> Tfloat))) defined?

The latter should be the case because of 

  TMLword t => TMLword (Tint -> t)
  TMLword t => TMLword (L t)

So the question is, why isn't as0 fixed?

Maybe now it's time to move to the approach where we just transform a
struct of arrays into an array of structs.  This means that the
"distribution" needs to be solved at a different level, or somehow
encapsulated..

Maybe it's just literal.. What about adding the Struct constraint here:

  instance (TMLword t,
            Array m r a t,
            StructRepr r,
            Struct stx r (L (a t)) (r (L (a t))) -- HERE
            Loop stx m r) 
           => SArray stx m r (L (a t)) (L t) where 

No that didn't solve anything.  I'm back (never away?) from not
understanding the dependency structure and why it doesn't want to just
infer dammit!


  l16 = compile3 f2

  l16
    :: Struct Term Code as (Code (L (Tint -> Tfloat))) =>
       StateCont.SC
         Code.CompState
         Term
         (Code
            ((L (Tint -> (L Tfloat, L Tfloat)), ((as, as), L Tint))
             -> StateCont.SC Code.CompState Term Tint))

How come that as isn't fixed?


Entry: Generalizing array instead
Date: Sat Dec 31 12:33:47 EST 2011

Going in loops again..  Why is it again that this can't be expressed
as an ordinary Array?  What about the following?  This expresses more
directly that a tree of arrays is also an array.

  instance Array m r a (L t) => Array m r (L' a) t

Is there anything new?  Yes: I did not allow trees of arrays before.

This can't be expressed directly due to a kind mismatch.  So this
needs a transformer / isomorphism that I don't see how to express..


Entry: I'm sick of misunderstood morphisms..
Date: Sat Dec 31 15:38:45 EST 2011

Actually what needs to be expressed is a relation between 3 kinds of
things, using L for structuring, r for representation a for arrays and
t for the base type.  Substitude L with (,) in a straightforward way
to find an exhaustive list of all the representations.

With only L there are 3 * 2 = 6 combinations.  I'm noting

   L (r (a t))    as    L r a

This is all there is:

   L r a
   r L a
   a L r   (**)
   L a r   (*)
   r a L
   a r L   (*)

Which of these are meaningful?

I don't think (a (r .)) is meaningful (*).  This would be an array of
representations of something.  There is no need for a meta-language
array.  For the same reason (a (L (r .))) doesn't make sense either
(**).

Removing those cases leaves exactly what we expected:

   L r a    structure of representations of arrays
   r L a    representation of structure of arrays
   r a L    representation of array of structure

The operation I'm interested in is providing an equivalence between
the last two.  The first one is just a metalanguage structure and is
not so interesting, but it is there if the second one is there.

These 3 forms are related by these 2 transformations:

   r L a <-> L r a    pack/unpack: metalanguage embedding of data structures
   r L a <-> r a L    transposing of structuring and arrays
                      
The trick now is to either express those directly.  The first one is
independent of a, and we can forget about it.  It's expressed by the
Struct constraint.

The second one is problematic because the representation r a L needs
to be implemented in some way.  So maybe it should be only virtual,
i.e. a relation between types used for inference only.

Currently in the implementation some link is broken between these 3.

So, if the problem is really just finding a representation of r a L
then maybe something can be done about it.  Anything "virtual" can
usually be implemented by a structure.

On the other hand, it might be enough to keep the current direct
implementation and encode the morphism in type class constraints.

The relation r L a <-> L r a or really r L <-> L r is expressed in
general as (with irrelevant constraints removed):

  Struct r (L t) (L (r t))

or 

  Struct r s sr

Thinking about this a bit more and finding an understandable naming
scheme I get to this:


  {-  The following used this naming scheme for types

   r   representation (of anyting)
   a   array (of anything)
   s   structure (of base types)
   sa  structure of arrays (of base types)
   as  array of structurs (of base types)
   sra structure of representations of arrays (of base types)
   sr  structure of representations (of base types)
  -}
  class (TML m r,
         StructRepr r,
         Struct r sa sra,   -- structure of (a t_i)
         Struct r s sr)     -- structure of t_i
        => SArray m r s sa sr sra where
    _gets ::  r sa -> r Tint -> m (r s)
    _sets ::  r sa -> r Tint -> r s -> m (r ())


  -- Base case delegates to Array instance.
  instance (TMLword t,
            StructRepr r,
            Array m r a t)
           => SArray m r (L t) (L (a t)) (L (r t)) (L (r (a t))) where 

    _gets r_sa i = do
      (L a) <- return $ unatom r_sa
      v     <- _get a i
      return $ atom $ L v

    _sets r_sa i v = do
      (L a) <- return $ unatom r_sa
      (L v) <- return $ unatom v
      _set a i v


  -- Inductive case.
  instance (SArray m r s1 sa1 sr1 sra1,
            SArray m r s2 sa2 sr2 sra2)
           => SArray m r (s1,s2) (sa1,sa2) (sr1,sr2) (sra1, sra2) where

    _gets r_sa i = do
      (a1, a2) <- return $ uncons r_sa
      v1       <- _gets a1 i
      v2       <- _gets a2 i
      return $ cons (v1, v2)

    _sets r_sa i v = do
      (a1, a2) <- return $ uncons r_sa
      (v1, v2) <- return $ uncons v
      _sets a1 i v1
      _sets a2 i v2


Entry: Cleanup
Date: Sat Dec 31 16:11:46 EST 2011

Is it possible to remove that tourist stx parameter in Struct?
Apparently, but that leads to some loose ends here and there.

This error is strange / 0test_integration.hs t16

<interactive>:1:1:
    No instance for (StructVar Term (Code (L (Tint -> Tfloat))))
      arising from a use of `compile3'
    Possible fix:
      add an instance declaration for
      (StructVar Term (Code (L (Tint -> Tfloat))))
    In the expression: compile3 f2

This shouldn't happen..  Should be  L (Code ...  instead of Code (L ...

Maybe it's a faulty class constraint in SArray.hs

After cleanup (see previous post) I still get the same !@#$ error.

  No instance for (StructVar Term (Code (L (Tint -> Tfloat))))
  No instance for (Struct Code as0 (Code (L (Tint -> Tfloat))))


Entry: Debugging functional dependencies
Date: Sat Dec 31 20:07:41 EST 2011

Is there a systematic way to see why something doesn't infer?  I.e. is
it possible to get a more verbose "half inferred" output?


Entry: Fundeps
Date: Mon Jan  2 08:11:48 EST 2012

To move forward I need a systematically different approach.  I don't
think the problem is correctness -- though that is still possible
because I can't explain some of the error messages where the order of
type constructors is flipped.

The problem is really not understanding where the error originates.
How to "query" a type?

I had used this before to equate types:

  typeEq :: (Monad m) => t -> t -> m ()
  typeEq _ _ = return ()

In a monad it's simple to insert a "statement" like that without
affecting anything else.

What about using this trick and add an extra argument to a function
that's only used in conjunction with a variable that I want to know
the type of.  I.e.

  expr test = do ...
     ...
     typeEq somevar test
     ...


Entry: _lambda type
Date: Mon Jan  2 08:29:50 EST 2012

Why is the argument in _lambda a packed struct, i.e. r (L t) instead
of an unpacked one, i.e.  L (r t)?

This might be the problem.  In Loop it's clear it should be the latter
(struct of representations identified by 'sr' type variable.
  
  _lambda :: (TMLword s,
              StructVar stx sr,  -- var name generation.  FIXME: can this be removed?
              Struct r s sr) 
             => (sr -> m (r t))
             -> m (r (s -> m t))

In loopExpr2 this is the same:

*Main> :t loopExpr2
loopExpr2
  :: (Loop stx m r,
      Array m r a as1,
      Array m r a s,
      Array m r a t2,
      TMLword as1,
      TMLword s,
      TMLword as,
      TMLword t2,
      StructVar stx (L (r Tint)),
      StructVar stx t1,
      StructVar stx t,
      Struct r as t,
      Struct r s t1,
      Struct r as1 sr1,
      Struct r t2 sr) =>
     (L (r (a as1)), ((L (r (a s)), L (r (a t2))), L (r Tint)))
     -> ((t, t1) -> m (sr1, sr))
     -> r ((as1, L Tint) -> m Tint)
     -> m (r ((as, L Tint) -> m Tint))

The base type of the state array (r (a as1)) is the same as the
argument of the embedding function  r ((as1, L Tint) -> m Tint)

This means that as1 is a struct of representations.  Or what?  No.
Following the definition of _lambda above, the representation for a
function is indeed (r (s -> m t)) which is implemented in terms of
something using struct of representations (sr -> m t).

So this is correct.  Both as1 should really be the same.

First strange thing, in this:
   
  loopExpr3 :: (Loop stx m r,
                TMLword s,         -- s can be represented in the target (as a C structure)
                Struct r s sr,     -- s can be represented in the host ..
                StructVar stx sr,  -- .. and it's rep can be deconstructed in variables
                StructVar stx (L (r Tint))
                )
               => (L (r (a s)), ((r pi, r po), L (r Tint)))
               -> ((sr, i) -> m (sr, o))   -- update is HOS: it works on struct of rep
               -> r ((s, L Tint) -> m Tint)
               -> m (r ((s, L Tint) -> m Tint))

It wants to (s ~ sr) Ok, this is because the loopbody is indeed type
sr, not s.  The rest was straightforward:


  loopExpr3 :: (TMLword s,         -- s can be represented in the target lang (C structure)
                Struct r s sr,     -- s can be represented in the host lang
                StructVar stx sr,  -- .. and it's rep can be deconstructed in variables
                Array m r a s,     -- s is provided in an array (singleton array == box)
                Struct r i si,     -- relation between rep of struct and struct of rep ..
                Struct r o so,     -- .. for input and output
                TMLword i,         -- i,o can be repd in target lang
                TMLword o,
                StructVar stx si,  -- si, so can be repd in host lang
                StructVar stx so,              
                SArray m r i pi si srai,  -- i,o are in "transposable" struct of arrays
                SArray m r o po so srao,
                StructVar stx (L (r Tint)),   -- the loop index is a host-repd struct
                Loop stx m r)      -- relation between host and target language
               => (L (r (a s)), ((r pi, r po), L (r Tint)))
               -> ((sr, si) -> m (sr, so))
               -> r ((s, L Tint) -> m Tint)
               -> m (r ((s, L Tint) -> m Tint))

The last thing added was the SArray instances.  The variables srai and
srao are not used so they might be cause of trouble.

The error later down the line is still:

  *Main> :t compile3 f2

  <interactive>:1:1:
      No instance for (StructVar Term (Code (L (Tint -> Tfloat))))
        arising from a use of `compile3'
      Possible fix:
        add an instance declaration for
        (StructVar Term (Code (L (Tint -> Tfloat))))
      In the expression: compile3 f2


What this means is that it wants to generate a representation of a
struct of an array type.

   r (L (a Tfloat))

Is this meaningful?  I thought it was not, that it should realy only be 

   L (r (a Tfloat))

but maybe it is actually..

Is there room now to investigate where this constraint comes from?

There's an error in the above.  It should be:

  (L (r (a s)), ((srai, srao), L (r Tint)))

instead of

  (L (r (a s)), ((r sai, r sao), L (r Tint)))

nope..  These are correct, but that probably means there is some extra
indirection there I'm not taking into account?

Woah this is confusing..

What about the difference between these two:

  (L refs, ((arri, arro), L arrn))

  (L refs, ((L arri, L arro), L arrn))
  
I'd think that because arri and arro are in direct correspondence to
target language elements, they should be leaf nodes.  Let's see if
this infers..

Yep, that was it!

Funny how some bugs are always in a different spot then where you're
actually looking.


Entry: Some missing Unatom
Date: Mon Jan  2 10:06:32 EST 2012

Trying to compile t16' I get this:

  struct tuple_f00f00 {
      struct {
          float m0;
      } m0;
      struct {
          float m0;
      } m1;
  };

  const struct tuple_f00f00 t13 = a0[t12];
  const float s14 = t13.m0;  // incorrect

This should be

  const float s14 = t13.m0.m0;

Probably a missing Atom / Unatom.  PrettyC.hs seems to be correct
since it generates proper dereferencing for the other cases.  Code.hs?

Can't find it from code inspection.  I need a proper Term
prettyprinter.  It's 'pp' in the tests.  Just a bit verbose.  The
relevant part of the Term struct is:

            const float s14 = t13.m0;
            const float s15 = t13.m1;

                    Unpack (
                      VarCons (
                        VarAtom
                          Var
                            { varType = Type
                                          AFloat
                                          0
                            , varName = "s14"
                            } ) (
                        VarAtom
                          Var
                            { varType = Type
                                          AFloat
                                          0
                            , varName = "s15"
                            } ) ) (
                      Ref
                        Var
                          { varType = Type (
                                        ATree (
                                          ACons (
                                            Type (
                                              ATree (
                                                AAtom (
                                                  Type
                                                    AFloat
                                                    0 ) ) )
                                              0 ) (
                                            Type (
                                              ATree (
                                                AAtom (
                                                  Type
                                                    AFloat
                                                    0 ) ) )
                                              0 ) ) )
                                        0
                          , varName = "t13"
                          } ) (  ... )

Inspecting Unpack points at termList / varList.

Ok, this is hidden somewhere in a recursion in cStructVarInit.  It's
probably simplest to transform a variable type to a structured
initializer (Cdr (Car ...)) and reuse the other Unpack case.


Entry: Fix named fields
Date: Mon Jan  2 10:48:58 EST 2012

Define them separately. It messes with the prettyprinter.


Entry: Done?
Date: Mon Jan  2 12:03:41 EST 2012

Looks like those last problems are ironed out.

Next: 

- write to .c file, compile and run
- Value implementation of Arrays


Entry: Bug
Date: Mon Jan  2 13:32:31 EST 2012

I'm not sure where this extra Atom wrapper comes from, but it does
seem to be correct.  Trouble is that the struct tuple_f0 is not
declared.

            const struct tuple_f0 t16 = { t15 };
            const float t17 = t7.m0[a11] = t16.m0;

Is there a quick workaround, or does this meed statement compilation
should also be monadic?

Also, this is not correct:

            const struct tuple_f00f00 t21 = a0[t20] = t19;

It's a var declartion where it should just be an array assignment.
Ah, that's because of __set.


Entry: From let -> do
Date: Mon Jan  2 14:20:23 EST 2012

That's easy.  But what with "where" inverted style?  Is there a
straightforward way to translate it?


Entry: Sys.hs and monadic initial values
Date: Mon Jan  2 19:10:19 EST 2012

Or more specifically, use a monadic (,) :

cons a b = do
  a' <- a
  b' <- b
  return $ (a',b')

Does this mean that all of this trickles down?

That should really be avoided..

Recap.  The monadic lit was necessary to ensure every input is a
variable reference, and no literals can appear in expressions.

The problem with that is that it is no longer possible to define
values of (r Tint), (r Tfloat), ... that can be used in the initial
state concatenation of Sys.hs

It's possible to make that composition monadic such as mentioned
above, but will that cause other problems?  Let's see.

Jep, straightforward after some wrong starts and type annotations.


Entry: Another Atom bug?
Date: Mon Jan  2 20:08:26 EST 2012


struct tuple_f00f00 {
    struct {
        float m0;
    } m0;
    struct {
        float m0;
    } m1;
};
int fun(struct tuple_f00f00 * a0)
{
    const float t1 = 0.0;
    const float t2 = 0.0;
    const struct tuple_f00f00 t3 = { t1, t2 };
    const int t4 = 0;
    a0[t4] = t3;
    const int t5 = 0;
    return t5;
}


Initializer should be 

    const struct tuple_f00f00 t3 = { { t1 }, { t2 } }


Entry: Getting Sys.hs to work
Date: Wed Jan  4 08:47:40 EST 2012

I'm collecting the state/array wrapper generation in Pd.hs

Next is to try to make Sys.hs work with this.

It seems that Sys type needs to be annotated with a lot of interfaces
to make compilation work.  Let's see where this leads.

Note that Sys is still "optional".  And only there to allow Functor /
Applicative / Category / Arrow instances without having to deal with
the state types.

The alternative is to use dynamic typing as suggested in a reply to
one of my Haskell Cafe questions.  This might be good enough.

Looks like the current problem in Sys.hs is an inproperly organized
dependency structure.  It needs to know that Arrays of structs are
also arrays, which is currently encoded in the TMLword class, but
doesn't seem to end up at the right place..

Yeah.. This is too hard.  Isn't there a simpler way?

Just define the "compile Sys to Term" thing in a separate interface
and be done with it?


Entry: SysCompile
Date: Wed Jan  4 10:22:27 EST 2012

class SysCompile stx u i where
  sysCompile :: u -> i -> stx


Hmm... not abstract enough.

What about hiding all the operations in a SSM class, then doing *just*
the phantom type in Sys.

This is what it looks like, without interface:

  data Sys stx m i o = 
    forall s. () => Sys (SSM m s i o)

  instance Monad m => Category (Sys stx m) where
    (.) (Sys f) (Sys g) = Sys (ssmSer f g)
    id  = Sys (ssmPure id) 

  instance Monad m => Arrow (Sys stx m) where
    arr f = Sys (ssmPure f)
    (***) (Sys f) (Sys g) = Sys (ssmPar f g)
    first   a = a *** (arr id)
    second  a = (arr id) *** a

  instance Monad m => Functor (Sys stx m i) where
    fmap f op = (arr f) . op

  instance Monad m => Applicative (Sys stx m i) where
    pure f = arr $ \_ -> f
    (<*>) f a = fmap (uncurry ($)) (f &&& a)

The trick is now to add a compilation interface to SSM.  One that
guarantees the preservation of that interface after ser/par
composition.

Still note: this is entirely optional.  SSM might be good enough!

This is a strange thing.  How to express in a general way that when
the inputs of ssmSer / ssmPar / ssmPure are compilable, that also the
outputs are?  The only thing that happens is (,)

But the way this is encoded in dependencies seems "upside down".
Eventually in Loop / Array this is encoded in the recursive instances
of TMLword and Struct.

The trouble is that there are quite a lot of constraints that need to
be satisfied before a function can be compilable, so maybe this just
needs to be added as type constraints to the SSM data structure?

The thing is that by itself, SSM is quite simple and doesn't need all
that.  The trouble really arrives somewhere else, so maybe these
things should be expressed as part of Sys's constraints?

So what about 3 layers:

   - SSM:     basic types, no fuss
   - CSSM:    all type class constraints for compilable SSM
   - Sys:     only phantom state


Hmm.. I won't get there with mindless manipulation.

What's the point?

       Compositions of compilable SSM are compilable SSM.


How to express that?  Put it in a class with proper constraints.
How to prove it?  Provide instances.

It's more directional:

       If a, b are compilable SSM  

       then a `ser` b, a `par` b are compilable SSM

I don't see a way to encode that in a recursive type class because the
compile operation is completely opaque: you can't compile 2 parts of
this "composition" separately.

Maybe the key lays in not compiling to stx, but to another "growing"
state.

Pff.. mismatch between intuition and tools.  Another one of those
"glossed-over isomorphisms" that's not properly encoded.

This probably just needs time.  Or some rest.


Entry: ArrowLoop
Date: Wed Jan  4 13:27:39 EST 2012

I wonder, because my Arrow class is strictly directional, does this
create problems with recursion?

Nope.  It's always a separate class, so simply don't provide an
instance, or only do this for "iterative" algorithms where the
recursion is finite (numerically bounded).


Entry: Next
Date: Wed Jan  4 13:37:36 EST 2012

- What to do with Sys.hs?

  There is an interesting problem at the core.. Why can't the hidden
  class structure be exposed?  Where is this information lost?  Is it
  really just making things more explicit?  Just lift the Struct stuff
  to the function level?  Is this yet another Struct something
  composition?

- Finish the Pd / SM interface.  This should be a routine task.


Entry: Pack
Date: Sat Jan  7 12:18:14 EST 2012

Needs to follow the same approach for left-hand side initializers.
Currently it's wrong: it flattens the struct.

So, in prettyC separate generation of initializer lists (which already
exists) and following struct's structure,.

FIXED.


Entry: Renaming
Date: Sun Jan  8 09:11:56 EST 2012

Let's find some better names for the different submodules:

TML  -> Data
Loop -> Control

+ cleaned up tests.


Entry: Still fundep problems
Date: Sun Jan  8 13:01:00 EST 2012

While working towards building SM modules, I run into a problem with:

  pdInfo n = _lambda $ \() -> _exit (n :: Tint)

  pdModule update minit = do
    _def "sm_init" $ pdInit minit
    _def "sm_tick" $ pdUpdate update

    _def "sm_state_size" $ pdInfo 100
  --  _def "sm_nb_init" $ pdInfo 100


If I add the uncommented line above, it doesn't infer: the
representations are not the same.  Adding fundeps in Data or Control
seems to fix things too much, i.e. something then fixes the
representation.

Hmm.. I don't get it.  This should really be straightforward: stx r m
are all uniquely related through the Loop constraint: one defines the
other 2, so anything that doesn't support this is a bug.

I fixed the Value rep's monad to MValue, which is currently just an
identity monad.

Looks like _lambda is wrong

  *Main> :t _lambda
  _lambda
    :: (DataWord s, StructVar Value.SValue sr, Struct Value s sr) =>
       (sr -> MValue (Value t)) -> MValue (Value (s -> MValue t))


Removing the code below, type is:

  _lambda
    :: (Control stx m r,
        DataWord s,
        StructVar stx sr,
        Struct r s sr) =>
       (sr -> m (r t)) -> m (r (s -> m t))


  niValue = "not implemented for Value representation"
  instance StructVar SValue t where
    structVar = error $ "StructVar" ++ niValue
  instance StructVarCons SValue where
    structVarCons = error $ "StructVarCons" ++ niValue
    structVarNil  = error $ "StructVarNil" ++ niValue


Replaced with, modeled after Code's instance

  niValue f = error $ f ++ " not implemented for Value representation"
  instance DataWord t 
           => StructVar SValue (L (Value t)) where
    structVar = niValue "StructVar"

  instance StructVarCons SValue where
    structVarCons = niValue "StructVarCons"
    structVarNil  = niValue "StructVarNil"


Entry: SM updates
Date: Sun Jan  8 13:13:07 EST 2012

Maybe best to move nodes/src/sm to meta/sm..

Some impedance matching problems:

- SM assumes fixed block sizes and doesn't pass lenght in sm_tick()
  routine.  This might be good for allowing optimization, but is it
  really necessary?

- SM still assumes float array state.  Make this abstract: only bytes.


Code now generates with info from types.  After some foefelare I found
out that simpler type annotations are also possible.  That made it
quite straightforward.  The bindings on the structVar line are used
for type equations only.

  pdModule
    (update :: (s,i) -> m (s,o)) 
    (init   :: m s) 
    = do
      _def "sm_state_size" $ pdInt (4 * ns)
      _def "sm_nb_init"    $ pdInt ns
      _def "sm_nb_in"      $ pdInt ni
      _def "sm_nb_out"     $ pdInt no
      _def "sm_blocksize"  $ pdInt 64
      _def "sm_init"       $ pdInit init
      _def "sm_tick"       $ pdUpdate update
        where
          -- Relate syntax type and s
          (_, stx, _ :: s) = structVar undefined undefined
          -- Type-indexed info.
          ns = structSize stx (undefined :: s)
          ni = structSize stx (undefined :: i)
          no = structSize stx (undefined :: o)


It runs in the SM C test suite.  That's a nice mile stone.  More later.


Entry: replace stx by r s ?
Date: Mon Jan  9 20:46:20 EST 2012

Maybe that's a better approach, since only Code needs this, and Code
knows how to unpack a singly wrapped Term.  Also the symmetry between
(r s) and sr as that in Struct is obvious.  stx looks a bit like a hack..

However, ths messes up StructVarCons, but it that still needed?  It
seems StructRepr is enough..

EDIT: Actually, that was quite straightforward.

There are some more simplifications.  structVar doesn't need to return
both r s and sr: the StructRepr class is available so only sr or r s
is enough.

Did that + implemented stub for variable reuse.


Entry: Next
Date: Tue Jan 10 10:14:31 EST 2012

SM seems done, next is the highlevel stuff and cleanup.  Did some more
name cleanup, removing all the underscores and replacing _if with ifte
and _lambda with lam.  Also moved pack/unpack to Data (it's not
control flow).

What to do with Sys.hs ?


Entry: Upsampling : Constant Array / Discrete Events?
Date: Tue Jan 10 11:13:08 EST 2012

One of the nuisances of writing audio DSP code is the upsampling of
control-rate signals: rate conversion and associated filtering.  You
really want to do this 1. automatically and 2. using in-line
calculations, avoiding the use of intermediate memory.

Can it be solved with current approaches?  Yes.  It seems the only
component that's necessary is a "constant" array, and a way to get at
the oversampling factor.

The input of an SM loop is ((i,o),n) where i and o are structures of
array pointers.  It could be made such that these pointerse are simply
floats/ints.  The problem is to get the 'n' next to that float, or to
compute some update coefficients at the start of the loop.

This should go somewhere in the SM <-> update path.  The main question
is: does this need extra state?  The answer should probably be yes.

It seems straightforward to do.  This is mostly a transformation of
the update method itself using serial composition.  The problem then
is to find a simple way to tag this in the type signature.
I.e. something like:

  liftUpdate :: LiftUpdate ((A,C),C)

Where A is audio rate (array) and C is control rate (constant float
over current audio frame).

This should be made part of SSM because it's not just the update
method that needs to be lifted, but of course also the state.

Thinking about it a bit it's probably best to work bottom up: write
some manual cases, then abstract the pattern that turns up as a class.


Entry: Evaluation of recent work
Date: Tue Jan 10 13:27:50 EST 2012

Looking back at the last month or so, the following can be noted:

- The type class structure is better following a better understanding
  on type classes in general (overlap & fundep) and cleaning up some
  bugs and warts in the structure itself.

- C code generation turned out to be more difficult than expected.
  Language.C AST seems to be quite usable, but C syntax in itself
  isn't so simple as one might think.  It's definitely useful to have
  the intermediate Term language and it might even be useful to make a
  better pretty-printed syntax.

- "Virtual structures" like (Unatom (Car (Cdr (Ref var)))) seemed to
  be an essential addition to Term, and they map directly to C's
  struct member dereference: var.m1.m0.m0.  This allowed pack/unpack
  and app/lam to be placed on (almost) the same footing.

- Construction (Cons, Atom, Nil) and destruction (Car, Cdr, Unatom)
  are not the same thing.  (duh! but apparently I got confused by
  Atom/Unatom).

- Related: Structured variables and structured values are not the same
  thing.  Though they are related through morphisms (i.e. obtain a
  structured variable for a structure of variable references).

- All structures down to the C level are binary trees, ending in
  explicit 1-element leaf nodes.  This seems to be essential for
  avoiding instance ambiguity (A thing and a 1-element collecting of
  things are not the same!)

- C code generation needed to be monadic to enable naming of C struct
  types.  It's not possible to do this in-place because the resulting
  types are not considered to be equal (no structural matching in C).

- The stx argument to Control (i.e. Term in case of Code rep) doesn't
  seem to be necessary.  It's sufficient to use the struct-of-rep <->
  rep-of-struct morphism and generate either struct of rep or rep of
  struct terms with variables, as the user (i.e. Code) knows how to
  deconstruct rep of struct (i.e. to Term).

- I found out that "partial" type annotation is possible.  This makes
  it possible to avoid manual specificiation of large class
  constraints.  Everything seems to infer as expected now.

- Existential types for Arrow instances are still a mistery.  There
  seems to be no inherent limitation, just an orgianization of type
  class hierarchies that hide recursion relations necessary for
  satisfying the Arrow (,) composition.

- Things get complicated when "commutating" multiple type
  constructors, i.e. struct of reps of types <-> rep of struct of
  types are relatively straightforward.  When mixing in a 3rd one,
  i.e. rep of array of struct of types it can get too complicated to
  understand at a glance..  To build intuition, it helps to create an
  exhaustive list of all combinations, eliminate those that are not
  meaningful or relevant, and explicitly name the morphisms between
  the remaining ones[1].


[1] entry://20111231-153845


Entry: Next
Date: Tue Jan 10 15:43:55 EST 2012

- Fun stuff: autodiff.

- Upsampling scalar inputs + type-directed specification?

- Sys.hs : probably largely irrelevant


Entry: Upsampler: config
Date: Tue Jan 10 15:48:52 EST 2012

The main problem seems to be in the configuration.  Two conflicting
properties of the current code base:

  - Everything is generated automatically from the type of the update
    method.  Synchronous sampling is assumed.  This is good if all the
    types are the same (float*), but not so good if then they need to
    be individually specified (float or float*).

  - Type-directed specification probably assumes the structure of the
    inputs.  Is this always going to be straightforward?  It better
    be, because in Pd it needs to be..

From this it looks like the problem here is a tricky one.  The good
thing is that it doesn't exist on the SSM level.

EDIT: see next entry

Entry: Type Level Programming
Date: Wed Jan 11 08:49:32 EST 2012

(Context: see prev entry)

What this really needs is a simple way to access the type of the
update function.

The manipulations mentioned would be so easy to do if that type would
be a simple data structure.  So it looks like the thing I'm ultimately
looking for is a more systematic way to do type-level programming.

It's funny that there are already a couple of levels involved:

 - Haskell type level computations (now mostly just structural induction [1])

 - Haskell class instance code (= Term / Value construction)
 - Term -> C compilation
 - C -> bin compilation

 - bin execution

Though from a practical pov the middle ones can be grouped = all
Haskell code.


So, concretely.  What is needed?

 - Given some map specified as a type, like (A,(C,C)), construct a SSM
   lifted over arrays, where each C node has an upsampler inserted.
   This can be split into 2 parts.

   * Construct the upsampler / pass from (A,(C,C))

   * Perform simple serial composition with original update function

The trouble is that these things live in "array space" but
conceptually they need some form of "SSM space" composition.  It seems
that what is missing is some kind of lift operator that bridges these
two spaces.

The concept that is missing is indeed a "virtual array" abstraction.
More specifically, an interface:  v <- read in  ; write out v

Put differently: hide n.

Or maybe: hide inputs in state?  This looks a bit like partial
application: turn inputs into generators.

Something like this:

  - Abstract input dereference as (si,()) -> (si,o) where si contains
    all state necessary to perform the dereference.

  - Compose the 2 systems using ssmSer.

Structurally, the main difference is between storing or not storing
this intermediate state.  It will have to be storing, because for any
kind of interpolation we'll beed at least the previous inputs.  Beyond
that state and interpolation method could be abstract.

So, the type is something like this:

    Array Tfloat -> SSM
    Tfloat -> Int -> SSM

Actually, the state structure is more complicated than that.  There
are at least 2 components:

  - Persistent state, i.e. interpolation/filter state

  - One-time state constructed from the input (float values, array
    pointers, array size) only live in the current iteration.


Simply put: translate "inputs" to "signals" = systems with no input.

Probably good to also do the same for output / subsampling.


[1] http://en.wikipedia.org/wiki/Structural_induction


Entry: Meta-currying
Date: Thu Jan 12 08:20:29 EST 2012

Is this actually possible?  I'm trying to create a SSM with state =
array index, that is parameterized by array pointers outside its
update method.  Will that work?


Entry: CAnalyze
Date: Sat Jan 14 11:20:06 EST 2012

I want to have a standard approach for analyzing C files.  This means
in practice to find a way to factor the recursion over a C AST into
something that is manageable.  I'm taking the approach of writing a
single recursion, parameterized by a type class of analysis methods.

This seems to go quite well, though the first hurdle is recursive
structure declarations.  In Language.C both toplevel declarations and
member declarations are captured in the same CDeclaration type.  We
really want to separate those, because of the "just gather" semantics
wanted for the toplevel declarations, and recursive "compile tree"
semantics of the member declarations.

It seems that "managing the recursion" is 99% of the problem.

Anyways, how to capture this?  It's quite complex, but I think with a
proper structure this can be captured in a good way.  What assumptions
to make?

1.  The result should be a tree structure of a single type.  This
    keeps type signatures simple.

2.  Recursion should be monadic: this allows the user to collect
    statistics instead of just building a tree structure.

3.  Toplevel forms should have m () return type because they don't fit
    the nested structure well.  There should be a way to "ignore"
    things.


Yeah this needs a different approach.  Stacking all that functionality
in a single class is not a good idea.

Let's keep CAnalyze but rename it to CAtop: toplevel analysis.  Then,
when gathering lists of function / union / structure definitions /
declarations it's possible to write individual analysis for them.

Not so simple.  Maybe this needs a top down approach: write the wanted
ADT for the analysis result, then 

So, for structs, the problem is that a structure type can be nested,
i.e. for:

T = { T1 m1, T2 m2, ... }

In practice this doesn't happen so much, but it needs to be handled.

The trouble is that in my handwaving I didn't think of this in the
eventual result, which would be a Lua description of a binary data
structure, used for parsing.  There is no real issue though, just that
it needs to be recursive.

It seems to be best to skip on the typeclass and first write a
recursion that produces a known type, then later replace the
constructors with type class functions if that's still needed.

Struct is too complicated.  Let's take an error-driven approach to
implement the subset, starting with:

caData = struct where
  oops ast = error $ show $ strip ast
  struct = oops    


Hm... messed it up again.  It almost worked, but only for structs.
Typedefs didn't show up which made the type info incomplete.  To make
it complete, it needs all typedefs also.

I'm also confusing type name bindings and structure information.

Trouble is that in C syntax this all happily lives together..

What a mess!


Next: use the same error-driven approach, but make sure first that the
semantics of the eventual type description structure is sane.

Note that this needs two parts:

  - declaration of types
  - declaration of variables / definition of functions

It's probably best to keep those to COMPLETELY separate

  
What about using an SSA-style form for the types, providing a name for
each unnamed subtypes.  This then enables at least a flat
representation.

The main idea seems to be to separate type definitions from type
references.

What about this?  It doesn't cover everthing (such as arrays) but it
seems like a good start.

data TypeBase = Prim String  
              | User String    -- typedef
              | Enum String
              | Struct String 
              | Union  String
              deriving (Eq, Show)

data TypeRef = TypeRef TypeBase Int                       
             deriving (Eq, Show)

data TypeDef = DefUser   String TypeRef
             | DefStruct String [TypeRef]
             | DefUnion  String [TypeRef]
             | DefEnum   String [(String, Int)]
             deriving (Eq, Show)


Maybe it's better to type the namespaces directly.

So.. using error-driven approach I get to something, but probably need
to refactor it a bit..  Also some more general parsing/matching will
be necessary to parse annotation lists.


Entry: Analyzing C
Date: Sun Jan 15 11:27:14 EST 2012

Yesterday I took a stab at parsing a real world C file.  The goals are:

 1. Generate external representations of binary data structures
    defined as packed structures.  Main goal: generate load/save
    routines in a language different than C.

 2. More general problem: generate function and data wrappers for
    scripting languages.  Aiming at Lua for my current real-world
    problem, but eventually this should fuel libprim[1].

The problem I ran into is the structure of C syntax.  Getting C into
an AST is not a trivial problem.  However, the ad-hoc nature of C
syntax makes it hard to work with it directly.

It makes a lot of sense to transform a C file into something that's
more manageable, like:

  - type definitions (struct, union, enum, typedef, function).

  - external (no storage) declarations for data / functions.

  - internal (storage) declarations for data / functions.

For my current application the storage declarations of variables and
functions can be ignored.  I'm mostly interested in 1. types and
2. external declarations.

The first approach I took was to write a parameterized recursion over
raw C AST.  It turns out that this is too complicatied.  Before raw C
syntax is usable, it probably needs to be translated into something
simpler, reflecting the 5 categories of objects mentioned above.

So I wonder, did anyone do this before?

Looks like I didn't really do my homework.  There's plenty in the
Language.C package that deals with higher level processing.  Let's
have a look at it.

Summary: I burned my fingers by failing to see the complexity of C
         syntax.  Disappointing in that what I want to do seems more
         complicated than I thought, but one step closer to actually
         getting it done!

[1] entry://../libprim

Entry: A closer look at Language.C
Date: Sun Jan 15 11:44:28 EST 2012

I've seen and used these parts before:

  Language.C.Data
  Language.C.Parser
  Language.C.Pretty
  Language.C.Syntax

I probably don't need these yet because compiling and preprocessing is
done outside of the Haskell tool:

  Language.C.System.GCC
  Language.C.System.Preprocess

So what is this one about?

  Language.C.Analysis


A description from [1]: 

  Analysis of the AST.

  Currently, we provide a monad for analysis and analyze declarations
  and types. Especially note that there is no direct support for
  analyzing function bodies and constant expressions.

  NOTE This is an experimental interface, and therefore the API will
  change in the future.

Looks like this is exactly what I'm looking for.

EDIT: The question is, where to start?

[1] http://hackage.haskell.org/packages/archive/language-c/0.4.2/doc/html/Language-C-Analysis.html


Entry: Language.C.Analysis
Date: Mon Jan 16 10:04:55 EST 2012

Goal: get at all (packed) struct definitions to generate Lua tables
for translating between binary storage and table representation.

Where to start in [1]?  I'm looking at the source, and the first thing
that looks familiar is this from AstAnalysis.hs :

  analyseAST :: (MonadTrav m) => CTranslUnit -> m GlobalDecls

This uses MonadTrav[2] class.  A predefined instance of this is
Trav[3], which is parameterized by a user state.

For now I'm using this, inside the IO monad for getting at the file
system.

  info  :: (CTranslUnit -> b) -> IO b
  decls :: IO (Trav () GlobalDecls)

  decls = info analyseAST

Here info parses a C file from the file system into CTranslUnit using
readFile, parseC, initPos and decls is the output of the analysis
using a Trav monad without user state.

So this gives C file to GlobalDecls[4] dat structure.


  GlobalDecls	 
     gObjs :: Map Ident IdentDecl
     gTags :: Map SUERef TagDef
     gTypeDefs :: Map Ident TypeDef

Getting at the keys of gObjs and gTypeDefs:


  decls tu = fst ds where
    Right ds = runTrav_ $ analyseAST tu
  objIdents     = keys . gObjs     . decls
  typeDefIdents = keys . gTypeDefs . decls

These don't seem to have structure definitions.  That because they are
in SUERef[5], which names anonymous types.

  tagSUErefs    = keys . gTags     . decls

As a list of strings:

  onFile ((Prelude.map show) . tagSUErefs)

Next: write a function that filters out all packed structures.

That was relatively straightforward.  However, trying to actually map
that data to something exportable (size + basic data types like int /
float) actually requires evaluation of C code in the most general
case.  So solving the problem in a generic way seems quite a hassle.

Practically I run into initializations through enums, which show up as
CVar.  This probably needs access to a global symbol table.

It does give me the impression that for many C analysis tasks, a C
parser is not enough.  An interpretation step is necessary, which can
probably best be left to a C compiler!

So: almost there.  More after the break :)

[1] http://hackage.haskell.org/packages/archive/language-c/0.4.2/doc/html/Language-C-Analysis.html
[2] http://hackage.haskell.org/packages/archive/language-c/0.4.2/doc/html/Language-C-Analysis-TravMonad.html
[3] http://hackage.haskell.org/packages/archive/language-c/0.4.2/doc/html/Language-C-Analysis-TravMonad.html#t:Trav
[4] http://hackage.haskell.org/packages/archive/language-c/0.4.2/doc/html/Language-C-Analysis-SemRep.html#t:GlobalDecls
[5] http://hackage.haskell.org/packages/archive/language-c/0.4.2/doc/html/Language-C-Data-Ident.html#t:SUERef

Entry: Prettyprinting
Date: Wed Jan 18 10:39:13 EST 2012

This[1] is simply fantastic.

  -- (useful subset of) Lua data syntax.
  data LuaData = LuaStr String
               | LuaNum String
               | LuaTable [(LuaData, LuaData)] deriving (Eq)

  -- Prettyprinting with Text.PrettyPrint.HughesPJ
  instance Show LuaData where
    show = (++"\n") . show . docLua

  docLua (LuaNum s) = text s
  docLua (LuaStr s) = quotes $ text s
  docLua (LuaTable xs) = doc where
    doc = nest 2 $ braces $ cat $ punctuate comma $ list xs
    list xs = zipWith pair [1..] xs
    pair i (k,v) = docPair i k $ docLua v

  -- If i is in sequence, don't print index.
  docPair i (LuaNum n)   sv | show i == n = sv
  docPair _ n@(LuaNum _) sv = cat [ brackets $ docLua n, sv ]
  docPair _ (LuaStr s) sv = cat [cat [text s, equals], sv]
  docPair _ (LuaTable t) sv = error $ "Table as key."


[1] http://www.haskell.org/ghc/docs/6.2.2/html/libraries/base/Text.PrettyPrint.HughesPJ.html


Entry: DSL concrete syntax
Date: Sat Jan 28 10:41:45 EST 2012

I've been claiming I don't really care about syntax, but I have to
admit that I miss Scheme macros when working in Haskell.  Also, I'm
thinking of porting (a subset of) the Staapl framework to Haskell, for
which it seems a concrete syntax might be handy.

So how to go about that in Haskell?

[1] http://www.haskell.org/haskellwiki/Template_Haskell


Entry: Does GCC inline function pointers?
Date: Tue Nov 13 23:18:55 EST 2012

With __attribute__((always_inline)) it's possible to keep modularity
without loosing flat code.  However, that's not the full story as this
does not abstract over iteration structures (loops with holes).

Does GCC inline function pointers?
Apparently it does[1].

Let's try this.
Indeed, it does, see inline_function_pointer.c
It even vectorizes the addition.

[1] http://stackoverflow.com/questions/5097917/can-gcc-inline-an-indirect-function-call-through-a-constant-array-of-function-po

Entry: restrict
Date: Tue Nov 13 23:54:46 EST 2012

In which cases is "restrict" more powerful than using C code that
performs load/store explicitly?  I.e. the code never reads twice from
the same memory location?

Still, if code is in a loop with only a single load/store, the
compiler doesn't know that output and input are not the same so can
not re-order the load/store.

[1] http://en.wikipedia.org/wiki/Restrict
[2] https://lwn.net/Articles/255364/


Entry: GCC AVR ABI
Date: Mon Nov 19 10:30:10 EST 2012

r0 temp
r1 zero

r3:r2  call-saved
r5:r4
r7:r6

r9 :r8   arg0
r11:r10  arg1
r13:r12  arg2
r15:r14  arg3
r17:r16  arg4

r19:r18  arg5 rv call-used
r21:r20  arg6 rv ..
r23:r22  arg7 rv ..
r25:r24  arg8 rv ..

???  Not clear from [2] or [3], but easy to test.
rv64 = r19:r18:r21:r20:r23:r22:r25:r24
rv64 = r25:r24:r23:r22:r21:r20:r19:r18

r27:r26  X call-used    
r29:r28  Y call-saved, frame pointer
r31:r30  Z call-used

From [2]:

What registers are used by the C compiler?

Data types:

  * char is 8 bits, int is 16 bits, long is 32 bits, long long is 64
    bits, float and double are 32 bits (this is the only supported
    floating point format), pointers are 16 bits (function pointers
    are word addresses, to allow addressing up to 128K program memory
    space). There is a -mint8 option (see Options for the C compiler
    avr-gcc) to make int 8 bits, but that is not supported by avr-libc
    and violates C standards (int must be at least 16 bits). It may be
    removed in a future release.

  * Call-used registers (r18-r27, r30-r31): May be allocated by gcc
    for local data. You may use them freely in assembler
    subroutines. Calling C subroutines can clobber any of them - the
    caller is responsible for saving and restoring.

  * Call-saved registers (r2-r17, r28-r29): May be allocated by gcc
    for local data. Calling C subroutines leaves them
    unchanged. Assembler subroutines are responsible for saving and
    restoring these registers, if changed. r29:r28 (Y pointer) is used
    as a frame pointer (points to local data on stack) if
    necessary. The requirement for the callee to save/preserve the
    contents of these registers even applies in situations where the
    compiler assigns them for argument passing.

 * Fixed registers (r0, r1): Never allocated by gcc for local data,
   but often used for fixed purposes:

   r0 - temporary register, can be clobbered by any C code (except
   interrupt handlers which save it), may be used to remember
   something for a while within one piece of assembler code

   r1 - assumed to be always zero in any C code, may be used to
   remember something for a while within one piece of assembler code,
   but must then be cleared after use (clr r1). This includes any use
   of the [f]mul[s[u]] instructions, which return their result in
   r1:r0. Interrupt handlers save and clear r1 on entry, and restore
   r1 on exit (in case it was non-zero).

Function call conventions:

  Arguments - allocated left to right, r25 to r8. All arguments are
  aligned to start in even-numbered registers (odd-sized arguments,
  including char, have one free register above them). This allows
  making better use of the movw instruction on the enhanced core. If
  too many, those that don't fit are passed on the stack.

  Return values: 8-bit in r24 (not r25!), 16-bit in r25:r24, up to 32
  bits in r22-r25, up to 64 bits in r18-r25. 8-bit return values are
  zero/sign-extended to 16 bits by the called function (unsigned char
  is more efficient than signed char - just clr r25). Arguments to
  functions with variable argument lists (printf etc.) are all passed
  on stack, and char is extended to int.


[1] http://www.avrfreaks.net/index.php?name=PNphpBB2&file=printview&t=75932&start=0
[2] http://www.gnu.org/savannah-checkouts/non-gnu/avr-libc/user-manual/FAQ.html#faq_reg_usage
[3] http://gcc.gnu.org/wiki/avr-gcc


Entry: ORC
Date: Mon Nov 26 19:28:14 EST 2012

[1]: VOLK can take advantage of the Oil Runtime Compiler (ORC) to
create cross-platform kernels relatively quickly. ORC is a
higher-level language way to write SIMD code for different SIMD
architectures. The ease of writing an Orc function can be offset by a
less well-tuned architecture-specific kernel (generality versus
speed). ORC can often be a good place to start writing VOLK kernels
and then optimize as necessary.

[1] http://gnuradio.org/redmine/projects/gnuradio/wiki/Volk


Entry: Poly^2
Date: Sat Dec 22 18:27:23 EST 2012

If there is polymorphism over the domain (code, values), does it make
sense to also use polymorphism over operators like scalars, vectors?

I.o.w. where does operator algebra fit in?  Is this strictly a
different layer, or should it be integrated?

The base layer is going to be scalar math, but there are surely going
to be some occasions where an operation written for a scalar is going
to be lifted to a vector.  Will this just "inherit" the base
polymorphism (code, values) ?

Lots of things need to be clarified here.  The most important
"intuition error" in thinking about this stuff is level confusion (x
or meta-x?).  Somehow that doesn't always represent properly in my
head..


Entry: Is multiple interpretation necessary?
Date: Sat Dec 22 18:38:55 EST 2012

If I'm really only interested in compilation, the normal evaluation
interpretation can probably be avoided.


Entry: Monads too serial?
Date: Sat Dec 22 18:39:36 EST 2012

Is there a structure that is less strict than a monad, but can capture
a data flow network?  Maybe monad isn't too serial?


Entry: Monads and arrows
Date: Sat Dec 22 18:42:57 EST 2012

To summarize:
- monads are for memoization
- arrows are for propagating state (sort of generalized "growing type" arrows)


Entry: GreenArrays GA144
Date: Sun Dec 23 10:01:36 EST 2012

Is it possible to write a C compiler for a highly parallel Forth chip?
In the end, once there is a dataflow application, it's the connections
that are important, not the way arguments are encoded.


Entry: Language parameterized by monad with pure semantics
Date: Sun Dec 23 19:29:35 EST 2012

In DSPM/Data we have a language parameterized by a monad, but the
language itself should have a functional semantics: the monad is there
for implementing code _structure_ not meaning.  Is there a better way
to encode this than just the comments?


Entry: Status
Date: Sun Dec 23 19:36:56 EST 2012

Is it ready for writing SPUT?  Can find this out by writing SPUT.
Need to find last test case and add some.  There are plenty of
problems if I recall, all type stuff (is there every any other kind of
problem in Haskell?)

IIRC, the base problem just implements functions.  State is handled
through functions.

The "arrow problem" is not solved.  Probably it can be solved using
existential types, but that requires a kind of magic (particular
composition structure) I do not yet understand.


Entry: Trying out test cases
Date: Sun Dec 23 19:47:17 EST 2012

Let's start with the test cases:

tom@tx:~/meta/dspm$ ls -al *test*
-rw-r--r-- 1 tom tom 6756 Jan 20  2012 0test_integration.hs     # broken
-rw-r--r-- 1 tom tom 6182 Jan 20  2012 0test_Loop.hs            # broken
-rw-r--r-- 1 tom tom  829 Jan 20  2012 0test_Pd.hs              # broken
-rw-r--r-- 1 tom tom 1667 Jan  8  2012 0test_PrettyC.hs         # broken
-rw-r--r-- 1 tom tom 1759 Jan  8  2012 0test_TML.hs             # Let expression

Looks like a Haskell setup problem:

PrettyC.hs:477:25:
    Couldn't match expected type `bytestring-0.9.1.10:Data.ByteString.Internal.ByteString'
                with actual type `Data.ByteString.Char8.ByteString'
    Expected type: InputStream
      Actual type: Data.ByteString.Char8.ByteString
    In the return type of a call of `Data.ByteString.Char8.pack'
    In the first argument of `parseC', namely
      `(Data.ByteString.Char8.pack c)'

PrettyC.hs:479:9:
    No instance for (Show (f0 ()))
      arising from a use of `ast'
    Possible fix: add an instance declaration for (Show (f0 ()))
    In the first argument of `(.)', namely `ast'
    In the expression: ast . parse
    In an equation for `c2ast': c2ast = ast . parse
Failed, modules loaded: Type, Data, Term, Control, Struct, StateCont, SSM.


Entry: Too complex?
Date: Sun Dec 23 20:03:39 EST 2012

It seems all a bit complex for such a simple task..  

Entry: Data structures
Date: Sun Dec 23 20:12:35 EST 2012

IIRC one of the tougher problems was how to implement argument lists
to pass to functions.  For a first-order language this required data
structures, which where implemented as binary trees of scalars.

For structures, there are 2 components:

- primitive data types (implemented by the target language, i.e. C)

- composition mechanism for datastructures in the DSL: binary trees in
  terms of (,) and ()  -> these are picked to support Arrow.

There is also a mechanism for bridging target language's structure
(i.e. C struct) with the DSL.  Essentially this is about commutation
of struct / rep.  It is essential for implementing multi-arg lambda!

Seems that this is the basic point.


Entry: More use of GCC
Date: Sun Dec 23 20:30:14 EST 2012

I wonder if it is possible to skip the "Cons" step in Code and
generate arg lists directly.  Also, why flatten out everything when C
can just as well do the inlining?  Multiple rvalues can just use
pointers, and even function pointers get inlined properly.  This
eliminates a whole lot of shuffling..

Also, once there is type-correct "bunch of functions" C-like
representation, translating this to something more lowlevel seems like
a problem that can just be done separately.  My type issues are mostly
with high level constructs and "blessed floats".

The problem seems to be one of inlining of state data accesses.
Leaving this to C might lead to inefficient code.  Not sure..  Maybe
it's not a problem at all.


Entry: Time for the difficult questions
Date: Wed Jan 16 22:02:40 CET 2013

I like abstract interpretation.  I like autodiff.  But is using a
typed language really necessary for straightforward DSP code?

Requirements:

- Polymorphism: abstract algorithms over types (scalar floats, ints, doubles,
  vectors, matrices, processess)

- Some control over expansion/inlining

- Multiple interpretations of code: direct eval, DSP code, GUI code.

- Straightforward generation of C code


Up to now I've been using Haskell for this, but is it really a good
choice.  Maybe it is better to go for Scheme?

Would it be useful to start a small project to see if it makes sense
to do AI in Scheme?  The most useful part of the Haskell experiments
has been modeling of dataflow graphs with state abstracted away.  This
should be straightforward to do in Racket.

Trouble here is how much time is this going to take?

Main disadvantage of Haskell is the inflexibility of the type-level
tricks.  Sometimes it is just too hard to make something work that is
straightforward to do using direct metaprogramming.


Entry: Abstract interpretation in Racket
Date: Wed Jan 16 22:15:53 CET 2013

Let's start with a simple language with forms: `function', `locals'
and `op'.  This covers most of the necessary functionality and would
be a good start to define abstract interpretations for:

- type checking / inference
- evaluation
- typed code generation for C

The first extension for this would be to lift functions over streams
instead of scalars.

The first significant representation decision is to represent outputs
explicitly, i.e. use "node form" instead of lambda/let form, since
lambda form is easily expanded as:

(let ((z (op + a b))) ...) ->

(node z)
(op + (a b c) (z))

However, the main trick in embedding is to use the host language's
binding structure.  So let's stick to expression form.  The AI layer
can easily perform transformations to "node form" if that's necessary.

In racket, to add multiple interpretations, the options are to use
lambda expressions (low level) or the module system.  To start out it
seems easiest to use plain lambdas.

One of the simplest AI requirements is the need for mapping a program
to nb of i/o, preferrably with type (if input types or some other
inference root are given).

Should this go on top of typed scheme?


Entry: Simple AI tests
Date: Wed Jan 16 22:31:21 CET 2013

EVAL/SSA style AI is quite straightforward[1].

The real trick is to add state to the picture.  Here eval is over
(causal) streams, and code generation needs to take care of state
initialization and propagation.

Lets try that out.  The simplest way to try this out is using delay
which we call `z'.

How to implement?  Let's try eval first.  What is 'evaluated stream
processing code' ?  It is a function taking streams.  What are
streams?

Note that "streams" for (audio) DSP usually refers to "causal, IIR
stream operations", meaning that functions applied to streams can be
fed one item at a time and will store state.

The interesting problem here is to try to capture output feedback in a
lambda/expression syntax.  A function's output is not named, so how to
name its delayed output?  The simplest problem is a summer:

(lambda (x)
  (+ x (z out)))

Where does 'out' come from?  Shall we just add a magic name, or
require state variables to be named?  It seems to be only output
feedback that causes this problem.  Maybe add a new binding form?

(let ((a ...)
      ((z a) ...)
      ...))

where `z' means next: occurance of (z a) form will add an `a' variable
to the state and will bind its new value.

This is really a sugar problem: the representation is simple: a mimo
fuction where part of the i/o is the state that is fed back.  So let's
ignore it for now, as it is not core.


(define-syntax-rule (lambda-system
                     (si ...) (i ...)
                     (so ...) (o ...)
                     (b ...))
  (lambda (si ... i ...)
    (let* (b ...)
      (values so ... o ...))))


Conclusions:

- using lambda as representation allows lexical scope to be used in
  definitions: no mucking about with symbols

- basic substrate is _initial state_ + _update functions_ (s,i)->(s,o)
  the latter solves the code generation problem: just SSA of lambda
  expressions using AI.

- state initializers should be defined together with code

- composition is important: use the arrow interface for this.

- lexical abstractions go on top of basic initializer + function
  representation, i.e. use a 'z' or '@' form.


[1] http://zwizwa.be/darcs/meta/rai/rai-ssa.rkt


Entry: siso abstraction in Scheme
Date: Thu Jan 17 00:30:05 CET 2013

When sticking to a pure functional representation, code generation is
straightforward: it is just SSA eval.  The real problem is
manipulation of state and metadata such as inputs/outputs, and
construction of composition operators such that state machines can be
chained without manual state manips.

                COMPOSITION IS THE KEY!

In Haskell, the cool part here was the use of '.'

What I really want however is a way to use ordinary `let' forms to
connect the inputs and outputs of state machines, but have the state
be handled separately.

(define op (i1 i2 ...) (o1 o2 ...)
   (let ((x (big-stateful-filter i1)))
      ...))

What I really want is something like Haskell arrow 'proc' notation[1].

How to distill the basic "state multiplication" approach used in dspm?

The problem seems to be that state size is type information, so it's
probably best to look at this problem at expansion time.  The main
problem is the "inference" of the state type.

Using arrows it's simple, because arrow notation orders the
compositions, such that we only need to implement binary composition
operations.  Does it make sense to do something similar for Scheme?

Conclusion: using an arrow approach might be a good idea: it reuses a
standard pattern, and allows composition to be limited to binary
forms.


( A nasty question: what's the advantage over using straight up OO
stateful approach?  Lexically this approach here uses less code: all
constructors are inferred from the body of the code, and all state
updates can be done behind the scenes.  The whole class/instance
distinction disappears from view. )

To make this work in Scheme, some information needs to be attached to
compile-time representation in order to expand

   (lambda (in)
      (let* (... (x (fn y)) ...) ...)
         (values out))
   
to

   (lambda ((... si ...) in)
      (let-values* (... ((so x) (fn si y)) ...) ...)
         (values (... so ...) out))

This doesn't seem like a big deal really.  Because the let* form is
linear, all state variables appear in order.  This might give a huge
benefit for cache management since all references are predictable.
For a real application, state memory will probably dominate necessary
intermediate storage..  Also it might even make it possible to split
loops to trade off between state size and intermediate coupling size.

Next: how to attach I/O size to a name at compile time?


[1] http://www.haskell.org/ghc/docs/7.4.1/html/users_guide/arrow-notation.html


Entry: SISO in scheme
Date: Thu Jan 17 01:11:26 CET 2013

- Code substrate is pure functions.  Re-use host binding forms.

- State threading is a compile-time operation which generates pure
  functions and initial state vectors.

Practical Racket macro issues aside, this is all really straightforward.

Advantages: when this composition mechanism is used to combine a large
number of state machines into a small i/o processor, the memory access
patterns are a lot easier to handle since state access will dominate
interconnect.

- All operations are local, which means that once state updates are
  computed they can be tucked away in slower memory until the next
  iteration.

- Sequence of state memory accesses is linear, which means that reads
  can be prefetched.

Combination of both means that state can be in higher-latency memory,
as long as the bandwidth is high enough.

- Most cache is made available for intermediate results, i.e. i/o
  connections between different state machines.


Entry: Racket compile-time bindings
Date: Thu Jan 17 16:34:56 CET 2013

Basic mechanism is:

   (require (for-syntax scheme/base))
   (define-syntax <id> <val>)

Syntax identifiers are only accessible during transformation.
E.g. the `slv' macro below lifts the value 123 from compile time to
syntax through `datum->syntax'.


(define-syntax foo 123)

(define-syntax (slv stx)
  (syntax-case stx ()
    ((_ id) (datum->syntax stx (syntax-local-value #'id)))))


After that, the level reasoning gets difficult.  Needs some rehash..
Ok, I got it:

;; Define a compile time entity representing a siso system.
(define-syntax (define-system stx)
  (syntax-case stx ()
    ((_ name (si i) (so o) b)
     #`(define-syntax name
         (list
          #,(length (syntax->datum #'si))
          #'(lambda-system (si i) (so o) b))))))

;; Expand compile time siso entity to function.
(define-syntax (system-function stx)
  (syntax-case stx ()
    ((_ id)
     (match (syntax-local-value #'id)
            ((list narg spec) spec)))))

This allows definitions of compile-time entities (state size and body
syntax) that contain structural information that can be used by other
macros, such as `system-function'.

This should be enough to build a composition macro that builds such a
compile time object (size + syntax) from a specification syntax.

In the examples above, it's not even necessary to cache the lenght as
it can be recomputed from the system syntax:

(define-syntax (define-system stx)
  (syntax-case stx ()
    ((_ name (si i) (so o) b)
     #`(define-syntax name #'((si i) (so o) b)))))

(begin-for-syntax
 (define (system-num-state id)
   (length (syntax->datum (syntax-local-value id))))
 (define (system-body-def id)
   #`(lambda-system #,@(syntax-local-value id))))

(define-syntax (system-function stx)
  (syntax-case stx ()
    ((_ id) (system-body-def #'id))))
(define-syntax (system-arity stx)
 (syntax-case stx ()
    ((_ id) (system-num-state #'id))))


So the idea is now to make a `system' form containing a collection of
bindings, which expands to a lambda form derived from the systems that
are present in the body.

Note this requires distinction between primitive and composite
systems.

(compose-system (in ...) (out ...) (bindings ...))

Since we're binding syntax to an identifier, there will be no code
reuse if we're not being careful, so this might be added later:
instead of expanding the body inline, record the references.  This
would allow generation of inline functions in C, where the C compiler
decides on performing the inlining.  For now, just inline.

Ok, works.  Minimal config:

(begin-for-syntax
 (define (thread-state vdefs)
   (let*
       ((vs
         (for/list ((vdef vdefs))
           (syntax-case vdef ()
             (((var ...) (op arg ...))
              ;; If it's a processor, generate state variables.
              (let-values (((si so) (system-state #'op)))
                (let ((g-si (generate-temporaries si))
                      (g-so (generate-temporaries so))
                      (l-op (system-body-def #'op)))
                  
                  (list g-si g-so
                        #`((#,@g-so var ...) (#,l-op #,@g-si arg ...))))))
             ;(else (list '() vdef))
             )))
        (g-sis (apply append (map car vs)))
        (g-sos (apply append (map cadr vs)))
        (state-vdefs (map caddr vs)))
             
     (values g-sis g-sos state-vdefs)))
 )


(define-syntax (define-system-comp stx)
  (syntax-case stx ()
    ((_ name in out (vdef ...))
     (let-values (((si so vdefs) (thread-state (syntax->list #'(vdef ...)))))
       #`(define-system-prim name (#,si in) (#,so out) #,vdefs)))))
 )
(define-syntax (define-system-comp stx)
  (syntax-case stx ()
    ((_ name in out (vdef ...))
     (let-values (((si so vdefs) (thread-state (syntax->list #'(vdef ...)))))
       #`(define-system-prim name (#,si in) (#,so out) #,vdefs)))))

(define-system-comp foo (i) (o3)
  (((o1) (integrator i))
   ((o2) (integrator o1))
   ((o3) (integrator o2))
   ))
 

Next: should the form support only composition of siso systems, or
should it lift pure functions too?  Pure functions are useful as glue,
but automatic lifting can be confusing...


Entry: siso.rkt
Date: Fri Jan 18 22:45:01 CET 2013

While I like the basic simplicity of the approach, I'm not sure I
understand why there is a real need to split the abstraction into two
layers:

;; * CORE REPRESENTATION: For generating target C code and test suite
;; the technique of abstract evaluation is used.  Code is represented
;; as a (pure) function which can be evaluated over several abstract
;; domains.
;;
;; * SYNTACTIC SUGAR: To generate the lambda syntax corresponding to
;; this functional representation, a collection of macros is used to
;; remove the notational burdon of explicit state threading.

Only the latter layer has information about "structure" of the code.
The former is truly only a semantic analysis.

What I find weird is the need for "reconstruction" of memoization
structure while performing AI of the core representation layer.

It seems a bit backwards: we have the info and expansion time, so why
throw it away?


Considering current time constraints, is there a way to cut this
problem short?  Does it matter?  Will new insights make it hard to
rewrite end-user code that is written in this 2-phase way?


Entry: Practical application
Date: Fri Jan 18 23:21:06 CET 2013

( Towards a practical application: audio processing code. )

* Will polymorphy be necessary? E.g. instantiate functionality over
  scalar and vector?

* Will type inference be necessary?  E.g. for constants?

* What about conditional execution?  This was a nontrivial part in the
  Haskell code.

I'm thinking to cut these short and bring them up when they are
actually needed.  There is a "proper" way to do this (as in the
Haskell code) and there is probably a "distilled" way to do this,
keeping only the basic things of the algo: the leverage without all
the layers of abstraction.


Entry: Doing it wrong?
Date: Fri Jan 18 23:44:48 CET 2013

Is there a way to define this stuff on a higher level?  Instead of
working with pure update functions explicitly, why not work with
operations on streams?  Every value is a stream.  Is there a way to
perform AI on this such that the state-threading comes out directly?

It seems that "AI composes", but the syntactic stuff does not.

Let's try this out:
- specification is on streams
- abstract interpretation performs the insertion of state nodes

Can't grasp it, some ideas:

- It should be AI, and some of the AI runs at Scheme compile time,
  such that it will be possible to generate _SCHEME_ code that
  implements the abstract domain code.  It's a great test.

  Basically, stay away from syntax in the basic functionality: really
  to stick to functions as functions compose.


Simply modifying the code in rai-ssa.rkt poses the problem that a
state update function has at least 2 outputs: state and out.  This
needs to be represented in code.


Entry: AI stream spec to state-threaded update function
Date: Sat Jan 19 00:43:04 CET 2013

It seems possible.  Let's go for it.
EDIT: Indeed: very elegant piece of code.

Next:
- Level-shifting macro OK
- Type inference?
- Generate C code


Entry: C code generation from scheme syntax?
Date: Sat Jan 19 14:22:26 CET 2013

Generating Scheme syntax has the benefit that some checks can be
performed on the generated syntax.  It might be best to convert the
Scheme syntax to C, instead of generating C directly.

Trouble is though that processing state should be abstracted into
structs at the very end (primitives should stick to scalar addressing).

I'm just using a #define indirection to map the generated state names
to float array elements.

It seems best to also compile straight to chunk processing function.

EDIT: Ok, works, this is what comes out:


void fun(_ * restrict so,
         _ * restrict si,
         const _ ** restrict in,
         _ ** restrict out,
         int nb_samples) {
	for (int i = 0; i < nb_samples; i++) {
		#define si0_17 (si[0])
		#define so0_18 (so[0])
		#define si0_20 (si[1])
		#define so0_21 (so[1])
		#define i0_12 (in[0][i])
		#define i1_13 (in[1][i])
		_ o0_14; mul(&o0_14, i0_12, i0_12);
		_ o0_15; mul(&o0_15, i1_13, i1_13);
		_ o0_16; add(&o0_16, o0_14, o0_15);
		_ o0_19; integrate(&so0_18, &o0_19, si0_17, o0_16);
		_ o0_22; integrate(&so0_21, &o0_22, si0_20, o0_19);
		out[0][i] = o0_22;
		_ *tmp = si; si = so; so = tmp;
	}
}


Entry: Multi out
Date: Sat Jan 19 20:01:35 CET 2013

Seems not trivial:
- Won't work for expressions: some other "context" is necessary, i.e. let-values.
- Might need separate partial evaluation step to determine # outputs


Function output arity is simple:

;; This one is simpler: all primitive functions are replaced by
;; `void', mapping anything to #void, which yields a (values #void
;; ...) result where only the size is significant.

(define (ai-arity p)
  (define eval-void (p void void void))
  (define nb-in (procedure-arity eval-void))
  (define args (for/list ((i (in-range 0 nb-in))) #f))
  (define nb-out
    (length (call-with-values
                (lambda () (apply eval-void args))
              list)))
  (values nb-in nb-out))


However, this raises the issue that inner functions cannot be
abstractly evaluated.  It might be best to remove the "poor man's
module" approach now and use a different mechanism for propagating
semantics.

Anyway, why not use the same approach as in Haskell implementation?
Type classes are implemented by threading the semantics through the
code, passed as a first argument to each function.

A similar approach might work better than units.  Somehow it seems
that units are not modular enough..

I.e. using syntax-parameterize.

Yep, works!

(define-syntax-parameter ai-prims #f)

(define-syntax-rule (ai-lambda (arg ...) expr)
  (lambda (p arg ...)
    (syntax-parameterize
     ((ai-prims (lambda _ #'p)))
     expr)))

(define-struct ai (add mul integrate))

(define-syntax-rule (add a b)       ((ai-add (ai-prims)) a b))
(define-syntax-rule (mul a b)       ((ai-mul (ai-prims)) a b))
(define-syntax-rule (integrate a b) ((ai-integrate (ai-prims)) a b))


Next is how to do this for nesting ai-lambdas?  Basically this needs a
re-implementation of %app


Entry: Modifying %app syntax?
Date: Sat Jan 19 22:02:22 CET 2013

Looks like the DSL needs a modifed %app syntax.  The semantics
threading seems like a good idea, but it really needs to go in the app
form, not the primitives, otherwise composition isn't possible.

So let's make a proper #lang module.

All pretty straightforward up to the point where #%module-begin starts
messing with the `define' form.  It expands 

   (define (name x) ...) -> (define-values (name) (new-lambda (x) ...))

Looks like #%module-begin needs to be overridden.


Entry: Multiple return values
Date: Sun Jan 20 00:56:18 CET 2013

Seems that only "function" needs this.  The primitives are all
single-valued.  Weird...


Entry: Phasors
Date: Sun Jan 20 11:18:58 CET 2013

Should phasors be implemented as unsigned integers, followed by
int->float conversion and a scaling operation, possibly some
bit-twiddling, or should we just stick to floats?

Another problem is delay lines: here there might need to be extra
precision over the 24 bit in a float.

Trouble with ints is that they make the code typed.  A single type is
the same as no type..


Entry: Time delay primitive
Date: Sun Jan 20 11:42:22 CET 2013

Is it enough to just have a time delay primitive?  Needed is both
input and output delay.  Let's try to make a composite integrator
object that way.

I don't think this works..  Since the feedback is local, it can only
be done in the primitives.  Makes sense!  The whole point of the
stream abstraction is to tuck away such feedback.

Maybe this needs a different kind of operator.  Something like the
`rec' operation in ArrowLoop.

I don't see a way around this.  Currenlty it seems best to stick to
biquad, one-pole and z as basic ops, and rely on partial evaluation to
optimize away unnecesary operations.


Entry: Onepole
Date: Sun Jan 20 12:10:31 CET 2013

Yeah I want to implement onepole in terms of twopole.  There are no
"larger" delays, except for real delay lines which need a separate
abstraction anyway.

Let's just go for twopole, or better a SVF: all input/output
processing (non-feedback) can be removed later.

So... I'm still using add/mul in the implementation..  No no no this
needs a different apprach.

I need a `rec' operator.  Something like the ai-lambda form that
performs state feedback.

Basically, this is a type issue:
- scalar functions
- output feedback

However, it feels as if this mixes two abstractions.  Output feedback
stream processors are an implementation detail.

It's a tough one...


Entry: Stream fold
Date: Sun Jan 20 13:17:55 CET 2013

- specification:  (lambda/fold (s0 s1) (i0 i1)
                      ...
                      (values/fold (q r) (s t)))

- compilation: somehow this needs to make it into the SSA compiler.
  what does it need to know?


It doesn't seem accessible.  This needs a second layer, i.e. a second
"language".  Something that can generate primitives that are plugged
into the other language.

It's really a type issue: stream language operates on streams, not on
scalar samples..

The point in the code where this information needs to be exposed is:

  (define ((e-op op) a b)    (node 0 1 #`(#,op #,a #,b)))
  (define ((e-stream op) x)  (node 1 1 #`(#,op #,x)))

It will work if we can attach something to the `op' syntax identifier.

Hmm.. this stuff is hard to reason about.

This should be done in a boostrap phase:

Phase 1: Use make-ssa-compiler without the stream extensions to compile a bunch of functions that can be tagged as stream processors.

Phase 2: Use make-ssa-compiler again, but this time extended with a
bunch of extra "primtives".

Note that primitives are modeled as Scheme functions now, not syntax,
so this should be straightforward.  Trouble is keeping all the ducks
in a row.

Roadmap:
- Turn semantics into a list to make it easier to extend.


Let's think about this some more.  Can't this be used to define a
primitive that does this particular tagging?  I.e. something like:


   (let* ((b (causal 1 (f a)))) ...)

where `causal' indicates that the following function is the
specification of a causal system with 1 state variable.

Let's try this.

The causal form then probably needs to be a macro such that it can
delay the execution of the usual binding form.

Wait.. should the language be abstracted as call by name to leave the
evaluation order up to the abstract interpretation?  Let's give that a
try first.

Hmm, seems to hard to change.  Let's just use a macro for now then see
what happens.

It seems that "causal" needs to be a macro, creating a "state
injection" form, something like

(causal 2 (f a b))

-> (_causal 2 (lambda (s0 s1) (f s0 s1 a b)))

Still, this looks like cheating, i.e. mixing implementation of a
single feature in both the surface syntax and the abstract
interpretation, but I do not see any way around this when using strict
evaluation by default.

Let's build it and see where it goes.

Starts to make sense..  The trouble is opacity of types..

Ok, I got it working.
Ticky part is to distinguish:
- Binding of output variables: these are just primitive nodes!
- Performing the state assignment separately


Entry: This is going to be fun!
Date: Sun Jan 20 17:47:07 CET 2013

Now i want something like this:

(matmul
  ((so_x)  (c_xx c_xy) (si_x))
  ((so_y)  (c_yx c_xx) (si_y)))

:)


Entry: ai-eval.rkt
Date: Sun Jan 20 19:14:00 CET 2013

How should the direct evaluation of fold-state work?

  (fold-state sem
              1
              (lambda (s)
                 (int-loop (sem) s x)))

I'm not sure this will work since it mixes some types.  fold-loop
fundamentally alters the semantics of its subsyntax (from stream to
scalar!).

( Confusions like this won't happen in a typed language ! )

Maybe it's best to change the outside of the abstraction a bit to at
least make it more meaningful.

Another thing this brings up is: what happens when code inside a
fold-state form references a stateful processor?


(define level-test
  (lambda ((s) (x))
    (values (integrate s) (integrate x))))

     
 (lambda (si0_2295 si0_2298 si0_2294 i0_2293)
   (let*-values (((o0_2296) (p_add si0_2295 si0_2294))
                 ((so0_2297) (p_set o0_2296))
                 ((o0_2299) (p_add si0_2298 i0_2293))
                 ((so0_2300) (p_set o0_2299))
                 ((so0_2301) (p_set o0_2296)))
     (values so0_2297 so0_2300 so0_2301 o0_2299)))

After substitution:

 (lambda (si0 si1 si2 i)
   (let*-values (((o0)  (p_add si0 si2))
                 ((so0) (p_set o0))
                 ((o1)  (p_add si1 i))
                 ((so1) (p_set o1))
                 ((so2) (p_set o0)))
     (values so0 so1 so2 o1)))


so2 = so0 <- (p_add si0 si2)    ;; a, 2a, 4a ,....
so1 = o1  <- (p_add si1 i)      ;; integral of i


It seems to just work.  The output is a single integral, coming from
'(integrate x)', and a geometric series from '(integrate s)'

Weird..  so what does it mean, stream-wize?

The thing is, evaluation can be side-stepped since there is a
compilation to Scheme part that can just compute the whole thing at
once.


I think it's probably best to tuck away the abstraction behind a
lambda form only, and remove the `causal' form.

;; AI can call the function properly.  (Leaky abstraction?)
(define-syntax (causal stx)
  (syntax-case stx ()
    ((_ nb-state (op arg ...))
     (let* ((n (syntax->datum #'nb-state))
            (s (generate-temporaries (make-list n 's))))
       #`(fold-state (ai-semantics)
                     nb-state
                     (lambda #,s
                       (op (ai-semantics) #,@s arg ...)))))))


Entry: What are feedback stream operations?
Date: Sun Jan 20 20:39:18 CET 2013

Feedback stream operations are still stream operations, but there is
an imposed relation between the input state streams and the output
state streams, namely that input is delayed output.


Entry: ai-eval.rkt
Date: Sun Jan 20 21:13:15 CET 2013

The argument to feedback is a function that takes delayed streams to
streams.  The trick here is the representation of a stream.  It
somehow needs to be lazy.

A neat lambda-trick is hiding here...


Let's think of it the other way around: what is the output of the
compilation step?  Can it be just an update function?

Passing values to the feedback function should return a function that
takes a number of inputs.

Let's just make all primitives delayed, see what happens


Yeah I don't see it...

I've updated ai-scheme.rkt to include a wrapper for stream processing,
so this now works:

(define intgr (ai-scheme-stream integrate))
box> (intgr '(1 1 1))
((1 2 3))


Entry: Next
Date: Mon Jan 21 00:49:27 CET 2013

- Figure out why eval semantics don't work.


Entry: Can #%app be overridden locally?
Date: Mon Jan 21 11:08:51 CET 2013

What I'm looking for is probably a `lambda' macro that iterates over
syntax and inserts `my-app' in the proper places.

Yes, it does work.  After reading the doc it says that `#%app' is
inserted by the macro expander with the lexical context of the
expression, so I tried the following:

  (let-syntax ((#%app (syntax-rules () ((_ . a) 'a)))) (foo 1 2 3))
  =>(foo 1 2 3)

The trouble is that redefining #%app in a lambda form needs insertion
of non-hygienic #%app == same context as inserted by the expander.
After a bit of experimenting, it seems that this is a bad idea.  It
seems better to implement a new lambda form that performs its own
recursion over an expression throuh inserting its own `app' and
`datum' references.  The "global variable" #%app thing doesn't seem to
be such a good idea, except when implementing a language that never
inserts or is inserted into another language context.  In sort, in
Staapl there are "local language contexts" in the form of macros that
perform special-case interpretation.

[1] http://docs.racket-lang.org/reference/application.html?q=%23%25app


Entry: Haskell vs. Scheme : DSL embedding with higher order abstract syntax
Date: Tue Jan 22 12:48:59 CET 2013

The cool thing about Haskell is that Monads and the `do' notation are
quite a strong abstraction that allows a lot to be done behind the
scenes.

In Scheme, I still need 2 kinds of syntax: a special case frontend
that tucks away the semantics threading (cfr. environment monad) and
evaluation order tricks, and one that does the actual abstract
interpretation on the resulting lambda expressions.

I.e. you "need" macros because of evaluation order restrictions.


Entry: Autodifferentiation
Date: Wed Jan 23 14:39:24 CET 2013

Next interesting question is: how to add an abstract interpretation
that returns new, modified abstract syntax?  Autodiff comes to mind.

First, this needs a protocol.  How to represent the normal numbers?
Best to just increase arity, i.e.

(x y) -> z    ->   (x0 x1 y0 y1) -> (z0 z1)

This makes the frontend work.  Frontend then maps this double arity
function to the single arity function evaluated over normal number
structs.

First however, start with 1-argument functions.


Entry: AI and single (concrete) intepretation
Date: Wed Jan 23 15:14:36 CET 2013

Even if there is only a single "concrete" interpretation of syntax
(i.e. as generated SSA code), it still enables implementation of code
transformers like autodiff, and lifting expressions over complex
numbers.

I wrote a unification algo in scheme before.  Don't remember where I
left it..

I have to this from the ground up once, to see the connections, then
solve it..


Entry: Testability / partial evaluation
Date: Thu Jan 24 11:48:04 CET 2013

Currently there are Scheme and C primitives.  These should be the
same.  How to enforce that?  Is this a problem that will eventually be
non-existant, or should we scrap the prim.rkt and only compile to C
directly?

Scheme prims are important for partial evaluation.  This works now,
except that things like (+ 1 (+ x 2)) won't work due to expression
ordering.


Entry: Loadable modules in Pd
Date: Fri Jan 25 13:18:24 CET 2013

To work around limitations of libld, the main_bin.c together with
rai.ld  defines a binary format used for dynamic loading of code.  The
rai code itself does not have any undefined references, so a simple
mechanism is possible.  See [1] for some more info.

This required mprotect() to enable code execution, which required
page-algined memory allocated with posix_memalign().


[1] entry://../c/20130124-232227

Entry: C loop generation: split inputs in streams and parameters
Date: Fri Jan 25 13:21:36 CET 2013

Straightforward: implemented in main_pd.c


Entry: Propagate parameter names
Date: Fri Jan 25 13:26:42 CET 2013

Especially for the "constant" parameter part it would be nice to be
able to transport the parameter names for run-time access.

This needs Scheme surface syntax support.


Entry: Parameter structure
Date: Fri Jan 25 15:47:30 CET 2013

Allow input / parameter to be organized in a tree structure.

Some ideas:

* from a C perspective (state and input management) we really just
  want flat structures, possibly named.  Structure can then be
  recovered using a good naming convention.  Doing this with compile
  time metadata is a pain.

* flat C side means no need to separate inputs/params.  simply make it
  easy to distinguish the boundary between the two (if inputs
  structure is flattened, the offset of the flattened structure needs
  to be tracked)

The question is then purely syntactical on the specification side:

- define abstraction and application forms.
- how to deal with loops? (later, just inline for now)


EDIT: `map', ajusted for context threading, should really just work.
The AI then needs to pass a list of values, or a representation of a
list that the AI version of `map' understands.  The problem is: how
does the evaluator know the type?

Summary: the problem is types: evaluator needs to know what to put in.

Maybe the trick is to separate definition and instantiation.  This
would avoid the need for a type system.


Entry: Plugin extension: SP
Date: Fri Jan 25 18:22:11 CET 2013

signal processor
simple plugin


Entry: Map and AI
Date: Sat Jan 26 01:42:23 CET 2013

Wait... at the moment that map "consumes" an input variable, it knows
it will be an array.  Why not annotate this information to the
dictionary?  Types can be added in a destructive manner easily.

It really can't be so hard to add a unification-based type inference
to a simple language like this.

The main driver is the desire to infer variable structure, which would
be great.  However, as a side effect, it will probably also solve
float / int distinction.


Seems that this is a good approach:

- map each variable to a type

- every time there is some relation, like : t1 = List t2, perform a
  substitution

  binding to the type binding table.

- substitute


a : t1
b : t2

c = a + b  => t1 = t2

a : t1
b : t1
c : t1

d = map f c => c,d : t1 = List t3

a : List t3
b : List t3
c : List t3
d : List t3

Basically, this can be done separately, but it would be pretty much
the same derivation as SSA, so might be done together.

It seems there should be no difficulty because there is no inference
step: everything is ordered.

[1] http://en.wikipedia.org/wiki/Hindley%E2%80%93Milner


Entry: bound-identifier=?
Date: Sat Jan 26 11:42:16 CET 2013


(require (for-syntax scheme/base))
(define-syntax (s= stx)
  (syntax-case stx ()
    ((_ a b)
     #`(bound-identifier=? #'a #'b))))


Entry: Type inference (unification)
Date: Sat Jan 26 15:09:18 CET 2013

Turns out to be very simple.  Components:

- Type environment, binding node names to type names.  (Initialized by
  setting each node to a separate type variable)

- Every primitive application unifies the nodes that take part in the
  expression.
     t_0 = t_1 = ... = t_k

- Every deconstruction application unifies the type variable of the
  input node with a type-constructor-wrapped deconstructed type:

     t_struct = List t_element

- Unification can be implemented by brute force deep substitution of a
  symbol with its binding in the type environment.


Questions:

- Can the quadratic complexity of the brute-force substitution be
  linearized?  EDIT: I think the trick is to just not unify
  (substitute) already equal types.


Entry: Trouble
Date: Sat Jan 26 15:36:11 CET 2013

The general idea above works, but there's a slight problem.  Type
inference seems to work "backwards", meaning that going down the code
actually affects type bindings that where made earlier.

Trouble came with typing
in :: L a -> a

for a binding (e (in l)) and types e : e and l : l, the unification
step is

               l = List e

However, it is possible that l is not a type variable, but something
like List e2.  In which case the unification would be:

    List e = List e2
 => e = e2


Ok, this works.


Entry: Loops
Date: Sat Jan 26 23:34:47 CET 2013

The only reason I need structured types is to abstract loops.  For
now, the `in' operator infers to a (S t) type, and produces a `p_car'
primitive in the output.

The p_car : (S t) -> t should now be generalized to any type of fold
operation.

Maybe the `for/fold' form is a good place to start.

(for/fold ((x (in xs)))

EDIT: The thing is, loops should probably be propagated to the
evaluator/compiler, so it can do what it needs to do.  The trouble is
that it introduces another hinge point..  How to make sure it doesn't
interfere with the rest?

To keep the concept clean, the best thing to do seems to be to not
expose recursion directly (i.e. use combinators).  This needs `map'
and `reduce'.

`reduce' is more useful (convolution, summing mixer, ...) so let's try
that first in the same "algebraic inspired" way I've been working
before, i.e. to try to identify how one component should work and then
to connect the dots.

Wow, that's starting out with a bad idea.  What is necessary is a
primitive way to do recursion, i.e. named let.  That would probably
lead to the simplest solution.

Nope. The simplest thing is that which is already used: a state
machine driver.  Something that takes a SISO system and iterates it
over an abstract data type.  This way the C mainloop can also be
abstracted in Scheme syntax.
  
[1] http://docs.racket-lang.org/guide/for.html#(part._for/fold)


Entry: Mainloop over in/out/param blocks
Date: Sun Jan 27 11:50:56 CET 2013

Find a way to lift an expression to in/param -> out notation, then see
how state fits the picture.

Let's focus on the lifting operation.  Maybe on the special case
that's already there: lift the first n inputs to a block.


Not easy..

Thing like this: what does the C processor need?  Actually we're just
abstracting that information.  Maybe this should be abstracted first
to types?

It seems that at least a little bit of the secret is in lifting the
primitives over streams, i.e. where one argument is a stream and the
other is a scalar.

The C processor needs to know:
- expression in scalar form
- difference between input and output
- difference between scalar and vector

Maybe first some refactoring needs to be done here.

Essentially, what the compiler gets is:

   (lambda (state-in vector-in scalar-in)
     .... ;; bindings
     (state-out vector-out))

state-in   : t
vector-in  : V t
scalar-in  : t
state-out  : t
vector-out : V t

with the extra information that:
  - state is fed back 
  - vector types are indexed


Entry: Loops: combinators or straight recursion?
Date: Sun Jan 27 12:55:04 CET 2013

Straight recursion might be simplest, but this requires some support
in the type checker and evaluator.  Probably needs a Y-combinator to
break the typing loop.

So ordinary "hierarchical" combinators look more interesting.
However, this requires type annotation of function bodies.  Can this
be avoided by using "macro combinators" that are type checked only
when they are inlined?  I.e. not supporting higher order functions in
the language, just some recursive expansion guard.

Maybe this should look to true SSA.  How are loops handled in SSA?

From a higher perspective: this stuff is really important.  Not all
loops are going to be SISO loops.  Some are going to be proper folds
with explicit state init, i.e. SIS + S0.

Better make it work!

The good thing is that the current abstract for stream functions is
pretty good.  The trouble now is drivers.  How to put that in the
language?

What does a driver do?
  - Feed back state I/O
  - Lift operations over arrays

That should probably stay like that.


Entry: Macros, combinators, circuits.
Date: Sun Jan 27 18:44:18 CET 2013

Maybe it's time for some harder questions.  What is the real
difference between higher order functions and higher order forms?
Meaning, there are ways to implement abstract combinators without the
need for higher order functions, but this means the combinators are
not functions themselves, i.e. some kind of strict order of
combinators. (E.g. Backus' Function level programming [1]).

A lot of vague ideas and I'm not sure what to do with them.

I'm missing a key insight.

What is a loop?

Can the data/codata idea help?

A loop is the dual of a data structure, e.g. an array <-> linear
iteration over elements.

Is there something as a canonical combinator?
Meaning, if you know the data type, you know the iteration pattern.
Yes, probably, but does that help?

Where to look... It seems from a wider application point, restricted
recursion is a good way to go.  Essentially, what I'm looking for is a
specification of a largely static collection of mathematical
operations on data aquired over time, possibly in blocks.

Currently there is already one operator: feedback, implemented by
feeding back the first n I/O streams.  The same could be done for
mapping onto vectors.

Maybe the thing should be kept more functional: don't think about
"running the same code more than once", but think in terms of varaible
binding.  What a stream processor really is, is a *duplication* of the
same functionality, bound on some kind of data flow grid.  That is the
fundamental viewpoint.  Combinators are then ways to build grids from
smaller, *functional* 

I.e. think in circuits.  Loops combined with memory are just a way to
implement self-similarity of circuits.

So, currently there is really only one combinator or "circuit
duplicator", which maps a functional i->o network on 

- first n_s I/O        -> causal feedback
- next n_i I and n_o O -> streams
- last n_p I           -> constants

what is missing is an explicit mention of initial state and final
combinator (something that reduces the last output).

This is `for/fold' with outputs.

Nice, but that doesn't bring much bread on the plank..

[1] http://en.wikipedia.org/wiki/Function-level_programming


Entry: Practically: lazy encoding?
Date: Sun Jan 27 19:17:10 CET 2013

Maybe it will help to make the language lazy.  That would avoid some
macros, and would expose combinators.

Combinators also need to be typed.


Entry: State threading context needs to be explicit.
Date: Sun Jan 27 19:21:57 CET 2013

What is currently missing is the "end point" of state feedback.
I.e. it is assumed that the "context" is stream processing.
Basically, this needs to be made symmetric:

`feedback' refers to an environment that collects all the threaded
state introduced during execution.

It needs to be delimited.  This is a bit like a prompt or continuation
marker.


Summary: move from the idea that streams are "global" to "local".


The misleading part is that the engineering context is a "global"
stream that is essentially "open".  There is no begin (technically
there is: state initialized to 0) but there is no end: only the I/O is
of importance.  However, for "local streams", the end state might be
the only thing we're interested in, i.e. a fold.

Such operations need to be localized, such that they look like pure
function evaluations on the outside.


Also important: this implicit "feedback" trick is *different* from
ordinary for/fold!  It is a syntactic trick to ease notation.

This is how it should be made explicit:

(loop ((s1 0)         ;; state
       (s2 0))
      ((i0 (in S0))   ;; streams
       (i1 (in S1))
       (i2 p2))       ;; parameter
   ...                ;; explicit bindings
   (values s1+ s2+    ;; state
           o0 ...)    ;; output streams


The above would be the expansion of:

(begin-feedback ((i0 (in S0))
                 (i1 (in S0)))
   ...                           ;; implicit bindings
   (values o0 ...))


So let's do this first.


Entry: Loops in DSP: `feedback' is contextual
Date: Sun Jan 27 19:47:41 CET 2013


* To make things more explicit it should be understood that the
  `feedback' form refers to some kind of "context".  It is nothing
  more than a preprocessing trick to create fold-style loops where
  state feedback dominates.

* Stream processors are an "infinite loop", so the above isn't
  immediately visible (no end, but there is an init that for now has
  been glossed over.  But it makes a lot of sense to have the same
  kind of state feedback mechanism on a local level, i.e. for block
  processing or iterative numerical algorithms.

* The big idea is that the siso system is the unifying structure for
  numerical algorithms that operate on streams of data -- i.e. not the
  'fold', not the 'map'.


In short:

- Syntactical need: Some DSP algorithms can get heavy on local loop
  state.  Removing the need to explicitly name this state helps a lot
  to focus on inter-module dataflow.

- Representation need: Functional representation is good for analysis
  (as opposed to destructive state updates)


Entry: Why not simply OO then?
Date: Sun Jan 27 20:11:23 CET 2013

- In OO, the distinction of instance / class needs to be made.  Using
  the `feedback' mechanism this is handled automatically: each
  reference of a stateful method introduces a new set of state
  bindings.

- Analysis of a data flow structure that contains mutation (instead of
  binding) is more difficult.  However, it can be recovered.  This is
  what conversion to SSA form does, but it seems better to do it
  straight a way.


Entry: Duplicate circuit
Date: Sun Jan 27 20:48:25 CET 2013

The `feedback' thing is for later.  A more pressing need is a
`duplicate' function, where a piece of I/O processor is duplicated,
turning each (parameter) input into a vector, or implemented as a list
of parameters (this is a final binding issues, not a core issue).

The core issues are:

- How to syntactically specify duplication.

- How to implement?

The main problem seems to be: how to specify which inputs should be
duplicated, and which not?

At the lowest level, this translates to lifting the primitives.  When
a region can be isolated well enough such that the compiler can insert
a loop context, it creates the freedom to do it as such, and the core
language doesn't need to care, i.e. it becomes a type issue.

If it is a type issue, there is only one place where this needs to be
specified: the summation operator!

Let's see what this looks like.

The sum operator should be a bus operator, i.e. multiple values are
summed individually.


Practically: take out `in' and replace it with `bus', since it seems
to be the same kind of functionality.

   (lambda (freq)
     (bus (phasor freq 0 1)))))

This comes out:

(values
 (lambda (v0 freq)
   (let*-values (((v1) (p_add v0 freq))
                 ((v2) (p_wrap v1 0 1))
                 ((v3) (p_sum v0)))
     (values v2 v3)))
 '(v0)
 '(freq)
 '(v2)
 '(v3)
 '((v3 . a) (v0 B a) (freq B a) (v2 B a) (v1 B a))) 

All variables are typed (B a) exept for v3 which is a.

This is correct, but it is too flat.  All "bussed" operations are
mixed with scalar ones.

However... it can be kept as it's not totally bonkers.

An advantage is that just expanding the "vector" primitives into a
sequence of directly bound scalar primitives breaks data dependency
and exposes SIMD parallellism[1].

If I rush over this and implement it in the backend using "lifted
primives", and add the same lifting operation in the Scheme
primitives, how hard will it be to do it properly and expose the loop
explicitly?

I.e. (bus (fn A1 A2)) would be something like 

(bus ((a1 (in A2))
      (b1 (in B2)))
  (fn a1 b1))

which will clearly delineate the "plug point" to insert an iteration
loop.

Yeah I don't feel good about using such a naked hack that's hard to
recover..  It can be done syntactically by separating out the free
references and adding a barrier.  However, that could be used as a
shorthand later.  For now it's probably best to start with an explicit
binding form.

Let's write it in terms of map and fold/s 

  (fold/s + (map (lambda (a1 a2) (fn a1 a2)) A1 A2))

Here the "/s" for a symmetric fold t t -> t that doesn't need an
initializer.  There is little need for an a-symmetric fold in
arithmetic..

Chances are that folding these two operations into a single fold-map
is better.

But how to represent?

The types are already ok.  What is needed is a little syntactic marker
that a loop is being performed.  The types of the operands inside the
loop are known, so it is known which need to be indexed and which not,
but of course the sub-structure is only visible inside the loop.

(let* ((v1 ...)
       (v2 ...)
       (v3 ...)
       (v4 (map/fold + 
             (let* ((v5 ...)
                    (v6 ...))
                 (values ...))))
       (v7 ...))
    (values ...))

For C/Pd representation, the loop can just be unrolled and the
parameters flattened (individually named).  The Scheme representation
can use proper deep data structures.

So we're back to square one: this needs a local loop structure.  The
thing is that implementing this special case maybe allows later
generalization to a single siso loop structure.  map/fold is sis (no
output, just state).

So maybe we should just do it correctly.  

- Loops are a nesting structure that can be inserted in a normal SSA
  sequence.

- The variables defined inside such a nested structure are local to
  the loop, but they can be bound to external outputs.

- Two syntactic tricks are necessary: some kind of locality to delimit
  the iteration body, and a "blessing" of some variables as composite, i.e.

  (loop (X Y Z) (fn X Y Z a b c))

  -> here the capitals are blessed as streams (will be indexed inside
     the loop) and the lower caps are constants.

- The only thing that can leave such a nested structure is the output
  written during a loop ( stream ), and the final state ( scalar ).
  I.e. a loop returns a number of scalar and stream variables: t t t
  (B t) (B t) ( for summation, this is just the state variable. )


[1] http://gcc.gnu.org/projects/tree-ssa/vectorization.html


Entry: Composition of structure: commutation of space and time combinators
Date: Sun Jan 27 22:11:52 CET 2013

Problems: how to individually name dimensions?  

What I'm looking at is loop nesting:
- state machine threading the time dimension (i.e. filter)
- state machine threading the space dimension (summation of output of a couple of filters)
- any other form of nesting

Each "feedback" operation stores state at a particular loop level.
How to identify this syntactically?

Whether these levels commute depends on the commutation of the
operations used.  For summation there is no problem.

Example: sum a bank of integrators (summers).

  signals s_n(t)  n : 1..N   t : 1..T

  the integrators sum each individual signal into N integrated streams
  the output at any particular moment in time is the "spatial" integral.

  the trouble now is: when I use the `feedback' mechanism to implement
  the integrators, does the feedback correspond to the time or space
  dimension?


This is all quite simple when all multi-dimensionality is named
explicitly.  It is probably safe to assume that state feedback always
refers to the inner-most loop.


A more concrete example:

 for t in Time:
   bus = 0;
   for f in Freqs:
     bus += osc(f)


Entry: Loop context
Date: Sun Jan 27 22:38:40 CET 2013

What I'm talking about is nested loop with a-symmetric iteration =
output feedback = loop accumulation.

I got too deep into the shit of trying to unify two abstractions that
I no longer know what I'm talking about.

Is it really a problem?

Can summation be handled without needing to 

The problem is state streams.  If things happen behind the scenes, the
indexing needs to be captured by the summation loop.


Entry: Dimensions: TIME vs. SPACE
Date: Sun Jan 27 22:46:49 CET 2013

Wow what a day.. some things to take home:

- Feedback has a context == the "dimension" along which state is fed
  back.  This creates problems

- The whole game is about properly handling multi-dimensional arrays
  (data) which mirror directly in nested loop structure (code).

  This really isn't so hard when it is done explicitly, but it is a
  pain to work with indices directly.


The feedback notation basically abstracts state.  

Why? 

For the kind of code I want to write (Music DSP), there are a lot of
IIR filters (think damped sinusoids) which all have:

- local state: only relevant in a very small part of the code

- a lot of state accumulating at higher levels (when using functional
  notation)

Keeping it behind the scenes helps.


However... when there are more dimensions, it becomes hard to just
tuck it away.

But in practice, this might not matter.

* The only points where such bulky local state is important is for
  things that have to do with time.

* Other accumulations really are just accumulations e.g. product, sum,
  min/max, ... of *finite* structures (a collection of modules, an FFT
  block, a signal block matrix, ...) and might just as well be
  expressed explicitly.  Such state is also not retained.  With time,
  we have an "open" loop.

* A very special case would be an iterative numerical algorithm, which
  probably is also best expressed explicitly.


Really.. the only thing I'm looking at are GLUE FILTERS.

It is very useful for such things, but a pain for almost anything
else.  My needs are very specific.

So, for now:

* All system feedback syntax refers to TOP LEVEL TIME ONLY.  All code
  is stream processing code embedded emplicitly in a time iteration
  loop.

* A separate fold/map operator allows spatial operations.  All
  referenced variables are automatically lifted to streams by the type
  inference algo.  ( Later, maybe add a "constant" operator: t -> S t )


Later:

* Provide the ability to create "local time" in a loop.  The only
  reason for doing this is expressing iterative numerical algorithms
  as a causal stream processor.  The question is if this is really
  useful.

* Design a separate set of abstractions for "symmetric" grid
  processing, i.e. where all accumulation is associative and there is
  no iteration direction introduced by state feedback.  (time is
  a-symetric, but space isn't)

  (Then there are two languages: TIME and SPACE ;)


Entry: Building an audio synth DSL: Space and Time
Date: Sun Jan 27 23:16:19 CET 2013

Signal processing is working with number grids.  For audio synthesis
and processing, the algorithms can be split into two major classes,
mostly distiguished by the difference between space and time.

* Output feedback systems.

E.g. IIR filters.  These algorithms have mostly 2 dimensions:

- time, highly a-symmetric: past is known, future is not.

- object multiplicity (equalizer with 4 sections, mixer with 10
  channels, synth with 20 oscillators, ...)

Loops over these two kinds of dimensions are also different.

Loops over time tend to carry with them an amount of filter state
which is large compared to the dimensionality of the input/output they
process.  I.e. they are "fat".

Loops over space are mostly "map" operations: large sections of
dataflow that are largely independent, to be combined 99% of the time
by a huge summing operation (very little state is "threaded" along
spatial dimensions).  The summing operation is moreover associative so
there isn't even a necessary order.


* Block processing systems.

These split input/output signals in (overlapping / cross-faded) blocks
and process each block separately.  Such algorithms are more spatial.

Often there is some tracking of state going on between different
blocks creating some a-symmetry, but this is not where most of the
"connection mess" is concentrated.


Basic conclusion is that it makes sense to handle space and time
differently, with different abstractions.  This boils down to:

 - TIME: behind-the-scenes output state threading that conceptually
   lifts the "atom" from scalars to streams.

 - SPACE: focus on all dimensions being somewhat "equal" and
   symmetric, i.e. work with `map' and `fold', but don't specify any
   order.

In short: think spatially, but lift out the time dimension for causal
streams, and implement causal processors (output feedback) separately.


Entry: Think streams
Date: Sun Jan 27 23:35:47 CET 2013

Practically, it seems best to keep thinking about streams, meaning the
product of compilation is a stream processor, but there might be
"loops over stream processors" that need to somehow be expressed.


Entry: Bus: practically
Date: Mon Jan 28 00:07:44 CET 2013

- Just wrap an expression like is done now: for typing this does
  exactly what is needed: lift all types to streams.

- Use lazy evaluation to do 2 things:
  - Insert a context/loop in the syntax
  - Determine the number of values to insert the bus summuation.


Entry: Lazy syntax
Date: Mon Jan 28 09:38:07 CET 2013

It seems best to move the lazyness to `ai-app'.
In this case, feedback becomes a function again.

There is a choice:
- application of primitives
- application of all functions

Already confused.  The point is to catch the return "channel" of the
arguments of a primitive before the primitive is executed.

The `ai-app' form cannot distinguish primitives from composite
functions.  So it needs to insert a thunk, and the corresponding force
should be inserted by the lambda form.

Hmm... this is a mess..  Let's not do it that way.


Entry: The `bus' form
Date: Mon Jan 28 10:57:23 CET 2013

Also, it's probably best to do explicit binding of collection ->
element to avoid trouble later, and maybe write a "hack" form on top
of that.

Wow this takes time to sink in.
That's not what I decided yesterday!

I *can't* do explicit binding because of the state threading stuff.
The simplest way to solve this is to do implicit lifting, and just
make sure there is a loop context created.  Then to *undo* the lifting
using the `constant' form.

This corresponds to the way it is actually used, so even if it feels a
little dirty, it's probably best that way.

There is a problem however: the output is no longer flat, so it's
probably best to first fix that.


So how does the C code generator work?  It needs to know it is in a
loop context, so it can index any variable that is typed as a
container.

The question then seems to be, do we insert a nested loop, or goto and
labels?

I think I got it.  Definitely, all temporary nodes should be scalar
nodes in a loop context, while input/output/state nodes are indexed.
It's probably enough to keep the dictionary global, but to insert
"begin" and "end" loop hooks that delimit the loop context.

This even fits in the SSA line syntax:

(i) (p_loop_begin 0 n)


() (p_loop_end)


This then translates to

_ i; for(i = 0; i < n; i++) {

...

}

EDIT: It seems to work.  I'm separating code generation and context
(loop) dependent variable mapping.

It's quite an intensive transformation though.  But it doesn't look
more dirty.  That's good.

Something to note: the state is interleaved in the wrong way.  The C
app doesn't care about this since the total state size is the same,
but it might be good to add this as a parameter.

TODO:
  - actual sizes
  - propagate size to C compile time
  - const doesn't work - indexed access stx gets generated too early
  - trouble with (values 123 (bus ...)) (values in (bus ...))
  - accumulator doesn't work properly

Next: accumulator.


[1] http://llvm.org/docs/tutorial/LangImpl5.html


Entry: Accumulators
Date: Mon Jan 28 16:04:11 CET 2013

Interesting problem: the accumulator nodes need to be generated before
the number of outputs is actually known.  At the point when the nb of
inputs are known, the SSA sequence is already written to the dictionary.

Looks like we need to call ai-arity here.

It's neat that this is possible, but I don't want to introduce that
dependency.  It makes more sense to fork off the bindings using
`parameterize' and insert them manually.

EDIT: This was a bit of a shuffle, but it seems to work now.  It's not
the most elegant code due to the presence of accumulator registers,
though it is quite general: fold a function over a fixed static type
(vector), using an arbitrary accumulation operation, and automatically
lift all operations over the structure.

(display
 (ai-c (stream
        (lambda (freq)
          (fold B
                +
                0
                (phasor freq 0 1))))
       #:nsi 0
       #:tc '((B . 10))))


#ifndef PROC
#define PROC(x) proc_##x
#endif
#define proc_nb_state 10
#define proc_nb_in_stream 0
#define proc_nb_in_param 10
#define proc_nb_out 1
#define proc_in_list(m) \
m(0, freq) \

INLINE void proc_loop(aa_ s, aa_ i, aa_ o, int n) {
	for (int t = 0, p = 0; t < n; t++, p ^= 1) {
		_ v3 = 0;
		for(int i0 = 0; i0 < 10; i0++) {
			_ v1; p_add(&v1, s[p][0+i0], i[0][0+i0]);
			p_wrap(&s[p^1][0+i0], v1, 0, 1);
			_ v4; p_add(&v4, v3, s[p][0+i0]);
			p_set(&v3, v4);
		}
		o[0][t] = v3;
	}
}
box> 


Entry: Occurs check
Date: Mon Jan 28 21:46:07 CET 2013

Something is working, but the types are all messed up.
- const doesn't work properly
- what's the type of i?  It should be (B t)

I think I see what's going on: the accumulation operation will unify
the accumulator itself with (B t).

This means I'm missing an occurs check.  Probably what happens is that
t and (B t) are unified.

Yes.  Added the occurs check.  Should make it more verbose though,
with syntax info and stuff..  

Found the problem: the weird type lifting stuff in loops makes that
the accumulator has the lifted type.  It needs to be cast to the lower
type.  This can be done by inserting an assignment and manually
unifying the 2 node types.


Entry: Fixed const
Date: Mon Jan 28 22:33:21 CET 2013

Loop indices are now added only to structured types.

EDIT: Remaining question is: what is better, the 'const' notation, or
the dual approach, setting everything to const by default and using a
'multiple' notation?  For now it seems more meaningful to use const.
I.e. constants are more "important" because their values have more
global effect.  Default is local scope.

It would be nice to have a const operator that knows what the
structure is we're iterating over.  Use syntax parameters here.

Actually, when everything is a vector, 'const' really doesn't need an
argument.


Entry: It works!
Date: Tue Jan 29 09:39:43 CET 2013

Got a supersaw (= fold over collection of detuned oscillators) running
in Pd!  This is another milestone.

The rest is cleanup and generalization, which will probably take a
while.


Entry: Next
Date: Tue Jan 29 11:02:47 CET 2013

- Param naming (done)
- Multi-dim indexing
- Generalized HO fold
- Formalize the "lifted types" approach: it's different; why is it good?
- Switch to nested structure for SSA dictionary?


Entry: Generalize loop
Date: Tue Jan 29 13:35:55 CET 2013

It's probably best to generalize the loop to a proper fold over a
higher order function, then rewrite the current 

The trick is really the binding of the accumulators and the typing,
which still needs to be specified properly.

Basically, the key insight is that the time and space loops are
different.  The time loop is "hardcoded" because the state that is
associated to output feedback operators is hidden, which enables them
to stay purely functional at the stream level: state is an
implementation detail.


Entry: Why not simply OO then? - part II
Date: Tue Jan 29 13:55:23 CET 2013

DSP is about loops over grid, but the key insight is that the time
loop is different than the space loops:

- it is open
- it is a-symmetric
- it has a lot of associated "accumulator" state

Hiding the time loop and loop state from the view of the programmer
however creates an abstract view: at the level of streams, the
language is completely functional: there is no mention of state.
Operations are proper maps from streams to streams.

This is distinctly different from the OO approach where the time loop
is explicit (because it is open), and while state variables might be
abstracted, the fact that there is state is clearly visible through
the class/instance distinction.


Entry: Nested structure for dictionary?
Date: Tue Jan 29 15:43:06 CET 2013

It would be easier to map straight to Scheme this way.  However, it
requires a slight adaptation of the C generator, to traverse over the
tree, or to simple pre-process and flatten it.


Entry: Grid index computation
Date: Tue Jan 29 20:13:05 CET 2013

It might be better to define offset variables at the beginning of each
loop such that loop indexing is just:

   offset = offset_super + y * x_max;
   arr[offset + x];

it doesn't make sense to use things like:

   arr[... + y * x_max + x]

Id' say to use stride increments in the for loops, hower there needs
to be a way to recover the proper linear increasing coordinates
because the loop body might need them to compute stuff.

Let's leave it open, and insert something that works.

Anyways, I've abstracted the grid traversal code a bit.  This can be
simplified a bit more even.

It's probably best to convert all type info to grid dimensions,
strides and offsets as soon as possible.

Indexing the arrays requires offsets of particular input parameter or
state structure in the bulk array + a stride structure for indexing.

EDIT: fixed


Entry: Language factory
Date: Wed Jan 30 23:33:00 CET 2013

[1] http://hashcollision.org/brainfudge/


Entry: Enhancements
Date: Sat Feb  2 22:43:49 CET 2013

- Support both float and int.  Should be simple with the time inf.

- Add more generic fold-map.  The issue here is to solve the
  "auto-lift".  It might indeed be best to solve it, meaning to do
  explicit dereferencing read and store.

- Const is not well-defined


Entry: Const is not well-defined
Date: Mon Feb  4 10:46:15 CET 2013

What I want it to mean: this input is a constant.  What it really
means: this one is a constant across the loop dimension of the current
fold.

-> It seems best to add the const form to the `bus' macro, essentially
   "spreading out" some values over a data structure.


Entry: BUG: Voice alloc
Date: Mon Feb  4 22:43:41 CET 2013

After playing a chord, some notes seem to stick, even when the gate
signal is 0.  However, they don't stick with full volume.  Very
strange..

For later..  I'm not sure if it's a core problem.  All code looks
alright.  Something very strange.


Entry: Generalizing fold
Date: Tue Feb  5 20:12:12 CET 2013

Basic idea is that we're mapping many to one.  This is not a stream
processor with stream outputs.

In this way, the body of the code could be generalized to an (s,i)->s
map, where s comes from [s], generalized container.

So this is really proper fold:

  (i -> s -> s) -> s -> [i] -> s

The trouble is with the type trickery.

What I actually use is fold over a lifted (mapped) operation.
Maybe not so useful, since I really only need fold +.


Entry: Delay lines
Date: Tue Feb  5 22:48:56 CET 2013

These are just another feedback relation.  No big deal.  Maybe make it
part of fold?

There are mainly 2 way to organize delay memory: per shared or
separate.  The only thing that's important is memory locality, which
is not going to be very predictable.  Maybe best to organize delay
lines in cache line chunks, i.e. multi-channel.


Entry: Types
Date: Sun Feb 10 13:52:17 CET 2013

I need integers..  How to do that?  This requires a small adaptation
in the type inference.  Currently there are only type variables at the
lowest level.  This needs to be extended to concrete types.


Entry: Loop inference.  Automatic map?
Date: Tue Feb 12 13:36:30 CET 2013

The important parts are:
- mappings are common
- distributions are common (const over vector)
- most folds are +, (with a remote second: min/max and maybe *)

Type inference at this point does automatic lifting of all lexical
variables to streams, and then uses this typing to implement a loop.
Basically, there is no mention of a loop in the representation, only
about types.  The advantage is that inserted bindings (state) do not
need to be mentioned explitly.

The languge needs to distinguish between scalar and composite data,
with respect to a single fold.  State is *always* composite, so it
makes sense to make composite the default, and use the `const'
construct to introduce constant distribution.

So:
- Default is parallel
- Distribution ("environment flow") is explicit: `const'
- The folding function is generic.

The main difference with standard approach is that the scalar function
that is mapped over the composite data is not visible explictly.  It
is visible only indirectly through "lifted primitives", which are in
practice always array lookups relative to the local loop context ==
index variables.


Entry: Possible bug
Date: Tue Feb 12 14:03:56 CET 2013

Currently, no distinction is made between distributions on 2 different
levels:

   for(i) { for(j} { a[i] } }

   for(i) { for(j} { a[j] } }

TODO: Make this all a bit more explicit.

A 'const' introduces a stride of 0 for that particular level, meaning
that it doesn't count.  See difference between:

     a[ 100 * i + 1 * j ];

     x[   0 * i + 1 * j ];

Though this doesn't work for:

     y[ 100 * i + 0 * j ];

But the idea might be interesting, since the latter can be represented
by

     z[   1 * i + 0 * j ];

or a stride sequence z:(1 0) and y:(0 1) compared to a:(100 1).

Another idea is to type these as arrays of dim 1, but that doesn't
capture the idea they get distributed.  The multiplication by zero
does that very well!

Basically, the type (1 0 2) says 2 things:
- It can be lifted to (1 ? 2)
- It has a memory representation of (1 2)

But the iteration pattern has stricly more information than the memory
representation.

So the trick is in the implementation of the `const' typing.  It
should give a particular "wilcard" array type that has no size, but
maps to stride multiplication 0, and can be unified with any size.


So what if vectored inputs are used in different configurations,
i.e. transposed?  Would that give type constency?  Maybe this would
just be a typing trick, i.e. "casting" (A (B t)) to (B (A t)) by
flipping indices.


Entry: Next
Date: Sat Feb 16 11:33:18 CET 2013

First, let's fix the typing issue for "distributed" things.  I'm not
sure if the way suggested above is really good, but it makes sense in
some way.  Maybe it's time to also look at APL again for inspiration.


Entry: Fixing typing
Date: Mon Feb 18 15:20:34 CET 2013

I don't like the fact that I'm doing something special. I have to
write down the principles every time:

- Instantiate state variables for the outer *temporal*, keeping the
  frontend purely functional.

- Provide a useful notation for *spatial* iteration.  This mostly
  boils down to the idea of "distribution".


A problem with the current approach is that loops are not represented
well, and types of "distributed" variables are not correct.

Due to the hidden nature of the state variables, the representation is
somewhat upside down: it seems easier to lift the primitives to stream
operations, than to use an explicit "for" structure where a scalar
variable is taken from a composite (array) object, but the operations
themselves remain scalar.

Something keeps escaping me: where do the hidden composite (vector)
state variables and scalar types meet?

Maybe lifting the primitives is simpler.

It reminds me of the Haskell approach, where all this is more obivous:
this is a problem of commutation of type constructors: there is an
isomorphy between different orders, but once you pick an order, the
implementation does have a specific look.

Let's stick with what works now, and simplify it.


Entry: Lifting primitives
Date: Mon Feb 18 16:40:15 CET 2013

Fold reduces rank, e.g. acumulation of vector -> scalar or for the
sake of argument, from list [t] to scalar t.

The main problem here is typing.

The language does not use explicit array indexing or any other kind of
data structure deconstruction.  In a sense, the containers are always
hidden, and it is assumed every container has a fold combinator, and
all operations can be lifted over all containers (Functors).

In a traditional functional language, a fold over [t] would use a type
(t, t) -> t as accumulator, or t -> t -> t in curried form.

Here we take the inside-out approach of lifting all operations from t
to [t] which means that the accumulation operation also has type [t]
-> [t] -> [t].  In essence: the accumulation combines two sequences to
a new sequence.  One of the input sequences is the sequence we're
accumulating, the other input is a shifted version of the output.

Using the lifted approach, the accumulation ends up with a value of
[t], being the stream of all intermediatly accumulated values, of
which we only need the last element.  Picking that last element is an
operation of type [t] -> t.  Practically, this translates to a a
simple assignment operation since of course we did not keep track of
the accumulator sequence.  However, the key part is that this
assignment can be typed differently on both sides, conceptually
performing the "pick last element" operation.


So it looks like this [t] -> t is at least meaningful and actually
corresponds to a physical operation.


Entry: Distributing indices
Date: Mon Feb 18 17:04:30 CET 2013

It might be best to start working with the dim and stride operations
before making this work.

Currently, a scalar node is casted to a vector node, and this is
compensated for in the "stride map".  But that doesn't seem to be
correct.

Currently we always bind to the innermost loop indices, but it might
be that this is not correct.  Actually, that is probably what the bug
is about..

Basically, const unifies a node with (? x) where ? is the
"distribution dimension".

Then both dim and stride need to handle this correctly.


The principle is that all variable occurances inside a loop have the
same rank, but that some dimensions might be "skipped".

I.e.  (4 ? 5) corresponds to a rank 2 grid used in a rank 3 or 3-level
loop where the second level is ignored in the indexing of the rank 2
grid.

So it's clear that:
- const introduces a special rank wrapper '?'
- stride needs to insert a 0

what about dim?
where is it used?

Only in the generation of stride lists.

Instead of modifying the code, it seems simpler to do this in a
pre/post approach.

I.e. define an operation that takes a dims list and returns a list
with infinities removed, and a function that will take a list and
re-insert zeros in the correct places.

Maybe, before doing this, it might be best to simplify the type
representation.  The nesting (A (B t)) is unnecessary and can be
flattened to (A B t) or even (A B) since the last element is not
necessary for the grid typing.

This way, dims are the same as the type reps themselves.

What about this, as soon as possible, replace the type map with a dims
map.  Then only use dims.

However, this makes it impossible to use the "skip dimension" trick..

This all needs to be made a lot simpler.  It's probably also better to
use offset variables instead of relying on common subexpression
elimination.


I'm starting to think that it might actually be easier to perform
explicit indexing, i.e. use plain C arrays.  This will make the
generated code easier to read.  The whole thing can be avoided by
making the state/input vector into a proper struct.


Entry: Named let
Date: Tue Feb 19 12:15:57 CET 2013

Maybe it's time for a second interpretation to Scheme syntax using
named let with type annotation, and then compile this to C.  It seems
that there is already too much complexity in the part that does SSA ->
C.  I.e. loops should be somewhere else.

Instead of named let, it could also be a for/fold with accumulators,
and explicit indexing for the arrays.

Big question is that indexing: abstract, or numeric.  Important for
much code is to work with indices explicitly, so let's keep it
explicit.


Entry: for/fold to accumulator assigments
Date: Tue Feb 19 16:01:27 CET 2013

Modified for/fold  (no generic input streams: explicit index sequence).

(let*/values
  ((a1 a2) (for/fold
               ((s1 0)
                (s2 1))
               ((i (in-range 0 n)))
             (values (+ (arr i) s1)
                     (* (arr i) s2)))))


Question is: how to make this easier in C?  The above is still an
expression language, while C is an assignment language..

So let's keep the above form, since it keeps us in the expression
domain, and it also makes C code simpler, since we can juse use

_ a1 = s1;
_ a2 = s2;


The modified for/fold form then can be

(fold-dim i n
  ((s1 0)
   (s2 1))
  <expr>)

However, that one is not a map/fold as the one we're abstracting, so
maybe stick to the current approach first, then move to an "abstract"
accumulator later (for/fold style).


Entry: Tension between SSA and "named output" Oz dataflow style.
Date: Tue Feb 19 16:21:41 CET 2013

Since eventually all results go into buffers, it might be better to
switch from SSA form to Oz style named binding forms, instead of
mixing SSA and array assignments.

Wait.. this is not necessary.  Array output nodes are bound properly,
one element at a time.


Entry: control is only necessary for memoization
Date: Tue Feb 19 17:01:11 CET 2013

Looks like nested loops can just be done with parameters.


Entry: Auto-lifting
Date: Tue Feb 19 19:24:40 CET 2013

It's an interesting feature, but it remains a bit strange.  Maybe
"implicit" is the word.  Implicit might be good for programming, but
not really for compiling.

It might be that I'm a bit overambitious after the preliminary success.

But, otoh, a syntax of binding array outputs makes a lot of sense.

I.e. inside a loop, all temporaries are scalars, but those that make
it to the outside as input/output/state will be indexed bindings.

This means that the algorithm is essentially two-pass:
- node expansion / type inference in parallel
- index annotation

While const stuff needs wildcards to be lifted to the grid type of a
loop, the intermediates take the type of the other expressons in the
loop.

They are actually grids.  The only difference is that their values are
not retained to the next loop iteration.


Anyways, just chugging along I get something like this:

(let-values ((((v12))         (p_mul (v10 i j) (v2 i j)))
             (((v13))         (p_mul (v11 i j) (v3 i j)))
             (((v14))         (p_add v12 v13))
             ((((v15 i j)))   (p_add v14 (v0 i j)))
             (((v16))         (p_mul (v10 i j) (v4 i j)))
             (((v17))         (p_mul (v11 i j) (v5 i j)))
             (((v18))         (p_add v16 v17))
             ((((v19 i j)))   (p_add v18 (v1 i j)))
             (((v22))         (p_add v20 (v15 i j)))
             (((v23))         (p_add v21 (v19 i j))))
  ...)

Something not right with the parenthesis but it looks ok.  All binding
forms that have indices are actually assignments to externally
provided grids.

It's probably also better to use array indexing like this, and give
the C compiler access to the structure of the state.  Also better for
debugging.


Entry: Single assigment
Date: Tue Feb 19 22:32:52 CET 2013

The border between single assignment and "true" memory assigment
remains a bit tricky.  On thing though: even external memory is
assigned to only once.  The only difference is that *references* are
passed into the function.

I.e. the essential transition is one from values to references.

Internal node: declaration + assignment.
External node: only assignment.


Output is still a-symmetric.  The reason is probably that the current
structure does not allow vector outputs.  The app doesn't need it, but
should it be taken into account?  It might be solved by making the
time loop explicit.

One structure pops up though: at all points, the values statements can
be replaced with assignments.


Entry: C arrays
Date: Wed Feb 20 00:32:12 CET 2013

The interesting thing is that mixing * and [] types in C is actually
useful for what I'm trying to accomplish..

Anyways, got an imperative form in s-expressions:

'(proc
  ((v10 v11) (v15 v19))
  ((v0) (v1 v2 v3 v4 v5) (v30 v31))
  (loop
   (t ?)
   (block
    (v26 (p_copy 0))
    (v27 (p_copy 0))
    (loop
     (i N)
     (block
      (v6 (p_copy i))
      (v7 (p_copy N))
      (v20 (p_copy 0))
      (v21 (p_copy 0))
      (loop
       (j M)
       (block
        (v8 (p_copy j))
        (v9 (p_copy M))
        (v12 (p_mul (@ v10 i j) (@ v2 i j)))
        (v13 (p_mul (@ v11 i j) (@ v3 i j)))
        (v14 (p_add v12 v13))
        ((! v15 i j) (p_add v14 (@ v0 i j t)))
        (v16 (p_mul (@ v10 i j) (@ v4 i j)))
        (v17 (p_mul (@ v11 i j) (@ v5 i j)))
        (v18 (p_add v16 v17))
        ((! v19 i j) (p_add v18 (@ v1 i j)))
        (v22 (p_add v20 (@ v15 i j)))
        (v23 (p_add v21 (@ v19 i j)))
        (p_set v20 v22)
        (p_set v21 v23)))
      (v24 (p_copy v20))
      (v25 (p_copy v21))
      (v28 (p_add v26 v24))
      (v29 (p_add v27 v25))
      (p_set v26 v28)
      (p_set v27 v29)))
    ((! v30 t) (p_copy v26))
    ((! v31 t) (p_copy v27)))))
 

Entry: State buffer swapping
Date: Wed Feb 20 13:02:29 CET 2013

One annoying thing is that state input/output bindings are fixed in
the intermediate representation, but need to be ping-ponged in C.

Ok, wasn't too hard..


    for (int t = 0; t < t_endx; t ++) {
        struct proc_si * restrict si = (struct proc_si *)(&state[(t^0)&1]);
        struct proc_so * restrict so = (struct proc_so *)(&state[(t^1)&1]);
        ...
    }


Entry: Fold vs accumulate
Date: Fri Feb 22 11:42:32 CET 2013

Thing is that the rep doesn't have explicit accumulators, while fold
does.  It's possible to do this but I think it's actually not so
useful.  I.e. currently a parallel implementation is possible, since
the form is not specified.


Entry: Compiling to C and Scheme is difficult
Date: Fri Feb 22 14:51:32 CET 2013

I'm a bit amazed that the semantics isn't simple; compilation to
Scheme is a bit problematic.  Maybe this should have two levels:

- One Scheme side that compiles to array updates.


Entry: feedback/n trouble
Date: Fri Feb 22 16:56:31 CET 2013

It's a bit hard to understand..  Made for juggling state nodes, so can
it be used for someting else?

I'm reverting to the old implementation.  This is the current doodle:

(define-syntax (ai-lambda-feedback stx)
  (syntax-case stx ()
    ((_ (state ...) (in ...) expr)
     ;; Define the function body of a standard stream processor.
     #`(let ((update
              (ai-lambda (state ... in ...) expr)))
         ;; Pass it to the implementation through feedback/n construct.
         ;; The result behaves as (ai-lambda (in ...))
         (ai-app feedback/n
                 '(state ...)
                 '(in ...)
                 update)))))


(define (make-feedback-prim make-state-nodes
                            register-state-nodes!)
  (lambda (sem state-args args update)
    (make-ai-function
     ;; Every time this function is evaluated (i.e. for stateful
     ;; operation instance) a set if state i/o nodes is created.
     (lambda (sem . in)
       (let*
           ((nb-state  (length state-args))
            (state-in  (make-state-nodes state-args))
            (exprs     (values-list
                        (apply (ai-function-proc update)
                               sem 
                               (append state-in in))))
            (state-out (take exprs nb-state))
            (out       (drop exprs nb-state)))
         (register-state-nodes! state-in state-out)
         (apply values out))))
    args))


Entry: Output feedback
Date: Fri Feb 22 16:58:30 CET 2013

To make it fit in the current approach, the output feedback needs a
trick - some kind of lazy evaluation.  The semantics seems to have
this trick built-in..

Focus on
1. Having a processor object (leap of faith)
2. Iterating it over a stream

It needs to be opened up a little bit to make sure there is access to
the update method.  This can't be hidden..


So, the update fuction is exposed.  Now, is it possible to reuse the
node injection code for lazy streams? 


(define (make-feedback-prim make-state-nodes
                            register-state-nodes!)
  (lambda (sem state-names update in-nodes)
    (let*
        ((nb-state  (length state-names))
         (si-nodes  (make-state-nodes state-names))
         (exprs     (values-list
                     (apply (ai-function-proc update)
                            sem
                            (append si-nodes in-nodes))))
         (so-nodes  (take exprs nb-state))
         (out       (drop exprs nb-state)))
      (register-state-nodes! si-nodes so-nodes)
      (apply values out))))

No, the trouble is that we need multiple executions of the body code.
So this function should do the iteration and feedback.


Once the update part is exposed, it's not all that hard.  Just call it
in a loop.


Entry: Z transform
Date: Fri Feb 22 19:27:01 CET 2013

For linear operations it might be possible to make an interpretation
that computes the Z transform.

Main trouble again is what to do with:
- multiple inputs (just have it compute complex gain, not z transform).
- feedback?

The question is: given an update function, what is the transfer
function with output feedback?  This is 1 / 1 + x right?

Nope, it's more interesting than that.  This is a "loop closing"
thing, since we have a generic formula.

This seems akin to partial derivatives.  Since everything is linear,
we can put a 1 in each of the state inputs subsequently and see what
comes out, then combine the results.  Is it possible to get the full
picture this way?

Another way is to compute it iteratively.  Trouble is that time
constants might be quite high.  Can they be artificially enhanced,
then compensated?

Hmm..  something simple is hidden here, but I can't see it.

It's a matrix inverse.

The matrix itself can be reconstructed by probing all inputs with 1
and 0.  Once obtained, it's just linear algebra.

It would be good to keep this composable, i.e. to be able to generate
C code that computes the transfer function.  This means that the
Gaussian elimination should be implemented directly instead of from a
library.  There's some old code somewhere..

First, focus on computing the system matrix.


Can this be done without performing a matrix inverse?  It doesn't seem
so, unless using some constraint propagation thing..  Functions can be
arbitrarily dense, so reducing it to a matrix seems simplest.

It's going a bit fast... really need a break.

Questions:
- compositional?
- how to implement in-place algorithms?

One thing though: if this is a 1-1 transfer function, it might be
better to reduce it to fractional form first.


Entry: Gaussian elimination
Date: Fri Feb 22 22:01:07 CET 2013

How to express something like Gaussian elimination in the stream
language? It probably requires the use of a double buffering scheme to
eliminate the in-place approach.

However, it might be possible to just flatten all the computations
into SSA form, and let the compiler decide.  Most practical matrices
are going to be small, so it might actually make a lot of sense to do
this.

Trouble there is pivoting..  But let's assume condition is good.


Entry: Transfer function
Date: Fri Feb 22 22:13:27 CET 2013

For 1-1 functions, gaussian elimination is not necessary, as there is
always a way to write the transfer function a rational function,
i.e. without matrix invsersion / equation solving.

How does that work?  This is a computation on the coefficients..  It
seems that in general this is still a matrix operation, i.e. to move
to the all-pole form:

   a b c d
   1 0 0 0
   0 1 0 0
   0 0 1 0


I.e. given a program

Another thing: in practice, most feedback is going to have 1 or 2
state vectors.  Can't we just exploit that and special-case the
formulas?

Even if it's special-cased, gaussian elimination might be a simpler
way to express it!


Entry: Conclusions: transfer functions
Date: Fri Feb 22 22:17:38 CET 2013

- Generic gaussian elimination is a nice idea to keep in mind for
  later, but is probably not necessary if we take a shortcut for 1 and
  2 element feedback functions.

- It is probably possible to perform GE in a value-orgiented way, just
  by performing the algorithm at the "macro level".  The result is a
  computational network that still has parallelism (for low-level
  opti) and can be pruned by 1 and 0 ops.

- A mechanism is necessary to write "macro functions" in scheme.
  I.e. "DAG generators".  Things that operate on compile-time data
  structures.


Entry: Compile time data structures.
Date: Fri Feb 22 22:29:39 CET 2013

These should probably take shape as pattern-matching binding forms and
constructors.


(let-vector (k l m)
  (gauss-elim
    (matrix (a b c d)
            (1 0 q r)
            (z 0 0 1))))


The cons/des seems straightforward.  How to implement the function?
It's going to be problematic since there are 2 levels..

It's probably best to do it explicitly once, and then see where it
goes.

EDIT: cleaned up a bit, moved all such functions to stream-meta.rkt.
It seems quite straightforward.


Entry: Testing linear functions
Date: Sat Feb 23 13:21:53 CET 2013

Given an opaque function, determine the following:
- split linear variables vs. multiplicative parameters
- produce a matrix for the linear variables.

What determines whether a variable is a parameter?
- Only multiplied with signals.
- Never added to signals.


Maybe it is possible to assume that it is known which variables are
signals, because thei are either:
- given as input, or
- act as state feedback

The rest is lifting.  But.. there are ambiguities if multiplication of
constants is encountered.

This requires some kind of logic programming approach that can
represent these ambiguities.

To make it simpler, we have only 2 possible outcomes, one unknown and
a bunch of relations.

(a b c d)
 1 1 0 ?

exprs           node types

c = add a b  => a == b == c
    sub

c = mul a b  => a != b, c == 1 || a == b == 0, c == 1

c = div a b  => a == b == c = 0

c = unop a   => a == c == 1

This requires flat unification combined with backtracking for each of
the || alternatives.

Proably not too hard to setup.
Nice exercise.


Entry: Prolog
Date: Sat Feb 23 14:20:42 CET 2013

This is the draft for the interface of a not-quite prolog language:
unification + backtracking (as `unify-or'), but no support for
"functions", i.e. expressing a != b is not possible.

However, it does look like this is enough to solve the problem.

I wonder if there is a real prolog-like language for Racket.  Yep
there is [1].

EDIT: I was able to express the problem as a racklog problem.
However, due to the variable number of logic variables, this seems to
need a 2-phase approach:
- Convert AI higher order syntax to Scheme syntax (Racklog)
- Evaluate

Probably there is a list approach to do this..  Somehow encode the
predicates so they operate over lists?

I guess this is always possible with symbols.  I.e. write the whole
thing in one predicate, but then there still needs to be input.  It
seems the static nature is just part of the deal..

Yeah this can't be so limiting.  It's just me being inexperienced with
logic programming and data structures.


After takig a look at the source to the definition of the `%which'
form, it seems there is little magic.  Variables can be created using
`(_)', and the which form could just return a list bound to a dummy
variable.

           (let ((x (_))
                 (y (_)))
               (%find-all ($)
                 (%and (%= $ (list x y))
                   (%member x '(1 2))
                   (%member y '(1 2)))))

So let's try to build the clause like this.                   


EDIT: Works


[1] http://docs.racket-lang.org/racklog/


Entry: Guessing types of linear functions : linear variable vs. multiplicative parameter
Date: Sun Feb 24 11:24:28 CET 2013

Is it necessary?  Aren't all types known?

No.  One of the problems to pull this off might be that not all inputs
are known.  I.e. there might be inner nodes that add constants, which
for a linear program interpretation act as inputs.

I.e. it results in responses instead of transfers.

Maybe responses is what we're looking at?

The problem doesn't seem very well-defined yet.


Entry: Linear function test functions
Date: Mon Feb 25 19:52:06 CET 2013

I got to a point where it's possible to convert an opaque function to:

- A partition of the inputs into linear variables and multiplicative parameters
- A (parameterized) state space representation[1] of the linear map

One of the things that pops up is nilpotent matrices, particularly in
the function that implements the unit delay:

(define z
  (lambda ((s)
           (i))
    (values i
            s)))

This has the system matrix:

0 | 1
--+--
1 | 0

The eventual goal is to compute the I->O transfer function of the
system, which involves a matrix inverse.

Should this be special-cased to support the nilpotent case, which
probably always corresponds to a nilponent matrix.

(define z
  (lambda ((s)
           (i))
    (values i
            s)))

Testing another function gives another nilpotent matrix.

(define test-z3
  (lambda ((s1 s2 s3) (i))
          (values i s1 s2 s3)))

0 0 0 | 1
1 0 0 | 0
0 1 0 | 0
------+--
0 0 1 | 0

By reasoning, for pure delays the output should just be multiplied by
z = e^jw.  I wonder if there is a better way to do this.

It's probably best to special-case the `z' case, and throw an error if
a zero pivot shows up during GE.

Wait! the transfer function is

      zs = A s + B i
       o = C s + D i

or

      s = (zI - A)^1 B i
      o = C s + D i

or as a transfer function

      o = C (zI - A)^1 B i

Even if A is nilpotent, this should still just work.  I.e. in the case
of `z' above, A = 0, so this becomes:

      s = z^-1 i
      o = s

or
      o = z^-1 i


[1] http://en.wikipedia.org/wiki/State_space_representation


Entry: Matrix operations
Date: Tue Feb 26 00:53:07 CET 2013

This might be interesting[1].  However, at some point I probably need
transparent algorithms.  But for now it's all numbers.

Hmm... this looks like a fairly new thing.

Let's do it the hard way.

Implemented Gauss elimination.  Gauss-Jordan is probably better.  (
Why?  The N^3 coefficient is half as large.  Is there also a numeric
stability constraint?  However, for N small it might not make such a
difference. )

[1] http://docs.racket-lang.org/math/matrices.html


Entry: Composing semantics
Date: Thu Feb 28 12:35:23 CET 2013

- Map over matrices

Entry: Rings and multiplicative inverses
Date: Thu Feb 28 12:36:19 CET 2013

Turn the language into a ring?  I.e. use inverse instead of division
as primitive?  Maybe this isn't necessary: division for matrices is
actually more efficient to implement, as Gauss-Jordan elimination.

I.e. for transfer function:

        o = (C (zI - A)^1 B + D) i

However.. multiplication is not commutative, so what would A/B mean?
A * B^1 or B^1 * A?  If it's the former, how to express the latter?
Explicit inverses are probably better from this respect.

In practice, for matrices the form will mostly be:

        A^-1 * B

which corresponds to x as the solution of the linear system

        A x = B

following this approach, a right inverse B * A^-1 corresponds to

        B = x A

so it's the same thing (still a set of equations) but placed in
transposed form.  i.e. it is the same as

        B^T = A^T x^T

Does it matter, the transfer function equation above, which RHS is
used to solve the system?  I.e. C or B ?


Entry: Transfer functions
Date: Thu Feb 28 16:40:10 CET 2013

Currently, the typing of the `update' function doesn't use globally
available information.  I wonder if it's possible to run a whole
program type analysis and then pass this to the linear evaluator.

The basic problem here seems to be the construction of an "annotated"
representation, i.e. add an extra param to the primivies which is the
i/o types.

-> lift a semantics to an annotated semantics.


Entry: Lifting semantics : adding type information
Date: Thu Feb 28 17:20:14 CET 2013

Adding type information to a network seems straightforward: just
prefix the values of the parent semantics with node types.  Trouble
here is that types are also outputs.  How to handle that extra node?
Ha! convert to CPS!

So at least the principle seems straightforward.

The implementation seems a bit problematic.  Mostly the delay
involved: something needs to be suspended until the whole type node
system is resolved.  I.e. primtives cannot be called directly.  How to
do that?

Each computation node can be represented as a function of all the
program's input nodes.  Maybe CPS is actually essential to make this
work.


Entry: Focus
Date: Thu Feb 28 19:19:22 CET 2013

Maybe time to focus on the core stuff, and build some synths :)


Entry: Plugin format: sp: just data?
Date: Fri Mar  1 14:38:10 CET 2013

It might be best to keep the .sp format as dumb as possible, i.e. do
not put any behavior except for the main loop.

So I'm taking out the dispatch routine:


static void PROC(dispatch)(word_t cmd,
                           const void *in, word_t in_size,
                           void *out, word_t out_size) {
    // rai_cmd_t cmd = _cmd;
    switch(cmd) {
    default:
        break;
    }
}

    .dispatch   = (u64)PROC(dispatch),


Entry: Controls
Date: Thu Mar  7 19:55:35 EST 2013

The trick is to never compute any exponentials, i.e. for envelopes or
z transform mappings, but use multiplicative updates instead.  Maybe
there is a way to do this as a proper transform, i.e. without having
to "weave" the envelopes with the coefficients.

So, essentially this is 1st or 2nd order splines.

Next: Look at libm implementation of exp / sin  (see [1] math/*exp*)

Had to make a small program to find out where it is actually going:
*__GI_(...)(long long, float *) (x=0) at ../sysdeps/ieee754/flt-32/w_expf.c:41
41	../sysdeps/ieee754/flt-32/w_expf.c: No such file or directory.

This delegates to __ieee754_expf() in e_expf.c

It's not a trivial operation..  What worries me a bit is the table
lookup, so it seems that to make this fast, it's best to perform all
the expf() operations in sequence, to keep everything in the cache.

So, definitely: it is worth making an approximation.

To find out:

- are there any special routines in intel architecture to compute exp
/ sin or at least a starter value that can be refined iteratively? [2][3].

- is exp2f faster?


[1] http://ftp.gnu.org/gnu/glibc/glibc-2.17.tar.bz2
[2] http://software.intel.com/sites/products/documentation/hpc/mkl/vml/functions/exp.html
[3] http://gruntthepeon.free.fr/ssemath/


Entry: VCF / VCO
Date: Fri Mar  8 11:24:23 EST 2013

The problem seems to be the generation of complex exponentials.  VCF
and VCO are not that different, VCF takes a complex exponential as
input == feedback coefficients.

Basically, VCO is integral of VCF coefficients.


Entry: Approach for linear exponentials
Date: Sat Mar  9 11:41:09 EST 2013

See [1].  Evaluate 2 exponentials:

exp(P_k) -> reset p_k
exp((P_k_+1-P_k)/n) -> q_k

p_k+1 = p_k q_k^n

Approx error is then:

exp(P_k+1) - p_k+1

Note that the limiting factor is the accuracy of exp(P_k).


Find out:
- what the pure numerical error is with exact exp()
- what the full error is with approx exp()


(define (run P0 P1 n [exp exp])
  (let* ((q  (exp (/ (- P1 P0) n)))
         (p0 (exp P0))
         (p1 (for/fold ((p p0))
                       ((i (in-range n)))
               (* p q))))
    (values p1 (exp P1))))

box> (run +1i +.9i 100)
0.6216099682706652+0.7833269096274834i
0.6216099682706644+0.7833269096274834i

From this it seems that numerical error due to accumulation can
probably be largely ignored.  It pretty much stays in the lower
significant bits.  The real problem is in the approximation of exp.

We need something that satisfies

   exp ((P1 - P2 / n)) ^ n  == exp(P1) / exp(P0) + e

up to an acceptable e.

(define (error P0 P1 n)
  (lambda (exp)
    (- (expt (exp (/ (- P1 P0) n)) n)
       (/ (exp P1) (exp P0)))))

box> (define e (error +1i +.9i 100))
box> (e exp)
3.774758283725532e-15-4.579669976578771e-16i
;; Test  
(define e (error +1i +.9i 100))


;; exp Taylor series
(define (exp-taylor order [acc 0])
  (lambda (x)
    (call-with-values
        (lambda ()
          (for/fold ((acc acc)
                     (xk 1)
                     (kfac 1))
              ((k (in-range (add1 order))))
            (values
             (+ acc (/ xk kfac))
             (* xk x)
             (* kfac (add1 k)))))
      (lambda (acc . _) acc))))


box> (e (exp-taylor 4))
-0.0029735148625679164-0.0017139087382160162i
box> (e (exp-taylor 6))
9.001294569177531e-05+5.099546987226422e-05i
box> (e (exp-taylor 8))
-1.4694914934887393e-06-8.294693956273358e-07i

That isn't too bad.

[1] entry://../math/20130309-104518


Entry: Linear exponentials, without reset?
Date: Sat Mar  9 17:08:02 EST 2013

Since exp((P1-P2)/n) is going to be a lot more precise than exp(Pi),
it might be possible to forget about resetting the state at all, and
just focus on data-oriented approach (threshold), or to reset the
state only once every couple of blocks.


Entry: Linear exponentials, conclusions up to now
Date: Sun Mar 10 09:21:31 EDT 2013

base:
- lin for radius: probably won't change much.
- exp(i lin) for phase: linear probably effects state too much

correction:
- straight exp(), or "lowpas pull" to exp(), or
- no correction?  solve it at upper level?


I think it is now understood well enough to start an implementation.
What is missing in the architecture is control rate operations.

Control rate ops act as state ops that are pushed out of the loop.
This needs some kind of syntax.  It could be done first for things
like computing 1/samplerate.


Entry: Explicit time loop?
Date: Sun Mar 10 09:39:19 EDT 2013

Probably the simplest approach is to make the time loop explicit.  A
time loop is the reference point for "feedback" loops (state update)
and provides a time coordinate.

Practically, a time loop is a fold with a marker for feedback.

While thinking in hacks is easier, the important part is to keep the
semantics clean.  Is an ai-lambda a stream processor or not?  In
current use, there are at least a couple of points where a simple
scalar language is useful.

It seems this boils down to making the top time loop into an
accumulation loop.  Doing this incrementally: first pushing it from
stage 2 to stage 1 in ai-array.rkt

First attempt is to make the toplevel time loop an accumulate loop,
but that's no good: it is expected to be a values form.  This needs a
different representation.

Hmm... this is going to require some deeper insight.


Entry: Cleanup
Date: Sun Mar 10 09:41:38 EDT 2013

Cleanup the toplevel define-values form such that `define' works
properly.  OK, works.


Entry: Support for piecewize time processing
Date: Sun Mar 10 12:54:53 EDT 2013

Current semantics is reflected best in the t-lambda form.

'(t-lambda
  (sx sy)
  (a b c d e f)
  (nodes
   (((v22 v23)
     (accumulate
      (v0 v1 N)
      ((v18 0) (v19 0))
      (nodes
       (((v16 v17)
         (accumulate
          (v2 v3 M)
          ((v12 0) (v13 0))
          (nodes
           (((v4) (p_mul sx c))
            ((v5) (p_mul sy d))
            ((v6) (p_add v4 v5))
            ((v7) (p_add v6 a))
            ((v8) (p_mul sx e))
            ((v9) (p_mul sy f))
            ((v10) (p_add v8 v9))
            ((v11) (p_add v10 b))
            ((v14) (p_add v12 v7))
            ((v15) (p_add v13 v11)))
           (values v14 v15))))
        ((v20) (p_add v18 v16))
        ((v21) (p_add v19 v17)))
       (values v20 v21)))))
   (t-values (v7 v11) (v22 v23))))

Essentially, this is stil a functional program, which has two
isomorphic interpretations related through the scalar <-> stream lift:

- Original stream->stream map with the shifted state i/o stream relation made explicit
- Scalar interpretation for elements of the stream.

The looped scalar interpretation is an implementation of the stream semantics.

The move from functional -> imperative is possible due to causality of
data relations.


Now, I'd like to extend this to the following operational (imperative)
interpretation: some scalar code needs to run before the scalar loop
code runs.  The type of the code is:

(si i) -> si'

Based on inputs, modify the algorithm state, and pass this state to
the loop.  Since it runs only once per loop, it doesn't fit in the
original semantics.  An operator (magic wand) needs to be introduced
for this.  Also, the problem is: where to define?  This should
probably be at the point of ai-feedback: add some state-massaging
operations which then can be moved up the chain.


Entry: Support for truncated power series
Date: Sun Mar 10 13:47:16 EDT 2013

I.e. compile-time unrolling of polynomials.  This is a partial
evaluation problem.  Where to fit it in?  There is no interface for
it.

Evaluating polynomials is probably possible using the fold op.  It
might be good enough, since the loop count is available to the SSE
layer optimizer.

However.. for most polynomials the coefficients are known.  Should
they be inputs or constants embedded in the program?

For the latter, a special form is probably better.  Let's try that
first.


Entry: Tension between functional (e.g. Haskell) and function-level (FP) programming
Date: Sun Mar 10 14:05:19 EDT 2013

The thing is: higher order functions are replaced by operators.
Operators are special.

The question is: is this good or bad?
Good for what, bad for what?

Good: keeps the algebra simple
Bad: seems to need more ad-hoc structure

Does the ad-hoc structure keep the algebra simple?
I.e. structure that is not too powerful to escape into turing-completeness?


Entry: Control rate
Date: Wed Mar 13 09:20:40 EDT 2013

I see 2 ways:

- Stay in the same framework, and see it as an input (parameter) TX,
  where moving things outside the loop is treated as an optimization
  (meaning stays the same).

- Add an extra state processing step.

It's hard to look into the future for this.  Control rate ops are
going to be all over the place, so why not make them explicit?


Entry: Tailor series
Date: Wed Mar 13 10:47:37 EDT 2013

I need (numerical) Tailor series for formulas like:

       ( exp(P0) / p0 ) ^ 1/N

How to express this?  Essentially this is just convolution.  The
trouble is to be careful with numerical instabilities, so it might be
good to keep some form of symbolic computation going, e.g. don't add 1
to a small number e , but keep track of 1 + e such that subsequent - 1
is still precise.

The trick here is probably to use lazy sequences.  However, that seems
like a pain to implement.  Or not?

EDIT: Representing series as functions on natural numbers seems like a
good idea.  All primitive series are simple functions, and any
numerical series could be represented as a function also.


Entry: Autodiff vs taylor
Date: Wed Mar 13 11:26:44 EDT 2013

I'm writing ai-taylor.rkt but it seems quite similar to autodiff.
Maybe it is possible to put both on the same footing?  I remember
vaguely from Conal's presentation that there is a way to derive Taylor
sequences too..


Entry: FLUNK presentation
Date: Thu Mar 14 15:15:41 EDT 2013

Bio

  - A/V DSP, graphics, generic numerical.

  - Systems programming: Linux & embedded uC

  - Student of programming language design & embedded DSLs.

  - Racket projects:

    - Staapl:
      metaprogramming for PIC, based on Forth & Scheme
      http://zwizwa.be/staapl

    - DSP code gen (work in progress).

  - Haskell projects:

    - Metaprogramming experiments

    - Visual Poem.


Abstract interpretation
FLUNK 2013-03-13


  - Informal/practical:

    -> multiple interpretations of a programming language.

    -> main purpose: program analysis

  - Formal Mathematical framework exists:

    -> Cousot, ordered sets
       http://en.wikipedia.org/wiki/Abstract_interpretation

  - Basic idea: abstract syntax / abstract interpretation is based
    on maps from semantics to evaluators:

      (define program
        (lambda (add mul)  ;; Semantics
          (lambda (a b)
             (let ((a2 (mul a a))
                   (b2 (mul b b)))
               (add a2 b2)))))

    For creating DSLs, this approach allows re-use of the embedding
    language's name binding structure (i.e. lambda expression /
    lexical scope) to represent syntax as code instead of data.

    Basic engineering principle: re-use.
    In this case, re-use of the idea of "variable".

    Evaluators are constructed by passing concrete semantics to
    abstract syntax.

       (define program-eval (program + *))

       (define program-compile
          (program (lambda (a b) `(+ ,a ,b))
                   (lambda (a b) `(* ,a ,b))))


  - Representations:

    -> Basic: based on lambda expressions.

    -> Haskell: type classes.

    -> Racket: "environment monad" style syntactic abstraction.


  - Application: AI for Audio DSP.

    - Problem: 
      
      - C/C++/Fortran/ASM is necessary for efficiency.

      - Too low level: structure iherent in to code is lost.

      - AI allows to capture structure that is present in original
        syntax to perform code analysis.

      - AI avoids "manual compilation", e.g. translation from high
        level mathematical model to C code.


    - RAI (Racket Abstract Interpratation):
      Basic language:

      - purely functional stream processing language

      - support for output feedback / state space models


    - Interpretations:

      - Evaluation to (inefficient) Scheme program for quick
        manual prototyping at command line + unit tests.

      - Evaluation to typed SSA form
        -> Converts to C program / LLVM code.
           -> Compiles to machine code

      - Automatic differentiation (dual numbers)
        http://en.wikipedia.org/wiki/Automatic_differentiation
        http://vimeo.com/6622658

      - Linear frequency domain transfer function.

      - Constraint satisfaction / logic programming.
        Convert syntax to racklog program
        http://docs.racket-lang.org/racklog/


Entry: Series expansion vs. differentiation
Date: Thu Mar 14 18:38:20 EDT 2013

This is a dead end.  No beef here.


Entry: Control rate
Date: Fri Mar 15 10:31:02 EDT 2013

Where to insert the (s,i)->s function?

This should probably be done at a fairly deep level, i.e. when
defining feedback functions.

Doing this in the ai-lambda-feedback form, passing an extra `setup'
argument to ai-feedback/n

This gives implementations access to the function body of the state
setup function.


Using a special-cased approach: doesn't work.
It should really be a composition of two functions:
- One feedback processor run at control rate
- One .. at sample rate

The beef is in how parameters are passed between the two.


Still confused about the approach to take

1. Implicit: don't change semantics, but just depend on loop
hoisting[1] to take care of the efficiency issues.

2. Explicit: allow for control-rate state machines


If an approach based on hoisting is used, it might be good to just
leave it entirely to the low-level compiler (GCC / LLVM).

The good thing here is that I can focus on semantics now, and focus on
implementation/optimization later.

This means the only missing parameter is N, the current block size.
This will be a semantics extension to the feedback/n code.

What about this: add an extra set of parameters to feedback functions,
expresing the time loop index and block size.

The subtle but nice thing here is that the time invariance /
block-based properties are implicit: using/not-using t (current time)
or T (max time == block size) fully determines this.

One problem though: this does not work for state normalization
operations.  It might be best to include those in the operations.


Conclusion: control rate ops split into 2 cases:
 - hoistable, no effect on state (just expose t and T)
 - performed on state


[1] http://en.wikipedia.org/wiki/Loop-invariant_code_motion


Entry: Control rate combinators
Date: Fri Mar 15 15:16:02 EDT 2013

Since hoistable expressions don't cover all cases, it's probably best
to split this into two parts: a state normalization / initialization
function, explicitly computed before the loop is executed, (and
possibly looping over a couple of states), and a loop function.

Trouble is: the state initialization function might produce constant
parameters that are not state parameters of the inner loop.  How to
glue these together?

  (s,i)   -> (s,l)  ;; Once
  (s,i,l) -> (s,o)  ;; Multiple times

This is actually not such a bad idea, since it also gives a way to
integrate block processing.

Let's let this ferment a bit more.


On the surface this isn't much more than a single state update, and a
way to push the `l' variables from setup block to loop block.
However, it does need to indicate to the caller where the final state
is stored, e.g. returning t&1.  Otoh, a state init phase could be
mandated for all loops.

This should probably be defined at one place, i.e. ai-lambda-feedback.

Maybe a state post-roll should be defined also, for symmetry's sake.
This would allow some statistics to be gathered as outputs?

EDIT: So, we're there.  A setup method exists and is pushed into
feedback/n.  However, it is not clear yet how to use it.  For
ai-array.rkt this needs to somehow capture all the bindings and push
them outside of the loop.


EDIT: There's a double assignment that shows up in the state setup
code.  This can be split by defining 2 so structs.


Entry: Naming parameters
Date: Sat Mar 16 14:22:37 EDT 2013

While naming state/output/input nodes is useful for debugging, they do
need to be disambiguated.


Entry: State names
Date: Sat Mar 16 16:17:42 EDT 2013

Still some problems for the state names: there is no unification
betwen the state nodes.

EDIT: Added a node -> pretty names map.  Input/param are still named.
They should not give conflicts as they come from a single lambda
expressions.


Entry: Testing the control-rate functionality.
Date: Sat Mar 16 18:03:02 EDT 2013

This is a big milestone.  It looks like it fits about alright.  Next
is to perform some tests.  A time-variable filter would be nice.
Probably best to do this in Pd.

Possible application: low pass, driven by the c/s samples.

EDIT: elements necessary to make testing easier
- Load compiled C code into racket
- Fix ai-stream.rkt OK
- Fix ai-array.rkt based new ai-stream.rkt semantics (sample at t=0) OK


Entry: Rough edge: setup depends on streams?
Date: Sun Mar 17 09:47:00 EDT 2013

Probably the setup routine should be run with t param bound to t=0.
This makes the semantics a bit clearer, i.e. explicit sub-sampling of
streams.


Entry: ai-stream.rkt
Date: Sun Mar 17 11:44:53 EDT 2013

Updated reference implementation to include state setup routines and
state init.

It might actually be better to implement ai-stream.rkt such that it
returns sequences instead of lists.  This could be done by wrapping a
lazy list (stream[1]) in a sequence.

Actually, racket/stream is just an interface on top of sequences[2].
To have real lazy lists, use lazy racket.

Hmm, it does seem to be solved now, at least, `stream-cons' is lazy.

EDIT: Changed implementation to sequences/streams.

[1] http://docs.racket-lang.org/reference/streams.html
[2] https://groups.google.com/forum/#!topic/racket-users/iFiBiU-YNhI


Entry: Graphs
Date: Tue Mar 19 09:47:54 EDT 2013

Gnuplot would be easiest, then dump it out in the build system to .png
for display in emacs buffer.


Entry: TODO
Date: Tue Mar 19 09:48:43 EDT 2013

- graphs
- run gen C code in racket
- simple osc + filter


Entry: Lowpass filter
Date: Tue Mar 19 09:49:31 EDT 2013

Start with a 2-pole.  What is needed is a quick way to play back a
sound, probably best in racket.  It's time to start trusting the
reference implementation and treat differences between reference and C
as bugs.

Pulseaudio gives me trouble: error messages interact with snot.
Really not usable.

Something else is needed.  Probably a native app that speaks OSC.  I
do have some code for this.  EDIT: Jack wrapper is working and part of
the build process.

[1] http://planet.racket-lang.org/package-source/clements/rsound.plt/1/3/planet-docs/rsound/index.html


Entry: Filter math
Date: Tue Mar 19 17:19:20 EDT 2013

I've been going at this for a while, and what I found is mostly
confusion, and plenty of approximation routes.  Knowledge gained is of
course useful, but is there a way to do this without spending all that
time?

Some urgency is required here..

Maybe not...  This stuff is important, and I lost a lot of good
intuitions.


Entry: Autodif pow(x,y)
Date: Thu Mar 21 10:36:50 EDT 2013

I need powers and radicals for solving optimization problems.
However, I'm a bit puzzled by the autodiff rule of the pow function,
since one of the arguments is a constant,

      #:pow (op ((b db) (e de))  (pow b e) (* e (pow b (- e 1))))

meaning `de' is not used.

Is this correct?

How to make this correct even in the case that de is not zero?

The thing is, we're computing the derivative towards x of x^y.  So to
make it correct it should be:

   d/dt f(x,y) = @/@x f dx/dt + @/@y f dy/dt

Yep, works:


    ;; Note that `pow' is redundant as it can be written in terms of
    ;; `log' and `exp', but the primitive is there to allow exact math
    ;; when de = 0 and b is an integer.
    (define d-pow
      (op-match
       ((b db) (e de))
       (let ((b^e (pow b e))) ;; maintain sharing
         (make-dual b^e
                    (+ (* db e (pow b (- e 1)))
                       (* de (log b) b^e de))))))


Entry: Autodiff full
Date: Thu Mar 21 11:38:02 EDT 2013

It should be easier now to do some more diff-based experimentation.  I
started a compalg.rkt module which can be used for symbolic analysis
at the command line, using the stream.rkt language syntax.


Entry: Problem in ai-freq.rkt
Date: Sat Mar 23 12:02:54 EDT 2013

All linear nodes need to be z-dependent, so should be implemented as
functions.  Looks like the current implementation is wrong.

It can probably be fixed, but needs a little bit of rewrite.  It would
be best to separate the type inference, since that could take into
account some global information.  However, this requires to solve the
annotation problem: the language is pretty much functional (input ->
output), so it's not clear how to make it into something that is
(input, input-type, output-type -> output).  It is like a
transformation of prog into (environ -> prog).  EDIT: no, it's
probably enough to type-analyze the whole program and then just feed
the correct type into the new ai-freq.

EDIT: The trick is to indeed represent linear variables as z->amp, and
use some memoization to avoid having to evaluate all matrices whenever
z changes.

EDIT: I also removed the dependency on the linpar type inference.
This can still be used separately as a whole-program analysis, to
automatically determine L/P distinction.  For ai-freq it now needs to
be passed in correctly or it will trigger a type error.


Entry: Environment extension
Date: Sat Mar 23 12:12:19 EDT 2013

Add a transformation that allows a transformation of the primitives
themselves.  This is useful for encapsulating type information.


Entry: Small signal analysis
Date: Sun Mar 24 12:37:26 EDT 2013

I've added small signal analysis to ai-freq.rkt based at 0.  However,
this could be generalized to any point, by using ai-autodiff.rkt to
construct linear approximations.

The probing part can probably be completely replaced by the
computation of a set of derivatives.  The thing there however is to
distinguish linear from nonlinear outputs.

Let's give it a try.  Roadmap:
- Implement evaluation over offset + signal correctly
- Evaluate over normal numbers to find matrix

Computing the derivative matrix was straightforward, just folling the
same pattern.

Changing the number implementation to small-signal is a bit stranger
though.  Maybe this can now be simplified a bit.


Entry: BCR2000
Date: Mon Mar 25 23:55:20 EDT 2013

Unit is a bit flakey.  Can't seem to program it with the old scheme
code for writing bcl files to the device.  Anyways, it's possible to
use one of the predictable presets, like P-2, which uses channel
strips with each row mapped to a CC.

I wrote a small C program to read it.  It's probably best to map it to
OSC.  There's an OSC racket lib[2]. 

I forgot.  OSC is not a text format.  Maybe using Pd format is better
for now.  OSC doesn't matter until later.


[1] http://planet.racket-lang.org/package-source/clements/osc.plt/1/0/planet-docs/osc/index.html


Entry: Solving polynomial constraints
Date: Tue Mar 26 10:01:13 EDT 2013

It might be good to write a polynomial constraint solver.  I have all
the elements, and keep making grave calculation mistakes doing this by
hand.


Entry: Fast floating point modulo
Date: Tue Mar 26 11:01:32 EDT 2013

It seems that the use case of fast float modulo usually comes with
wanting both the integer quotient and the floating point remainder.
The deeper question is whether integers should be reprsented
explicitly, or whether the int modulo should be just the `div' part.

Currently it seems too much work to add integer support.  This is just
an optimization.  It's perfectly fine to work with floats.  So for
now, I'm going to take away the f_mod primitive and change it to
f_floor.


Entry: Missing ordinary data-dependent loops for initialization
Date: Tue Mar 26 13:24:35 EDT 2013

Example: integer power, e.g. as part of computation of 2^x.  For now
it could be hacked as part of p_pow.  Hmm... not really.  The
constraint that the input is an integer is too much.

Principle: DONT ADD PRIMITIVES
They are a pain to implement, as every semantics needs a version.


This kind of looping is hard to do without an integer type..  Painted
into a corner.

Anyways, I couldn't resist.  Basic idea for a primitive:

/* Approximation to 2^x, accurate for "human" mapping.  FIXME: This is
   a bit of a hack since the core language can't do the conditional
   branching.  How to do this better?  */
OP p_pow2_approx(_ exp) {
    i_ n = p_ifloor(exp);
    if (unlikely(n < 0)) {
        return p_div(1, p_pow2_approx(-exp));
    }
    _ x = exp - n;
    // approx 2^x on [0,1] as p(x) = 1 + 2/3 x + 1/3 x^2
    _ r0 = p_add(1, p_mul(x, p_add(2/3, p_mul(x, 1/3))));
    _ r1 = 1;
    _ pow2 = 1;
    while(n) {
        pow2 *= 2;
        if (n&1) r1 *= pow2;
        n >>= 1;
    }
    return p_mul(r0, r1);
}


Though this has a while loop, so can't be done in parallel.  To do it
in parallel, unroll the loop a couple of times and limit the range of
exp.  Then still it is possible to use the approx over a larger range,
i.e. based on 2^x = (2^{x/n})^n.

It seems there are a lot of possibilities.

EDIT: Successive squaring seems to work quite well.  There is no need
for the p_pow2_approx hack so I'm removing it from the code.


Entry: BCR2000
Date: Wed Mar 27 14:04:04 EDT 2013

./bcr2000 </dev/midi3 | ~/pd/src/pd/bin/pdsend 2000 localhost udp


tom@zoo:~/meta/rai$ make bcr2000 test_jack.jack && ./bcr2000 </dev/midi3 | ./test_jack.jack 


Entry: BUG: dependency problem
Date: Fri Mar 29 19:19:19 EDT 2013

It's possible to pass parameters defined inside the time loop into
setup routines, which are executed before the time loop.

EDIT: Let's see what happens if this is fixed in the ai-array.rkt module.

Using a straightforward approach: if it's in the setup loop, the type
is scalar, and if it's in the update loop, the type is time-indexed.

Another approach: keep everything virtually in the same loop (all
time-indexed), but allow time-indexed values to be subsampled.
Essentially lifting stuff out of the main loop?

Maybe what this means is that each operation has 2 instances: one
outside of the time loop, referenced as t[0], and one inside.
Alternatively, unroll the time-loop such that at t=0 some setup code
executes, and at time=1...t_endx the rest will run.

It's too complicated / to ad-hoc.

This needs to be split into two parts: the dual sample-rate operation
should be constructed as a composition of two state machines, both
operating on the same state type.

global:
(s,i,p) -> (s,o)

decomposed as:
(s,i[0],p,t) -> (s,l)
(s,i,p,t,l) -> (s,o)

The problem with the current approach is that due to composition, the
i[0] can actually be streams represented as temporary, scalar nodes
inside the time loop.

So basically, for any computation that depends on i, there should be 2
versions: one based on i[0] to be included in setup calculations, and
one based on i[t] to be included in the main loop.

So maybe, this should just evaluate the program two times.  Combined
with lazy nodes, this will also take care of unused intermediates.

Problem solved.  Roadmap:
- implement lazy network construction
- implement 2-step evaluation


Entry: Implementing dual/multiple-rate programs and dead code elimination
Date: Mon Apr  1 09:59:59 EDT 2013

Basic ideas:
- ai-array should construct a lazy network
- lazy networks won't create dead nodes, as it is demand-driven
- lazy networks require lexical binding of environment (not dynamic params!)
- multiple evaluations can perform separate evaluations at different sample rates

Maybe it's best to spit the array stuff into explicit (s,i,p) -> (s,o) lifts?

Roadmap:
- lazy network + proper bindings handling


- 1: box all environments
- 2: replace all parameterize ((bindings)) calls with binding to an indexed array/list access of a bindings box
- 3: add one abstraction layer that creates new boxes on each evaluation (keep referentially transparent)


Entry: ai-type
Date: Mon Apr  1 20:02:12 EDT 2013

What is a typed program?  It annotates each primitive node with a type
environment, i.e. each primitive would be type-parameterized

   (ta,tb,tc) -> ((a,b)->c)

Entry: Type composition
Date: Tue Apr  2 00:45:54 EDT 2013

Make this[1] kind of class composition work.  Deligation can be done
dynamically as long as we don't depend on return types, i.e. constants
will probably not work.

Basically it needs a single interpretation that has data-dependent
delegation.  I.e ai-generic.rkt


[1] http://blog.sigfpe.com/2006/08/geometric-algebra-for-free_30.html


Entry: TODO
Date: Tue Apr  2 00:51:39 EDT 2013

- lazy eval + proper connection of setup + loop
- how to keep track of functions as being composed of 2 components?  can't inline!
- host parameters out of loop?  or is this just an array opti?


Entry: Cleanup
Date: Tue Apr  2 09:35:29 EDT 2013

Maybe it's best to first separate out the type unification problem to
simplify the ai-array code a bit.  Hmm... it seems impossible to
separate the structure from the typing..


There might be a big confusion going on here..  What I want in the end
is to reduce everything to a single setup routine and a single update
routine.  Can't this be solved at the type level?  The trouble is the
implicit "subsampling" operation.

Maybe this can be solved through an extra primitive?  I.e. p_hold Hold
takes any computation and projects it down to a sample that is
constant over the update loop.  This gives a "typing bridge".  The
remaining problem is then to perform state setup.

This allows setup code to be simpler:

  setup:  (s,i,t)   -> (s,l)
  update: (s,i,l,t) -> (s,o)
  comp:   (s,i)     -> (s,o)

As a result, this should give a single loop expression where each node
either depends on t or not.  If it doesn't, it can be hoisted out of
the loop.

Let's try the typing first.


The problem is too convoluted to solve at once.  How to break it down?

Hack: split all loop-local nodes?

No.  Split all inputs passed to setup and wrap them with "p_hold".
While this duplicates the whole network, it should do the right thing
in combination with lazy eval.

This is essentially the same hack as "const".

It doesn't seem possible or desirable to separate things by separating
setup and update functions.


Entry: Setup code : duplicate input network
Date: Tue Apr  2 10:09:27 EDT 2013

The solution seems to be to simply duplicate the whole input network.

However, this only needs to happen when the input network is a signal,
not when it is a parameter.

So I wonder, where is the problem actually created?
External inputs are used in two different ways: as param and as


What about this:
- construct a lazy version of the dependency network.
- gather all state setup out and local binding nodes
- force calculation of these -> creates all pre-loop bindings


Hmm... needs some more experimentation.

Looking at the stream semantics, what really happens is that at each
feedback function, the inputs are treated as streams and subsampled.
This needs to be done exactly the same in the array language.

How to implement subsampling?

Two problems are interfering:

- State setup, and keeping track of state variables so the proper
  loops can be generated in C.

- Subsampling of inputs.

Tough..

What does the C code generator need?  It just needs node names.

s,i,p,t_endx   # user-provided input + state from previous run
s,i,p,l        # state setup nodes + intermediary 
  t            # loop index
  s,o          # output and state output


So I cleaned up the t-lambda form a bit.  Now for the real work.

The problem: we're interleaving the creation of the setup and update
bindings, which leads to some of the setup operations depending on
update bindings.

What needs to happen is that these 2 phases need to be separated,
probably by creating the 2 functions (setup and update) explicitly.

What about that?  Create a semantics that splits the setup and update
functions into two different programs with some metadata?

So.. I went through the motions to make the node generation lazy.
Time to test it with the cases that failed before.

Test case in: test-dual-rate2


I think I found a hack: after evaluating the nodes that the main loop
depends on, the boxes associated to the main loop should not have any
nodes in it.  If so, those can be moved up.

However, that doesn't solve the duplication problem.  What is needed
is to effectively duplicate the operation.  Essentially, run the whole
program twice, properly binding the nodes.

Binding the nodes is going to be tricky, since they are not exposed
explicitly.

Let's just give it a try.

First, some cleanup:

The names "setup" and "update" are then maybe not correct, since the
nodes will get woven, and some tricks are necessary to separate them.
If there are going to be tricks, it's probably best to implement them
differently.  Maybe it is easier to mark nodes as local on a per node
basis, but that doesn't solve the state initialization problem.

However, if a state node is pinned as sub-sampled, maybe it will work?

More fermentation needed...  There are 2 problems:
- State-composition
- "these nodes run at a different rate"


First evaluation: compute only state output and local nodes.

Really, what this does is to extend the state.  The "setup" routine is
just like the "update" routine, but it has more state, which are the
local nodes.  Using this approach, it should be straightforward to
move the composition elsewhere.

Maybe the problem I'm running into is just ill-defined.

No, we just need to stick to the original meaning from ai-stream.rkt :
the local nodes only depend on subsampled inputs, and produce a
constant stream.  In ai-stream.rkt this doesn't need extra
bookkeeping.  However, when we want to optimize it such that the
constant nodes get optimized out of the loop, some untangling is
necessary.  I.e. some code does need to run multiple times.


Entry: Control rate ops
Date: Tue Apr  2 15:52:17 EDT 2013

Maybe this should just be a different type signature?  The setup has
"extra state" that can be used by the update as input.

setup:   (s0, s1, i) -> (s0, s1)
update:  (s0, s1, i) -> (s0, o)

Probably not necessary yet.  The current scheme can do the above if
necessary by setting s = s0 x s1.


Entry: Ill-defined?
Date: Tue Apr  2 16:03:18 EDT 2013

Not necessarily.  The problem is that the upsampling/downsampling
points don't have much structure.

mixing state update and local nodes computation does make things
entagnled and harder to manipulate.

So I really see no other way: evaluate the code twice:

- mode 'setup' -> run the update method, and pull (force) only the
  updated state nodes and loop constants

- mode 'update' -> don't run the update method, but plug in the nodes
  from the 'setup phase.

the trouble here is associating each feedback/n node with the nodes
computed in the previous phase.


Entry: Lazy nodes
Date: Tue Apr  2 19:12:36 EDT 2013

Yeah.. it would be nice, but it leads to too many hacks.  The code has
this strict side-effecting feel that's hard to get right by inserting
boxes everywhere..  Stumbling block is the "accumulate" form, where
it's hard to properly propagate the lazy semantics.

Basically, due to lazyness, some expressions will "grow".  If it were
just bindings, it would be easy, but it turns out there are many patch
points that make it very hard to read the code.

This needs a different approach.

Rewind.


Entry: State machine setup
Date: Tue Apr  2 19:31:35 EDT 2013

The code is ugly.  How to go about doing this in a cleaner way?

Semantically, what I want is a computation that is executed only at
t==0.  

_ r2;
for (t=0; t<t_endx; t++) {
    _ r1 = ...;
    _ r2 = ...;
    if (t == 0) {
        r3 = f(r2);
    }
    _ r4 = f(r3);
    out[t] = r4;
}


it might be simpler to annotate those nodes directly, and just unroll
the loop once

_ t = 0;
_ r1 = ...;
_ r2 = ...;
_ r3 = ...;
_ r4 = ...;

for (t=0; t<t_endx; t++) {
    _ r1 = ...;
    _ r2 = ...;
    // not updating r3
    _ r4 = ...;
}


This would probably allow much of that hackery to disappear.  I.e. no
need to handle the state any different.  While it is not as efficient
as could be, if the output is not written in the preamble, there will
probably be a couple of dead nodes that can be eliminated by the C
compiler.

So how to handle this?  It needs annotation, or a primitive p_hold.

No, not a primitive.  This needs some kind of node analysis.  Is a
node a constant?

It is if
- it depends only on other constants

A parameter is a constant.  These could be annotated explicitly to
distinguish them from signals.

Constant annotation could be done as a form of type-inference.

Maybe this should just be unified with the other constant annotation
thing that's not too clearly implemented?


Roadmap: this can be implemented along side the other implementation.
Once stable, the old could be deleted.

Still, it doesn't do state...

Idea isn't ready yet.

A simple if statement and stateful updates would be so easy!  Why does
this have to be so hard then?  Probably just not getting it..


Entry: Trouble
Date: Wed Apr  3 08:52:54 EDT 2013

So, maybe the trouble that comes from allowing this criss-crossing
between setup and update just isn't worth it.  Doing it implicitly
like that also makes things quite hard to test..

Let's kick it out, and provide a more highlevel sample-rate joining
mechanism.

Basically, parameters should be converted to streams only at the top
level, and in an explicit way.

Parameters should be converted to time-parameterized state machine
generators.

A low-rate parameter results in:
- Extra state for the update routine
- Extra constants for the update routine

What I want:
- To keep the siso state machine as the basic abstraction
- Combine 2 siso state machines for interpolation

So instead of solving this at the language level, it should be solved
at a level above that.

Problem: state resets don't fit this.
This is a recurring issue in thinking about this ...
I just can't wrap my head around it.

The simplest thing to do really is to make the whole thing explict.
Anything that runs inside a `setup' routine is exectued at t=0.  The
default behavior is the identity function for state fetches.

This allows simple unrolling to be used.

Let's give it a try.

Rationale:
- Allow for if() statements
- Optimize these in case they express time-dependence
- Unify all conditional jumps?

Arguments against:
- Piecewize operation doesn't work well for other interpretations.


Entry: Cleanups
Date: Wed Apr  3 12:34:48 EDT 2013

- Make fold generic
Problem: arity of the accumulators is not known.

There are a lot of these corner cases!


Entry: Explicit time loop
Date: Wed Apr  3 18:26:38 EDT 2013

Maybe it's time for an explicit time loop.  This would make all the
binding issues fairly simple, i.e. remove the magic.

The `main' function will then do one-time setup + lift anything over
the time loop.

Roadmap:
- Make state-feedback explicit: let-feedback : the stub for feedback/n
- fold-time

Combine both?  Essentially, unify feedback (fold) and state variable
gathering.

Generalizing fold seems to be the most important part.  The problem
here is to initialize the feedback nodes.  How to find out the arity
of a fold?  The syntax of "accumulate" can already do this, since the
number of expressions in the `values' statement is known.


Entry: Fold or map-reduce?
Date: Thu Apr  4 12:00:11 EDT 2013

Wait... the current approach is already generic.  Just replace the
accumulator with the fold body.  The trouble is that this should
probably not be something like '+' but 'parallel +'.

A general fold body is (s,i)->s For a "parallel +", the arity of i is
the same as s.  In general this is not the case.

Still there doesn't seem to be a way to find the arity without running
the function, so maybe that should be done first.

Arity comes from the 'values' part, so maybe this is enough.  Let's
just try it out.

This is map - reduce: separate out the parallel part and the
accumulation part.  Depending on the properties of associativity of
the accumulator, more things are possible.

How to test associativity?

Conclusion: got it +- working, but there are a couple of issues.
- Fold arity : now it takes 2x the map arity
- Performing data-dependent iteration?
- Lifting

This needs a lot of work, probably a week full-time.  Don't have that
right now.


Entry: ai-stream.rkt : implementation of reduce/n is wrong
Date: Thu Apr  4 13:24:36 EDT 2013

Due to automatic lifting the current implementation is wrong:
Todo:
- perform the map -> this uses automatic lifting and produces sequences.
- fold the sequence.

FIXED


Entry: hold and setup
Date: Thu Apr  4 19:48:41 EDT 2013

One of the features I'm looking for is `hold', which computes an
expression once at t=0, then holds that value.

The other one is `setup' which takes the first expression at t=0 and
the second one otherwise.

(define (test-subsample (s) (x))
  (let*
      ((s (setup (+ s 1) s))
       (y (hold x))))
    (+ s x y)))

The thing is that these mechanisms are distinct, but they are unified
in the "setup/update" 2-phase approach.  This seems to be the cause of
a lot of confusion.

An advantage of implementing them separately is that we can unroll the
loop for t=0 and t>0 and just directly implement the 2 behaviors
without resorting to assignments, just variable shadowing.

setup: t=0 / t>0 each take one of the branches. practically, every prim can be tagged.
hold:  first unroll defines the variable, second unroll doesn't define the variable.

Now, by defining (hold x) as (setup x #f) this might be implemented
using a single mechanism.  So let's start with setup.

Capturing the bindings seems to be simple.  However, it might be
better to implement this as a type-annotation, since it's a bit
special anyway.

Think about it a bit..

Looks like there is a solution.  This can be implemented using an
explicit "p_phase" primitive which performs an actual (t==0)
comparison.  However, using proper annotation generated during AI, the
nodes can be tagged to be left out of one of the 2 phases: setup (t=0)
and loop (t>0).

Trouble is that "hold" doesn't work this way: it needs explicit support.

The rest is for later.
It's a big change, but looks straightforward.


Entry: Changing the hold / setup stuff
Date: Fri Apr  5 09:28:38 EDT 2013

- fix "hold" for ai-stream: not correct inside setup method. -> TODO
- keep current syntax -> NO
- hold / setup for ai-array: OK

Can "hold" be translated into a state machine?  since it is actually a
stateful operation.


Entry: Ai-stream state stuff
Date: Fri Apr  5 12:33:52 EDT 2013

Should not loose stream semantics for state!

For this it seems that some kind of knot tying is necessary.  How to
express it in scheme?


Entry: Better plotting
Date: Sat Apr  6 12:11:24 EDT 2013

Need better plotting, including multiple graphs.


Entry: Racket plotting
Date: Sat Apr  6 16:43:35 EDT 2013

    (require plot)

    (plot (function sin (- pi) pi #:label "y = sin(x)"))

That is really nice.  Looks like I've found a reason to move to racket
for editing.  Or, ditch snot and move to geiser[2] instead.


[1] http://docs.racket-lang.org/plot/intro.html#%28part._.Plotting_2.D_.Graphs%29
[2] http://www.nongnu.org/geiser/geiser_3.html


Entry: Delay lines
Date: Mon Apr  8 12:43:46 EDT 2013

Delay lines are special in that they don't fit the state picture very
well.

Delay lines can be stored head-to-tail which makes addressing them
more straightforward, but they reduce locality in the case of small
delay lines.

Anyways, implementation can be optimized later.  The specification is
as follows:

- A delay line has a type: its length
- Otherwise it behaves as a function taking an integer index.

Problems:

- Where to solve bounds checking?  It's probably best to keep the
  implementation as simple as possible, meaning to solve bounds
  checking in the DSP side, and just retun undefined value for
  out-of-bounds (but not crash!).  This points to a big shared 2^n
  table.


I'm having a hard time representing this without having an
implementation in mind, so for now I'm assuming :
- One global delay memory state
- p_delay() passes offset in this state

p_delay() is only for reading..  Maybe this can somehow be captured in
the existing state mechanism.

This would need a modification from s->s to (D s)->s.

Where to put the type annotation?  Since there is only one write
point, it seems best this is done there.  There might be multiple read
points.

Using state, the following

(define (test-delay (dl) (x))
  (let ((sum (+ x 
                (dl-read dl 1)
                (dl-read dl 2)
                (dl-read dl 3))))
    (values
     (dl-update 10 sum)
     sum)))

translates to:

struct proc_si {
	float r2[10];
};
struct proc_so {
	float r10[10];
};

		float r0 = p_copy(t);
		float r1 = p_copy(t_endx);
		float r3 = p_delay(si->r2, 1.0);
		float r4 = p_delay(si->r2, 2.0);
		float r5 = p_delay(si->r2, 3.0);
		float r6 = p_add(r4, r5);
		float r7 = p_add(r3, r6);
		float r8 = p_add(param->x, r7);
		float r9 = p_copy_dl(r8);
		so->r10 = p_copy(r9);
		out->r11[t] = p_copy(r8);

which would work for the p_delay operatior, I.e. it could be something like:

#define p_delay(line, index) line[(offset + index) % length(line)]

where length() is a sizeof() style element counter, and offset is the
current delay line offset stored somewhere.  this could be a global
counter shared by all delay lines.  if the % is power-of-two this
should be straightforward.

the trouble is at the write point.  somehow the type needs to be
extended to indicate that this is not a normal array, but a single
write update.  also, the delay line memory needs to be excluded from
the double buffering scheme.

maybe what is needed is an annotation for "sample memory" or just
arrays that can be indexed and written to by the DSP loop.

it requires a bit of a cleanup for the type system.

Essentially, it needs a type-specific implementation of
single-assingment binding.

Note: keeping SSA semantics is probably a good idea.  Trouble is
really the output copying that gets in the way.

What this means is that `dl-update' is really like 'cons'.  Because of
functional semantics, the input line needs to be mentioned explicitly.
It looks as if there is more freedom this way (i.e. returning a consed
delay line in another state output than the input)

I removed the state copying, which does give a primitive expression
which has all the info (binding + full functional constructor) in one
place:


		so->r9 = p_delay_update(r8, si->r2);

NEXT:
- fix type system to distinguish Array from Delay


Entry: copy-nodes
Date: Mon Apr  8 15:25:18 EDT 2013

Two conflicting things:

- To bind high level objects, node copying is probably not a good idea.

- Node copying is necessary to avoid some direct in->out paths.


It's probably harder to avoid node copying then to simply support some
kind of "virtual binding" operation.  I.e.

		<virtual> r9 = p_delay_update(r8, si->r2);
		so->r10 = p_copy(r9);


in any case, removing the copy-node part seems possible and gives a
clearer view.


Entry: type cleanup day
Date: Wed Apr 10 16:57:13 EDT 2013

Lot's of cleanup but it seems to have exposed a bug.
Unification between x and Array x.
see nphasor in test-ai-array.rkt

Only fails in some specific test cases.  Not for the real world stuff.
Maybe they are just wrong?


Anyways, need to push the types from unify.rkt into the C generation,
or add some delay-specific generation stuff to phase 2.


Entry: Type system thoughts + array language
Date: Thu Apr 11 11:19:57 EDT 2013

The ai-array.rkt language should produce only array accesses in its
output language (phase 2, before C generation).

The C generation step should be "dumb".  All low-level mapping should
be done in phase 2.

The pasees:

1: Convert abstract syntax to type-annotated t-lambda form.  This is a
functional form with stream semantics.  All primitive operations
operate on streams, though some nodes are defined as "phase 0" meaning
only computed at t=0.

2. Convert t-lambda form to imperative block language.  Maps all types
to arrays and places all lifted operations inside loops over explicit
time/space indices.

3. Convert to C (dumb).


Concerning delay lines: Level 1 output has SSA bindings for whole
delay line structures while level 2 output has in-place update for one
sample value only.  The idea is to keep that SSA semantics.  The
assignment of an array element is still semantically a one-time
binding (cut off old, cons on new).


Entry: Delay memory
Date: Thu Apr 11 14:01:37 EDT 2013

For implementation, the important part is to keep things simple.

It's probably best to implement all delay lines in a single circular
buffer, as this requires only a single modulo operation, but it does
require some global state management.  So let's not.  The only global
thing is the delay offset counter, the rest should be handled locally,
i.e. per delay array.


I'm having a little trouble putting this in the right way.  It should
be an operation on an identifier.  It works fine for lvalues, but the
delay_read operation has the modulo before the addition.


What about making all array assignments modulo assignments?  Then
removing the modulo-check is an optimization, which can be done inside
of loops, since the extents are known. ( added FIXME to the code ).


Entry: Structured vs. random access arrays
Date: Sat Apr 13 11:48:30 EDT 2013

It seems there are 2 kinds of array accesses, both need to be handled
differently:

- structural (map / reduce) read / write

- random access: data-dependent table read / write, optional modulo
  indexing.


Entry: Transfer function computation
Date: Fri Apr 19 21:15:47 EDT 2013

According to Porat, matrix inversion is not necessary.  However, for
my app, the transfer functions is not always SISO.  Still it might be
more efficient to compute all polynomials and memoize them, to
accelerate computation multiple evaluation of the transfer function,
instead of performing a matrix inversion per different frequency
point.

EDIT: To spit out C code for transfer function computation, it might
be best to use this approach..  Otoh, for 2D systems, the matrix
inversion isn't going to be problematic.


Entry: Encoding "float types"
Date: Fri Apr 26 12:13:14 EDT 2013

Problem: some meta info needs to be transported to the GUI (e.g. VST
host like Ableton Live).  However, these are just "float" types to the
C code generator.  Essentially, the core doesn't care about the limits.

What does the host need?

The real problem is one of usr feedback.  All params are normalized,
being 0-127 for MIDI, or 0-1 for VST plugins, meaning that the
operation and automation is transparent.

I'm inclined to add the type information in the name of the parameter.
I don't see another way to do this except for introducing a lot of
complication on the C side.  The wrapper (.g.h -> code) could then do
some run-time interpretation?  Even the rkt code gen could add some
macros to centralize this.

Naming scheme: <input-range>_<scale-type>_<min>_<max>_<unit>_<name>

input format:
 m    = MIDI range 0-127
 v    = VST  range 0-1
 p    = exact parameter, no transformation

 ?? r<max> = 0-max (inclusive) range, m=r127 v=r1?

scale-type:
 a = absolute = linear
 r = relative = logarithmic/exponential
 c = circular (angle)

min / max: output scale ranges

 <sign><integer>[<dot><fractional>]
 sign = p/m
 dot  = d

name

eg:

 r127_

m_r_p20_p2000_Hz_Frequency
m_r_m10_p10___dB_Gain

Spec in rkt should probably use a macro to perform the name mangling,
so rep can change

(Frequency MIDI rel  20 2000 Hz)
(Drive     MIDI rel -10 10   dB)


This needs to be a special form that does:
- Name mangling or annotation
- Insertion of converter code, e.g. `midi-log/i'
- Separate "user" and "system" controls (e.g. samplerate)


Entry: Input types
Date: Sat Apr 27 11:37:43 EDT 2013

I'm trying to add the input type annotation using a special form,
which copies the annotation to a symbolic form in the ai-function
object, and a parameter conversion routine.

However, the latter can be multi-valued.  I have 1->2 but n->m is
probably also a possibility.

In general it seems not possible to separate these.  Maybe it needs to
be tagged in the type system?


The thing is that this is really type information in a generalized
sense.  Meaning, it is not something the program uses, but it is
something that a program "meta-thing" uses.

Can it be stored somewhere else?  I.e. it should not be used in
unification, since the difference between a float and a float in [0,1]
goes away after it's been "blessed".  Also, it's possible to add a log
float and a lin float (i.e. a polynomial approximating "exp" bridges
the linear/exponential "type").

Maybe this is more something of a "unit".  I'm thinking Haskell
"phantom types".  It seems that going that way is a bit overkill for
the Scheme thing..

I really just need units and scales.  Some kind of annotation that is
only used outside of the whole code generation thing.  Sure, it would
be useful to apply more general "theorems" about values, but it might
be hard to express things like "linearity".  Basically, there is no
"logarithmic" scale: it is only an artifact of the curvature of some
polynomial approximation.


So what is the real problem?

- scales
- bounds
- units

Maybe this should be considered as a 2nd level type system.  The
problem of scales is the same as that of units.  I.e. Hz or logHz.

The important distinction with the "machine" type system is that this
2nd level type system cannot be applied to the primitives.  It only
makes sense at a level that glosses over certain details.

This requires a big change: a dataflow network has 2 parts:
- A highlevel semantics that can be typed in the "units" sense.
- A lowlevel implementation


What about this:

- The current float/int array type system is part of a specific
  interpretation.

- The unit-correct type system is a *different language* that expands
  into the flat number-crunching stream language.


The current I/O specification problem is exactly this: the processor
is the number-crunching part, which is abstracted as a higer level
program that can be typed in terms of higher level units.

Note that there is a possibility to add higher level semantics, since
there is already a function object as part of the representation.

This corresponds very well to the idea of approximation.  It could
even be used to encode exact and approximate functions and
(automatically) find approximation errors etc..

Some nice beef is hidden here.  But I need to move on.

Let's stick to the plan: store the annotation in the ai-function-args
field, or add an extra "types" tag to the function object.

Yeah, this is not good.  It needs to use a typing approach.

The idea is not ready.
Time to put the annotation in there manually, then connect it up later.


Entry: Parameter ranges
Date: Sun Apr 28 11:53:32 EDT 2013

Another way to solve it is to use type-directed approach,
i.e. polymorphy using typeclasses.


Entry: Parameter meta info
Date: Tue Apr 30 16:42:24 EDT 2013

What I have is C node names.  These need to be attached to a struct
with meta information.  This meta info is (for now) unrelated to the
semantics of the code (later, use this for "typing" approximations).

There is a provision to attach meta info to nodes already.  In
ai-array.rkt


Entry: Tagging nodes
Date: Wed May  1 10:24:25 EDT 2013

It seems simplest to just add a tag primitive to add arbitrary
information to a dataflow node, i.e. pass it through a primitive that
has a pass-through run-time semantics.


Entry: Scales
Date: Fri May  3 11:24:03 EDT 2013

Just noticed that a scale has 2 ranges:
- What is presented to the user
- What is computed in the algorithm

Basically, there might be an "exp" conversion in there.

This distinction is reflected in the code by having param
specification and param interpolation separate!


Entry: Annotating approximations
Date: Sat May  4 10:49:19 EDT 2013

It seems like a good idea to annotate a function that is a (polynomial
or rational) approximation with the exact function, to aid some
automatic comparisons.

It seems to be a good form of documentation also.


Entry: Events
Date: Sat May  4 13:12:46 EDT 2013

How to implement event triggering?  There are only signals, no events,
so this needs either direct state modification or some workaround.

Instead of working with pulses, it might be simpler to work with
transitions, since those are easier to define as a primitive.  After
that a positive/negative edge can be generated.

Really, a simple differentiator should do the trick.

        y = x (1 - z^{-1})

Note that problems like this are created by requiring that state
access is not allowed.


This is really hard to express without vectored conditional
assignment.  Currently that is not supported.  I want this:

(define (env-AD (level setpoint rate prev-gate) (gate attack decay))
  (let*-values
      (((next-level)
        (+ level (* rate (- setpoint level))))
       ((next-params)
        (< prev-gate gate  (vector attack 1)   ;; 0->1  start new cycle
        (< 0.9 level       (vector decay  0)   ;; attack done, start decay
                           (vector rate   setpoint)))) ;; keep coefs
       ((next-rate
         next-setpoint)
        (unvector next-params)))
       
    (values next-level
            next-setpoint
            next-rate
            gate
            ;; out.  1 sample delayed to pipeline computations
            level)))


Maybe ai-array prims should automatically lift?  Still, in this case
it would compute the conditionals multiple times.

Not ready yet!


Entry: How to make something difficult?
Date: Sat May  4 17:48:19 EDT 2013

So I got stuck at an invisible point: simulation code (ai-stream)
looks fine for both AR and AD, but in Pd, the AD doesn't give any
output.

( hmm... just thinking, might be a param vs. state interaction )

Maybe what is needed is a standalone C .elf that can be run in the
debugger.  Actually, this already exists:

( changed code a bit to limit number of params, then run it like this: )

make synth.g.elf ; ./synth.g.elf .1 1 1 1 0 .1 .1

gdb -i=mi --args synth.g.elf .1 1 1 1 0 .1 .1


Suspicious code line:

		so->r36[i] = p_lt(si->r31[i], so->voice_gate[i], r24, r35);

that should be param->voice_gate[i]


struct proc_so {
	float r34[1];
	float r38[1];
	float r36[1];
	float voice_gate[1];
	float r55[7][1];
};

WTF?


The thing that happens here is that there is a direct path from
(state) input to (state) output, meaning that the nodes are not
unique.  In ai-array this needs an additional decoupling.

So, no panic.  This was a true compiler bug :)


Entry: Feldspar
Date: Mon May  6 23:54:33 EDT 2013

Maybe time to look into this again, see if there is anything to pick
from..

[1] http://hackage.haskell.org/package/feldspar-language


Entry: Fixed point DSPs
Date: Tue May  7 01:01:11 EDT 2013

Time to go for the real challenge: generating code for fixed point
DSPs (i.e. low power apps).  I already have the dsPIC, but no free C
compiler.

There's an academic lite version[2], and the evaluation version is 60
days[3].

Trying mplabc30-v3_31-windows-installer.exe under Wine, dsPIC version.

Command line programmer? [5]

  Yes. You need to install MPLAB and it is here.
  C:\Program Files\Microchip\MPLAB IDE\Programmer Utilities\ICD3\ICD3CMD.exe


That might be good enough to get started.  Apart from NXP, I don't
have any other chips that might be a good strategic bet.


[1] http://wiki.maemo.org/Programming_the_DSP
[2] http://www.microchip.com/stellent/idcplg?IdcService=SS_GET_PAGE&nodeId=1406&dDocName=en536656
[3] http://www.microchip.com/stellent/idcplg?IdcService=SS_GET_PAGE&nodeId=1406&dDocName=en535363
[4] http://www.microchip.com/pagehandler/en-us/family/mplabx/#downloads
[5] http://www.piclist.org/techref/postbot.asp?by=thread&id=%5bPIC%5d+Use+ICD3+directly+from+the+command+line&w=body&tgt=post


Entry: BUG: problem with time / space commutation
Date: Tue May  7 13:25:01 EDT 2013

A 'hold' inside a bus context doesn't work: it would need an array of
values...


Entry: Higher order functions?
Date: Tue May  7 13:35:11 EDT 2013

Note that it is apparently possible to pass functions as arguments.
A nice surprise, but why is it a surprise?


Entry: Bret Victor / Live coding
Date: Tue May  7 23:13:03 EDT 2013

Watching "Inventing on Principle" [1].

To point to a number in the code and have it change the output
immediately is just awesome.  Can we have that in emacs?

[1] http://vimeo.com/36579366


Entry: Speeding up the dev cycle
Date: Wed May  8 09:21:39 EDT 2013

Two things are important next:
- User interface
- Speeding up the dev cycle

From Bret Victor's presentation I get the idea that it might be better
to combine both.  Focus more on visual feedback.  How fast can the
whole compile-edit-run cycle go?


Entry: #lang vs. macro
Date: Thu May  9 10:54:24 EDT 2013

Hi list,

I have a "#lang s-exp" that has #%app redefined, but I would like the
ability to embed it into a macro.  E.g.

  #lang s-exp
  ...
  (f a)   ;; special application

vs.

  #lang racket
  ...
  (g x)           ;; normal application
  (my-lang (f a)) ;; special application

I tried an approach using
   (let-syntax ((#%app ...)) ...)
which gives trouble.

FIXME: what trouble?


It might be simpler just to use submodules:

;; stream syntax submodule
(define-syntax (begin-stream stx)
  (syntax-case stx ()
    ((begin-stream form ...)
     #`(begin
         (module stream-forms "stream.rkt"
           (provide (all-defined-out))
           form ...)
         #,(datum->syntax
            stx '(require 'stream-forms))))))


Entry: Automatic lifting
Date: Thu May  9 12:15:08 EDT 2013

Trouble: can't just put this in the unification algo because it will
"spread".  I.e. this needs specialization.

The trouble is the following:

  - we're in a loop context i,j,k
  - the type of some parameter is not Ni,Nj,Nk but say Nx,Ny
    how to specify which constants to drop?

How to integrate?

  1. Type inference works correctly across the "const"
  2. Some info is lost here: i.e. which index to ignore.

What we need is something like this:

   p_const_ref(param->c, i, j, k, 1-0-1)

where 1-0-1 is the "ignore mask"

So every time we go into a loop, all nodes need to have their indexing
info updated.  Basically, ai-array needs to keep track of the reduce
stack.  That's really it.


Entry: Refactoring
Date: Thu May  9 12:54:34 EDT 2013

Making a couple of changes.
1.  Loop vars are explicitly typed as Int
2.  First pass keeps the current loop stack.


What about this: perform the lifting in the first step, and perform
the optimization (re-using internals) in the second.

Hmmm... makes me doubt the whole auto-lifting approach again.


It seems simple in the abstract: introduce references.


Entry: Scrap the whole thing?
Date: Thu May  9 14:11:35 EDT 2013

I think I've hit a wall.  The ai-array is too complex.

The trouble is that this all requires a lot of local bookkeeping,
while using an object-oriented approach would avoid that.

Maybe it's time to look at it in a different way.

- Every node is a grid
- Grids can be constructed from grids (transposes?)
- The whole thing can be compiled to a stateless update function + explicit state objects.


Entry: const
Date: Thu May  9 14:49:44 EDT 2013

Is not a good abstraction.  It really can't mean anything else then:

  for all accumulation layers below this, the value is constant.

Anything else requires some abstraction of indexing.

The above already works.  However, it needs invocation of "const" on
each level, otherwise it breaks.


Entry: Abstract array references
Date: Thu May  9 14:56:44 EDT 2013

What about making all references abstract?

I.e. a node can be a compile-time transformation of another node.

   (r123 (thunk))

If it is represented as a  i -> j -> k function

Const: it can take a function and produce another function that
ignores the current loop index.

This is also the main reason why loop indices cannot be bound in the
first pass, because indexing operations might be composed.

So that's it: a "transposer" decouples the real type from the
presented type, and produces a list of indices from a list of inputs.


This seems to be the right way to go.  However, there's a conflict
between implicit and explicit indexing.  What if a node is explicitly
indexed, but moves inside a new loop nesting.  How to add the other
coordinate?

Maybe it needs some more fermentation.


The questions is: what is a node?  Is it a grid, or a grid point, or
another subset of a grid for that matter?

Focus on dereferencing of nodes.  This always happens in a context.


Entry: Outer product
Date: Thu May  9 16:45:00 EDT 2013

Maybe something to try would be an outer product.  This gives a better
idea if what "transpose" means.


Entry: Getting rid of all syntax stuff in ai-array
Date: Thu May  9 17:19:17 EDT 2013

Yeah.. long term project.


Entry: Array references.
Date: Thu May  9 18:42:59 EDT 2013

I'm going to make all readreferences explicit.  Write references are
explicit, since they are all the same.

Wow this is really a minefield.

It needs two passes because:
- binding needs to know what are external nodes (only available at the end)
- typing is available only at the end

So how to clean up things so the second pass is simpler?

Ok,
- removed the syntax stuff
- just pass dictionary from pass 1 -> pass 2
- removed the delayline stuff


a bit blurry since it was on autopilot.. i don't think i changed that
much, but i had to remove the delayline hacks, which can probably be
implemented on direct array reads.


Entry: Delay lines
Date: Fri May 10 10:32:34 EDT 2013

- Uses a pool of delay lines cycled in a single buffer.

- Effective memory use is delay_size + 1.  One element is used for
  writing.

- Apart from modulo, bounds are not checked on the lowest level.  For
  variable length delay, use an extra abstraction.


The problem isn't really the low-level stuff, it's how the
specification fits into the update wrapper.

Basically, in the implementation above, a delay line's state is
virtual.  So how to handle virtual state objects?  They do not need to
be saved in the global state dictionary.

Well, let's save them anyway in the first pass, but skip them in the
second pass.  I.e. instead of passing #<delayline 10>, this is
translated to index 10 into a delay line.

Basic principle: no inplace update!  Updating an array always needs to
go through a virtual object that simulates the replacement of the
whole array with a new one.


So, I'm goig like this:
- Generic vbuf mechanism to abstract in-place array updates.
- Build delay lines on top of that using Int index computations

TODO: How to get at the delay offsets?
Should it be part of vbuf read reference?

So indexing works (using vbuf attributes)
now just modulo indexing.

Maybe this is something to solve at the lower layer, using a sizeof()
macro?  Since the size is not yet available when the code is actually
generated.

So it looks like it's all there.

next: support.  Add "proc_storage" allocation in host.

Ok, basic idea works.  Delay is going the wrong way though (needs
decrement) and probably best to stick to "and" for buffer wrap-around.


Entry: Differentiated integral interpolation
Date: Sat May 11 10:47:37 EDT 2013
Type: tex

The general form of DII (Differentiated Interpolated Integral) is to
perform an interpolation or resampling operation after integration,
and differentiate the resulting signal.

I wonder if it is possible to find a workable representation of this
technique in the frequency domain.


Entry: Type system trouble: delay lines
Date: Tue May 14 10:45:19 EDT 2013

It should be possible to derive types (sizes) from constants present
in the code, otherwise there is a level-shifting problem: can't
parameterize delay length in a function!

Is there a way to do unification with more generic constraints, like
"size > 1000" ?  Since I don't have a type system to embed in, I could
use any logic program to do the actual constraint specification.

Currently, it might be possible to solve this elsewhere: add a type
family parameter, and put a constraint on it.

FIXME: So there's a workaround for the most important use: currently
only fixed number indices are possible to make FDNs work.  ai-array/1
will unify this with the Delay array type.  The multi-tap problem can
then be solved later.


Entry: KVR / music-dsp / pd-list post
Date: Tue May 14 11:41:42 EDT 2013

Is there anyone here interested in Functional Programming and C code generation for DSP code?

I'm working on a system for DSP code development based on the principle of Abstract Interpretation
http://en.wikipedia.org/wiki/Abstract_interpretation

Basically, it will allow several interpretations of a single specification program:
- specification as a pure functional program
- imperative C code generation (for passing to C compiler or LLVM)
- Z transform for frequency plots of (linearized) transfer function
- automatic differentiation for all kinds of derivative-based tricks
- ...

It is written in Racket
http://racket-lang.org/

It's starting to get to a point where it is actually useful (there is a basic synth core) and could use some feedback, but please note it is very experimental and still needs a lot of work. Familiarity with Racket is probably a prerequisite at this point to make sense of it. Current code is here:
http://zwizwa.be/darcs/meta/rai/

It is part of a darcs archive that can be fetched using
darcs get http://zwizwa.be/darcs/meta

[1] http://www.kvraudio.com/forum/viewtopic.php?p=5355958#5355958
[2] http://www.kvraudio.com/forum/viewtopic.php?p=5355958#5355958


This is a very valid point, and I must say my main point of dispair.  Maybe it is so that a language like C (or C++ for those so inclined) is really the best available abstraction once you take all these tricks into account.  Ultimately we write code for state machines, not some pretty highlevel model interpreter.  Ultimately, we need to just make it work.

What interests me though is how many of those hacks and tricks are essential?  When you start out with a simple language that makes the dirty hacks impossible but somehow incorporates the "smart" ones as simple abstractions, are you left with something that is uglier than what you would do in straight C with a good coding standard?

Another approach I've been thinking of is to go the other way: write analyzers for C programs.  However in this case, the unessential hacks get in the way very quickly simply because C is too permissive and doesn't capture enough structure.  There has to be a middle way somewhere, but it could very well be that this only works for a limited set of primitive combinations.


Another valid point.  Maybe to clarify that I am not talking about stateless functions at the sample level.  I'm talking about functional / stateless the same way I understand is done in Faust: at the stream level.  I'm looking at functions as stream operators, where delays and IIR filters are implemented in one interpretation as a C struct containing current state.  Essentially, update functions in

 (In another interpretation, the stream operator can be a linearized transfer function).  However, the program never manipulates this state directly.  It is essentially what is abstracted in the ArrowLoop class in Haskell.


Entry: Delay lines
Date: Wed May 15 22:05:49 EDT 2013

Should be simple 1-1 ops.  The lenght needs to be available to the
z-transform evaluator!  If they share an input, they can be optimized
into tapped delay lines.

The thing is: single-sample delays all come from the state feedback
operator.  Essentially, other than unit delay should be implemented
the same.

Something isn't right: two concepts are mixed.

A simple z(n) operator won't work, since this can't do output
feedback.

Maybe the current approach is really the best way to go: a delay line
is a matrix signal.


Entry: Compile time vectors
Date: Fri May 17 14:12:15 EDT 2013

Maybe it's good to allow state nodes to be compile time vectors.  This
requires an extension in ai-array.

The trouble with that is the type needs to be known at input.
Auto-lifting could be used.

It is getting dangerously complicated.  This is the constructor
commutation problem I ran into in haskell, i.e. generalized transpose:
foo of bar to bar of foo.

It would actually help to have a proper type system to annotate the
nodes.  Something that runs "backwards".

How to construct an annotated semantics?
A program is a sequencer of primitives.
Maybe the primitives should be extended to take a context?
They already do for semantics, but not node-specific.
I.e. adding context requires the introduction of an identity.
Every primitive node is unique.
Currently, the isomorphism between the output and the primitive node is implicit.


What does a type checking transformation look like?


I'm thinking, everything points in the direction of automatic lifting.


Entry: Type Checking
Date: Mon May 20 10:10:05 EDT 2013

Maybe it's time to make the typed interpretation rigid, and provide a
standard interface to the primitive operations in the form of a context.


Entry: Macro stepper
Date: Mon May 20 12:50:55 EDT 2013

(require macro-debugger/stepper)
(expand/step #'( . . . ))


Entry: Type analysis stuff
Date: Mon May 20 15:03:36 EDT 2013

Trying to implement this, it seems easier to abstract away primitive
execution in the first ai-array.rkt pass then to try to do it the
other way around.  It doesn't seem to be possible to separate the
basic structure of the array code from the type analysis, but it does
seem possible to avoid actually executing the primitives.

However, what does come to mind is a different class-based approach to
this.  The output of pass one could be a different language that can
also have abstract interpretation, where the extension is like a class
extension / subtype.

The "only" thing this does is to abstract the recursion pattern inside
the representation.  The trouble is, that probably requires laziness.

Still, it is getting a bit convoluted with the different small hacks
like const, hold, setup, delay, ...

Anyways, the basic structure is not too hard to abstract as a
function.  Just wrap substructure in thunks.


Entry: Is it too crufty?
Date: Mon May 20 15:50:47 EDT 2013

The collection of ad-hoc extensions makes it hard to see what is
happening where.  The ai-array code is too complicated.

However, the meaning of the language itself is quite straightforward,
which can be seen in the relative simplicityt of ai-stream.rkt

So maybe this should just be left as is, since the complexity is quite
isolated.


Entry: Abstracting recursion
Date: Mon May 20 15:54:31 EDT 2013

One of the interesting aspects of the abstract syntax approach, is the
embedding of the recursion mechanism in the syntax.  Maybe this should
be generalized?

Currently there is only one recursive structure, which is accumulate.

What is the output of stage1?


Entry: Delay #f
Date: Mon May 20 19:51:29 EDT 2013

Delay lines are infinite length from the language's pov.  ai-array
implements a subset, requiring at least one literal indexing
operation, taking the maximal index as the implementation length.


Now, the notation of delay lines is a bit awkward, since stream
semantics is confused with scalar semantics.

Semantically, `feedback' construct enforces equality between input and
output state streams, except for a unit delay.

The `dl-read' operation is not a problem: it shifts a stream.  It
should be renamed to `dl-shift'.

However, `dl-update' doesn't update anything.  It *is* the stream.
Maybe that should then be renamed `dl-bind'. ?


Entry: Tension between compile time and run time aggregate datastructures
Date: Mon May 20 22:04:58 EDT 2013

I need an abstraction that can both represent indexed data structures
(arrays) and lists of discrete nodes (registers).  Maybe this is
some memory from the FeldSpar paper poking through?

The idea is that a list is both a "real list" and an "indexing function".


Entry: ai-array.rkt
Date: Tue May 21 10:04:00 EDT 2013

Gotten a bit depressed about the state of the C code generator.  Maybe
it is time to realize that this is not a simple thing!  It's actually
doing a lot of work translating high level concepts into low-level
code and data allocation.

EDIT: Cleaned it up a bit, following the idea that PASS1 builds global
node annotation dictionaries and a half-annotated program tree
containing "postponed" operations in the form of virtual primitives,
and PASS2 completes the annotation from the dictionary info.  It seems
hard to factor it into smaller chunks since this requries the
definition of some intermediate language at each point.

Target mapping (-> C) is straightforward if the PASS2 output is
structurally C-like and fully type-annotated, and can be done in a
single pass (PASS3).


Entry: C-like Surface syntax
Date: Tue May 21 15:10:43 EDT 2013

I was thinking about finding a good input syntax to avoid running into
s-expression arguments.  For some apparently syntax is a real road
block.

So I was thinking, why not pick a subset of C?  That's what all the
big guys are doing, i.e. think of shader languages.


Entry: Implicit sharing in Haskell
Date: Wed May 22 03:27:07 EDT 2013

A solution to the problem of sharing using hashed expressions.  

This would remove the main problem with using Haskell as an embedding
language for the RAI language.

[1] http://okmij.org/ftp/tagless-final/sharing/ExpI.hs
[2] http://okmij.org/ftp/Haskell/DSLSharing.hs


Entry: To inline or not?
Date: Wed May 22 10:49:02 EDT 2013

Some more things to do:
- Block-based transform processing
- Vectors & matrices

Essentially, this is about generating non-accumulating for loops, or
at least generating temporary buffers.  Is it possible at this time?

The main missing link is the execution of feedback systems as
sub-loops.  Maybe the time loop needs to be made explicit?

What about unifying feedback and reduce, or make reduce operate on
feedback structures?

Making the time node explicit means introducing an interface point
between two sides of a loop.  Essentially this is about accumulator
setup and pass-through.


It's unclear at this time.  Let's fix the FDN problem first.

It seems reduce needs to be extended to allow for output streams.
That will make the body a standard siso system.


Entry: Delay inside loop?
Date: Wed May 22 11:47:59 EDT 2013

There is probably a bug in the delay operator all
              (vector a b c d))))ocation: if a node is
a vector, it needs multiple delay line allocations.

For now it's probably possible to put the delay operator inside an
accumulate loop.


Entry: unpack-function
Date: Wed May 22 11:50:19 EDT 2013

Need a way to lift a function that takes a state vector and expand it
to a scalar.


Entry: FDN: pipelining
Date: Wed May 22 11:56:57 EDT 2013

One way to work around the awkwardness of FDN mixing is to pipeline
the feedback matrix.  This makes the lines parallel, but requires some
more state memory to store the result of the mixing.


Entry: Compile-time vectors
Date: Wed May 22 12:13:31 EDT 2013

This works well for real I/O as the pack/unpack operations can be done
explicitly.  However for the behind-the-scenes state feedback it might
become problematic.

It would be good to allow for some kind of "lazy lifting" on the state
nodes.  I.e. whenever a map operation is attempted on a single node,
splice it open.

Trouble there is to put it back together again.  I.e. how to bind a
vector node to a list of scalars.

This can actually be done in the ai-array.rkt implementation.  When a
return value of any function is not a single node, insert a parallel
assignment statement.

( While there is definitely some worth in just doing it, running into
walls, and climbing over them, I do miss a bit of a global idea. )


Let's see where this goes incrementally.

As long as the size of the array is known, it's fairly
straightforward.  However, there is one point where it isn't, which is
state input.  How to split that into a proper list?

Problem: type inference needs to complete before the operation can be
executed.


Entry: Circular deps between typing and evaluation
Date: Thu May 23 09:23:45 EDT 2013

In practice it seems possible to wiggle around this, but it needs a
real solution.  There are some circular dependencies between
evaluation and typing, i.e. a "vector" operation can determine a type,
but "unpack" needs type info.

How to fix this?  A separate type inference step is probably
unavoidable.


Entry: Faust
Date: Thu May 23 18:11:08 EDT 2013

Faust Workshop Part 1 - Feb 23, 2013 - CCRMA Stanford University 

Relationship with faust:

- Everything in Faust is a signal processor.

  This is more similar to a functional Forth dialect than to a Scheme
  dialect, where everything is a value.  Though in RAI, all functions
  are signal processors / operators.

- Except the algebra combinators, which are higher signal processor
  processors :)

- It does "parameteric code", basically higher order functions,
  alledgedly built on pattern matching.

- There's an IDE called FaustWorks

- There is an embeddable compiler tying into LLVM.

- Effects library written by Julius Smith

- The Latex output is cute :)


I'm wondering if a lot of the things in RAI aren't unconsciously
inspired by Faust examples.  E.g. the "bus" word.


[1] http://www.youtube.com/watch?v=Q4CKgIhdyVg
[2] http://repmus.ircam.fr/_media/mamux/saisons/saison11-2011-2012/orlarey-2012-02-03.pdf
[3] http://www.youtube.com/watch?v=B9QJFHE7lVo


Entry: Common subexpression elimination
Date: Thu May 23 21:38:48 EDT 2013

A possible addition might be to use cse.


Entry: Faster code gen
Date: Thu May 23 21:57:01 EDT 2013

What about a small VM for faster updates?


Entry: Albert Graef - Pure
Date: Thu May 23 22:10:54 EDT 2013

[1] http://blueparen.com/node/6
[2] http://www.musikwissenschaft.uni-mainz.de/~ag/ag.html


Entry: Extra pass?
Date: Thu May 23 22:17:06 EDT 2013

Looks like it is time to insert an extra pass.  Finalize type
inference before performing things like input node unpacking.
Though.. still that won't work.  Something is cyclic!

Where exactly is the problem?  A type annotation is necessary, without
performing any of the data unpacking tricks.


Entry: What is a stream?
Date: Fri May 24 15:31:45 EDT 2013

That is the real question.  The problem with the array implementation
is that it is hard to interpret in terms of imperative code.

So is there a way to keep more true to the definition?

The main problem seems to be situated when moving from a pure
operation on a stream, to a non-pure stateful operation on the next
element.

Somehow, something is lost.

Who owns that context, the state streams?


Entry: FeldSpar
Date: Fri May 24 15:39:19 EDT 2013

From [3]:

    The current version of Feldspar deals only with pure data
    processing; although, we have initiated work to extend the
    language to encompass control.

It looks like they avoided feedback state.

Basic Ideas:
  - Symbolic arrays: map f (Indexed l ixf) = Indexed l (f . ixf)


Maybe the idea of symbolic arrays can be reused.  Currently I rely on
type inference, but to abstract the indexing procedure in the form of
curried functions is a possibility.

Basically, given an opaque node `s' that is passed to a map function
will return a new opaque node s.  The map doesn't need to be forced
until the type is known.  Let's give this a try.


[1] http://www.cse.chalmers.se/~ms/
[2] http://dsl4dsp.inf.elte.hu/
[3] http://www.cse.chalmers.se/~ms/MemoCode.pdf


Entry: Grid
Date: Fri May 24 15:50:01 EDT 2013

What is the basic idea?  DSP is about computations on a grid.  There
are different ways of constructing grids.  What we're interested in is
local context: given a point on a grid, there is a way to get to a
neighbouring point.

One limitiation is important: the time dimension is special.  If the
time dimension is causal (computations never refer to the future) it
is possible to re-arrange the computations in such a way that they
take the form of a state space model.

The key point is to be able to separate the description of grid
computations and their dependencies, from an actual implementation as
loops.

It's probably time to take a look again at APL and J.


Entry: Roadmap
Date: Fri May 24 18:19:33 EDT 2013

- Generalize reduce to foldmap (accumulation + output streams)

- Allow for compile-time vectors?  Maybe not necessary (just use
  constant arrays?)  Anyway, it would be useful for meta-programming
  code shuffling tricks, which is different from straightforward array
  programming.


Entry: Catchy name: Kauzal
Date: Fri May 24 18:52:11 EDT 2013

As the whole point was to be able to use causal IIR filter.

Cowzal?  That doesn't sound right in English though.


Entry: A proper map?
Date: Fri May 24 19:06:10 EDT 2013

Seems that the reason I've used the implicit map and strict fold (no
outputs) approach is that it avoids the necessity for intermediate
arrays.  This might be quite a strong restriction in general, but for
audio it makes sense: everything will be accumulated at some point.

Is it possible to always write code in this style?  This requires
manual fusing.

E.g. can the FDN be implemented like that?

What it has is a map/reduce structure.

Ie.
-f--\
-f-- r
-f--/

It doesn't look like this can express parallelism followed by a mixing
relation like a matrix multiplication.

-f--|--g-
-f--m--g-
-f--|--g-

The advantage of the mapreduce is that it maps directly to vector
operations.  It works nicely for synth voices.

The basic idea for the mixing is that it necessarily requires more
than one pass, unless it is causal and can be expressed as a
"triangular" mixer.


The more general mixing is currently only possible using explicit
indexing.  This is where the FeldSpar idea of symbolic indexing comes
in.

Once map is implemented in such a way, it is probably best to do away
with the mapreduce / autolifting abstraction as a basic building
block.

So what would map look like?

It seems easiest to try to implement it for state i/o first, since
that has no artificial restrictions on arity.


  (define (_map sem fn arr-in)
    (let ((arr-out
           (lambda (i)
             (let ((el (arr-in i)))
               ;; cross-rank unify el and arr-in/arr-out
               (fn el)))))
      arr-out))

What if all nodes are arrays, and can be dereferenced (specialized) at
any time?  When is it time to evaluate?  At some point it will need to
be stored.

This delayed execution will wreck all kinds of havoc on the strict
evalatuation / imperative approach...  Maybe it should be explored as
a side path first.

Maybe this delayed eval is the reason why the "where" clause appears
in FeldSpar?  Or is that nonsense?


To put things in perspective: there is absolutely no problem for
functionality: we can always unroll and use compile-time operations.
The only "problem" there is is one of efficiency for e.g. large
arrays.


Conclusion:
- when does it eval the array?
- how does it interact with fold?
- does it need to memoize?


Entry: Now, what is the point?
Date: Fri May 24 20:50:29 EDT 2013


f :: (s,a)->(s,b)
g :: (s',b)->(s',c)

f . g :: ((s,s'),a) -> ((s,s'),c)


1. Does it make sense to have those things as objects in Haskell
2. Is it really necessary to have them at the C level even?


I mean, it is nice and clean to have a siso system in the end, but
what is the point really?  Is it just to expose all the variables more
clearly?  If AI is necessary, the fact that the original language is a
stream language should be enough.

The things is that the primitives are not stateful.  It's groups of
primitives, and in order to do this in an OO approach, that grouping
needs to be abstracted.  Because of nesting, this can maybe get a bit
ugly so the explicit threading approach actually seems simpler.


Entry: Intermediate vectors
Date: Sun May 26 11:40:51 EDT 2013

This needs separate declaration and assignment.

I was thinking about the map/fold/map problem, and it seems a solution
might be the following:

- extend the reduce mechanism to allow for vector output.  this would
  enable multiple passes with accumulation.

- implement the FeldSpar-style symbolic indexing as a higher level
  operation on top of explicit map/reduce loops


On the lower level, this needs the ability to allocate temporary
arrays on the stack.  It should be straightforward to write a test app
and let it drive the changes.

Instead of modifying reduce/n it's probably better to write a
completely new abstraction fold/n


Looking good:

next: prevent copying of ouput nodes.  probably better to assign
directly inside loops instead of creating temporaries?

alternatively: introduce copy loops. -> done


Entry: for/n
Date: Mon May 27 09:13:45 EDT 2013

Looks like this is going to be the better abstraction.
Next: see if it can support the "bus" structure.

Because of the auto-lifting (dual representation?) the input lists do
not need to be mentioned explicitly.  I'm still not sure if this is a
good idea.  It complicates things, but seems necessary to implement
the time feedback state threading.

Or is it?  Currently the loop indices are known at the time the state
feedback is performed, so it is possible to infer those without
relying on typing tficks.  But on the other hand, why not infer the
input indexing if the state indexing is already inferred?

It remains a point of confusion.  Why?  Because it is different..


What about this:
- for/n only needs state inputs for body, rest can be bound elsewhere
- feedback/n uses the same interface

So, the basic conceptual abstraction is state feedback, with two
iteration mechanisms: time and space.

Auto-lifting makes it possible to avoid having to specify array
inputs.  The same is probably true for the feedback/n

Roadmap:
- later: make for/n feedback/n use the same interface
- now: replace reduce/n with for/n in bus


Entry: auto-lifting
Date: Mon May 27 10:33:24 EDT 2013

The auto-lifting is really clumsy to deal with.  In practice it isn't
even needed so much: most parameter use seems to be constants from
context, so it might be better to explicitly dereference arrays.

For the internal state: auto-lifting is essential, since there is no
way to annotate it in the text, and it will always follow the same
pattern.

So, let's get rid of the `bus' thing altogether, and use an explicit
deconstructor like the Racket sequences.

If it is done this way, types inside loop bodies will be scalar,
except for the outputs and the explicitly mentioned inputs, which will
be arrays.

What happens for state i/o?  It needs to be wrapped with
dereferencing operations.

This is a big change.  Is it compatible?  There is already explicit
indexing, so it looks like it would be possible.  Let's give it a try.

Yes, by tracking the current loop type, it is possible to insert
unpack/pack instructions for the state threading.

The rest should be straightforward except for details of type
annotation, but it is a permanent change.  Will break `reduce/n'.


This probably needs virtual buffers to implement multi-dim arrays.

Auto-lifting is removed from for/n.  Up to now everything looks good.
Insertion of virtual access is going to need some work.

meta/rai test synth works
next: access to loop indices


Entry: Array sizes
Date: Mon May 27 15:22:00 EDT 2013

Something doesn't sit right: the connection between types and values.
Maybe it's time to look at Haskell solutions to this.  What is the
essence of bridging both stages?

This is for later..  Maybe it's going to be possible at some point to
just put the array size in a node and use a "pierce trough" to the
type annotation.

Anyways, I've removed the array size parameterization.


Entry: Trouble
Date: Mon May 27 17:05:12 EDT 2013

Delay lines don't mix well with the state pack/unpack.
Problem: vbuf state nodes need to be removed from the state node list.

Not load/store wrapping the scalar nodes fixes the scalar delay lines.

Trouble is with node-pack.  Putting that in the right place seems to
fix it..  Not sure it's 100% correct though.  Interaction between
-store and -unpack ?


Entry: Cleanup
Date: Tue May 28 10:41:37 EDT 2013

Removed some of the implicit indexing. 
Currently only time indexing needs to be inserted.

Except deep copy.  Can that be avoided?

It seems to be only the case for output.  Maybe direct binding instead
of creating a temp array.  Seems to work.

Duplicate accu node?


Entry: Delays and loops don't mix
Date: Tue May 28 17:58:49 EDT 2013

Trouble is that each line has to be allocated separately.  A simple
-load/-store inside a loop doesn't work.  Time to think a bit about
it.

Maybe it is better to put it in feedback/n?  Or have some magic
lifting for state inputs that preform delay operations.  That will
probably be best.

Delays can be added in the second pass: at that time, it is clear what
the size of the loop is.  Alternatively, they can be special-cased in
store.

Well... as it is currently implemented, this can't work unless the
delay offsets are stored in a vector.

Looks like the first thing to make work is constant vectors.


Entry: Stream vs. param
Date: Wed May 29 08:50:46 EDT 2013

What about making inputs referred to in a `hold' automatically into a
param?


Entry: Time feedback
Date: Wed May 29 08:51:49 EDT 2013

Still doesn't sit well.  Doing stuff behind the scenes seems
unnatural.  While composition of recursive systems is nice, it is not
clear where the state goes.  So having an explicit "capture" for the
state feedback might be a better approach.

Making the time loop explicit will also remove the awkward "hold" and
"setup" constructs.  But it will break stream semantics.

Essentially, what is needed is some construct that will transform what
we have now into an explicit loop form like this:

(loop (t (t_endx 64))
      ((acc ...))
      ((in ...))
      body)

Where all the acc stuff is threaded.

The state threading really is an essential feature, but it has an
intrinsic conflict: can't get at that which is hidden!

So, the stream semantics is essential.


Entry: RAI doc merge
Date: Wed May 29 09:16:06 EDT 2013

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>RAI</title>
    <link href="http://netdna.bootstrapcdn.com/twitter-bootstrap/2.3.2/css/bootstrap-combined.min.css" rel="stylesheet">
    <!-- Le HTML5 shim, for IE6-8 support of HTML5 elements -->
    <!--[if lt IE 9]>
<script src="//html5shim.googlecode.com/svn/trunk/html5.js"></script>
<![endif]-->
    <link rel="shortcut icon" href="/img/favicon.ico">
    <link rel="apple-touch-icon" href="/img/apple-touch-icon.png">
    <link rel="apple-touch-icon" sizes="72x72" href="/img/apple-touch-icon-72x72.png">
    <link rel="apple-touch-icon" sizes="114x114" href="/img/apple-touch-icon-114x114.png">


  </head>
  <body>
  <!-- Navbar
================================================== -->

<header class="hero-unit clearfix" id="overview">
  <h2 class="spinto-editable" id="headline-content">RAI</h2>
  Introduction
</header>

<div class="container">

<div class="row"><br>  
<div class="span2"></div>
<div class="span8"><br>  


<ul>

<li> Go over <a href="">FeldSpar</a> and its goals and inspirations.
This <a href="http://www.cse.chalmers.se/~ms/MemoCode.pdf">paper</a> has a nice list of references to Spiral, Pan, Obsidian, 


<li> Compared to existing systems, RAI is probably most similar
to <a href="http://faust.grame.fr">Faust</a>, a functional stream
processor language developed at Grame in France.  The main difference
with Faust are:
<ul>
<li> RAI is an extention to an established programming language,
i.e. <a href="http://www.racket-lang.org">Racket</a>, a Scheme
dialect with extensive support for writing embedded DSLs.
<li> A core element of RAI is the ability to easily construct
alternative language interpretations.
<li> RAI supports control rate programming, allowing slow rate update of parameters that are expensive to compute.
<li> Feedback in RAI follows a more traditional approach using a state-space update equation, while faust uses a 2-operand combinator.
<li> On the practical side, Faust is over 10 years old and has a lot of support and libraries.
<li> Faust has a more sophisticated compiler/optimizer.
</ul>
</ul>

<br>

</div>
<div class="span2"></div>
</div>

</body>
</html>


Entry: Take a break
Date: Wed May 29 17:22:21 EDT 2013

I'm nearing burnout.  Some things to think about later:

- add a `feedback' operator, taking a pure operator into a causal
  stream operator.

- fix delay lines so they do not need output spec, only input
  reference:

  * for input delay, this inserts a siso delay

  * for siso delay: automatically insert the output wrapper.

  delays should probably be implemented after type inference has
  finished.


Entry: Delay: cut out dl-bind.. what happens?
Date: Wed May 29 17:31:33 EDT 2013

I reverted the change.  Doesn't look like a good plan for now.  It
really is a bit problematic to do the type annotation if there is no
explicit dl-bind to balance the dl-index.

Still, the abstraction is quite lowlevel.  Not sure what to do with
it..

As for the multiple-state problem, it is probably possible to perform
proper multi-delay expansion at the dl-bind level, since the loop
index is known there.

Still, it's all a bit fishy.  What is really needed is a good way to
loop over lists of constants.  It's perfectly possible to put these in
the code, though it is a bit harder to perform the necessary juggling
to mix compile time and run time computations as in the case of delay
lines, where max length is necessary for allocation and run time index
is used.

So, for now: delay lines are serial only.


Entry: Delay lines already are SISO systems
Date: Wed May 29 19:00:52 EDT 2013

Wait.  Why not stick to the simpler approach?  Delays are implemented
using state space systems.  Abstract that!  And it looks like this is
already the case.  The abstraction is good, it's the naming that's
problematic, or the relation to the basic idea: arrays of shifted
streams.


(define (delay (s0 s1 s2) (x))
  (let* ((delay-out s2)
         (feedback 1/2)
         (delay-in  (+ x (* feedback delay-out))))
    (values delay-in s0 s1   ;; shift all delays
            delay-in)))      ;; output    


So instead of using s0 s1 ... sn, the line is abstracted as a single
array, but it remains possible to access all the taps.

The limitation lies in the recombination possibility: only shift is
allowed, not a generic permutation.


Entry: Moved
Date: Mon Jun 10 10:18:56 EDT 2013

The RAI-related messages are moved to
http://zwizwa.be/-/rai


Entry: State machines
Date: Sat Jul  6 12:15:02 EDT 2013

I spent a couple of hours writing a ISO7816 protocol parser for a
logic analyzer (LA).  Bit stream accessible to a C program.

I used the Saleae Logic which has an SDK available.


It boils down to RS232 + some byte-level parsing.

Interesting as it brings up that tension between switch-style explicit
state machine programming and thread-style or linear temporal
programming.  Basically, writing switch-style state machines is error
prone.  In this particular case I found it also hard to use a
self-contained test-driven approach.  Instead I used a live target to
produce the LA data.

Some ideas:

- I wrote some C macros to do thread-style programming[2].  It would
  be good to give that a try for the protocol parsing lib.

- Look at MyHDL[3][4].  Basic idea is to use Python shallow coroutines
  to define state machines.

[1] http://community.saleae.com/DeviceSdk
[2] https://github.com/zwizwa/shaco
[3] http://www.myhdl.org/doku.php
[4] https://speakerdeck.com/jandecaluwe/myhdl-designing-digital-hardware-with-python-pycontw-2013


Entry: State machine compiler
Date: Sat Sep 28 16:24:17 EDT 2013

In essence, this can be solved by taking a function and:
- adding a state struct parameter
- replacing *all* local variable definitions with state members
- adding a dispatcher to the code's entry point
- annotating all yield points with labels

I wonder if this works in Language.C ?
It would need support for GCC's computed goto.


Entry: Haskell Embedded
Date: Mon Jan 13 23:24:02 CET 2014

These deserve some attention:

http://leepike.github.io/Copilot/ Copilot on top of Atom
http://smaccmpilot.org/languages/index.html  Ivory & Tower


Entry: Conclusions from RAI
Date: Tue Jan 14 01:14:52 CET 2014

So the basic using the Scheme approach for audio plugin DSP code
generation is:

   - there is a lot of ad-hoc structure

   - composing structures is hard


I'd like to do next is to capture the structural transformations in
Haskell with the purpose of
  1. abstraction, composition in the solution domain
  2. separation of concerns in the implementation/compilation domain


Entry: The Disciplined Disciple Compiler (DDC)
Date: Mon Jan 27 09:32:42 CET 2014

http://disciple.ouroborus.net/
https://github.com/DDCSF/ddc


Entry: Ragel state machine compiler
Date: Mon Jan 27 09:33:50 CET 2014

Ragel compiles executable finite state machines from regular languages.

http://www.complang.org/ragel/


Entry: Real time and stacks
Date: Fri Mar 28 18:45:02 EDT 2014

So.. if you have a real-time system, isn't it so that it has to be a
FSM?  Anything with stacks and tasks has the possibility to go
unbounded, and if all tasks are bounded they are actually FSMs.

What is this?  Why does everyone keep talking about multitasking while
what we really need is a way to better write FSMs without pretending
they are (finitely used, infinite) stack machines?

Using tasks when FSMs will work is wasting resources (and introducing
a whole class of problems that FSMs don't have.)


Why am I getting infatuated with FSMs?  Because they somehow seem to
drag in message passing / event systems instead of relying the
combination of shared resources + non-deterministic scheduling.


Entry: P Language: event-driven state machines
Date: Sun Oct  9 20:37:31 EDT 2016

https://github.com/p-org/P


Entry: Recap
Date: Sat Mar 18 10:00:26 EDT 2017

meta/siso  see siso/DSPM_RAI.txt
meta/dspm  dead end, code gen reuse?
nodes      dead end
rai        works, but could use types