Mon Jun 15 13:48:49 CEST 2009

Filesystems as Graphs

I ran into an interesting pattern trying to solve .tex -> .dvi -> .png
conversion.  It is a way to manage temporary files used in
orchestrating the invokation of external programs.

Classical file systems, by the way they are _used_ in unix-like
utilities, are quite low-level data structures.  They cannot support
garbage collection because _references_ to files are not explicit.

A filesystem is a finite function (hash table).  Since it cannot be
guaranteed that this function won't be evaluated at some arbitrary
point, it has to be kept around in its entirety.  This kind of late
binding makes garbage collection impossible.

By replacing this data structure with a graph (a Scheme code/data
structure) files can be managed using the graph memory manager.  Wrap
temporary files in graph nodes, and ensure a 1-1 correspondence to
these (meta)objects and the file's content, either in memory or on


  * This is essentially independent of the data storage / caching

    It is possible to perform operations on objects by temporarily
    serializing them to disk, running external programs that produce
    more files, and bring those back into memory.  The most elegant
    solution would be a filesystem interface towards external
    programs, but simply save+execute+load is good enough as a first
    attempt to implement the essential logic.

  * The effect of external programs can be localized.

    Filesystem operations (unix utilities) still work in this view.
    What is better though is that effects can be managed locally:
    create a temp directory with files, perform some external
    processing on it, and map the relevant results back into the graph