Mon Jun 15 12:18:59 CEST 2009

juggling binary objects -> caching continuations

Using the latex + dvipng the problem isn't really computation time.
Rendering is fast.  So why cache it?

What about this: create an abstraction that map the operation of
creating a directory with files onto the creation of a list of
objects.  I.e. a hash table.

Then, in sweb when these files get transferred to the client, they can
be garbage-collected.

It would even be better if the .dvi could be cached in memory (since
it's not that large), but the dvi2png can extract individual pages.

Actually, this completely solves the problem.  The hairy part is the
fact that a .dvi or .tex represents a _collection_ of pages, but http
requests are always about individual items.. So in modeling data
structure, you need to think about dependencies and then apply
memoization there.  Simply put: dvipng can use indexed addressing

Then the 2nd problem: doing it this way the filesystem storage used as
a scratchpad during the execution of a program can be abstracted
completely.  This is what makes things a whole lot less messy.


  * HTTP requests are about objects.  This should be reflected in the
    in-memory model.  

  * Some documents have a _logical_ hierarchical structure.  I.e. html
    + embedded images.  This can be reflected in the in-memory model
    using dependencies.

  * Intermediates can produce multiple objects which are requested
    asynchronously through http.

The problem is the asynchronous nature of http.  Because it has no
concept of containment (which sucks if you ask me) this containement
needs to be modeled elsewhere.  Because of the production of multiple
objects, some memoization is a good idea.  As long as the memoized
data isn't too large (in this case .dvi files are only slightly larger
than .tex files) it can be kept around in memory.  If not, some disk
caching strategy might be necessary.

The real insidiuous problem here is that you can't really
garbage-collect anything: the client might request sub-documents, or
might not.  This is the central problem in server-side continuation
management for instance..  You really want to transfer all this
information to the client.

Hey.. Can this be done for intermediate data also?  I.e. instead of
keeping the memoized .dvi around, can't we just dump the .dvi in its
entirety to the client, then ask the client to give us the .dvi it
wants rendered?

It's the same thing as you'd want to do with continuation storage.
The problem is really that continuations themselves tend to be large,
and passing them back and forth between client and server is not a
good idea..  So, caching is in order, and that is where the problems

So essentially to solve the web continuation problem you just need a
caching approach that works.  That's all.  But definitely not trivial
as it's hard to define what a good caching strategy is..

[1] entry://../compsci/20090615-131905