[<<][pool][>>][..]
Wed Mar 21 12:01:45 EDT 2007

distributed filesystems

looks like there is really no ready-made solution. what i want is
something that works a bit like 'darcs':

- archives are completely distributed and self-contained (cache and
  backup)

- transaction based: all 'editing' changes are propagated.

- deleted files are not retained: no editing history

- separate directory tree and storage pool

- file aliases

basicly this is rsync, but with poper 'merging'. in rsync, there is
always a master. so what about these paths:

1. rsync with proper merge
2. darcs with symlinks and a data pool

my orginal idea of putting the directory tree in a darcs file, and
using a pool with MD5 hash names isn't so bad really. the tree could
really be an s-expression, and a copying garbage collector should work
just fine (between pool directories).

elements
- fuse for interface
- scheme for handling the internal representation + daemon
- rsync for transferring pools

everything seems technically feasible, except for the 'merge' idea.
AFS seems really heavy, and is client/server.

this seems close:
http://wiki.apache.org/nutch/NutchDistributedFileSystem



let's have a go at this:

1. all operations on the store need to be serialized
2. nodes can perform operations in parallel
3. the merger needs to handle conflicts

the operations are:
* add file
* change file properties (permissions / name)
* delete file

what about hard linked pool? and sync only the pool? hard links are
better because they are not directional.

it would be really nice to have standard representation. something
that can easily be tranferred to non-managed space, and also easy to
debug and regenerate..

so

file tree   <-->   pool + file log (md5 + path)


this is for another time. don't have enough context in my head for
it.. would be a nice opportunity to give scheme shell a try though.




[Reply][About]
[<<][pool][>>][..]