Sun Sep 7 11:44:55 CEST 2014
Backups and content-addressed storage
Basically, I do not trust your backup software :)
Version control (darcs, git) has worked well for keeping my own data
duplicated and managed. However, for the rest of the familiy and for
those big files it's best to start using context addressing.
To keep things simple, create an sqlite database with:
It's important to identify individual volumes as well, i.e. when
Let's write this in python to make cross-platform management easier.
What are the connections to make?
- filesystem hard link from hash name to file data
- text file / table with hash <-> path link
This is the basic data structure.
Backups = rsync store + metadata file
Cleanup = delete versions no longer linked to files
Deduplication -> manual, find files with same hash