Sun Apr 24 22:31:05 EDT 2011

Robustness : external mutators

Follow-up of last post.  How to make a data storage system robust?

 - In a reliable system (no externally introduced faults),
   inconsistent state has to be a consequence of bad design, as there
   are tried and true principles to build such systems correctly.

   The main idea is to clearly define what a state change is, and
   allow it to occur or not, but never show to produce any
   intermediate state.

   This is captured in more detail in the ACID[1] rules used in
   database design:

     * Atomicity: a transaction succeeds or fails.  No intermediate
       (inconsistent) state is ever visible.

     * Consistency: each transaction maintains consistency rules.

     * Isolation: concurrent interaction should not interfere.

     * Durability: a completed transaction persists.

   All these are reasonably obvious, especially if you stick to the
   simpler approach of serialization as isolation principle: a state
   change either succeeds or fails.

 - In a system with transient errors, recovery is possible through
   transaction abort+retry if inconsistencies are discovered soon
   enough.  Here "soon enough" means before an inconsistent read leads
   to a write that would not occur.  Let's call these "read faults" or
   "wire faults".

   In practice, such errors can be caught using redundancy.
   Checksumming, error detecting and correcting codes, ...

   As an optimziation, nested transactions can be retried locally.

 - In a system with "external state mutation", consistency maintenance
   requires extra effort, i.e. actual "repair".

   This is hard, as it requires "intelligence", i.e. knowledge that is
   not inside the system and its consistency rules.

   It seems that the best approach is to design the system in such a
   way that such repair operations are kept to a minimum.  If the
   external mutations are known, they could possibly be caught by not
   allowing them to cause actual inconsistencies.

[1] http://en.wikipedia.org/wiki/Database#The_ACID_rules