Sun Apr 24 22:31:05 EDT 2011
Robustness : external mutators
Follow-up of last post. How to make a data storage system robust?
- In a reliable system (no externally introduced faults),
inconsistent state has to be a consequence of bad design, as there
are tried and true principles to build such systems correctly.
The main idea is to clearly define what a state change is, and
allow it to occur or not, but never show to produce any
This is captured in more detail in the ACID rules used in
* Atomicity: a transaction succeeds or fails. No intermediate
(inconsistent) state is ever visible.
* Consistency: each transaction maintains consistency rules.
* Isolation: concurrent interaction should not interfere.
* Durability: a completed transaction persists.
All these are reasonably obvious, especially if you stick to the
simpler approach of serialization as isolation principle: a state
change either succeeds or fails.
- In a system with transient errors, recovery is possible through
transaction abort+retry if inconsistencies are discovered soon
enough. Here "soon enough" means before an inconsistent read leads
to a write that would not occur. Let's call these "read faults" or
In practice, such errors can be caught using redundancy.
Checksumming, error detecting and correcting codes, ...
As an optimziation, nested transactions can be retried locally.
- In a system with "external state mutation", consistency maintenance
requires extra effort, i.e. actual "repair".
This is hard, as it requires "intelligence", i.e. knowledge that is
not inside the system and its consistency rules.
It seems that the best approach is to design the system in such a
way that such repair operations are kept to a minimum. If the
external mutations are known, they could possibly be caught by not
allowing them to cause actual inconsistencies.