Thu Jul 29 10:46:03 CEST 2010

Error handling and "smart code"

For very stateful object (i.e. file systems) that have many possible
local errors that could in principle be solved locally, it might
sometimes be better to separate the code into two layers:

  * A dumb layer that gives up at the first sign of trouble,
    cancelling the transaction (not modifying the state).

  * An error recovery layer at the API entry point that localizes
    retry/resolve to all conditions that are recoverable, restarting
    transactions from the start.

The basic idea is: don't _ever_ leave your object in an inconsistent
state between method calls.  This can sometimes conflict with "use
small method calls" and pushes you towards thinking very hard about
what a decently sized transaction does.  If a state transition
consists of a lot of small increments with intermediate
inconsistencies, make sure it can be aborted at all times.  In C this
can be done by keeping the updated state in local variables, and only
committing when everything went right.