[<<][softarch][>>][..]
Sun Apr 24 21:59:35 EDT 2011
Robust filesystem
I'm burning my fingers on a current project. Some things that went
wrong:
- No specification, evolutionary what-can-we-get-away-with-cheaply
design. Considering the application that was actually not such a
bad approach: functional requirements where very simple and
straightforward. Eventally there was one ill-specified
requirement that caused a bit of complexity.
- Complete underestimation of non-functional requirements:
robustness.
- Difficult refactoring: merging two subsystems that where "almost
the same" caused many headaches when they where actually placed on
top of the same abstraction.
- Splitting another part of the code into separate modules proved
difficult due to insufficient understanding of the coupling
involved.
- Premature optimization. Not for speed, but for memory usage, in
this case disk buffers. This lead to an implementation that was
very hard to change.
- Inadiquate test suite: the stateful nature of the system, and the
nature of the errors that should be recovered (lots of invalid
intermediate state) makes it very hard to test.
If I can name one major point, it is state. The system as it is at
this moment has too many degrees of freedom. This makes many things
very difficult:
* Testing: almost impossible to cover all corner cases.
* Change: the higly sequential nature of operation makes it very
difficult to separate responsabilities over multiple objects, or
perform simple, incremental changes.
* Ownership: at least one bad factorization is due to unclear
ownership of data structures.
* Temporary storage management: A non-functional requirement is to
use a minimal memory footprint.
If state is the problem, the solution I imagine is almost immediately
to switch to a transaction-based approach where pre and post
conditions can be expressed clearly (even if only in the test suite).
In a fully transaction-based approach, there is a very simple, even
_stupid simple_ way of handling errors: whenever it fails, retry.
In the current implementation that approach doesn't work completely:
physical errors are system state changes: the system is changed from a
consistent to an inconsistent state. The difficulty is in recovering
from that.
[Reply][About]
[<<][softarch][>>][..]