So I loved this essay in the Usenix blog comparing the IT attitude toward server availability to the efforts of the aviation profession. As the author notes, the aviation industry has a century's more experience with these topics, and so has much to teach us about what is needed in order to achieve the reliability levels that we desire:
I’m talking about the methods, the drive, and the sheer determination to discover, at all costs, the root cause of the issues that occur in the aviation profession.
Don't miss this link to the author's more detailed essay about redundancy and disaster planning. Beware the fiber backhoe!