Michael Nygard's Release It! is now five years old, which seems like an eternity in the world of Cloud Computing, so I was prepared for it to seem stale and out-dated, but in fact it still feels fresh and relevant.
Nygard nicely describes the goal of his book in the very first paragraph of the Preface:
In this book, you will examine ways to architect, design, and build software -- particularly distributed systems -- for the muck and mire of the real world. You will prepare for the armies of illogical users who do crazy, unpredictable things. Your software will be under attack from the moment you release it. It needs to stand up to the typhoon winds of flash mobs or the crushing pressure of a DDoS attack by poorly secured IoT toaster ovens. You'll take a hard look at software that failed the test and find ways to make sure your software survives contact with the real world.
In this compact description, Nygard conveys all the reasons that I found his book to be well worth the time:
- It's based in experience. Nygard shares dozens of real situations that he's encountered while building modern Internet systems, and isn't afraid to reveal his mistakes and how he learned from them.
- It provides solutions, not just problems. Nygard survived his mistakes and kept good records about basic, solid approaches that he used successfully in his work.
- It's lively and fun to read. If you've ever tried to spend time digging into Cloud Computing technology in areas such as networking, security, or resource virtualization, let me tell you: it is dry, dry, dry, full of acronyms and abstraction. Nygard's writing style is a bit breezy, but it's light and entertaining and he succeeds, for the most part, at taking some very dull material and making it, well, at least bearable.
It's probably worth comparing Release It! to a broadly similar book, Site Reliability Engineering, which I discussed about six years ago, briefly.
The Google SRE book is longer and more in-depth. It is also considerably harder to read, partly because each chapter of the SRE book is written by a different set of authors, all of whom are experienced in their subject matter, but each with a different writing style and approach. Moreover, the Google SRE book is focused on Google-specific solutions to problems.
Nygard's book is shorter and a lot more fun to read, making it more likely, frankly, that you'll get through it and actually learn things and remember them.
And Nygard doesn't assume you work at Google.
I'd suggest: read them both! Read Nygard's book first, to get a broad and readable grounding in lots of important subjects. Then, if you find yourself actively working in a particular area, dive into the particular subject matter in the Google SRE book as well, for additional depth and more detailed material.
Of course, neither of these alone will make you a successful Infrastructure Engineer at a Cloud Computing company. For that, you'll need a lot more study, a lot more practice, and the opportunity to actually work in such an environment and learn from your peers. None of this comes easy.