Following an interwebs link, I recently ran across a paper by Sandy Clark, Stefan Frei, Matt Blaze, and Jonathan Smith titled: Familiarity Breeds Contempt: The Honeymoon Effect and the Role of Legacy Code in Zero-Day Vulnerabilities.
This is the sort of academic research that there should be more of: with bold eyes, they take a fresh look at some precepts that were held to be Truth, run them through the maw of Hard Data, turn them on their heads, offer some suggestions as to why the surprising results might actually hold, and point to areas that have been under-considered.
So, what is the Received Wisdom that they investigate? Well, they consider the 40-year-old notion of Software Reliability Models (SRM), and the younger, but still widely known, notion of Vulnerability Discovery Models (VDM), both of which make statements about the expected behavior of a body of software over time.
The implications of such a VDM are significant for software security. It would suggest, for example, that once the rate of vulnerability discovery was sufficiently small, that the software is "safe" and needs little attention. It also suggests that software modules or components that have stood the "test of time" are appropriate candidates for reuse in other software systems. If this VDM model is wrong, these implications will be false and may have undesirable consequences for software security.
In other words, how do we assess risk when we are building software? If we reuse software that has a long pedigree, should we trust that reused code more, less, or about the same as we trust new code?
For other measures, such as the cost and speed of development, the reuse of existing code has many well-known benefits, but here the authors are specifically considering the implications for security. As they say
It seems reasonable, then, to presume that users of software are at their most vulnerable, with software suffering from the most serious latent vulnerabilities, immediately after a new release. That is, we would expect attackers (and legitimate security researchers) who are looking for bugs to exploit to have the easiest time of it early in the life cycle.
But after crunching lots of numbers, they find out that this presumption does not hold:
In fact, new software overwhelmingly enjoys a honeymoon from attack for a period after it is released. The time between release and the first 0-day vulnerability in a given software release tends to be markedly longer than the interval between the first and the second vulnerability discovered, which in turn tends to be longer than the time between the second and the third.
Furthermore, this effect seems pervasive:
Remarkably, positive honeymoons occur across our entire dataset for all classes of software and across the entire period under analysis. The honeymoon effect is strong whether the software is open- or closed- source, whether it is an OS, web client, server, text processor, or something else, and regardless of the year in which the release occurred.
So, why might this be?
The researchers have several ideas:
One possibility is that a second vulnerability might be of similar type of the first, so that finding it is facilitated by knowledge derived from finding the first one. A second possibility is that the methodology or tools developed to find the first vulnerability lowers the effort required to find a subsequent one. A third possible cause might be that a discovered vulnerability would signal weakness to other attackers (i.e., blood in the water), causing them to focus more attention on that area.
Basically, time is (somewhat) more on the side of the attackers than the defenders here.
From my own experience, I'd like to offer a few additional observations that are generally in agreement with the authors's findings, although from a slightly different perspective.
- Firstly, bug fixers often fail, when fixing a particular bug, to consider whether the same (or similar) bugs might exist elsewhere in the code base. I often refer to this as "widening the bug", and it's an important step that only the most expert and experienced engineers will take. Plus, it takes time, which is all too often scarce. Attackers, though, are well known users of this technique. When an attack of a certain type is known, it is common to see attackers attempt the same attack with slight adjustments over and over, looking for other places where the same mistake was made.
- Secondly, fixing old code is just plain scary. In practice, you don't have as many test suites for the old code; you are unsure of how subtle changes in the behavior of old code will affect all of the various places where it's used; the original authors of that old code may have departed, or may have forgotten how it worked or why they did it that way. Developers may well be aware of bugs in older code, but simply decide it's too expensive or too risky to fix it, and hence allow a bug to remain present in older code even though they know it exists.
- Lastly, that "blood in the water" sentiment is real, even though it's hard to interpret. The reluctance by the manufacturer to advertise information about known vulnerabilities, often labeled "security by obscurity," is in many ways an easy emotion to comprehend: "If we tell people there's a security bug in that old release, aren't we just inviting the bad guys to attack it." Although security researchers have done excellent work to educate us all about the problems with vulnerability secrecy, it's hard to educate away an instinct.
Overall, this is a fascinating paper, and I'm glad I stumbled across it, as there's a lot to think about here. I'm certainly not going to give up my decades-long approach of reusing software; it has served me well. And I'm far from persuaded by some of the alternative solutions proposed by the authors:
research into alternative architectures or execution models which focuses on properties extrinsic to software, such as automated diversity, redundant execution, software design diversity might be used to extend the honeymoon period of newly released software, or even give old software a second honeymoon.
Some of these ideas seem quite valid to me. For example, Address Space Layout Randomization is quite clever, and indeed is a very good technique for making the life of the attacker much harder. But "software design diversity"? Harumph, I say.
I suspect that, in this area, there is no easy answer. Software security is one of the most challenging intellectual efforts of our time, with many complexities to consider.
For now, I'm pleased to have had my eyes opened by the paper, and that's a good thing to say about a research project. Thanks much to the authors for sharing their intriguing work.