I was surprised to find an email from one of our outsourced service providers in my inbox two days ago, saying that they had to do emergency maintenance on their servers. Specifically, to take them offline and install the patch for MS08-067, a wormable RPC vulnerability in the Windows Server service.
The patch was deemed by Microsoft to be worthy of out-of-band release. Based on what I’ve read about it, I applaud that decision. It’s a severe bug. Waiting until November to publicly release the patch would have been a bad idea.
A certain amount of chaos ensues when such a patch is released. For example, the service I mentioned above was down with relatively short notice – and I’m paying for it regardless. But that outage was handled professionally.
As another example of chaos, this eWeek article includes a suggestion by a security professional that organizations bypass their internal testing process and just deploy the patch immediately to all affected servers. That’s bad advice. After all, the notes accompanying the patch explain how the threat can also be mitigated via a firewall. And if the patch were to cause a compatibility problem, what good is a broken server?
Another example: do a web search on MS08-067 and take a look at some of copies of the original bulletin that appear. Not all of them are complete, and most of them lack links to additional authoritative information. Incomplete, or even innacurate, information moves like wildfire on the internet.
The chaos, as well as the replication of incomplete information, is happening for a reason: lots of companies, and millions of users, are dependent upon Windows in some way. Service providers and news organizations are trying to keep up.
Millions of dollars in commerce, and probably much more than that, is dependent upon Windows. Whether it’s direct access to critical line-of-business applications, something indirect like hoping that your bank’s network doesn’t crash before you cash your paycheck, or even something mundane like checking internet email from home (or blogging; that probably falls into the mundane category as well), most people in industrialized countries are affected by Windows, good or bad.
This is a tremendous amount of responsibility. I used to work at Microsoft and I know what that feels like.
Thus, I think it’s fair to ask what’s being done to prevent problems like MS08-067 from happening in the first place. Frankly, the question didn’t even occur to me until I read this blog post from Michael Howard. It’s an informative post, and I especially recommend reading it if you have a development background.
However, in light of the responsibility, mentioned above, which must be born by Microsoft, as well as the cost paid by the industry in testing and deploying each new patch, the response laid out in Michael’s blog post is inadequate. Microsoft is not doing enough to prevent this problem from recurring.
I’ll summarize a few points made in that post: first, that it’s difficult to design automated tools that can catch the kind of buffer overflow bug that led to this bulletin. It’s not stated whether such tools exist elsewhere, but it is stated that Microsoft’s tools can’t do it. I accept this claim at face value, but there’s more to be said. I’ll come back to this.
Second, the observation is made that security features in Windows Vista and Server 2008 mitigate, although don’t eliminate, the threat. My observation: the patch still needs to installed on those systems. Plus, the majority of the deployed base is predominantly Windows XP SP2 and earlier on the client, and Windows Server 2003 and earlier on the server. So I don’t find the comments to be relevant. While the new security features point to a positive trend from a technology perspective, the blog post doesn’t explain what’s being done to reduce the impact of these bugs, as well as of the patches themselves, on Microsoft’s customers. How is TCO being reduced in this area?
Third, the claim is made that Windows Vista, as well as Microsoft’s Security Development Lifecycle process, came out as winners (I’m paraphrasing). That’s true from a certain perspective. After all, the catastrophe scenario of another widespread internet worm was probably averted. But in light of the observations above, this claim strikes me as insensitive to customer perception.
Finally, the one action item, so to speak, accepted by the blog post on Microsoft’s behalf is to do a better job of fuzz testing (aka fuzzing). Here’s my concern, though: fuzzing is a non-deterministic technique. Is that really the best Microsoft can do?
This brings me back to the first point regarding automation tools. The timing of this patch, coinciding with Microsoft’s earnings announcement, is … awkward. The company netted well over $4 billion this quarter. Think about that, then consider, again, the impact of each security bug and each out-of-band patch on the bottom line of each of Microsoft’s millions of customers, due to downtime, servicing, and testing.
Microsoft must do a better job of reducing TCO. Making a significant, new investment in proactively and deterministically finding and eliminating security bugs should be a key pillar in their strategy for doing so. I can’t and don’t accept that a company with that kind of profits can’t do better than updating their fuzz testing heuristics.