“There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don’t know. But there are also unknown unknowns. There are things we don’t know we don’t know.” — Donald Rumsfeld. Former US Secretary of Defense.
I watched in horror recently, as did many of you I suspect, as the WannaCry ransomware crippled thousands of systems around the world and wreaked havoc in almost every country on the planet.
I have, you might say, a slightly more than a casual interest in the matter. For several years, I managed the engineering team responsible for the SMB (Server Message Block) file sharing protocol, the vector used for the attack. I was also the head of security for Microsoft for a few years. Although, to be clear, I held both of these positions during the Vista and Windows 7 days, long after the protocol was first designed and implemented.
I didn’t personally write a single of line of code in that protocol. That task was delegated to much smarter people; but I did manage the teams responsible for building, testing, maintaining, advancing, and securing it for several years.
I was shocked, like everyone else, to see it at the center of an international meltdown of unprecedented proportions. Almost two hundred countries impacted? Over two hundred thousand businesses stymied? Hospitals? Emergency rooms? WTF?!?
The malware takes advantage of a previously undiscovered bug in the SMB protocol to take over a computer, encrypts all the files, and puts up a message telling the owner to pay up or lose their data. The NSA had known about the bug for years but didn’t disclose its existence so it could be used as an espionage weapon. It was only discovered “in the wild” a few months ago when a group called the Shadow Brokers leaked a ton of NSA documents through WikiLeaks.
Microsoft fixed the bug back in March of last year but it had been present in all versions of Windows for years. Many companies were caught off-guard as they didn’t realize the potential impact and didn’t deploy the patch, despite recommendations from Microsoft, thereby leaving their systems exposed to the attack.
This is a protocol, mind you, that was designed back in the eighties when computer networks were still few and far in between, when the internet did not even exist. It was first shipped in 1990, was standardized as part of the CIFS protocol in the late nineties, and has been used for the past thirty years for file sharing in every Windows and Windows-compatible product in the world.
Now you take one of these servers that, benignly, implements this file sharing protocol so you can… guess what… share files across a supposedly secure local area network, you add a pinch of magic dust and send it a really screwy malformed request, one that no sane human being would ever send in a reasonably written piece of software. This malformed request, in turn, triggers a bug in the implementation of the SMB protocol that allows the caller to gain supervisor access to the system. Game over. You can encrypt all my data behind my back and ask for ransom to release it.
“Hedley Lamarr: Unfortunately there is one thing standing between me and that property: The rightful owners.” — Harvey Korman. Blazing Saddles.
Microsoft’s Brad Smith immediately blogged about the need for all parties to share this kind of vulnerability information in order to secure software. It is inconceivable to me, knowing what I know about the teams and processes at Microsoft, that they would not have fixed this bug had they known about it.
I am not here to apologize for Microsoft or the Windows team or the SMB protocol or the history of computer science. I’m here to say simply that more such bugs will be found in the future, for the simple reason that “we didn’t know what we didn’t know back then” and it’s crazy to continue to depend on such software in today’s world where billions of people are connected to the internet, where nefarious actors abound, and where automated tools can be used to scan ports and sniff out vulnerabilities.
“Plan to be spontaneous tomorrow.” — Steven Wright.
We spent years designing this software; we spent years testing it; we spent years standardizing it in cross-industry committees and sharing it with partners; we spent years building a community around a protocol that is supported by billions of systems around the world. Our goals at the time were primarily interoperability, usability, and compatibility. We even spent thousands of man years fuzz testing the APIs to make sure attackers couldn’t trigger vulnerabilities in the code using malformed packets. We used specialized tools that generated all kinds of random bit patterns in the packets and we worked hard with the community of white hat security experts around the world to responsibly document and fix security related bugs in all our software.
But guess what. No one tried this particular random pattern of bits — except the NSA. And they chose to keep it to themselves because they felt they could use it to spy on people.
Some may argue that open source software is more secure because additional eyeballs on the code can identify such issues more readily. But the story is not that simple. A white hat security expert may indeed have found and reported such an issue but chances are just as high, if not higher, that a black hat hacker, given source code access, may find the vulnerability and keep its existence under wraps for his own nefarious purposes.
Note also that “automated updates” (a la Windows Update) are not a solution. The fix for this particular bug had been released months earlier but never installed on the servers that were attacked. Unlike consumers, most organizations around the world spend months retesting software patches after they are released in order to make sure they don’t break compatibility with business critical applications, then they spend several more months rolling them out through their complicated networks of thousands of servers.
The very same corporations and entities who are the slowest to adopt released security patches are the ones most in need of them: the ones that are highly regulated, fairly antiquated in their processes, and entirely unprepared to deal with a global security event of these proportions.
To me, this is the last nail in the coffin of on-prem shrink wrapped software and the reason more and more services will move to a SaaS delivery model. I’ve blogged multiple times in the past about the public cloud (on the death of on-prem infrastructure and its rebirth in the cloud, on the architectural advantages of the public cloud, and on the need to re-architect aging enterprise applications for the internet age). I hope WannaCry will serve as a wake up call for all those continuing to depend on on-prem shrink wrapped software, most of which was designed before we understood all the security implications of the open internet.
Much will be written about this event and how it could have been avoided or more quickly remedied. But the real answer is much simpler than all that, so I’ll spell it out: We didn’t know what we didn’t know back then. You are likely to continue to find more bugs — not just in SMB, but also in the millions of lines of code written in all the operating system and application software developed over the past few decades and running our businesses today. And the juiciest bugs will be hoarded by hackers and used to wreck even more havoc on our systems.
Most people don’t maintain and repair their own cars or the plumbing and electrical wiring in their homes. Yet they insist on doing the same with their computer hardware and software which are many orders of magnitude more complicated and critical in nature. The real problem is that this is a broken model of service delivery that relies on local system administrators or worse, government bureaucrats, to decide when to install a patch. We’ve just seen an example of what that means in real life.
So the hackers will keep finding the bugs, knowing that inertia is in their favor. And they will hide it from others — so they can weaponize it, so they can monetize it, so they can benefit from it. Think about that. It’s human nature. And we are all in denial of it. The motive — industrial, government, or criminal espionage — is almost secondary in nature.
The days are gone when it made sense to have so much device specific code running on-prem. The pipes are so much fatter and faster these days that the same services can be offered much more securely from the cloud. Dumbing down the client and moving complexity to the cloud doesn’t solve all the problems, of course, but it does make it much easier to roll out a security patch when the proverbial fertilizer hits the cooling system.
The more code you have on your local system, the bigger the attack surface. The more compatibility you offer with legacy systems, the more successful you are as a platform with the on-prem software delivery model, the longer the tail of companies that will be at risk of exposure — for decades to come.
As an industry, we figured all this out a while ago and moved to the cloud as a much more robust and supportable service delivery model but the rest of the world hasn’t caught up with that model yet. Legacy is a bitch.
Did I mention that this particular version of the SMB protocol was officially deprecated by Microsoft four years ago exactly because it was known to have fundamental security flaws in the design? Not that it matters. As became obvious with WannaCry, hundreds of thousands of businesses were still depending on it to run their applications.
We can sit here and blame Microsoft but that would be a mistake. It’s true that every one of the thousands of eyeballs that looked at that particular piece of code didn’t notice that it would misbehave in a peculiar way when handed parameters that it was never designed to handle. Some smart kid somewhere figured it out and it became weaponized. Trust me, there are many other such pieces of code out there. You and I and the rest of the world will pay the price for the next two decades, guaranteed. That’s how long it takes to replace these systems in regulated industries.
The only saving grace for on-prem software, be it on clients or servers, is that each such installation is a bespoke environment. Consumers or administrators pick and choose from hundreds of available options — Windows or Linux, Oracle or SQL, McAfee or Symantec, etc. — and put together an environment that is unique. By doing so, they reduce the likelihood that any single bug can impact their environment more than others.
But make no mistake: each and every one of those pieces of software contains security vulnerabilities. And no end user or system administrator is capable of keeping track of, monitoring, and patching or upgrading the systems in question on a daily basis.
The cloud model of service delivery, where the vast majority of the code runs in the cloud and is always up to date and the most recent version, conceptually bypasses all of these operational problems. If the code is running on our servers in the cloud, instead of on your systems on-prem, it’s so much easier to monitor the environment and patch problems quickly before they become a liability. And trust us; we know how to better maintain the servers running the code. Better than you, Mr. Hospital in the U.K., anyway.
Fundamentally more secure architectural solutions have evolved over the past two decades that cleanly address most, if not all, of the security concerns we deal with every day in an enterprise context. Yet we continue to rely on twenty year old technology and complain vociferously as it fails to stand up when measured against our latest requirements and innovations.
Continuing to run ancient software in today’s hyper-connected world is akin to riding a horse and buggy down the freeway, complaining that it can’t keep up with the neighbor’s latest Google-controlled autonomous vehicle, and blaming the poor horse when its knees buckle under the pressure.
If you think your particular application isn’t offered over the web as a service, I urge you to take another look. Meanwhile, depending on software designed thirty years ago, implemented twenty years ago, and deprecated ten years ago to run your business and trusting government bureaucrats to know when and how to maintain those systems is a recipe for disaster. It’s naive and it’s irresponsible in the world we live in.
WannaCry is just the first of many. There will be more and they will be worse. I’m sure of it.