Shortly before taxes were due this year, accounting firms had a surprise. Some of their tax preparation software failed to run because of an update from Microsoft that was automatically applied by Windows Automatic Update functionality. As a result, there was frantic firefighting at the vendors and at Microsoft to resolve the issue.
The lesson here is not about bashing Microsoft but about managing risks associated with changes.
Microsoft can't get a break. If it doesn't help individuals and small businesses update their systems then the company is faulted for not doing enough for security. If it releases an update that works fine for 99% of systems and then a minority fails, it gets bashed for that as well. The truth is that a great many of the issues people associate with updates from Microsoft are not Microsoft's fault.
A great many systems have been using the Internet from the day they were plugged in to download and apply updates from Microsoft. The problem is that every change to a system represents a risk. There is absolutely no way -- given the current Microsoft Windows architecture -- that Microsoft or any other vendor can ensure that the unique combination of applications, hardware and device drivers on a given computer will allow a patch to always be installed without a problem. They do perform extensive testing, but there are and always be some unique combination of configurations that cause unforeseeable events to occur.
There are a number of ways to address the risks of failure that are largely process-centric. To minimize these risks, a defined change management process should be used to manage the Windows updates and their risks to the local computing environment.
With this in mind, let's step through some basic recommendations:
First, turn off the automatic update feature in Microsoft Windows and any application or driver with that feature. You want to make sure that you control the timing and selection of what is installed.
Second, test your systems if possible. If all your computer configurations are different, then there is little testing that can be performed. The more standardized the computers are, the easier it is to set up a test platform and verify that nothing breaks in the OS or critical applications. Ideally, the test procedures are documented and even automated in such a way so that regression tests can be performed.
For enterprises that can use distribution tools, the updates can be packaged and delivered using a management tool such as Windows Server Update Services, Microsoft System Center Configuration Manager 2007 or third-party tools. These tools speed up distribution and reduce the chances of human error during manual installations.
It's important to note that if the distribution packages aren't tested, then distribution tools can increase the likelihood that large numbers of systems would crash because of problematic updates that are applied.
Small businesses with limited resources and individuals who do not have test environments and distribution tools should wait until they have a timeframe in which they can install the patches and recover if things go wrong.
For each system, create a restore point in Windows and make a full system backup, if possible. Then manually install the patches. Test the operating systems and critical applications as best you can, focusing on the key functionality that the organization relies on. If the testing shows there are problems, then you have three options, depending on your situation.
Do some research on the Internet to see if there are others reporting the same symptoms and/or error codes and for known fixes. If you find something applicable, take the corrective action specified and proceed.
Boot Windows in safe mode and roll back to the saved setting and see if that resolves the issue. If not, go to the next step.
The failsafe is to restore the system backup that was either made just before installing the updates or to the last known good backup.
One concern some may have is that applying updates less frequently creates security risks. While that may be true it's also true that applying changes without a defined process also creates risks.
In an interesting study, the Information Technology Process Institute actually found that high-performing IT organizations patch less frequently. The study reported that these organizations patch pre-production systems with adequate testing and deployment planning. They are able to do this because they have multiple levels of countermeasures in place, including firewalls, IDS/IDP and antivirus tools.
All Windows shops should stop and look at the security measures they have and review their change management processes to see if real-time updates are a necessity. To have predictable operations from our IT systems, you need to manage changes rather than have them manage us. Groups that understand the risks both ways and are prepared to manually review and deploy updates are best served by following a procedure versus playing Russian roulette with updates whose local consequences are unknown.
George Spafford is a principal consultant with Pepperweed Consulting and helps IT organizations in the areas of technology and process implementation. Spafford is an experienced practitioner in many aspects of IT operations, including strategy and audits. He is a speaker on topics that range from security and IT governance to IT process improvement. He is the co-author of The Visible Ops Handbook.
This was first published in May 2008