Why wasn't the box checked? Was it ever checked? Did someone uncheck it? Who did or did not do this? Most likely you will never really know.You: Hello.
Angry user: Why isn't X working?
You: I'm sure there is a logical reason (blood pressure begins to climb).
Angry user: I don't care about logic. I want X working, like, yesterday!
You: I'm on it. I'll report back to you as soon as I have figured it out (all-nighter commences).
You (next day without sleep): Joe forgot to check the box on X. I checked the box, and now everything is working.
Angry user: It doesn't make sense to me why you missed checking the box. (all trust and creditability is now lost).
No matter how well planned out, how many test scripts are run or how many times you explain it, something always goes wrong. Just like death and taxes, problems are inevitable, but you can avoid those phone calls.
Go back to that nice relaxing Friday, but this time start in the morning and pull up your daily monitoring report and see that a critical box has been unchecked by Joe. You follow up with Joe and find out it was done for a good reason, but Joe forgot to recheck the box when he was done.
The difference in the two scenarios is timely information on critical configuration. So how do you get that to happen? Monitoring controls. The first step to having monitoring controls is to have an appropriate design. I use the following methodology when designing monitoring controls:
Identification of control points
Everyone has some type of defined area of responsibility.
Having high-level bullet points of responsibility is a great tool. It will help you in so many ways, from managing your priorities to disaster recovery to monitoring for errors.
Privileged user administration (local system administrator access)
Server configuration (GPO, services, and so on)
Server patches and upgrades
The next step in the design process is to take a deeper look at each of your key processes. The goal is to ultimately identify where things are likely to fail and cause you problems. This is done by taking a process and asking a few questions.
Let's analyze privileged user administration. Here are the questions I would consider when performing a failure analysis:
- Do I have a standard process? If I don't have a standard process for provisioning – and de-provisioning – administrator access, how can I ever feel comfortable that the right people have access?
- Is there a "workaround" you frequently use? None of us will publicly admit this, but we all have workarounds – ways to override the standard process – to get things done. I'm not saying to get rid of these, but you have to be aware that they exist.
- Where is this process likely to break? And if it breaks, what is the impact – maybe you don't have a good way to reset all of the admin passwords? Have you considered that your local system administrators probably have full access to any MS SQL databases that reside within your server farm.
You are likely to come up with several more questions that will help lead you to your final list of all the areas in which your process is likely to break and result in a problem.
Identification of control points
Armed with your key processes and your points of failure, you can now go through the last exercise in control planning – identifying your control points. A control point is something that gives you that warm fuzzy feeling. Knowing that it is in place and working allows you to trust the system.
To get a clear picture of where your control points are, look over your failure analysis. You should have a quick response to each of the weaknesses. Maybe you have a weak provisioning process with lots of workarounds. If you have a program that logs administrative login or sends an alert every time someone logs into the box with a local account, then you can still feel comfortable with having a secure environment.
If you are like most other IT managers, you will have identified some gaps at the end of this process that you don't have controls for. There is no time like the present to design and implement controls to address the potential failures.
With a clear picture of your process, failure and control environment, you can now look forward to the next step in the process – implementation of control monitoring. Here you can pull out your WMI scripts, Active Directory queries and other tools.
Russell Olsen is the CIO of a medical data mining company and previously worked for a Big Four accounting firm performing technology risk assessments. He co-authored the research paper "A comparison of Windows 2000 and Red Hat as network service providers." Russell is a CISA, GSNA and MCP.
This was first published in October 2007