Lipstick - Part II
In a conversation with Jared, he shared that the RC roadmap includes plans to improve its built-in model. And clearly, new product development always requires tough choices, trade-offs, and continual improvement, so I understand and empathize. He also reiterated that RC’s first focus was on portfolio management — i.e., helping security practitioners prioritize and communicate, which is a critical need in our profession. My reply was that if prioritization is going to be based on risk, then the method/model used to evaluate risk is foundational to the product’s value. He agreed, but seems to have a much higher level of confidence than I do in how our profession approaches risk.
At Jared’s suggestion, I logged into the RC demo and dug into the information in their help file that serves as guidance for its users. However, rather than post specific observations about RC, I thought it would be more helpful if I simply provided a brief “Thinking Person’s Guide to Risk Assessment Tool Selection”. Okay, maybe “Things to watch out for” is a better description. Regardless, I hope you find it useful.
“First, do no harm” (Auguste Francois Chomel)
The phrase above was borrowed from Douglas Hubbard’s book “The Failure of Risk Management”. (Buy it. Read it.) One chapter in the book is entitled “Worse Than Useless”, and in there he describes “structured” scoring methods that can, in fact, lead to worse decisions than if no scoring method was used at all. To limit the length of this post, I’ll refer you to Douglas’ book rather than repeat it here. Suffice it to say, amongst other things he describes the same concerns I’ve already posted about ordinal scales and scoring.
There’s likelihood and then there’s “likelihood”
Many information security risk assessment tools view “Likelihood” as a measure of how likely it is that an attack will be successful. This is VERY different than a measure of how likely it is that an attack will occur and be successful. Without including the likelihood of occurrence we could rate the “Likelihood” of my being attacked by a polar bear on the streets of Dayton Ohio as “high” because I have no effective defenses from such an event. Bottom line — understanding likelihood of success is not very useful if I don’t also understand the likelihood of occurrence.
Of course, the first argument that someone’s likely to raise is, “But we don’t know how often some of these events occur!” I’ll talk more about this in a future post, but the short answer is:
- Baloney. I sit down regularly with clients who need to evaluate the risk associated with “rare” events or events where no direct evidence exists to draw from, and we’re able to arrive at frequency ranges that make sense and can be defended. The key here is the term “ranges”. We may not have the information we need to state exactly how frequently events might occur, but we absolutely have the means to generate frequency as a range. Again — read Douglas Hubbard’s work.
Ambiguity and overlap
Besides the problems inherent in ordinal scales and scoring, another very significant problem is the lack of clarity and specificity in the elements being measured. Unfortunately, many of the models I see in use are very poorly defined, with lots of ambiguity and overlap/redundancy between variables. The result is that things are accounted for and measured multiple times. Combine this with the ordinal scale problems, and the results are not defensible under any sort of scrutiny.
CMM limitations
Some models uses a CMM scale to rate the effectiveness of controls. And although CMM is useful for rating process maturity, it’s not intended for nor effective at rating technical controls.
Compensate not, lest ye go awry
Many models have only one place to rate controls, and those control ratings tend to be applied solely to the Likelihood component of risk. (This was a problem in the first version of FAIR). Typically, what happens then is that users throw compensating controls in that bucket too, even though some compensating controls (e.g., recovery capabilities) affect Impact rather than Likelihood. As a result, the effect of these controls are accounted for in the wrong part of the equation.
Chicken Little
The Impact ratings in most assessment models focus on what “can” result — i.e., what’s “possible”. And, being the paranoid lot that most of us are, we turn this into an estimate of what a worst-case outcome might look like. I don’t know what your experience has been, but out of all of the incidents I’ve been witness to and victim of over the years, not one has approached a worst-case result despite the fact that some of them had significant potential for really nasty outcomes. In fact, as I’ve discussed this with colleagues in the past it’s become clear that worst-case outcomes are extremely unusual. By characterizing risk events purely in terms of worst-case outcomes we provide an exaggerated view of risk, which management recognizes intuitively and writes off as “Chicken Little syndrome”.
The simple fact is that outcomes from incidents can range from inconsequential to catastrophic. And although we can’t predict precisely which will occur from any future event, there are factors that we can use to help us understand and communicate the range of possible outcomes from worst-case to best-case and even what’s most likely. If we want to communicate useful and believable risk information to management, we need to be able to deal with loss magnitudes other than just the worst-case outcome.
To summarize…
There are other issues I could raise, but here’s the short list:
- Be very skeptical of methods that use addition, subtraction, multiplication, or division with ordinal scales. If you do choose to use them, recognize that at the end of the day you’re not going to be able to defend the results as truly quantitative, and you may have a very difficult time defending their legitimacy.
- Make certain that Likelihood includes a frequency component or, better yet, that Frequency is used instead of Likelihood. Regardless, without some reference to the frequency/probability of occurrence the information’s usefulness is significantly reduced.
- Elements being measured, particularly if math is involved, must be as clearly defined as possible so that redundancies and overlaps can be avoided. This also helps to prevent having the wrong element in the wrong part of the equation.
- “Quality” scales like CMM should only be used to evaluate the things they’re intended for
- If the tool only allows the user to describe one level of Impact (e.g., “High”), there’s a significant likelihood that users will choose a worst-case outcome. This almost invariably inflates the risk rating well beyond the actual level of risk, which increases the probability that management won’t take the results seriously.
Bottom line — if we want our risk analyses to be taken seriously, it’s critical that we challenge the assumptions and models (including FAIR) underlying our tools. Unfortunately, much of what I encounter in our industry’s risk assessment tool kit are examples of faux sophistication and poor definition. Is it any wonder then, that many within our profession struggle to accept risk analysis as a viable approach?
So how long DOES it take the Sun to orbit the Earth?
I do need to reiterate my concern about a “model-less” analytic tool. As Jared clearly states, RC is not constrained to any one model for measuring risk. It’s intended to be an efficiency tool that allows the use of any risk assessment model. From a marketing perspective that may be pure genius, I don’t know. I suppose it could translate into a larger potential market because they wouldn’t be locking out those clients who are strict adherents of one model or another. And certainly, if a user is leveraging a reasonably accurate model, then the tool’s effect should be very positive. Unfortunately, as I described in the first part of this post, much of what our profession uses to model risk is junk. In that case, a model-neutral tool is a bit like saying to an astronomer, “Hey, if you want to analyze the solar system by modeling the planets and Sun orbiting the Earth — go for it. And while you’re at it, if you’d rather measure gravitational pull in bushels rather than units of acceleration, that’s cool too. We’ll still allow you — in fact we’ll help you — to present the results as valid astronomy.”
The fact is, models matter. A lot. In my next post I’m going to talk about the role models play and I’ll also draw a distinction between the different types of models I see our profession using.


