Some thoughts on “Physics Envy”
My apologies. I know I’m already late in writing a follow-up to my post on aggregate risk, but client work has been fast and furious lately (a Good Thing). I’m still working on it though. That said, I just can’t resist the need/desire to comment on someone else’s misleading post on the subject of risk analysis.
This time it’s Richard Bejtlick, who appears to be on the warpath again. And, while Richard is extremely intelligent and well established as an expert in many security-related matters, I’d argue that he’s not an expert in risk. It’s clear that he reads on the topic, but he appears to interpret it through an infosec lens which, I believe, tends to be badly distorted by some unfortunate/inaccurate biases in the industry. That said, let’s examine some of his concerns and see what we can learn…
“False precision”
On this point, Richard and I agree — the notion of precision in risk analysis (whether infosec-related or some other form of risk) is absurd. The future is uncertain, and risk analysis is fundamentally a discussion of the future. A precise statement (i.e., prediction) of exactly when something will happen, or how often, or to what effect, just isn’t feasible in a complex problem space like risk. Where Richard and I appear to differ, however, is in our understanding of what risk analysis is and isn’t.
Risk analysis isn’t (or shouldn’t be) put forth as a prediction of the future, but rather as a statement of probabilities given what’s known or believed. Much like a statement that there’s a 1/36th probability of rolling snake-eyes given that a pair of dice has six sides each, and that the dice have independent probabilities. Nobody in their right mind would believe they could predict on which roll the dice will come up snake-eyes, but it’s still very useful as a decision-maker to know what the probabilities are.
Analysis results also should reflect the degree of (un)certainty involved, so that people making decisions based on the analysis have realistic expectations. This is where the use of distributions and ranges become very useful in portraying uncertainty and imprecision. However, if I read Richard’s post correctly, he considers all risk analyses to be useless because they can’t predict the future (i.e., aren’t precise).
This “if it ain’t precise, it ain’t useful” position is one I run into frequently in the infosec community, presumably because the profession is made up of so many people with engineering backgrounds who are used to measuring things relatively precisely. Or, maybe, Richard’s just pointing out that many risk analyses are flawed because they don’t do a good job of conveying the degree of imprecision involved. He’s really not clear on that, and seems to paint the entire issue with a single broad brush stroke.
“Overweighting things that can be counted”
Here again, I tend to agree with Richard about the fundamental problem. There is an unfortunate tendency to look around us for things that can be easily counted, and then assume that they comprise the whole picture. This may not be as significant a problem if you’re dealing with something that has a large volume of relatively clean data to work from, but is a huge problem when good data is sparse. For example, if I want to construct models of human life expectancy in order to profit as a life insurance provider, I can probably find enough good data to do so. Unfortunately, in the infosec realm, good data is harder to come by. As a result, models derived from available hard infosec data are much less likely to be complete/accurate.
What I’ve described above, however, is an inductive approach to modeling — i.e., evaluate data to derive a model. The other approach to modeling is deductive. I’ll spare you a long-winded comparison and let you research these further if you’re interested but, simply stated, a deductive approach constructs a model based on logical (and believed to be) true relationships between premises. For example, a model that states “Loss events are predicated on threat events and vulnerability to those threat events” is logical and “true”. We didn’t need data to construct that model — it just makes sense logically.
Does that mean that deductive models are always accurate? Heck no. They’re subject to potential problems too, but if they’re well thought-out they’re less likely to have the gaps an inductive model built on sparse data is likely to have. Deductive models also act as a guide to knowing what data we need in order to perform analyses.
Keep in mind though, that “All models are wrong (i.e., imprecise), some are useful“. As I’ve said before, the world is far too complex to model exactly. Nonetheless, we should be looking for accuracy and a useful degree of precision in our models, which is entirely feasible. Also, all models should be flexible enough to be adjusted as data and experience improves.
“Man with a spreadsheet syndrome”
Richard’s basic concern is valid — spreadsheets often connote a sense of validity that isn’t always warranted. It isn’t logical, however, to conclude that because some spreadsheets (or other quantitative tools) are flawed, all must be. Maybe that isn’t what he meant, but Richard tends to use a broad brush, so it’s hard to tell sometimes what he’s really saying.
At the end of the day, the validity of a spreadsheet boils down to model accuracy and the quality of data. Since we’ve already covered models, I guess it’s time to cover data…
“A lot of guessing”
As Alex states, Richard (and others) toss the term “guessing” around like it’s an insult. If what Richard means by “guessing” is “estimates made in the absence of perfectly complete and precise data“, then welcome, Richard, to reality. All measurements in the real world are guesses to some degree. I assume, however, that what Richard is concerned about is whether the estimates (guesses) are accurate. The answer to that, of course, is “it depends”.
If someone asks me what the wingspan of a 747 airliner is, and I answer “Ummmm, I dunno. A hundred feet?“, then maybe we have a problem. For one thing, I’ve given a relatively precise answer (100 ft), but that answer may not be accurate. If, however, I answer “Well, the wingspan is almost certain to be less than the length of a football field (300 ft) but greater than the length of my driveway (80 ft)” then I’ve made an estimate that isn’t precise but is much more likely to be accurate. With a little work, I can probably narrow the range significantly (i.e., get better precision) and still be accurate (especially if I have access to a subject matter expert). The question of whether it’s precise enough (i.e., is useful) is a matter of what I need to use the information for.
Business decisions of almost any sort are based on imperfect and imprecise estimates of what might happen. That’s reality. As long as decision-makers are aware of and okay with the imprecise nature of the information they’re operating from, then it’s not a problem.
What people seem to forget is that whether we perform formal analysis on a problem or not, a decision is still going to be made. The question then becomes, is the decision-maker going to be using conclusions drawn from:
- Someone’s unstructured, undocumented, mental model and the “guesses” they apply to it, or
- A structured model that has been documented, examined, and evolved through use, and “guesses” that have been given due consideration
Either way, some model will be applied and guesses/estimates will be used. The point of analysis is to give decision-makers better information than they would have had in the absence of analysis.
Bottom line — it seems like Richard has accurately recognized the existence of some of the fundamental challenges in risk analysis, but it feels like he’s drawn some extreme conclusions about their significance and the ability to effectively deal with them. It could be, of course, that the problem is simply the manner in which he described his conclusions. Perhaps he’ll respond and clear it all up.





