Risk Rating Litmus Test

One of the significant challenges the risk profession faces is the ability to prioritize.  What I see a lot of in the industry are tools and methods that spit out dozens or even hundreds of “High Risk” or even “Critical” findings from a single evaluation.  As a result, typically one of the following happens:

  • Paranoid organizations cripple their operations and/or burn out their people by trying to aggressively remediate those findings, or
  • Non-paranoid organizations schedule remediation efforts for months or even years out.

In the first case, it’s common to see “committed” closure dates being missed and/or repeatedly pushed out.  This drives auditors nuts (as it should), and sets the organization up for a big fall if a significant loss event occurs.  Unfortunately, in both cases, there may be a handful of issues within the findings that truly are high risk or critical in nature, but because the organization hasn’t differentiated those, they get pushed out with the rest.

Setting aside for a moment the debate over quantitative vs. qualitative assessment, I have a simple “litmus test” I apply to audit or security findings that helps me perform crude prioritization.  This test is based on a recognition that remediation efforts can/should be characterized in very practical terms and applied consistently.  Consider the following descriptions.  For:

  • Critical Risk findings:  All hands on deck.  Efforts extend into evenings and weekends.  High value business objectives may be postponed, extra resources brought in, and “costs be-damned”.
  • High Risk findings:  Remediation efforts begin immediately, bumping existing priorities and stealing existing resources.
  • Medium Risk:  Remediation efforts scheduled and prioritized amongst other future work to be done.
  • Low Risk:  Either no remediation or “opportunistic” remediation as a part of other activities.

As a risk professional, if I’m going to label a risk issue “Critical” or “High Risk” and cause the organization to react accordingly, I’d better have a REAL good reason — a reason based on loss exposure (the combination of loss likelihood and impact) vs. “exploitability” or “vulnerability”.  Significant loss is either occurring right now, or it’s imminent.  And forget about formal analysis for a moment — if my intuition is telling me that remediation for issue X doesn’t need to be started immediately, then I am implicitly characterizing it as Medium.

Some time ago I had a conversation with a friend who was faced with hundreds of “critical” and “high risk” findings from a single security tool.  We spent about 30 minutes categorizing the findings by common traits (e.g., exploitability, frequency of attack, and impact), and then another 30 minutes of evaluating which type of response seemed most appropriate.  At the end of the conversation there were zero Critical and just a couple of  High Risk findings.  Consider what this means to an organization from a resource utilization and remediation focus perspective.  Also consider what it means in terms of the improved accuracy with which the organization’s risk posture is communicated to management and stakeholders.  Finally, consider what it means regarding the accuracy of industry tools and common methods…

Keep in mind that even though this approach may not require detailed quantitative analysis, it does still require an ability to think numerically in terms of frequency and impact, as well as how to apply critical thinking skills and recognize the difference between what’s possible vs. what’s probable.

This post reflects my own opinions and positions, and does not necessarily reflect the opinions or position of my employer.

A change in venue

I’m excited to announce that I’ve just accepted a position at Huntington Bank in Columbus Ohio as Senior Vice President and IT Risk Officer, starting May 1st.  This will be an outstanding opportunity to personally put FAIR through its paces in a way that just isn’t possible as a consultant.  Those of you who have been in the trenches understand that there’s nothing like the ownership and accountability that comes with a position like this to instill practical, effective solutions.

For those of you who aren’t familiar with FAIR’s history, its development began while I was the CISO of Nationwide Insurance.  The genesis was questions from management that I just couldn’t find answers to in what our profession had to offer.  During the early days of its development there were a lot of rabbit trails and dead ends (many of which were laughable looking back at them now), but the environment there was ideal — interested and supportive management, and no shortage of problems to evaluate and deal with.  The best part though, was the fact that I HAD to keep it practical and effective (contrary to the assumptions some people make about a quantitative analysis method).

Since leaving Nationwide, significant improvement and expansion to FAIR has occurred.  Only a small portion of these changes have been discussed in public.  Some of these changes have been well-vetted by clients, others less so simply because they’re so new.  In this new role I’ll have both the opportunity to “eat my own dog food” and face challenges that will undoubtedly result in further improvements to FAIR.

All of this, of course, begs the question of what will happen to RMI and RMI’s clients.  The plan is simple:

  • Existing engagements will be completed as defined.
  • FAIR consulting and onsite training will be available through qualified and licensed channel partners.  The RMI website will add a page that identifies those partners in the coming weeks.
  • Online training will continue to be offered through CXOWARE, which is headed-up by Steven Tabacek.
  • A SaaS version of FAIR software is under development and should be available by this Fall.  More information to follow soon.

Bottom line — RMI will continue to be the authoritative source of information and developments regarding FAIR and will leverage strategic partnerships to deliver FAIR products and services.  Of course, if you have any questions about this transition that haven’t been answered here, let me know.

To Be FAIR About It

I came up with something useful to post about the other day, only to wonder whether I’d already posted about it sometime ago.  (It turns out I had, mostly.)  But in the search through past posts, three things became clear:

  • I really haven’t had that many posts.  Alex was prolific, and Jack Freund and Ryan have added some excellent ones too
  • I needed a simple reference so that I could avoid repeating myself on topics
  • There doesn’t appear to be a simple way for someone to find my past posts

So, I decided to put all of my past posts (at least the ones I thought were decent) into a single reference .  I’ve posted that reference to the resources page of the RMI website in case someone (my family, maybe) wanted to browse my past posts without digging through an entire blog site.  In case you’re wondering, it contains only twenty-five posts.  Twenty-five out of a total 388 posts to-date on this site.

Don’t, however, think for a moment that these represent the best posts on this site.  They aren’t — not by a long shot.  They’re just the ones I’ve contributed.

CVSS Review

I recently had the privilege of being a guest on the Securabits podcast and, during the session, was asked about other frameworks.  I mentioned CVSS (Common Vulnerability Scoring System) in my answer and said I thought it had some serious problems as an analysis and measurement tool (however I also said there were good things about it).  Given time constraints, I didn’t go into detail in the podcast about what I thought was good or less-good about CVSS.  That’s what this post is about — to clarify and share my thoughts regarding CVSS (version 2.0).

In the interest of keeping this post to a manageable length I’ll constrain my observations to what I believe are the most important strengths and weaknesses of CVSS.

First, I have to acknowledge that what NIST and CMU have tried to accomplish with CVSS is both admirable and difficult.  I can only imagine the debates that must have taken place during its development regarding tradeoffs that needed to be made in order to come up with a practical result.  I also believe there’s value in CVSS, even as it is today.  That said, like any other model or framework there’s always room for improvement.  More importantly, like any other tool, its limitations should be well understood so that decisions based on it are made with both eyes open.

What CVSS aims to be

The CVSS guide mentions three key benefits the framework is intended to provide:

  • Standardized vulnerability scoring — essentially, a common means of measuring “vulnerabilities”.  I think the framework accomplishes this objective for technical vulnerabilities because it does, in fact, provide a standard against which technical vulnerabilities can be scored.  Enough said.
  • An open framework — i.e., a framework where scoring includes rationale so that the results don’t have to be accepted on blind faith.  As described further on, I think the framework hits this target in some respects, and misses completely in others.
  • Risk prioritization — i.e., a means of understanding the significance of vulnerabilities so that they can be compared and, thus, prioritized.  Here again, in some limited respect CVSS accomplishes this objective.  Overall though, as a CISO or other decision-maker, CVSS would not provide me with the information I need to make well-informed risk decisions.

An open framework

Great idea — a framework where justification is provided for the scores/measurements being used.  And for the variables a user makes choices about within CVSS (e.g., Exploitability) there is some basic descriptive rationale in the selection matrix.  Unfortunately, CVSS equations are also chock-full of weighted values, none of which appear to have clearly documented basis.

For example, the Base Equation multiplies Impact by 0.6 and Exploitability by 0.4.  In other words, someone decided that Impact was always 20% more important than Exploitability.  What’s the rationale for that?  In fact, by my count there are five weighted constants in the base equation alone.  Six more weighted values (eleven total) if you include the fact that each Base metric will be given a value that appears to be arbitrarily assigned (e.g., For Confidentiality Impact the score will be 0.0, 0.275, or 0.660 depending on whether the vulnerability is assigned “None”, “Partial”, or “Complete” for that metric).  The other CVSS equations use weighted values in a similar fashion.  Perhaps there are well-documented and thought-through rationale for each of these, but I haven’t found them.

In my experience weighted values are rarely well-justified.  Furthermore, they tend to be very sensitive to specific conditions/assumptions.  For example, someone might argue that strong authentication is a more important control than logging.  After all, “an ounce of prevention…”   Consequently, it might be tempting to “weight” authentication’s value higher than logging.  Unfortunately, the logic breaks down if the scenario is focused on privileged insiders as the threat community — i.e., people who are supposed to have access.  In that scenario strong authentication isn’t a relevant control at all and logging is much more important.

Unless there’s good rationale for weighted values, they introduce ambiguity, limit the scope of where the analysis can be applied, and can in some cases completely invalidate results.  At the very least, if weighted values are going to be used, some well-reasoned rationale should be provided so that users can make an informed choice about whether they agree with the weighted values.

Effective risk prioritization

As a decision-maker, two of the fundamental inputs to any decision are “What’s the likelihood/frequency of bad things happening?” and “How bad are they likely to be if they do happen?”.  These are the two values that, taken together, provide me with the loss exposure information I need in order to prioritize effectively.  So, in order for CVSS to be an effective aid in risk-informed prioritization it has to provide useful information on both of those parameters.

CVSS tries to hit both targets, but falls short.  With regard to frequency/probability of loss, CVSS focuses on the likelihood of attacker success from a couple of different angles, but never addresses the frequency/likelihood of an attack occurring in the first place.  Without that metric, the likelihood of attacker success simply does not provide enough information for me to understand the frequency/likelihood of loss.  CVSS may be trying to address the likelihood of attack through its Access Vector metric which, it could be argued, implies that the farther away an attacker is from the target, the less likely an attack might be.  No argument with the logic (if that is in fact what the metric is supposed to represent), but there are a lot of assumptions built into that, including an assumption that the attacker isn’t an insider.

From a loss magnitude perspective, the Base Metrics include Confidentiality, Integrity, and Availability references but these are actually measuring something pretty different.  In a longer post at a later date I might describe a way in which these CVSS metrics could be used in a very interesting way, but that would make this post WAY too long.

CVSS’s Environment Metrics try to include additional loss magnitude considerations.  Besides being very qualitative, there appear to be some significant logic flaws in the approach.  For example, the Target Distribution metric is essentially a measure of “surface area” (i.e., how many systems could be affected).  One problem with this is that there are many scenarios where a single critical or highly sensitive system/asset is exposed (i.e., a small Target Distribution) but gross exposure exists.  The way CVSS math works, this exposure would be unaccounted for.  Something else to keep in mind is that Target Distribution is also a key consideration in loss event frequency (it may be even more important there in many respects), which isn’t accounted for at all in CVSS.

Setting aside the points above, prioritization of CVSS ratings against anything outside of CVSS isn’t practical because CVSS uses an ordinal scale.  You can’t usefully compare something that was measured on a 1-to-10 ordinal scale against something that was measured in monetary values or, for that matter, in a different 1-to-10 scale.

Math

I’ve blogged before about the problems associated with using math on ordinal scales, so I won’t belabor the point here.  Suffice it to say that it just doesn’t stand up to scrutiny.  That said, if the user recognizes that the results are pretty much meaningless for anything but comparing one CVSS value against another, then I guess no harm, no foul.

Bottom line

For all I know, the people who put CVSS together already thought through all of this (and the other problems within CVSS that I haven’t talked about here) and decided that what they came up with was the only practical result given the constraints they faced and their objectives.  Nothing wrong with that.  Trade-offs are inevitable.  It is important though, for users of the tool to have a realistic and accurate understanding of its capabilities and limitations.

CVSS seems like a decent way to measure and compare technical deficiencies (“vulnerabilities”) against one another from a “(Very roughly) how much weakness does each vulnerability introduce relative to all of the other vulnerabilities measured using CVSS?” perspective, which can be useful information.  What it doesn’t provide is meaningful information about how these vulnerabilities stack up in the bigger picture — i.e., “How important are these vulnerabilities relative to the other concerns I have to consider spending resources against?”  In other words — “How much do/should I care about the findings?”  In order to be useful in answering these questions, CVSS would have to evolve considerably.

Speaking of evolution… RMI has on the drawing board a potential alternative to CVSS that we believe will be both practical and more effective in characterizing the risk associated with vulnerabilities.  Stay tuned!

It’s still a choice

This post is prompted by an “enthusiastic debate” about regulatory compliance I had recently with another gentleman in our profession.

I’d love to take a poll of infosec professionals to find out how many of them adhere strictly to speed and other traffic laws when they drive.  Why?  Because many of these are the same people who state with conviction that, when a law/regulation exists regarding information protection, an organization MUST comply.  While we might wish that were true, the fact is that compliance is ALWAYS a choice.  It’s just another risk decision; usually a trade-off of some sort.  Does the organization prefer to accept the risk associated with potentially being caught and facing legal and other losses, or would they prefer to accept the costs and business impact associated with complying.

The other consideration in play is the fact that many laws are open to interpretation.  I’ve been in plenty of meetings where the ambiguity in law is leveraged in decision-making.  Not in a malicious, bwah-ha-ha sort of way, but in a legitimate “How do we best manage the cost and risk associated with running a business?” sort of way.  And for those who’d argue that’s a terrible thing, I’d bet a close look at some of your own decisions will find a little “harmless interpretation” of the law from time to time.

Of course, some people might argue that you can’t compare speeding, tail-gating, and rolling through stop signs with the damage that can occur from a breach of credit cards or other PII.  I beg to disagree.  I believe the risk associated with automobile accidents resulting from even relatively simple carelessness or thoughtlessness is significant.

The point is, when we adopt the premise that laws/regulations somehow eliminate choice and decision-making, we’re being naive, and this naiveté comes across pretty glaringly to many of the business professionals we serve and support.  It’s just another example to them of the infosec geek lacking perspective and viewing our very grey world in black-and-white terms.

The Certified FAIR Practitioner Forum is now online

Finally, after many suggestions to do so, we’ve developed an online community for certified FAIR practitioners.  This is a place for people to ask questions, share challenges and successes, and recommend improvements.  It is also a source of additional documentation, example analyses, and training that aren’t available to the general public.  If you’ve completed FAIR training and passed the certification exam, this will be an excellent resource for taking you to the next level.

You can go here to register.

If you’ve been using FAIR based on the white paper I published a few years ago, you should seriously consider taking our online training to up your game, learn about how FAIR has evolved, gain access to the most recent version of the FAIRLite tool, and so that you can take part in the online community.

With that and the holidays in mind, we’re offering a special for our online training.  For anyone who signs up for the December 27th session, the fees are reduced by 20% (to $799).

More than just numbers

Many people believe that FAIR focuses strictly on quantitative risk statements, but they couldn’t be further from the truth.  The numbers simply allow us to recognize conditions and convey information better than we could do in any other way.  Sometimes, however, numbers don’t tell the whole story.

In this post I’ll describe two conditions defined within the FAIR framework that help us to ensure management understands the nature of some risk scenarios that would be very difficult to describe quantitatively or qualitatively.

Fragile conditions

Suppose we have a scenario where the threat landscape is very active but, due to a single extremely effective control, we actually have a very low probability of loss.  If we were to plot this condition as a point on an X-Y chart, it might look something like this:

Now, if all we provided management was this point on a chart, there’s a decent chance they’d be fine with it.  After all, the frequency is low and the magnitude isn’t outlandish.  What isn’t conveyed in the chart however, is the fact that if the single control fails, the point moves rapidly to the right — i.e., there is no “grace period” or window of time in which we might avoid compromise.  The threat event frequency is just too high.

In order for a decision-maker to make a well-informed decision about how to manage the risk scenario, they need to understand both the amount of current risk as well as the implications associated with the condition’s fragile nature.  With this information they may decide to introduce another layer of protection (defense-in-depth) and/or apply measures that make the control more robust and less likely to fail.  Or, of course, they may decide to do nothing, but at least it would be an informed choice.

Unstable conditions

Another scenario can exist where threat activity is inherently low but we have few or no resistive measures in place — i.e., our vulnerability is high.  Here again, the point on an X-Y chart would look just like the fragile condition above, and management might not be too concerned.  What the numbers don’t tell us though, is that we’re essentially rolling the dice every day and counting on bad things not happening.  We aren’t actively managing the situation.

Here again, by letting management know about the unstable nature of the scenario, they’re able to make an informed decision about their control options.

Another important aspect of unstable conditions is that in some cases the lack of preventative controls may be construed as an absence of due diligence by external stakeholders — particularly if something bad happens.

Why it matters

Many of us would intuitively recognize the nature of these conditions when evaluating a scenario, so you may be asking what the big deal is about formalizing their definition.  Well, because it’s difficult to convey these conditions quantitatively or qualitatively, what tends to happen is that people “adjust” the assigned risk level for scenarios like these so they’ll land in the high-risk category — essentially equating these scenarios to scenarios where the loss event frequency is actually high.  Unfortunately, in doing so they misinform their decision-makers.  The fact is, these conditions  are importantly different from scenarios where the frequency/likelihood of loss are high, and management needs to recognize this difference and decide accordingly.

Visibility Analysis Webinar

The Visibility Analysis webinar on Wednesday was very well attended and has received excellent feedback.  My thanks to everyone who showed up.  If you couldn’t make it, you can find the recording here.

Also, those who attended the webinar are eligible for a 20% discount on our online FAIR training.  Please just contact me before you register so that I can give you a discount code to use.

Thanks!

Jack

Visibility – one of the keys to effective risk management

Please join me in a webinar on risk management where I’ll pull back the covers and discuss a component of the FAIR framework that hasn’t been shared publicly before.

Although FAIR is primarily known as a framework for quantifying risk, other parts of the framework focus on understanding how to manage risk more effectively.  In this webinar I’ll describe the Visibility Analysis component of the framework and how it can provide a source of meaningful metrics and intelligence that helps organizations approach risk more strategically.  To learn more and to sign up for the webinar, please go here.

The webinar will take place on November 17th, at 11:30 am Eastern U.S. time.

Hope you can join us.

Flaw of Averages Webinar

If you’re not already familiar with Dr. Sam Savage’s book “The Flaw of Averages” then you probably should look into it.  A great place to start would be the free webinar he’s giving next Tuesday (Oct 19) at 10:00 PT.