Transforming Cybersecurity Risk Management: The Power of Bayesian Statistics in Risk Analysis

Tim Layton
7 min readOct 16, 2023

--

Image © Tim Layton — https://linkedin.com/in/timlaytoncyber

Hello, I’m Tim Layton. With three decades in the cybersecurity realm, I’ve journeyed from the early days of the Internet — when 2400 baud modems and Unix were standard — to the modern era of cloud computing, AI, and machine learning. Throughout these transformative years, I’ve been at the forefront of evolving enterprise computing and network technologies.

In the 1990s, cybersecurity risk analysis emerged as a topic of interest for businesses, though it initially received minimal attention from corporate leaders. However, as regulators began shaping the discourse for large enterprises, the Big 4 consulting firms introduced a simplified risk communication tool: the risk matrix or heat map. In hindsight, the widespread adoption of this matrix may have been detrimental to cybersecurity. Over the past two decades, it became the standard, with vendors and organizations relying on it to convey cybersecurity risk, potentially oversimplifying complex threats.

In the study titled “The Risk of Using Risk Matrices,” researchers examined 30 methodologies encompassing operational and safety risks, enterprise risks, and more. Their consensus was that “risk matrices should not be used for decisions of any significance.”

Tony Cox, who earned a Ph.D. in risk management from MIT, further critiques this method in his paper “What’s Wrong with Risk Matrices,” published in the Risk Analysis Journal. He asserts that “risk matrices can be worse than useless.” He emphasizes that they not only waste time but also introduce errors into decision-making, going as far as to say, “Risk matrices perform worse than unaided intuition.”

Upon thorough investigation, it becomes evident that despite their popularity, risk matrices are outdated and are inferior even to unguided intuition. Fortunately, there’s a superior approach grounded in validated mathematical techniques that presents cybersecurity risks in clear economic terms anyone can understand.

I am committed to equipping cybersecurity professionals with the robust capabilities of quantitative Bayesian statistical methods. By leveraging these mathematical and statistical tools, we can enhance our current risk assessment techniques and present risks in terms that business leaders can understand. Bayesian methods allow us to not only prioritize cybersecurity risks but also communicate them along with their potential economic impact, ensuring clarity for business professionals.

You can connect with me on LinkedIn and follow my articles here on Medium. Get notified via email every time I publish a new article.

BAYES THEOREM PRIMER

Before delving into why I advocate for Bayesian statistics in cybersecurity risk analysis, let’s first get a foundational understanding of Bayes’ Theorem to grasp its broader significance.

Bayesian statistics offers a mathematically validated approach to quantify uncertainties using probabilities. One of its key benefits is the ability to refine these probabilities with new data, a capability not found in traditional risk matrix and heat map methods.

In Bayesian statistics:

Prior probabilities represent our beliefs about parameters or hypotheses before observing data.

Likelihood represents how well the observed data supports the different possible parameter values or hypotheses.

Posterior probabilities combine the prior and the likelihood to give an updated belief after observing the data.

These probabilities are derived using Bayes’ theorem, which mathematically relates the prior, likelihood, and posterior.

Quantitative methods, broadly speaking, involve the systematic investigation of cybersecurity risks using mathematical, statistical, and computational techniques. Bayesian statistics falls squarely into the quantification of risk category as it uses mathematical and statistical techniques to quantify and update beliefs based on data.

Image © Tim Layton — https://linkedin.com/in/timlaytoncyber

Bayes’ Theorem provides a way to calculate conditional probabilities.

Let’s break down each part so you can understand how it can be used in cybersecurity risk analysis:

P(A|B) — Posterior Probability:
This is the probability of event A occurring, given that event B has occurred. This is what we are trying to calculate.

This is the main output of Bayes’ theorem. Given some evidence B, the posterior probability tells us how likely event A is. It’s an updated belief based on the evidence presented.

Consider this: if industry breach data suggests a 3% likelihood of a cyber breach from a phishing attack, we can refine this estimate with more specific evidence. For instance, if you have data from your internal phishing campaigns, you can factor in the rate at which users click on malicious links, thereby updating your risk assessment.

P(A) — Prior Probability:
This is the initial or baseline probability of event A occurring without any additional information about B. It captures our initial beliefs or knowledge about A before we factor in the new evidence B. It serves as a starting point for the Bayesian update.

I often use industry-published breach data from the Verizon DBIR or Cyentia IRIS report if I don’t have access to internal empirical data. This is much better than using unaided intuition and allows you to have a higher confidence in your findings.

P(B|A) — Likelihood
Given that event A has occurred, it’s the probability that event B will also occur.

The likelihood quantifies the evidence provided by the new data regarding how well it supports A. It acts as a weighting factor, showing how consistent the observed evidence B is with event A.

If event A represented a cyber breach, B could represent a user clicking on a phishing email, for example.

P(B) — Evidence or Marginal Likelihood
This is the overall probability of event B occurring, encompassing all the ways it could happen (both with and without A).

It serves as a normalizing constant to ensure probabilities sum to one. Conceptually, it ensures that the updated probability (posterior) is consistent with the total probability of the evidence. It is a scaling factor that adjusts the raw product P(B|A) x P(A) to a valid probability value between 0 and 1.

Staying with the phishing email scenario, B could represent users clicking on links in emails that may or may not be phishing attacks.

Bayes’ theorem offers a systematic way to update probabilities based on new evidence. It starts with an initial belief (prior), factors in how the new evidence aligns with that belief (likelihood), and then scales this according to the overall probability of the evidence. The result is an updated probability (posterior) reflecting the initial belief and the new evidence.

Can the commonly used risk matrix and heat map do this? I don’t think so! Let’s dive in and establish why learning Bayesian statistics is a good investment of your time.

WHY BAYESIAN STATISTICS ARE A GOOD FIT FOR CYBERSECURITY RISK ANALYSIS

Bayesian statistics is particularly well-suited for cybersecurity risk analysis for several reasons, as outlined below:

Incorporates Prior Knowledge: One of the hallmarks of Bayes’ theorem is its ability to incorporate prior knowledge or beliefs. In the context of cybersecurity, this means that past data, insights, and experiences about threats and vulnerabilities can be systematically integrated into current risk assessments and analysis scenarios. As new evidence (like threat intelligence, network telemetry, and security logs) becomes available, the risk can be updated rather than completely recalculated from scratch.

Continuous Learning and Adaptability: Cyber threats evolve constantly. Bayes’ theorem allows risk models to be dynamic and adaptable, updating the risk profile as new evidence comes to light. This means that organizations can continually refine their security posture based on the latest information, ensuring they remain agile and responsive to emerging threats.

Addresses Uncertainty: Cybersecurity is fraught with uncertainties, be it from zero-day vulnerabilities, evolving tactics of adversaries, or unseen flaws in systems. Bayesian methods inherently deal with uncertainty by providing probability distributions instead of deterministic values, giving a more nuanced view of risks.

Weighted Importance: Not all pieces of evidence or alerts have the same significance. Some might be more indicative of a genuine threat than others. Through the likelihood component of the theorem, evidence can be weighted according to its significance or relevance to the threat in question.

Decision Making under Limited Data: In some cases, especially with new systems or applications, there might be limited historical data on cybersecurity threats. Bayesian methods can still be applied in these scenarios, making it possible to perform risk assessments even when data is sparse.

Model Complexity: The Bayesian approach allows for the incorporation of complex hierarchical models, which can capture intricate relationships and dependencies in the data. This is particularly useful in cybersecurity, where multiple variables might influence the risk associated with a particular system or process.

Quantitative and Qualitative Analysis: While Bayesian methods are fundamentally quantitative, they also allow for the inclusion of expert judgment in cases where hard data might be lacking. This combination of quantitative analysis and qualitative insights can provide a comprehensive view of risks.

Prioritization of Alerts: In the context of threat detection and analysis, Bayes’ theorem can help prioritize alerts. By estimating the probability that an alert corresponds to a genuine threat, security teams can focus on the most pressing issues first.

In summary, the inherent flexibility, adaptability, and depth of Bayes’ theorem make it a powerful tool for cybersecurity risk analysis. Its capacity to continually update and refine risk assessments based on new evidence ensures that security teams can remain agile and responsive in a rapidly evolving threat landscape.

If you’d like to delve deeper into how Bayesian Statistics can elevate your cybersecurity risk analysis, please clap for this article. Your feedback will indicate a desire for more articles and tutorials on the subject.

You can connect with me on LinkedIn and follow my articles here on Medium. Get notified via email every time I publish a new article.

--

--

Tim Layton
Tim Layton

Written by Tim Layton

Cybersecurity Risk Analysis Using Python and Bayesian Statistics.

No responses yet