The second in a series of three blogs by Grant and Jason Rolleston on the process of identifying actionable insights.
In the last post in this series, we looked at the process by which data is collected from the operating environment and is then processed and distributed in a consumable manner as information. The collection and processing actions are typically automated. However, the last phase, analysis, has been almost exclusively the domain of human analysts until very recently.
And it is that human intervention at the “last mile” for intelligence that presents the challenge when your operating environment is throwing off 1,200, of even 100,000 warning bells a day from a chatty Network IPS.
It would be easy to say that the way forward is to apply artificial intelligence (AI) to this analysis phase and automate our way out of the chokepoint. But the reality is that AI, for the foreseeable future is still going to be insufficient for the task.
In data science, there is a direct correlation between the false positive rate and the true positive rate, resulting in a less than 100% accurate model. While the execution of machine learning and deep learning is critical in the SOC, it is essential to understand the relationship between Receiver Operating Characteristics (ROC) curves in the SOC. Assuming that machine learning models and classifiers will work 100% of the time is setting your SOC up to fail. Instead, a better approach is to use different technologies to filter out the noise. Then you can identify signals to gather insights that enable you to make a decision.
What is needed here is a reinforcing loop of education and information between humans and machines: “human-machine teaming” to borrow from our CTO, Steve Grobman. The goal is to augment the person, instead of replacing them.
It’s important to say that there are some things that human analysts can do on their own to get to actionable insights without the assistance of any machine, thank you very much. At McAfee, our security analysts focus on:
- Prevalence – How pertinent is this information to the enterprise? Is it local threat intelligence? Or used in a specialized way? Is it industry-level threat intelligence? Or global threat intelligence?
- Age – Understanding “new” signals, whether they are process, scripts, or files in the environment.
- Diversity – By leveraging prevalence, we apply diversity from sources like McAfee’s Global Threat Intelligence (GTI), which allows for more context across the globe.
Additionally, these traits are essential to SOC processes:
- Completeness – Do you have sufficient noise collection to capture context and evidence to deliver effective detection?
- Timeliness – Are you acting on the signals quickly?
- Accuracy – Do you understand the relationship between true positives, false positives, true negatives, and false negatives?
- Confidence – Are you aggregating data and models to understand confidence level and importance of the decisions?
You will always want a lot of signals to investigate that can be created using data science methodologies, because these are often the clues that allow you to start the triage and investigate process.
So this is where automation and machine learning can help to bridge the human labor gap. As you start down that path, what you realize is you’re going to need tools that are easier to manage. The focus becomes enabling your staff to do more. Learning mechanisms – for humans and machines – become a vital part of the equation. The idea is to put the human in the middle of the self-reinforcing data science capabilities like machine learning, deep learning and AI.
In the final post in this series, we’ll look at how McAfee Product Management, Engineering and the Office of the CISO are collaborating to generate that self-reinforcing learning loop.
McAfee technologies’ features and benefits depend on system configuration and may require enabled hardware, software, or service activation. Learn more at mcafee.com. No computer system can be absolutely secure.
McAfee does not control or audit third-party benchmark data or the websites referenced in this document. You should visit the referenced website and confirm whether referenced data is accurate.