Folks in the cybersecurity world are beginning to find practical ways to use machine learning (ML) for good and for bad, for offense and for defense and on both sides of the red/blue and safe/block cyber security divides. The first thing most of us think of when ML comes up in cybersecurity is blue teams using ML to detect malicious entities such as files, domain names or packets. But there are so many other possibilities we are beginning to see:
- Red teams using ML to identify deception entities like honey tokens and sand boxes
- Cybercriminals leveraging ML models and tactics published by red teams
- Red teams poisoning data sets to defeat blue teams and criminals leveraging AI
For a real world example we’ll look at how red teams (and cybercriminals) employ ML to allow their malware to intelligently predict whether it is currently running on a real endpoint where it should proceed with the attack or if it’s actually been detonated on a sandbox and should thus lay still and be quiet.
There’s a lot of information available to malware for making this decision, such as:
- Hardware resources and characteristics
- System uptime
- Last Login
- File access history
- Clipboard contents
- Data about OS installation
- Network configuration
- Desktop usage
- Event log
- Running processes
With ML all of these can become features. Then a data model is built by analysis of multiple samples of both real and sandbox systems. In fact, we’ll explore one red teamer’s proof of concept for just such an attack. It’s easy to follow because it uses just a handful of features based on the list of active processes.
DomainTools is our sponsor and Principal Data Scientist, John “Turbo” Conwell, will join me again as a ML SME. You might remember Turbo from a popular session we did last year on using ML to predict which domain names are malicious.
Please join us for this real training for free session.