How Is Machine Learning Used for Cybersecurity?
By Tobias Geisler Mesevage
The interconnectedness of technology, coupled with the growing number of mobile devices, quickly evolving technologies, and more prominent use of Wi-Fi has resulted in a severe uptick of cyber attacks.
To ward off impending threats, we’re increasingly turning to machine learning for help. Machine learning has the potential to offer better, more efficient solutions than what’s currently available on the market to prevent cybercrime.
In this article, we’ll give a deep-dive into how machine learning is currently improving cybersecurity.
Machine Learning Applications for Cybersecurity
Machine learning can be used to monitor and detect breaches in a certain network, and can also help generate an automated response to an attack. These applications are being utilized in the following ways:
One of the ways machine learning has improved cybersecurity is through spam detection.
According to Symantec’s 2018 Internet Security Threat Report (ISTR), 54.6% of all email is spam, with spear-phishing emails as the most widely-used type of cybercrime through email.
Luckily, a large portion of these attempts is blocked from reaching inboxes, thanks to robust machine-learning-powered spam filters.
There are different approaches to spam detection, you can classify emails as spam based on a finite set of rules which is costly, inflexible, and not scalable or you can use machine learning techniques. Machine learning techniques are much more efficient and scalable than knowledge-based methods.
Traditional machine learning algorithms need pre-classified data to train the model. First, you need to build a training and a test dataset. A dataset is built by using a set of emails that are classified as spam and not spam from features like:
- Word frequency count
- Types of attachments
- HTML tags
- Email length
- The IP address
- Number of recipients
Part of the resulting dataset is used to train the algorithm and the remaining is used to test the performance of the algorithm.
Machine learning is used to better monitor, analyze, and respond to cyber attacks and security incidents on:
As an example, Austin-based A.I. company SparkCognition, partnered with Google Cloud Machine Learning Engine to create a malware detection engine for Windows and Android. The product, called DeepArmor, uses machine-learning to detect security threats early and prevent endpoint attacks. Google states the engine can detect zero-day threats with 99.5% accuracy.
The ability to detect threats early is an important one. Data from Microsoft estimates that an attacker resides within a network for an average 146 days before detection, which is long enough to do business-ending damage.
Phishing is aimed at stealing personally identifiable information such as credit card data, financial information, account details, passwords, and intellectual property. Phishing techniques use technology and social engineering to deceive users and lure them into sharing personal and sensitive data. Specifically, more than 66% of MSPs in Datto’s network say phishing emails are the top method for ransomware delivery to their clients.
The most common types of phishing attacks are deceptive linking, website cloning, voice and text phishing. And, there are three main groups of anti-phishing methods:
- Detective (monitoring, content filtering, anti-spam)
- Preventive (authentication, patch and change management)
- Corrective (site takedown, forensics)
Malware is designed to infiltrate or damage a computer system. In Datto’s 2018 Ransomware report shows that MPS find malicious websites, web ads, and clickbait most commonly affect their client’s systems as they are can deliver malware like ransomware, spyware, adware, or trojans to unsuspecting victims.
Traditional approaches to malware detection were focused on identifying features using hashes, code fragments, and file properties. From these, algorithmic rules are created to classify a file as benign or malware. However, the introduction of server-side polymorphism turned these methods obsolete.
Now, we take a different approach based on the challenges we’ve previously faced.
For example, one of the main challenges of malware detection is the continuous evolution of new malware files and versions. Rule-based approaches can’t adapt to these changes. However, machine learning can now help detect ransomware by analyzing files during the pre-execution phase.
Another historic challenge is the detection of rare attacks like high-profile targeted attacks. Recently, deep learning algorithms have been used to detect these types of attacks and will continue to become an asset in malware detection in the future.
Challenges of Cybersecurity
For all the progress machine learning has made in cybersecurity it still has challenges to overcome:
- Anomaly detection requires a clear definition of what is considered normal activity, which is challenging to define
- Methods and tactics of cyber attacks constantly change requiring models to adapt to new patterns and behaviors quickly
- False positives can be costly in terms of infrastructure and data privacy
- Attackers can also use machine learning methods to power their attacks by creating new malware, phishing content, self-protection of infected nodes, identifying recurring patterns, and possible flags.
As technology changes quickly, we’re certain reliable protection is on the way. But in the meantime, discover why all small and medium-sized businesses should have a disaster and recovery plan to minimize the damage an attack can have on a business should cybercrime sneak through your doors.