… and how do we know it works?
What’s the difference between artificial intelligence and machine learning? Put simply, artificial intelligence is the area of study dedicated to making machines solve problems that humans find easy, but digital computers find hard. Examples include driving cars, playing chess or recognising sarcasm.
Machine learning is a subset of AI dedicated to developing techniques for making machines learn to solve these and other “human” problems without the insanely complex task of explicitly programming them.
Classifying malware
A machine is said to learn if, with increasing experience, it gets better at solving a problem. Let’s take identifying malware as an example. This is known as a classification problem. Let’s also call into existence a theoretical machine learning program called Mavis. Consistent malware classification is difficult for Mavis because it is deliberately evasive and subtle.
For it to successfully classify malware, we need to show Mavis a huge number of files that are known to be malicious. Once Mavis has digested several million examples, it should be an expert in what makes a file “smell” like malware.
There is a very wide spectrum of ways in which one could programme Mavis to learn this task. The options include head-spinning concepts and algorithms. Suitable approaches all have advantages and disadvantages. All that counts, however, it’s whether Mavis can spot and stop previously unknown malware, even when the “smell” is very faint or deliberately disguised to confuse it into an unfortunate misclassification.
Training a machine to learn about malware
A major problem for developers lies in proving that their implementation of Mavis intelligently detects unknown malware. How much training is enough? What happens when their Mavis encounters a completely new threat that smells clean? Do we need a second, signature-based system until we’re 100% certain it’s getting it right every time? Some vendors prefer a layered approach, while others go all in with their version of Mavis.
Every next generation security product vendor using machine learning says their approach is the best, which is entirely understandable. Like traditional AV products, however, the proof is in the testing. To gain trust in their AI-based products, vendors need to hand them over to independent labs. It’s the best way for businesses and governments to be sure that Mavis, in her many guises, will protect them.