In our earlier posts we started to look at how a Generative Adversarial Network (GAN) can be used to harden cyber-security defenses.
We also touched on how malware authors may use GANs to scale and broaden their range of attacks. In this article we’ll look at how this might happen, and what we can do to defend against such attacks.
Applying GANs as a cyber weapon
To recap, GANs consist of two neural networks, a generative one, and a discriminatory one. They work in an adversarial manner to produce new synthetic data. Data that is indistinguishable from the original.
The focus of many articles on GANs has been on their ability to create synthetic images, video or audio. We’ve seen many successful applications: They’ve produced paintings (sold for nearly half a million USD), composed classical music, and may also have been used to impersonate executives in business. The common factor is that it is relatively easy for the generator to produce an output that is recognizably of the same type. The output is music, or art, or video, or a voice.
What makes this possible is the way our brain interprets information. Applying a GAN to image creation requires the generative network to change pixels. The same in the case of videos, where it creates the appearance of movement. The important point here is that no matter what the generative network does, the eye will still perceive the result as an image or video. It may not be initially a perfect outcome. But working with the discriminator, the adversarial nature of the network will cause the system to home in on the target output.
Another example is the application of a GAN to music. In this case the generative network can change the pitch or the length of notes. Again, no matter what the generator does, the output will always be recognizable as music. Even if it is not particularly good music. However, classical music has structure, and the system can soon determine that certain patterns provide better results.
In each of these cases, it is the human eye, or ear, that is the judge. This is key: the human brain is very good at recognizing similarities and patterns, visual, or audible. Changing a pixel or changing a note doesn’t break this.
Not so for information technology. Changing a bit in a binary will almost certainly corrupt the file and make it no longer usable. This is the challenge behind using a GAN to create malware. The constraint is that the output must still be executable, and still be able to perform the intended malicious act.
One of the first demonstrations of the production of malware by GANs was described in the 2017 paper, ‘Generating Adversarial Malware Examples for Black-Box Attacks Based on GAN’: MalGAN.
MalGAN was used to generate PE malware which ultimately bypassed a static PE malware detection engine. It was a direct, blunt-force attack on the engine in which the system was trained to observe the outputs of the model knowing the exploratory inputs sent to the engine. It subsequently created a substitute model within the GAN. It could then create synthetic malware capable of bypassing the engine. Importantly, it was able to test directly against the system that it was trying to bypass and ultimately achieved this 100% of the time.
A problem ignored
We have established that one of the biggest challenges in applying GANs on executable files is to maintain the constraint that the created files are still executable. Most published work on adversarial learning for malware detection simply ignores this problem. It assumes a whitebox model: having complete access to the ML algorithm and the extracted features. In practice, it is not as simple as that. Often the data which is available is just binary data. Therefore, we are much more limited in the type of modifications to the file that can occur without breaking its functionality. For example padding bytes can be added at the end of the file or they can be injected into unused parts of the file.
Even if the system had access to the feature extraction process and made the modification to the features instead of the original binary data, the attacker would still need to have access to the ‘inverse feature extraction process’. This is required to find out what changes were needed to the original file to achieve the modified features. In the case of MalGAN, the authors consider a greybox model. They assume that the malware authors do not know the type of machine learning model used, but know which features it applied. They then restrict themselves to a simple set of features based on API calls which can be reverse engineered relatively easily, given unlimited access to the engine.
A single system to test against
Ultimately, any new malware has to go up against a malware detection system. It does not matter whether malware is hand-crafted by a malware author, or developed at speed and scale by a GAN. It has be tested against a system. This may be a heuristics-based scanning engine on an endpoint. It may be a scan engine that leverages machine learning. Or it may be as sophisticated as a multi-technology scan engine working with a cloud-based dynamic analysis system.
When attacking a system, fortune favors the malware author who has unlimited, local access to the engine that will scan their malware.
In this scenario, the malware author can repeatedly test against the local scan engine. They can do this until they can successfully bypass it without detection. It is the reason why local scan engines have limited efficacy unless combined with a cloud security system. This challenge was recently demonstrated by researchers in Australia who subverted Cylance Protect’s AI-based anti-virus system. In this attack, the basic vulnerability was that the Protect engine was in the hands of the attacker. They leveraged this advantage by continuously testing against it until they found a way to bypass it. This was not a vulnerability of the system. Irrespective of technology, any local engine could ultimately have been compromised. AI-based engines are not immune to this.
In a second scenario, the malware author may not have access to the engine (because it is in the cloud) but they know what type of engine they face. In this case it may be possible to determine parts of the feature space of the system. If they have unmonitored access to the cloud engine, they can now engineer an attack. They can repeatedly test against the system until they bypass it.
The real value of a black box
We’ve always maintained that a black box approach to malware detection is optimal. It goes a long way to protecting anti-malware systems from compromise. If a malware author cannot know what system detected their maliciously engineered code, it’s a lot more difficult to engineer around the system. We also believe that the threat landscape is so broad, it is wrong to rely on any one piece of technology to detect and protect users.
We designed the Avira Protection Cloud with this architecture in mind. There is no ‘unlimited access’ which allows an attacker to develop a training model. Because the system is a ‘black box’ it is not possible to know the complete feature space. Consequently the attacker will not know what aspect of the system detected the attack. They cannot determine if it was a machine learning system (or even what type of model). They can not know what form of dynamic or behavioral analysis took place, or if the file was detected by one of the powerful generics and heuristics of our cloud-based scan engines.
Although we tend to focus on technical safeguards, it is important to recognize that we also employ others: We make Avira technology available only to certain trusted partners. Similarly, beyond a simple binary ‘malicious or benign’ response, our extensive threat intelligence is also controlled in its distribution.
GANs, in the hands of malware authors, present a challenge to the cyber-security industry. But they are also a powerful new tool we can use to continue to improve detection systems and protect our customers.