Threat intelligence feeds have become very popular as a way of gaining near real-time access to the threat data that lies at the heart of some of the world’s leading cyber-security companies. These feeds are available in many forms, including documents written by analysts and data feeds designed for cloud-to-cloud delivery.
This is the first in a series of blogs on threat intelligence. We will consider how companies looking to build their own security product or service should assess threat intelligence feeds for their application. We will also explore how threat intelligence is collected, analyzed and ultimately integrated into security offerings. Through these blogs, we hope to help you match the right type of feed to your need.
Making the unknown, known
Cyber threat intelligence feeds make the unknown, known. They make visible the threats that others have already detected but that have not yet been seen by your own security infrastructure. But they can also do so much more. Improved threat intelligence enables you to understand if the unknowns you have already encountered present a threat to your customer. They can provide you with the information that you may have been missing, but that you need to make an informed decision.
When you think you know what’s wrong, but can’t be absolutely sure, threat intelligence feeds enhance your ability to correctly classify files and help you avoid false positives – and false negatives.
There are probably as many applications for threat intelligence feeds as there are approaches to building cyber-security solutions and, because of this, there are a huge number of different types of feeds to choose from, from a wide range of suppliers.
What are threat intelligence feeds?
In their simplest form, threat intelligence feeds help organizations protect their infrastructure and their users or customers. They deliver new intelligence, or enhance existing intelligence and typically comprise either a data set or written analysis.
Data feeds typically consist of a list of attributes and unique indicators that have been extracted and collated by analyzing raw data sources. These sources vary widely – from honeypots or spamtraps to partnerships and alliances, hacktivist sites, commercial sensor networks and even social media. Sourced commercially or free, the quality of data set feeds varies. However, irrespective of the quality or veracity of the source, processing data from these feeds using powerful static and dynamic analysis tools combined with the skill of experienced malware researchers, will develop the data into information and ultimately, trusted intelligence. These feeds are usually provisioned as cloud-to-cloud or machine-to-machine services because of the volume and speed of data delivered.
Written analysis feeds are typically delivered as documents written by experienced threat analysts. They provide operational or strategic insight. Operational feeds provide analysis of threats that are developing or are imminent and deliver intelligence that is immediately actionable, while strategic feeds are more likely to give analysis at the executive level and speak to the likely threats that will affect an organization’s business.
Turning data into intelligence
Threat intelligence feeds start life as raw data – a set of ‘unknown unknowns’. An initial process of filtering and validation removes ‘noise’ and unnecessary data to create valid information – ‘known unknowns’ – from which intelligence can be developed.
All feeds have value but that value will vary by application. Some may simply be a blacklist or whitelist or a list of known malware names or compromised URLs. Others may be more sophisticated and provide information from first pass clustering. Then there are those that may have developed significant intelligence on threats and provide information that helps understand context. The challenge is matching the feed to the application.
Avira is both a consumer and a provider of threat intelligence feeds. We buy information in the form of crowdsourced and commercially available feeds and combine it with information we develop from our own customers’ systems. We then apply powerful analytical techniques to develop much more valuable information from the data and ultimately turn it into intelligence.
One thing we have learned is that threat intelligence feeds are numerous, but high value threat intelligence is rare. The way to create value from threat intelligence feeds is to apply expert analytical techniques, techniques which we will discuss in greater detail in a later blog.
To create threat intelligence feeds that provide value, analytical techniques must extract the best possible attributes and indicators from the information. The objective is to create actionable intelligence that can be broadly grouped into several categories of information delivered within the feed:
- File information: provides details on the file such as the size, format and hashes
- Classification information: a decision on how the file or URL is classified (e.g. malware, clean etc.)
- Static information: intelligence developed from static analysis methods applied to the file, such as certificate attributes, association with a particular exploit, its prevalence and the source of the file
- Dynamic information: intelligence developed as the file executes, such as its impact on the file system and registry, its network and API calls and process operations and its injections and mutexes
- Infection information: the attack vectors and methodologies used (e.g. macros)
- Operational information: a summary of the tactics and procedures used by an attacker – typically relevant to a particular application.
Trust, relevance and refinement
The content delivered within a threat intelligence feed is only as good as the analysis performed on the data. If you trust the analysis, you can trust the feed.
Once trust in the analytical techniques used to develop intelligence is established, you can then move on to assess if the data is relevant to your application. Relevance is not just a simple question of whether certain attributes are present or absent within the feed (e.g. the file size or which encryption algorithm was used), it comes from understanding those attributes at a much more refined level. For example, an attribute may tell you it is malware or not. A refined feed with multiple indicators for each attribute may provide detailed intelligence on the probability that code is malware and data on how that assessment was made.
We’ll drill down in more detail into the topics touched on in this blog in future posts. In the meantime, for an overview of Avira’s threat intelligence feeds, take a look at our web page on the topic.