1 Introduction
A cast of increasingly persistent and sophisticated threat actors, along with the sheer speed at which cyber attacks unfold, have made timely decision-making imperative for an organisation’s security [
14]. High profile incidents such as WannaCry [
39] have shown the extent of the damage an attack could cause and the shortened time-window available for an organisation to put appropriate countermeasures in place.
Threat intelligence, or cyber threat intelligence, has become of increasing importance as organisations continue to generate, process and share forensic data, and analytical reports around cyber threats and vulnerabilities [
8]. It is also seen as an opportunity for small organisations to benefit from mature organisations’ experience, as many of the former do not typically have the resources to develop an independent threat intelligence programme [
53,
68]. This has prompted organisations to establish or expand threat intelligence programmes [
19] by deploying sharing ontologies and intelligence management solutions [
62].
A wide array of sources providing information on emerging threats, attackers’ tactics and
indicators of compromise (
IoCs) have become available for security teams [
51]. These sources range from open and commercial data feeds [
51,
52] to threat intelligence service providers [
4,
53,
66]. But, as the adoption and diversity of threat intelligence solutions and sources continue to grow, questions about their effectiveness, particularly in regards to the quality of the data they provide, remain unanswered.
Moreover, along with the increasing amount of intelligence available, there is a need to support human analysts with automation where possible. Appropriate structured data will be necessary for this, as will a machine-readable representations of quality.
Several studies have highlighted data quality issues as one of the most common barriers to effective threat intelligence exchange [
18,
43,
61,
62,
66,
75] and information security processes are facing problems with the quality of the used input data [
24]. Nevertheless, academic research has yet to systematically investigate the issues of data quality in threat intelligence. Little is known about the quality requirements and dimensions for threat intelligence data artifacts [
38,
53].
In this article, we attempt to empirically investigate these issues by: (1) providing a clear and comprehensive definition of threat data, information, and intelligence and (2) deriving quality dimension associated with these definitions.
The methodological approach employed in the work described in this article consisted of a systematic literature review (SLR) followed by a modified two-round Delphi study in an attempt to bridge the gap between theory and practice. Through the SLR, we identified widely-used definitions of threat data, information, and intelligence, and derived a set of literature-based quality dimensions. The preliminary definitions and dimensions served as an input to the Delphi study in which a panel of 30 experts refined and validated our results.
The remainder of this article is structured as follows. Section
2 contextualises the research by highlighting the key theoretical concepts and related work in the fields of threat intelligence sharing and data quality. Section
3 describes the methodology employed for this study. The analysis and the results are presented in three sections: Section
4 discusses the participants’ understanding of the terms threat data, information, and intelligence, and introduces their revised definitions of the terms; Sections
5 and
6 present and discuss the quality dimensions. Section
7 outlines the key findings of the study. Finally, Section
8 concludes the article and identifies areas for further research.
4 Defining Threat Data, Information, AND Intelligence
We noticed that the terms threat data
threat data,
threat information, and
threat intelligence are being used inconsistently in research and practice (e.g., [
46] and [
35]). Furthermore, the generation of threat intelligence can be seen as an iterative process in which threat data are transformed into information and subsequent intelligence [
17]. Accordingly, there needs to be a clear distinction between these concepts in order to facilitate the discussion around the quality criteria associated with each of them. Therefore, the first goal of our study was to explore the experts’ understanding of each of the three terms and create a common understanding.
As shown in Table
2, the majority of the experts either agreed or strongly agreed with the literature-based definitions of threat data, information, and intelligence (73.3%, 66.7%, and 73.3%, respectively). However, during the discussion a number of experts highlighted aspects of the definitions that they did not agree with and suggested some changes.
4.1 Threat Data
The authors of [
46] define threat data as: “lower-level raw logs that have been produced by sensors such as payloads hash values, network artifacts,
internet protocol (
IP) addresses, and
uniform resource locators (
URLs).”
Participants generally agreed with this definition albeit with three caveats. First, although threat data are primarily raw, lower-level data points such as
indicators of compromise (
IoCs), it does not always have to be produced by sensors. Secondly, in order for the data to be considered as threat data it has to be related to some sort of malicious activity. Finally, according to the participants, threat data are predominantly machine-readable. Examples of threat data include SHA1 and MD5 hashes that correspond to specific suspicious files or samples of malware; IP addresses of suspected command and control servers; and network artifacts that identify malicious activity from that of legitimate users [
6].
Based on the participants’ comments and suggestions, we propose the following revised definition of threat data:
Revised definition: Threat data are a machine-readable set of raw recorded facts that can help in identifying or mitigating malicious activity in a system or network. It may include indicators of compromise such as malware signatures, IP and URL addresses, file and domain names, or registry keys.
4.2 Threat Information
In the same report [
46], threat information is defined as: “data that have undergone additional processing to provide enhanced high-level insight that may help decision makers in reaching a well informed decision.”
Overall, participants agreed with the notion that threat information is data enriched with context. However, they stated that it does not necessarily provide high-level insights or allow for well-informed decisions. They also challenged the idea that threat information is strictly the result of extra processing of the data stating that threat data grouped together or with some context is usually enough to make the information relevant to the organisation. For example, a series of raw logs (threat data) collated together indicates a spike in suspicious activity (threat information) [
12].
A couple of experts stated that it is difficult to differentiate between data and information and therefore did not see value in distinguishing between the two in practice. Instead, they offered a simpler model that only differentiates between intelligence inputs and outputs. In light of the discussion above, we adopt the following definition for threat information:
Revised definition: Threat information is information pertaining to a threat to, or vulnerability of, a system or network. It is the result of contextualising and interpreting threat data.
4.3 Threat Intelligence
A widely-cited definition of threat intelligence in the literature is the one presented in [
35], where the term is defined as: “evidence-based knowledge, including context, mechanisms, indicators, implications, and action-oriented advice about an existing or emerging menace or hazard to assets. This intelligence can be used to inform decisions regarding the subject’s response to that menace or hazard.”
The majority of the experts agreed with this definition noting that it adequately captures the meaning and function of intelligence in practice. Nevertheless, they stressed the importance of manual analysis and, therefore, the analyst’s role in vetting, analysing, interpreting, and applying hypotheses to the information. A number of participants referred to the forward-looking nature of threat intelligence, emphasising that threat intelligence involves not only impact assessments and recommendations but also requires deduction and prediction. One participant raised doubt about using the word “knowledge” in the definition, arguing that intelligence is more about reaching a hypothesis that one could agree or disagree with and that it does not have to be 100% knowledge or truth per se. For example, indication of a suspicious activity, when contextualised with information on prior incidents involving similar activity, could allow for the development and deployment of a mitigation strategy to stop the incident [
12].
In reflecting on the experts’ feedback, we build on the definition introduced in [
35] to formulate the following definition:
Revised definition: Threat intelligence is evidence-based forward-looking assessment including context, implications, and action-oriented advice about a threat to, or vulnerability of, a system or network. This intelligence is produced through the application of individual or collective cognitive methods, and can be used to inform decisions regarding the subject’s response to that threat or vulnerability.
6 Threat Intelligence Quality Dimensions
In this section, we discuss our threat intelligence quality model. The final set of quality dimensions and their definitions are presented in Table
5.
6.1 Accuracy
In an absolute sense, accuracy is the degree to which information has attributes that correctly represent the true value of the intended attribute of a concept or event in a specific context of use. In the context of threat intelligence, inaccurate and false-positives might result in undesired effects and wasted resources [
56]. However, the experts stated that, in practice, determining the absolute truth is difficult if not impossible. Accordingly, alternative interpretations of accuracy were elicited.
For threat intelligence producers, accuracy means ensuring the objectivity of the analysis and that the way the product was communicated enables the message to arrive and be understood the way that it was meant to. Although a low number of false-positives in IoCs are one indication of the accuracy of threat data, the panel agreed that consumers do not expect a vendor to provide 100% accurate information in an absolute sense. However, in order to determine the relative accuracy of the intelligence product, they ask how the data and intelligence were collected, whether the analyst come up with the right hypothesis, and what evidence supports the conclusions. Analysts have some latitude based on their own experiences, but it is crucial to ensure that their analysis is logical and they do not leap to conclusions.
The need to distinguish between assessments and facts is crucial prompting practitioners to adopt traditional intelligence language that conveys confidence levels such as the NATO or Admiralty coding [
23] that is used by some vendors to indicate what they perceive as accurate and reliable information.
6.2 Actionability
According to the panel of experts, actionability is the extent to which threat data or intelligence allows a decision to be made or action to be taken without the need for further analysis.
The production of actionable intelligence in practice usually requires an analyst to provide some recommendations or courses of action. And there is always one course of action, which is to do nothing. So, in reality, what one is asking intelligence analysts to do is to make calls, regardless of whether they are right or wrong. This allows the consumer to use or deploy the intelligence directly before it loses its value. In practice, what makes intelligence actionable can take a range of forms. For high-level analytical reports, it could mean that the consumer is going to use that intelligence the next time they make an important decision about security architecture. For lower-level threat data, it could mean using these indicators to block a threat, prioritise patching, or develop mechanisms to detect specific adversary tactics, techniques, and procedures (TTP).
However, in reality (according to some experts), if the intelligence producer understands their client’s requirements, then everything they produce should be actionable in one way or another.
6.3 Interoperability
The increasing volumes of threat data and diversity of sources have elevated the importance of interoperability in threat intelligence. For organisations using more than one service or vendor, it is vital to be able to access and process all of the data in one place and in a timely manner. Similarly, vendors realise that offering products that are aligned to particular standards is important to allow consumption across their clients’ different platforms.
The importance of interoperability varies depending on the exact nature of the intelligence. For machine-readable threat data like IoCs, it is crucial to be able to ingest them smoothly and with minimal analyst effort. However, for higher level analytical reports, recommendations, or assessments, the importance of being able to ingest that into a platform is less critical.
The panel also reinforced the importance of standardised formats in ensuring data are exchanged in consistent and machine-readable manner. The
Structured Threat Information Expression (
STIX) language [
2] continues to attract traction and support.
Threat intelligence platforms and APIs continue to evolve and provide new capabilities including auto-scrubbing and auto-validation of indicators in order to prevent false positives or neutral indicators from being added automatically. However, despite the increasing reliance on platforms and APIs, some of the experts on the panel stated that some of the APIs of commercial vendors are not well documented (or not as well as the consumers would like them to be), which in some cases makes it very complex to pull files and generate reports.
6.4 Provenance
Provenance (or traceability) according to the experts panel is the extent to which a threat intelligence consumer is able to track the evolution of a piece of threat data or intelligence including its origins and the process through which it was produced. Moreover, it ensures the integrity of the intelligence during the iterative revisions in collaborative networks. For example, it makes it possible to trace which participant has made changes at which point in the production of the intelligence. Good provenance allows the consumer to trace the intelligence in the same way the author did based on the evidence provided. This includes the source of information at a granular level and detailed documentation of the processes through which it was produced. A provenance chain would also show what analytical models were used and what hypotheses were tested in formulating the conclusion or assessment.
Provenance is also seen as an important factor in establishing trust and determining usefulness of a vendor. It allows the consumer to identify vendors that merely serve as information aggregators giving rise to issues such as circular reporting. However, establishing provenance is a complex problem, and the community is yet to establish a standardised provenance process, which could be the reason why this dimension is often overlooked.
6.5 Relevance
Overall, the experts agreed that relevance is an important dimension in evaluating the quality of threat intelligence. Increased noisy and irrelevant data result in wasted resources and time. However, determining what makes a particular piece of data relevant or not cuts across the organisation’s industry, sector, geography, technologies in use, its assets, and so on.
In addition to the organisation’s business activity, the amount of allocated resources for its threat intelligence programme plays a vital role in determining how tailored the received intelligence needs to be. In big organisations including those operating in multiple sectors, a dedicated internal team collects as much data, information, and intelligence as possible before translating it into intelligence that is relevant for the organisation’s different business entities.
Our experts agreed that irrelevant intelligence can often be attributed to a failure to understand the consumer’s requirements. However, intelligence producers sometimes face cases where their clients have an immature threat intelligence programme and the clients themselves are unable to identify their intelligence needs, which further complicates the producers.
6.6 Reliability
According to the Delphi experts, determining the reliability of a source is critical in deciding whether to rely on the information received or not. The inferential value of the assessments or conclusions is constantly considered with the reliability of their source in mind. The experts stated that the reliability of a source encompasses other factors including its trustworthiness, authenticity competence, and objectivity.
Assessing a source’s reliability in the context of threat intelligence is currently a multi-layered process and involves subjective elements. The process, according to the experts, should take into account the following three considerations: the historical reliability of the source for similar incidents or topics including its own confidence in the intelligence; the intelligence’s consistency with known facts or confirmed post-mortem findings; the intelligence’s consistency with information gleaned from other sources. Therefore, perceived reliability in a source is established over time, throughout the progression of the analyst’s experience and encounters with the source, as well as the organisation’s overall experience. A source with a demonstrated track record of accurate and credible reporting is perceived as more reliable than an unknown source. For example, experts reported high levels of perceived reliability of official sources such as national intelligence agencies and national CERTs, as opposed to a new or untested vendor.
It should be noted that, despite the importance of source reliability, the experts pointed to the challenge of ensuring source anonymity in some cases, as highlighted by Murdoch and Leaver [
42]. They also noted that the reliability of a source is not constant and therefore continuous re-evaluation might be required. This is in line with the findings of Schaberreiter and colleagues who present a method that allows an organisation to re-evaluate trust in one or more sources every time it receives new threat intelligence [
54].
6.7 Timeliness
Timely decision-making is imperative for an organisation’s security [
14]. Accordingly, the experts overwhelmingly agreed that timeliness is one of the most important dimensions in evaluating the quality of threat data and intelligence. Organisations are constantly looking for new ways to reduce the delay in receiving important information and in acting upon it.
Although the value of intelligence does not drop to zero if it is late, it could diminish significantly. The shelf life of the intelligence also depends on the exact type of the information being shared. For example, threat data like malware signatures have a relatively short shelf life as they are constantly developing. An organisation responsible for providing those to a consumer needs to ensure they are delivered in a timely manner. However, a more strategic intelligence assessment that looks at 3–6 months of the threat landscape is still timely but could be delivered later. The experts also pointed to the inconsistency in the metadata surrounding the intelligence received by some sources. While some sources state both when the data were observed and when it was reported, others might fail to do so.
7 Discussion
The analysis of the results of the systematic literature review and the modified Delphi study provided insights on certain dimensions and metrics used to assess the quality of threat data, information, and intelligence. In this section, we discuss the key findings and reflect on the limitations of our research. In doing so we: (1) provide an experts’ ranking of the quality dimensions; (2) discuss why measuring intelligence quality remains a challenging issue; and (3) argue that the nature of quality is inextricably linked with the task of defining requirements.
7.1 Determining Dimensions Priorities
For each of threat data and threat intelligence, we asked the panel’s experts to rank (top-to-bottom) the final set of quality dimensions in order of importance. The results as depicted in Figure
3 show that the importance of a dimension varies depending on whether it is being used to assess the quality of threat data or threat intelligence.
Threat data as discussed in Section
4 are used to monitor and detect threats and vulnerabilities. The shelf life of threat data is often short and its value diminishes considerably with time. Data feeds are predominantly machine-readable and are expected to be easily processed and integrated into different platforms. Therefore, it is not surprising that the panel considered timeliness, relevance and interoperability as the three most important quality dimensions.
On the other hand, the three most important threat intelligence quality dimensions are considered to be relevance, actionability, and timeliness. As examined in Section
4, many experts stressed that threat intelligence by definition is forward-looking and involves applying hypotheses and making recommendations. Actionable, relevant, and timely intelligence therefore is understandably key to avoid waste of resources and overburdening security operation centres.
7.2 Measuring Intelligence Quality Remains a Challenging Issue
As part of the cost-benefit analysis of their threat intelligence programmes, organisations would ultimately like to know how intelligence products are being used and which paid source is providing good quality intelligence products. Discussions with the experts revealed that, despite growing interest in the idea, stakeholders are yet to identify and develop suitable metrics to measure any of the quality dimensions and most of them do not have a specific process to filter and evaluate threat intelligence based on these dimensions. Moreover, recent research introduced the first set of formal metrics to assess threat intelligence quality, but pointed out that these metrics might be subjected to changes as more knowledge is gained about threat intelligence processes, platforms, and stakeholders [
55].
Simple quantitative metrics like the number of intelligence reports received or number of IoCs fed into a platform are considered unhelpful in measuring intelligence quality. They are, however, useful for the consumers to know how much data they will receive and the quantity of resources required to process it. Traditional intelligence evaluation frameworks like the Admiralty System are used in some cases to indicate the reliability of the source and the credibility of the intelligence. Whether the intelligence has been validated by other sources, or the threat has materialised, inaccurate elements as well as the source reputation and history are factors that can negatively or positively affect the evaluation of the source.
The experts also stated that consumer feedback is essential for vendors in order to identify areas for further improvement. Feedback collection ranges from informal conversations to more rigorous and regular processes such as questionnaires attached to intelligence reports. Through these processes, vendors solicit feedback on the quality of their products, systems, and services. Metrics integrated into the platforms (e.g., number of times a report was read or downloaded) would also be considered. However, despite being an essential part of the traditional intelligence cycle [
27], the feedback loop according to some experts does not exist in commercial intelligence since only few vendors actually implement it, and, where it does, it is slow and inefficient.
From a consumer point of view, measuring the quality of a threat intelligence product is intertwined with determining its effectiveness. This is typically achieved by examining the decisions that were made based on the received intelligence and the impact of these decisions on preventing exploitation or reducing vulnerability. The percentage of intelligence that has led to control changes (e.g., deciding to monitor or stop monitoring a specific part of the network), or the number of incidents detected or foiled because of intelligence received from a specific source, are examples of metrics that could be employed to determine the effectiveness of a threat intelligence service. One way to achieve this is by using a tagging system that enables an organisation to track the number and nature of actions and provide it with a direct link back to the source of the intelligence. This would allow it to identify the most used sources and compare their impact to their cost.
The study’s panelists stated that threat intelligence vendors and their clients should work together to develop and implement some of these metrics. However, they warned that issues such as privacy, confidentiality, and willingness can hinder the progress of the collaboration. Metrics development and tracking also require dedicated efforts and resources adding more work to already stretched and understaffed cyber security teams.
7.3 Intelligence Quality Is Influenced by Its Iterative Production Process
According to the authors of [
17] and [
15] the production of threat intelligence can be defined as an iterative process called the intelligence cycle. This process includes: planning, collection, processing, analysis, production, dissemination, and feedback. The last of these occurs continuously throughout the intelligence process and steers it. This iterative process can be crucial for dimensions like accuracy or completeness as these quality dimensions might be improved through several iterations. Moreover, as the cycle transforms threat data into intelligence it clearly shows why a differentiation of quality criteria for threat data and intelligence is necessary.
However, recent research and our investigations showed that current production of threat intelligence lacks a common understanding and implementation of a formal process like the intelligence cycle. In this context, Oosthoek and Doerr showed that describe current threat intelligence as a product without a process [
43].
7.4 Determining Quality Is Intertwined with Defining Requirements
A common view amongst the panel experts is that a high-quality intelligence product is one that aligns closely with the consumer’s requirements. An organisation does not necessarily need to know which of the vendors provide the most accurate intelligence, but rather which one of them is the most useful, that is to say, which vendor’s services satisfy the organisations’ requirements and business area. The experts also argue that in reality if the provider knows their client’s requirements well then most intelligence products should be of suitable quality. Consequently, producing an irrelevant or unactionable intelligence product is mostly attributed to a failure at the requirements level.
Requirements are ideally captured at the beginning of the interaction and reviewed regularly through ongoing conversations and requirement exercises. Setting the requirements entails understanding what the business wants from the threat intelligence products. This requires vendors to understand the risks and the worries the client has, the industry’s threat landscape, underlying infrastructure, and what they can do to try alleviate some of these concerns or allow the client to make more informed decisions.
However, given that it is a relatively new area, most organisations still do not fully understand how to consume threat intelligence and are, therefore, unable to accurately identify and convey their needs. Organisations with no existing threat intelligence programme expect the provider to fully understand the organisation’s environment including its critical assets, personnel, systems, third party suppliers, and so on. A number of experts also pointed out that expectations from threat intelligence vendors are high and somehow unrealistic, and cautioned that threat intelligence services are not a panacea.
7.5 Limitations
Although the study has several limitations, the decisions taken during the planning and execution attempted to mitigate or minimise their impact.
Similar to any other literature review, the review described in this article is limited by the search terms used and the selected scientific databases. Given that the terms data and information are sometimes used interchangeably, we have included both of them in the search string along with their disjunctions across eight academic databases. Nevertheless, there still can be some relevant publications that have not matched our search criteria.
The process of literature screening and data extraction is inherently subjective and may suffer from authors’ bias or different interpretations. To minimise the bias, both of the first two authors independently conducted the screening before the results were compared and adjusted. We should note that, while the articles were thoroughly screened, it could be the case that the criteria were too strict or that a article’s abstract did not reflect its relevant contribution. Using the identified articles as a foundation, we conducted a snowball search in order to identify additional relevant publications.
As for the Delphi technique, an inherent limitation of its approach is the generalisability of the findings as it relies on a limited number of experts and the possibility of a selection bias. Although we tried to minimise the impact by recruiting a large panel of 30 experts from around Europe, future research might replicate this study with a different panel or use our results as the basis for a different data collection approach designed to test for generalisability. To minimise the burden on respondents and ensure high response rate, we kept the questionnaire in the second round short by including a brief description of the quality dimensions. However, this is in contrast with the complex nature of the research area. In some cases, this might cause ambiguous misinterpretation of the quality dimensions. For that reason, a comment box was included to allow respondents to report any vagueness.
8 Conclusion AND Outlook
As the interest in threat intelligence sharing continues to grow, questions about its quality and effectiveness have surfaced. The current academic literature highlights data quality issues as a concern for threat intelligence producers and consumers. However, it falls short of empirically investigating the issue further. This study set out to develop and validate a set of threat intelligence quality dimensions. A systematic review of the threat intelligence literature followed by a modified Delphi study resulted in identifying seven quality dimensions: accuracy, actionability, interoperability, provenance, relevance, reliability, and timeliness. It also examined the literature definitions of threat data, information, and intelligence, and suggested areas for improvement. The study has found that practitioners’ quality priorities vary depending on the nature of the intelligence. It has also shown that, despite increasing awareness of their potential value, organisations are yet to develop concrete metrics to measure any of the quality dimensions, and that they largely rely on consumer feedback and anecdotal evidence. The generalisability of the findings is limited by the inherent limitations of the Delphi technique. Nevertheless, the study provides empirical insights into the state of threat intelligence quality evaluation and extends our knowledge of what quality attributes are most relevant to practitioners. We lay the groundwork for further research that might explore the relationships between threat data, information and intelligence, and the identified quality dimensions, and offer a framework for the development of associated threat intelligence quality metrics. Additionally, future work might explore the extent to which the quality and value of threat intelligence intersect to maximise benefits and how quality affects user satisfaction across threat intelligence sharing platforms.