D1.1 Analysis of Current Practices - VIS

Transcription

SEVENTH FRAMEWORK PROGRAMME
Area ICT-2009.1.4 (Trustworthy ICT)
Visual Analytic Representation of Large Datasets
for Enhancing Network Security
D1.1 Analysis of Current Practices
Contract No. FP7-ICT-257495-VIS-SENSE
Workpackage
Author
Version
Date of delivery
Actual Date of Delivery
Dissemination level
Responsible
Data included from
WP 1 – Requirements-Specifications-Architecture
UKON, IGD, SYM, CERTH
1
M6
M6
Public
UKON
IGD, SYM, CERTH
The research leading to these results has received funding from the European Community’s
Seventh Framework Programme (FP7/2007-2013) under grant agreement n°257495.
Area ICT-2009.1.4 (Trustworthy ICT)
The VIS-SENSE Consortium consists of:
Fraunhofer IGD
Institut Eurecom
Institut Telecom
Centre for Research and Technology Hellas
Symantec Ltd.
Universität Konstanz
Project coordinator
Contact information:
Dr Jörn Kohlhammer
Fraunhofer IGD
Fraunhoferstraße 5
64283 Darmstadt
Germany
e-mail: [email protected]
Phone: +49 6151 155 646
Germany
France
France
Greece
Ireland
Germany
Contents
1 Introduction
2 Network Analytics for Security
2.1 Abnormal Network Traffic and Event Detection . . . . . . . . . . . .
2.1.1 Detecting network anomalies . . . . . . . . . . . . . . . . . .
2.1.2 Behavior-based Network Intrusion Detection . . . . . . . . .
2.1.3 Knowledge-based Network Intrusion Detection . . . . . . . .
2.1.4 Composite Detection . . . . . . . . . . . . . . . . . . . . . . .
2.2 Correlation Analysis and Alert Correlation . . . . . . . . . . . . . .
2.2.1 Alert Correlation . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.2 Monitoring from several vantage points . . . . . . . . . . . .
2.3 BGP State-of-the-art . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.2 Prefix Hijacking . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.3 Securing BGP . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.4 BGP monitoring . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.5 Methods for detecting prefix hijacking . . . . . . . . . . . . .
2.4 Analysis of Spam Campaigns . . . . . . . . . . . . . . . . . . . . . .
2.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.2 IP reputation analysis . . . . . . . . . . . . . . . . . . . . . .
2.4.3 Message content analysis . . . . . . . . . . . . . . . . . . . .
2.4.4 Network-level spam detection . . . . . . . . . . . . . . . . . .
2.4.5 Analysis of scam infrastructure . . . . . . . . . . . . . . . . .
2.4.6 Analysis of higher-level behaviour of spammers . . . . . . . .
2.5 Root Cause Analysis and Attack Attribution . . . . . . . . . . . . .
2.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.5.2 Investigative and Security Data Mining . . . . . . . . . . . .
2.5.3 Attack Attribution based on Multi-criteria Decision Analysis
2.5.4 Malicious Traffic Analysis and Cyber-SA . . . . . . . . . . . .
3 Visual Analysis for Network Security
6
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
9
9
9
10
19
21
22
23
27
31
31
33
37
41
45
52
52
54
55
56
57
59
60
60
62
64
65
67
3
3.1
3.2
3.3
3.4
3.5
Introduction . . . . . . . . . . . . . . . . . . .
3.1.1 Visualization Techniques . . . . . . . .
3.1.2 Basic Interaction Techniques . . . . .
3.1.3 Advanced Interaction Techniques . . .
3.1.4 The Results of an Analysis . . . . . .
Tools for Generic Data Visualizations . . . .
Tools and Methods for BGP Data . . . . . .
Tools and Methods for Network Traffic Data
Tools and Methods for IDS Logs . . . . . . .
4 Conclusions and Future Work
4
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
67
67
68
69
70
70
74
77
84
88
Abstract
The VIS-SENSE project aims to use a novel combination of visual analytics approaches
and network security analytics to enhance network security. To support the work and
collaboration in those two research domains and their communities, this document provides an overview of both fields detailing the state-of-the-art techniques, algorithms and
tools, which were developed in the past and are in use. As a result of this survey it
is possible to identify open questions and gaps, which lead to novel ways to solve and
achieve the overall goal. We also show that both research domains are still relatively
separated and lack tight integration. In general, most currently available visual analysis
tools focus on very specialized tasks and problems, do not integrate the most advanced
network security approaches and focus on specific data rather than providing means to
correlate different data sources. Therefore, it can be stated that the VIS-SENSE project
is very relevant and a highly promising direction to solve open questions in network
security.
1 Introduction
Today there are hundreds of thousands of different viruses, worms or other malware
spreading through the Internet and infecting unprotected computer all over the world.
Everyday the amount of malicious traffic increases which makes it difficult to keep a
network safe. In most cases the computer user doesn’t even know that his machine
is infected. Sometimes this can lead to unnoticed data theft or the possibility of the
criminal to use the hijacked computer to spread spam and unsolicited bulk e-mail, host
phishing web sites and so on. This is the reason why there is a great need to deal
with this massive amount of malicious traffic circulating through the Internet. The VISSENSE project is going to use visual analytics to create a scalable framework for network
security. The idea is to detect and predict very complex patterns of abnormal traffic
to prevent computer networks from being hacked or infected. The ultimate goal of the
project is the improvement of the international network security and the resolution of
cyber crime. In order to achieve these high aims it is absolutely necessary to combine
novel algorithms from the field of network analytics and novel visual analysis techniques
and methods.
These methods make it possible to deal with the massive amount of network data and
the very special tasks of network administrators. The primary goal of visual analytics is
to turn the information overload into an opportunity [281]. With visual analytics it is
possible to affiliate the information visualization area with automatic data mining methods to generate a highly interactive software and to couple human and machine analysis.
The idea is to combine the strengths of human visual perception and electronic data
processing to exploit their respective advantages to achieve the most effective results. In
visual analytics the user is completely integrated in the overall analysis process to give
him the chance to better understand automatic algorithms and their results and to control the process in the most promising direction, thus matching the need of the analysts
for data exploration. In order to meet these requirements, visual analytics draws tools
from both the information visualization and the data mining communities.
To achieve the overall goal to eventually enhance network security with visual analytics
within the VIS-SENSE project, it is required for all partners to be up to date on the
most recent developments, state-of-the-art techniques and common practices in the fields
of network analytics and visual analysis with the focus on security applications. It is
also necessary to have a baseline when it comes to the final evaluation of the whole
6
project and to see the advantages of the prospective developed framework. Therefore,
this deliverable summarizes the latest developments in network analytics and visual
analytics with respect to the scope of the project in a survey.
The rest of the document is structured as follows. Chapter 2 covers the topics of
anomaly detection, correlation and attack attribution. Additionally this chapter describes security issues concerning the border gateway protocol (BGP).
In particular, the existing practices on detecting abnormal network traffic and malicious events are investigated. The survey covers Intrusion Detection Systems following well-known paradigms and techniques such as expert systems and other data mining approaches including different clustering and classification algorithms. In addition,
techniques that compose more than one of such methods are discussed. Moreover, a
coarse-grain classification of the systems is provided together with a short summary for
each of the techniques on the advantages and disadvantages.
However, since the optimal operation of the discussed intrusion detection systems is
a challenging research issue on correctly detecting abnormalities, correlation techniques
have been employed to minimize the probability of false positive alarms and undetected
events.
Four basic categories of correlation methods have been identified: based on similarity
between alert attributes, based on predefined attack scenarios, based on attack preconditions and prerequisites as well as post-conditions and consequences and finally
strategies that utilize several heterogeneous information sources. A very interesting
approach that requires particular attention is that based on honeypots.
A special area for deeper investigation has been the security mechanisms for BGP
traffic. Major secure BGP flavors are also discussed that focus on the integrity of the
exchanged BGP messages.
Section 2.4 then discusses the state-of-the-art for the analysis of spam campaigns
as a concrete application area of the VIS-SENSE project. In particular, methods for
analysis of IP reputation, message content, network-level spam detection, and the hosting infrastructure of scam websites, that advertise their products through spams, are
discussed. Finally, this section outlines how these particularities are then used for abstracting higher-level behaviour of spammers.
For obtaining a more complete overview of the related work relevant to the project,
Section 2.5 details recent publications about root cause analysis and attack attribution.
The ultimate goal of work in this field is to understand the modus operandi of spammers
and attackers in order to develop better security mechanisms. Since current attack
phenomena are largely distributed in the Internet and their lifetime vary from a few
days to several months, it is a difficult task to attribute different multi-featured attacks
to the same root source. The review of preliminary research in this subfield gives us a
FP7-ICT-257495-VIS-SENSE
7
1 Introduction
good starting point for future research and development in the scope of the VIS-SENSE
project.
Chapter 3 then covers the state-of-the-art in the field of visual analytics for network
security. First, an introduction to the general field of visual analytics is provided by
explaining basic concepts for visualizing data and describing the most commonly used
visualization techniques in the field of network security. Furthermore, different interaction techniques are described, which turn visual analytics applications into interactive
data exploration tools.
While there are many tools that are custom-build for a particular data set or analysis
task, we first focus on generic data visualization tools in Section 3.2 that can be quickly
used for different kinds of data since those tools are more likely to be used in practice
by network security analysts and researchers.
The state-of-the-art of current research projects, which enhance network security tasks
with visualization techniques, is then discussed in sections 3.3, 3.4 and 3.5 focussing on
visual analysis tools and methods for BGP data, network traffic data and IDS logs.
Thereby, an overview and classification of the most relevant publications in this field is
given.
The last chapter summarizes the findings of this survey and briefly outlines the implications of this survey and future developments on both the project and the network
security field in general.
8
2.1 Abnormal Network Traffic and Event Detection
2.1.1 Detecting network anomalies
There are two general complementary approaches for detecting network anomalies [130].
The first one is be defining what normal network operation is and developing techniques
to identify deviations from normal cases, the so-called behavior-based techniques. The
second family of approaches shares the philosophy of defining directly the attacks and
aiming to identify them in the observed data, the so-called knowledge-based techniques.
IDSs may be categorized according to different properties and features such as the detection principle, the behavior on detection, the source location, etc. [75], [20]. A further
classification of the two complementary approaches based on the fundamental principles
of the analysis and investigation reveal five basic categories:
1. Statistical-based approaches generate a profile about the stochastic behavior of the
network traffic.
2. Expert systems that are trained with rules to produce a certain out about the
network state given a particular input.
3. Machine learning approaches that initially require training with input data to
identify the normal network operation building internal structures to represent
such information. Afterwards, they identify intruders by evaluating their actions
and raising alarms when they differ from the range of values they have been trained.
4. Pattern matching approaches that employ pattern matching (typically string matching) techniques to identify abnormalities.
5. State-based transition approaches where finite state machines are utilized to represent the potential states of the network and to identify the set of requirements
to transit in an abnormal state.
Table 1 classifies several state-of-the-art intrusion detection platforms that contain
behavior-based or knowledge-based detection modules. The majority of the investigated event detection methods fit into one of these categories, however, a number of
9
them combine techniques from several categories. In order to cover the latter methods,
the composite detection class is proposed. Observing Table 1 it is clear that the surveyed intrusion detection platforms are subdivided into two categories: self-learning and
learning-based. Self-learning refers to systems that learn by example what represents
normal network operations. The learning-based techniques require to be taught how to
identify particular abnormal patterns of network traffic.
2.1.2 Behavior-based Network Intrusion Detection
While expert systems have been the initial way of developing behavior-based network
intrusion detection systems, their enormous requirements led to the investigation of
methods based on neural networks, wavelets, Markovian models, Bayesian networks,
genetic algorithms, etc.
Expert Systems
Expert systems have been the first approaches dealing with anomaly detection, where
normal operation is described in terms of rules that are stored in DBs. The monitored
traffic is processed by the rules-based system and alerts are raised when low-weighted
matches are detected. The three major steps of rule-based traffic classification are identification of attributes and classes, deduction of classification rules and parameters and
audit data classification.
LERAD [199] is a rule based algorithm for finding rare events in time-series data with
long range dependencies, which has been used for detecting anomalies in network traffic.
LERAD uses association mining to find out syntactic associations between attributes.
More particularly, the anomaly score for each record is estimated based on the unsatisfied
rules and the time since each particular rule has been violated. However, LERAD lacks
a mechanism for distinguishing between correct and false alarms and it lacks the ability
of detecting untrained anomalies. Another expert system that employs fuzzy cognitive
maps (FCM) for network anomalies detection is described in [217]. FCM is a visual model
for encoding and processing unclear causal reasoning that via its dynamic properties is
able to represent time-varying characteristics of network anomalies. A matrix is used for
detecting propagation of events over managed network components providing a causal
inference representation mechanism.
While expert systems are solutions that provide robustness, flexibility and high-quality
knowledge, they typically achieve them through time-consuming and demanding processes. Nevertheless, anomaly detection systems in general have challenging requirements to defining normality, which requires exhaustive training with relevant data.
10
Table 2.1: Behavior and knowledge based system classification for abnormal event
detection.
Behavior-based
Learning based
Expert systems
Supervised machine
learning approaches
Self-learning
Statistical-based
approaches
Unsupervised machine
learning approaches
Knowledge-based
Composite techniques
Learning based
Self-learning
Pattern matching
approaches
Expert Systems
State transition and
Petri-net modeling
Hybrid approaches
[199], [217]
[122], [37], [73],
[47], [249], [110],
[213], [159], [214],
[150], [80], [95],
[196], [79], [58],
[191], [22], [113],
[329], [120], [72],
[89], [115], [102],
[85], [49], [202],
[256], [189], [83],
[114], [23], [41],
[42]
[77], [78], [44],
[215], [197], [99],
[127], [129], [324],
[288], [300]
[179], [128], [245],
[86], [123], [135],
[162], [169], [325],
[319], [280], [88],
[172]
[134], [94], [52],
[170], [231]
[18], [101], [190]
[299], [136], [240],
[173], [172]
[133], [233], [193]
11
Machine Learning Approaches
Machine learning approaches for network anomaly detection include methods based on
neural networks, support vector machines, fuzzy systems, genetic algorithms, etc. The
common goal of these techniques is establishing an explicit or implicit model enabling
pattern categorization.
Neural Networks The most challenging issue in a Neural Network (NN) is to train
it properly and set the coefficients to their optimal values for the specified input and
output. The general approach for IDS systems is to initially train the NN system with
normal data as well as attack patterns.
An early attempt to utilize NNs for IDS development is described in [122], where
a hierarchical approach combines NNs with hidden Markov models (HMMs). A NNbased IDS based on well-known intrusion profiles is provided in [37]. A NN based on
a statistical model focusing on the architecture design of an expert system is given in
[73]. A NN base on Self-Organized Maps (SOMs) is described in [47] combined with
a multi-level perception mechanism to detect attack patterns. Another NN-based system using SOMs is called Integrated Network-Based Ohio University Network Detective
Service (INBOUNDS) [249]. Six relevant parameters assist in characterizing network
connections. The SOM utilizes structures with two-dimensional lattices of neurons. A
predefined threshold is used to characterize activities as attacks or not. However, it has
limitations in identifying well-covered attacks and corner-case behavior can give false
positives. According to [110], training NNs with random data gives the best possible
results in detecting unknown attacks. Utilizing a recurrent NN it achieves generalization
of the results from particular users to categories. Nevertheless, NNs do not provide a
descriptive model clarifying why a certain detection decision has been taken. Finally,
a comparison between NNs and support vector machines (SVMs) is provided in [213],
where it is concluded that SVMs are superior IDS development since they are more
effective in training and operation, they have better accuracy and scalability.
Support Vector Machines As it is stated above Support Vector Machines (SVMs) are
promising learning techniques used for categorization and regression of network anomaly
detection. SVNs are a supervised learning method trained with normal data and afterwards used for detecting anomalies [159]. In [214], the superiority of SVNs over NNs
is investigated as it has been aforementioned. In addition, that work deals with the
feature selection issue for SVNs demonstrating that SVNs can achieve the same performance as NNs using a smaller number of features. The investigation has been based
on the Knowledge Discovery in Data Competitions (KDD Cup) dataset [150] using in
12
total five SVMs. While the SVM based method achieved 99% accuracy, the NN based
achieved only 87% requiring more time. However, if the training data contain traces
from intrusions, they will not be able to detect them in future attempts since they are
considered as normal cases.
Fuzzy Logic Techniques In order to overcome the limitations of deterministic reasoning, fuzzy systems promote a probabilistic framework where reasoning is approximate
and not precise. Such reasoning approaches fit nicely with the fuzzy variables related
to network anomaly detection where normal and abnormal events are distinguished by
the values of the variables (lying in given intervals). The Fuzzy Intrusion Recognition
Engine (FIRE) [80] is a fuzzy logic based IDS using data mining for classification of the
information and representing the discovered metrics as fuzzy sets. Well-known scenarios
are expressed as fuzzy rules that are applied to the collected data in order to provide the
output reasoning and classify them as normal or malicious. An algorithm for calculating fuzzy relationship rules based on Borgelt’s prefix trees is described in [95] involving
feature selection, genetic algorithms based optimization and improved confidence fuzzy
rules. Abnormal events are generated when the similarity between the trained fuzzy
sets and the ones under evaluation is beyond a threshold. Such a system achieves an
appealing accuracy that is achieved by modified data mining algorithms. The Intelligent
Intrusion Detection Model (IIDM) [196] is another IDS based on fuzzy logic. The system involves a normalization step for balanced mining fuzzy association and afterwards
it is applied to learn fuzzy frequency episodes where the selected similarity function
is continuous and monotonic. Nevertheless, a critical shortage of fuzzy systems is the
requirement for a costly offline analysis processing. A system that deals with this limitation is presented in [79] integrating fuzzy logic and genetic algorithms to select the
best possible fuzzy rules.
Genetic Algorithms and Immunological techniques Genetic algorithms provide another possibility for developing IDSs. Such evolutionary algorithms are considered global
search heuristics able to detect novel network attacks, capitalizing on features such as
inheritance, mutation, selection and recombination. An approach combining genetic
algorithms with a decision tree is provided in [58] achieving high detection rate and
low false positives over unknown gathered data capitalizing on the fact that malicious
events are inherently different from normal ones. However, the developed system has
scalability limitations. Another approach inspired by genetic algorithms is described in
[191] where both temporal and spatial information is considered for building the fuzzy
rules. The evaluation function gives high weight to source and destination IP addresses,
13
as well as the duration and less to the utilized communication protocol or source port
number. Crossover and mutation techniques are employed for the natural reproduction and mutation of the species where the fittest chromosomes are selected. A similar
technique is given in [22] where genetic algorithms are applied on TCP sessions where
packets of the same session are considered are sequences. The ROCK algorithm [113]
uses dynamic programming and it is employed to cluster the sequence features of the
TCP sessions. The compact clustered information provides a knowledge space where it is
more effective identifying normal and abnormal scenarios. Genetic algorithms are interesting approaches to detect network anomalies since they are able to identify unknown
attack patterns, however, they are considerably resource demanding. A system aiming
to partly address this issue is described in [329] that combined clustering techniques
with the genetic algorithms resulting in high detection rate, reduced false positives in a
resource-effective way. Nevertheless, particular attention has been given to immunological techniques. Such an IDS system is presented in [120] where a set of immunological
techniques are employed. More specifically, it includes permutation masks to amplify the
detection of false negatives, activation thresholds to aggregate activity over time, and
adaptive thresholds to integrate patterns from several points. R-contiguous bits match
rule is utilized to compare incoming connections with classified ones and matched connections are considered as anomalous. A negative selection mechanism for distinguishing
foreign patterns in the complement space is presented in [72] where a set of fuzzy rules
is generated for differentiating abnormalities in network traffic using genetic search algorithms. A comparison between Positive Comparison (PC) and Negative Comparison
(NC) approaches showed that the latter while less accurate are more effective in term
of resource requirements. Another immunological-inspired approach is presented in [89]
where is has been observed that introducing synthetic abnormalities into the original
data considerably improved discovery of malicious anomalies, even unknown ones. However, immunological techniques may miss evident attacks and occasionally produce false
positive alerts, as discussed in the evaluation of Lightweight Intrusion detection System
(LISYS), developed over the Artificial Immune System (ARTIS) [115]. Genetic algorithm based systems may face scalability limitations, however, they are able to identify
effectively previously unobserved attacks.
Clustering and Outlier Detection Unsupervised techniques such as clustering and outlier detection identify abnormal data considering their deviation from the normal ones.
Such an alert clustering mechanism is presented in [102], which operate in real-time
considering that different network sensors may produce different reports and similar reports may be generated by a particular sensor for different network events. A geometric
14
framework for unsupervised anomaly detection is presented in [85] where events are represented by d-dimensional vectors. Anomalies are identified as points lying in non-dense
areas of that d-dimensional space. A number of mapping methods are evaluated within
that work, one data-dependent for the network connections and one based on a spectrum kernel. The advantage of this technique is that it operates effectively on unlabeled
data. Constant width and k-nearest neighbor clustering algorithms are employed in [49]
investigating connection logs for anomalies in the network traffic. A hybrid approach is
described in [202], where examining user profiles, expert rules are applied to decrease
data dimensionality and afterwards, a clustering mechanism based on Learning Vector
Quantization (LVQ) provides a categorization of the data. Taking advantage of the
fact that LVQ is a nearest neighbor method, abnormal events can be easily identified
without the requirement to train the system a priori with network anomalies. A successful classification of about 80% is possible with this system. ADMIT [256] is utilizing
semi-incremental methods to detect non-legitimate users of computer terminals. They
introduce the concept of dynamic training and dynamic clustering to deal adaptively
with unobserved classes as new data is captured by creating new clusters. fpMAFIA
[189] is a density and grid-based high dimensional clustering method for large amounts
of data. fpMAFIA can generate clusters of arbitrary shapes and fpMAFIA attain a high
detection rate, however, it experiences high false positive rate. In addition, that work
compares fixed width clustering algorithms and density based ones. The density-based
clustering algorithm has the in-built limitation categorizing effectively points that lay in
sparse areas. In general, clustering algorithms may require long convergence time to a
stable categorization. Moreover, statistical dependencies among raw data are not effectively represented using clustering methods, thus, such correlations may not reveal. To
achieve local convergence effectively, the SFK-means approach combines fuzzy logic and
swarm intelligence algorithms [83]. The training phase produces improved classification
on each repetition while Euclidean distances are employed for the anomalies detection
phase. In addition, mixture models are alternative clustering approaches focusing on
modeling aspects. A finite Gaussian mixture model [114] is employed for approximating stochastically the maximum likelihood using the Expectation-Maximization (EM)
method. Anomalous events are identified based on the fact that they demonstrate rapid
mean value changes, while the baseline random variable is stationary having zero mean.
ADAM (Audit Data Analysis and Mining) [23] is a testbed to research which data mining
techniques are appropriate for IDSs.
A graph-theoretic approach for detecting abnormal network traffic is presented in [41],
where networks are represented as graphs with relevant properties at nodes sampled at
regular time intervals. The states of graph snapshots construct a space where differences
demonstrate the network changes as events occur. If the calculated distance between
15
two subsequent states is larger than a threshold, an abnormal event alert is generated.
More particularly, graphs with unique node labels are utilized for lower computational
complexity on graph operations. A concept called median graph is used to measure
the similarity of graphs [42]. The median of a set of graphs is a graph that minimizes
the average edit distance to all graphs in the set. The complete set of graph distances
is applied on the graphs using a multidimensional scaling (MDS) method to associate
events on the network, thus providing a scatterplot-based visualization method to present
anomalies.
Statistical-based Approaches
The aforementioned approaches are heavily dependent on the state of the network where
they are trained or configured. Significant changes on the network state require retraining
of the system to operate effectively. In contrast, involving online learning and statistical
techniques allows constant monitoring of the network state. An important discriminative feature between statistical anomaly detection and machine learning techniques is
that statistical methods mainly focus on the statistical investigation of the gathered
data, whereas machine learning methods focuses on the learning procedure. The general
process of statistical methods for network anomaly detection are first to preprocess and
filter the raw data, afterwards to perform the statistical analysis and the transformation
of the data, and finally, to check whether conditions and thresholds are met to raise an
anomaly alert. A significant amount of research focuses mainly on the second step to
distinguish normal operation from anomalous behaviors and noise. Some early statistical
approaches for network anomalies detection employed univariate models with Gaussian
random variables [77], [78]. An auto-regressive process based technique is presented in
[44] where applying Statistical Tests for Causality on the data from a Management Information Base (MIB) they derive information about the attacks. A similar IDS that
uses Adaptive Regression Splines is described in [215]. More particularly, Multivariate
Adaptive Regression Splines (MARS) are compared with SVMs and NNs. It is reported
that MARS is superior to SVMs for classifying significant attack classes and that SVMs
are superior to NNs regarding scalability, accuracy as well as training and execution
time.
Wavelet approaches Wavelet analysis has been employed to model non-stationary data
series taking advantage of the time and scale-localization abilities to identify abnormal
events in traffic traces. For example, wavelet-based analysis techniques are employed in
[197] and are applied on network packets in a MIB to generate time series of traffic statistics. The system mostly aims in identifying correlations between mis-configured traffic
16
and Retransmission Time-Out (RTO) events (they consist up to 33% of network disoperation) rather than attacks and generate related signatures. Another wavelet-based
decomposition method is presented in [99], aiming at rapid network recovery. Providing
scalability and adaptability, the method transforms the problem to a frequency domain
where mainly using medium and high frequencies detect the anomalies. A fast wavelet
algorithm allows its application in real-time traffic. Moreover, in order to achieve a
better performance than pattern matching methods, Waveman [127] is a wavelet-based
framework capitalizing on percentage deviation and entropy to calculate the performance of various wavelet algorithms. It is concluded that Coiflet and Paul wavelets
based on a five-minute, sixty-sample window are among the best for detecting network
anomalies. Moreover, WIND [129] is a prototype tool for Wavelet-based INference for
Detecting network performance problems. WIND is merely based on passive packet properties coming from a single observation point where time, scale and destination-based
inter-relations among packets are detected and structured using wavelet algorithms. A
covariance-matrix modeling and detection method is described in [324], where second
order features are employed. In this approach, statistical covariance matrices are used to
represent normal network traffic conditions and by using a threshold matrix generated by
Chebyshev inequality theory, classification takes place. Attacks are detected by estimating the difference to the categorized data. Such a method does not pose any assumptions
on the distribution of data. Wavelet approaches provide interesting scalability features,
however, they require complex mathematical models.
PCA methods An unsupervised statistical-based method that employs Principal Component Analysis (PCA) over global traffic matrix statistics is presented in [179], which
utilizes entropy as a metric to explore feature distributions and their structure. It is
observed that such a method achieves an effective classification scheme in an unsupervised manner enabling the discovery of known and unknown anomalies. However, it is
assumed that data processing took place in an offline manner raising scalability issues for
scenarios that pose real-time requirements. Another PCA-based approach is discussed
in [128], suggesting a scheme relaxes the need to centralize the available data using filtering methods in order to achieve scalability. However, a stochastic matrix perturbation
technique is employed to reduce the possibility of false alerts, providing the means to
trade-off between accuracy and communication effort. Independent Component Analysis
(ICA) is a similar approach presented in [245] that is able to split traffic into normal and
abnormal components based on blind source separation. A scale-space filter is utilized
to reduce the noise and a zero-crossing technique is employed to mine the stochastic
behavior pulse widths in order to select the largest as indicators for the behavior. These
17
indicators assist in detecting abnormal events. ICA does not require supervised learning.
PCA analysis methods are able to extract interesting features from the monitored data,
however, they have some limitations on their scalability.
BN methods In addition, methods that employ Bayesian Networks (BNs) and Hidden
Markov Models (HMM) are interesting approaches that have been investigated [86]. In
particular, BNs are models able to capture the statistical dependence or causal-relations
between variables and abnormalities. The application of BNs to MIBs is presented in
[123] where the normal operation is captured in the structure of a BN in order to detect the unknown anomalies when deviations occur. A Web-based automated network
anomaly detection approach is described in [135] aiming to address issues in multitier systems by utilizing a BN solution. More particularly, sequences of graph models
represent the offered Web services and their dependencies as they vary over time. A
feature vector is extracted from the adjacency matrix and the principal eigenvector of
the graph eigenclusters is calculated. Anomalies are detected by observing the irregular
changes in the graph sequences. S3 [162] is a BN based algorithm able to detect network anomalies. S3 targets to address short-lived anomalies by employing fine-grained
timestamps on inputs such as traffic volumes, correlated packets and session bit rates.
Combined together, these signals provide sufficient information to detect anomalies with
higher accuracy compared to time series-based and wavelet-based methods. Furthermore, the suitability of BNs to reduce false alerts is discussed in [169]. BNs provide
an advanced mechanism compared to unsophisticated aggregation methods that employ a single threshold to make decisions. Moreover, BNs allow natural combination
of information originating on different sources. However, the utilized model has strong
assumptions about the behavior of the target system.
HMM methods An approach based on multivariate models that consider the correlations between two or more metrics is discussed in [325]. It capitalizes on HMMs and the
maximum likelihood principle to deal with dynamic features, while for the static ones
it utilizes frequency distributions and minimum cross entropy. The multivariate models
are suitable for experimentally collected data coming from multiple sources since they
demonstrate enhanced discrimination features on them. In addition, using subtractive
clustering and HMMs, normal-anomaly patterns are produced for network traffic [319]
that assist in correlating the observation sequences and network state transitions, thus
detecting intrusion activities. A combination of HMM algorithms with models of user
behaviour is presented in [280]. A relevant piece of work investigating HMMs is presented in [88] focusing on TCP session sequences. It quantizes and models TCP headers
18
as Markov chains to represent the dynamics of the protocol; then it distinguishes normal
from abnormal behavior. HMMs improve the network detection accuracy by reducing
false alerts. Nevertheless, HMM-based approaches are complex procedures requiring
long processing time making them appropriate only for offline processing [172].
Statistical sequential change-point detection methods A different method of designing IDSs is based on statistical sequential change-point detection that estimate the deviation of a measured sample from the normal behavior using distance metrics based on
L-norm, Hamming or Manhattan distance, etc. More particularly, in [288] a statistical
signal processing method is presented where it is assumed that traffic variables are quasistationary. In this work it has been observed that many false alarms are due to burst
in traffic. In the utilized MIBs there are abrupt changes in a correlated manner. Fine
granularity sampling provides useful results by the means of a ”network health” function that indicates anomalies in the network. A related technique is presented in [300]
for identifying SYN flooding attacks at edge routers. Applying sequential change point
detection on the differences between TCP SYN and FIN pairs modeled as a stationary
ergodic random process and a non-parametric cumulative sum method it is possible to
detect irregular behavior. Nevertheless, the aforementioned approaches assume a quasistationary or stationary process to model the network dynamics that does not always
hold. Concluding, statistical-based approaches have several advantages for developing
IDSs such as the relaxation for required prior knowledge. However, such approaches
may be trained attackers to consider attacks as normal. Moreover, their fine tuning is a
complex procedure to minimize false positive or negative alerts. Finally, in several cases
it is not possible to model variables and system behaviors with stochastic means.
2.1.3 Knowledge-based Network Intrusion Detection
Pattern Matching Approaches
Pattern matching techniques have been early approaches for knowledge-based network
intrusion detection systems. However, such approaches introduced a number of issues
such as low processing capacity, high rates of false alerts, inability to identify unknown
misuses and requirement for explicit signatures for each attack. Commercial products
such as ISS [134] match network data to predefined sets of patterns. An IDS that
investigates string matching algorithms to detect security breaches is presented in [94].
The proposed algorithm is compared with other known string matching algorithms such
as Aho-Corasick and Boyer-Moore using the Snort platform [244]. Another pattern
based IDS is described in [52] paying particular attention to reducing the processing
19
requirements by disabling the pattern matching mechanism in periods where no traffic
changes are noticed. The latter changes are identified by employing a time series analysis.
However, this method does not work on links with high amounts of traffic. A different
approach aiming to reduce the matching processing requirements is described in [170]
where an ID3 clustering method is utilized for that purpose. The generated decision trees
are employed to optimize the rules-to-input comparison avoiding redundant operations to
detect malicious behaviors. Comparing that system with the Snort open source platform
[244], improved results have been reported. Bro [231] is a real-time IDS that employs
an event engine for grouping traffic to high level events and a policy script interpreter
to define security policies. Attacks are detected using string comparison operations. In
general, while pattern matching approaches are simple, they face difficulties dealing with
evolving networks and traffic conditions in a scalable manner.
Expert Systems
Based on rules that describe abnormal behavior, expert systems are able to generate
alerts for network security breaches when fed by transformed audit events. Nextgeneration intrusion-detection expert system (NIDES) [18] has been developed to detect
malicious activities on networks. NIDES has been designed based on a foundation for
anomaly detection as well as signature-based components. The system performs statistical and rule-based analysis of the audit data providing graphically the results to users.
State Transition Analysis Technique (STAT) tool [101] models attacks as sequences of
state changes that move the network from a secure state to a compromised one. CRITTER is a case-based reasoning (CBR) algorithm that is presented in [190]. CRITTER
combines rules and conditions that lead to abnormalities. CBRs are alternatives to rulebased reasoning (RBR) techniques that inherit fewer constraints compared to the latter
ones. RBRs can be easily configured to define attacks. CBRs are more effective than
RBRs for scenarios where the system must capitalize on and learn from past experiences and be able to cope with novel conditions that occur on Internet. Expert systems
based on the inductive approach produce if-then rules from provided data representing
normal and abnormal cases being able to support mechanisms such as unification at a
higher processing operation cost. Adaptive techniques are required to make CBRs function on changing environments. The amount of functions that is necessary to address
abnormalities scales linearly with the number of faults.
20
State Transition and Petri Net Modeling
State-transition and Petri-nets are modeling techniques able to represent the different
states of a network and identify intrusions. Generally speaking, state-transition diagrams
are graphs where nodes are the states and links represent the potential transitions; when
applied to network intrusion detection, nodes represent states of the network and links
are the required activities to move from one state to another. NetSTAT [299] and
its predecessors USTAT [136], [240] are prototype systems providing an environment to
model networks based on state-transition diagrams. Employed hypergraphs provide vital
information about the events to be monitored, the appropriate location assisting network
administrators on their work. This approach provides a robust solution against unknown
vulnerabilities. However, it lacks adaptation to state sequence changes where big effort
is required to be comprehensively configured. IDIOT [173], [172] is a coloured Petrinet (CPNS) based approach aiming to detect components of partially ordered attack
sequences. There are concerns about the scalability of this approach as the number of
states increases and its ability to operate in real-time. Nevertheless, the fact that IDIOT
operates on abstractions of the raw data gives it a performance boost. Moreover, it is
able to detect unknown attacks, exploit temporal relations, reuse modeled concepts and
achieve a reduced false alert ratio. Summarizing, a shortcoming of using finite state
machine (FSM) methods is the fact that some attacks require a very large number of
states to be comprehensively modeled. Therefore, the amount of total states and related
parameters grows up enormously that can only be handled as an offline process. They
also lack adaptability characteristics as network evolves.
2.1.4 Composite Detection
Some IDSs combine both knowledge based methods and anomalous behavior detection
ones. Such approaches are designed to take advantages of both worlds. They have the
ability to both identify the patterns of intrusive behavior and to associate them to the
normal behavior of the network. Hybrid intrusion detection system (HIDS) [133] is combining the advantaged of an IDS and an anomaly detection system (ADS) to identify
unknown attack scenarios. The ADS is developed out of the mined anomalous network
traffic episodes. By utilizing a weighted signature generation scheme the integration of
the two approached is achieved. Another hybrid approach is presented in [233] able to
detect and visualize network intrusions. Agents are employed to perform against intruders for protecting the network resources. The Production-Based Expert System Toolset
(P-BEST) [193] is a system employed for developing a modern generic signature-analysis
engine for network misuse detection such as SYN flooding and buffer overruns. Well de-
21
signed composite techniques are able to provide a rich set of functionality, however, such
an advantages comes with a cost of high complexity and in some cases with unnecessary
redundancy.
2.2 Correlation Analysis and Alert Correlation
Intrusion Detection Systems (IDSs) operate in a supplementary manner to other more
traditional security methods, such as network firewalls or certificates and cryptography.
Depending on the configuration and the ability of the deployed IDSs to detect correctly
the intruders and their actions, a very large number of alerts is generated continuously
every day, however, having a large number of false positive incidents. Tuning and properly configuring IDSs stamp out a large number of trivial false alerts, however, there is
still a significant portion of spurious notifications. Moreover, most IDSs have limited
observation abilities in terms of network space as well as the kind of attacks they can deal
with. Attack evidences against network resources can be scattered over several hosts.
It is a challenging issue designing an IDS with properly deployed sensors and analysis
capabilities able to detect the attacker traces at different spots in the network during an
intrusion attempt and being able to find dependencies among them.
Therefore, achieving collaboration in the result analysis correlation and the relevant
triggered alerts between different IDSs leads to improved results with better description
of the attacks and provides a stronger confidence on the raised security issues. Alert
correlation techniques, which gather and identify relationships on alerts from different
sources aiming to spot attack scenarios, are typical tasks of Security Information and
Event Management (SIEM) systems. Such techniques combine alerts with high probability of sharing the same root cause, reduce the probability of false positive alerts and
they provide rankings of the alerts based on their importance. Using an appropriate visualization method to show such information can be of great assistance for the security
analysis employees providing them with a decision support system. Alert correlation is
an important multistep process of IDSs that combines information from heterogeneous
network sensors, improves the ability of identifying attack, enhances the meaning and the
semantics of the attacks with more details and reduces the false positives scenarios. The
general procedure includes dealing with alerts at multiple granularities, exploiting potential spatiotemporal relationships (e.g. origin, target, etc.), data fusion and structure
identification for detecting complex intrusion scenarios [296].
The Intrusion Detection Message Exchange Format (IDMEF) [74] by IEFT has been
proposed for standardizing the format of the raw input alerts as well as to define the alert
exchange protocol. Time and synchronization are critical aspects in the alert correlation
22
process in order to capture accurately the arrival order and the relevant timestamps
[152]. A significant amount of work has been done related to of physical and logical
clocks and timestamps in distributed systems.
2.2.1 Alert Correlation
Alert correlation focuses on discovering various relationships between individual alerts.
The existing alert correlation techniques can be roughly divided into four categories
[253], [315]:
1. methods based on similarity between alert attributes (such as start-time, end-time,
source, and target of the attack), which cluster alerts through computing attribute
similarity values;
2. approaches based on predefined attack scenarios, which build such scenarios through
matching alerts to predefined templates;
3. techniques based on attack pre-conditions and prerequisites as well as post-conditions
and consequences, which develop attack scenarios as chains in time, through matching the post-conditions of earlier attacks with the pre-conditions of later attacks;
4. strategies that utilize several heterogeneous information sources integrating different information types and carrying out reasoning based on triggered alerts and
other collected information.
Data clustering techniques form the basic methodology of the approaches based on
similarity between alert attributes, where the definition of appropriate similarity measures is the most critical issue [186]. Typical records about potential suspicious events
include information such as source and destination IP addresses and ports as well as
timestamps, which form the attribute values. Then, the similarity measurement algorithms compute the distance between such events and cluster them accordingly and
alert correlation techniques are triggered. This methodology inherits lower number of
alerts since similar alerts are clustered together and assigned to the same attack. An
alert clustering approach to perform root-cause analysis is described in [145], where the
root-causes are the reasons of the triggered alerts. In such a system, groups of alerts
are identified so that the grouped alerts correspond to the same root-cause. In order
to provide meaningful clustering methods, hierarchical generalizations are proposed for
constructing high level concepts of the alert attributes, e.g. a network address is a generalization of an IP address. Then, a series of dissimilarity measures is defined that are
relevant to the produced generalizations. Measurements using this technique provided
23
that the top 13 alert clusters account for 95% of all alerts. A probabilistic framework to
perform alert correlation is described in [295], where the similarity among alerts generated by different IDSs is calculated. This approach focuses on dealing with IDSs with
heterogeneous alert attributes (e.g. IP addresses, ports, timestamps), where initially the
common features are identified and it estimates both the minimum similarity as well as
the expectation of the similarity. The overall similarity is weighted by the expectation
of similarity having as terms the similarity of the common features. An approach that
performs series and statistical analysis for detecting attacks is proposed for conducting statistical causality analysis [246]. Moreover, an aggregation technique is defined for
grouping lower level alerts to a conceptual higher level alert called hyper alert, therefore,
leading to a smaller alert number and providing the means for alert ranking. Finally, after completing the aggregation, clustering and prioritization steps, the proposed system
perform a statistical Granger Causality Test to detect the attack scenarios. The Mirador
project [67] developed an alert correlation system based on multiple IDSs, conducted in
three steps. The first step is alert management where tuples are generated for each alert
and stored in an RDBMS. Tuples are created by transforming IDMEF alert messages
into the specified DB schema. The second step is about alert clustering where alerts
belonging to the same attack are grouped together, where the successful result depends
on the correct evaluation of the similarity between alerts. Finally, the third step is about
alert merging in each cluster where a global alert represents the whole group of the alerts.
The approaches based on predefined attack scenarios require a series of attack steps
correlated aggregated to demonstrate the big picture of potential attacks. A typical
approach is to defining some required attack scenario templates. An example of such
a sequence of scenarios is to define a template for IP scan attack, then a TCP port
scan attack followed by an application buffer overflow attack. When an attack is identified, it is matched with the predefined templates as parts of an attack scenario. Such
an approach is quite beneficial to detect already known attack scenarios; however it is
not possible to identify unknown attacks. Moreover, for certain cases it is not easy to
exhaustively list all attack sequence templates. Such an approach is described in [76],
where an architecture called ACC is proposed aiming to cluster alerts based on predefined relationships between them. Both aggregation and correlation relationships are
identified, where the former ones aggregate alerts based on the predefined criteria, while
the latter ones discover the commonalities between attacks by identifying duplicates and
consequences. Nevertheless, this method produces a large number of false positives. Another approach utilizes alert correlations based on the chronicles formalism [212], where
chronicles is a model for temporal event patterns used to monitor security events and to
perform alert correlation. Thus, chronicles is a concept aiming to reduce the raised alerts
and more importantly their false rate. Each chronicle includes information timestamps,
24
event patterns, time constraints and other related information. If the relevant chronicle
conditions are fulfilled, then it is considered as valid and an alert is produced.
The approaches based on prerequisites and consequences of attacks capitalize on the
fact that intruders usually perform attacks in steps where earlier steps perform tasks to
set the conditions for subsequent ones. Examples of such sequential steps are to initialize
an IP sweep to find live hosts in a network. Afterwards, attackers may scan for open
ports on discovered live hosts to find vulnerable services, and finally start a buffer overflow attack on the specific hosts. Among such a sequence of attacks, causal relationships
can be identified that can form attack scenarios providing a more comprehensive view
about security threats. The prerequisites are mandatory conditions for follow-up attack
steps to take place, while the consequences are possible attack results. Attack modeling
languages such as LAMBDA [69], CAML [57] or first order logic methods can be used
for modeling the prerequisites and consequences. More particularly, an approach applying abductive correlation utilizing pre and post-conditions is described in [68], where
initially alert clustering is taking place and then a merging process using appropriate
similarity functions. The LAMBDA attack specification is employed to automatically
generate correlation rules both in a directed and undirected manner. Using the produced
rules alert information such as types, attribute values and timestamps is extracted and
justified against the rules. The identified series of correlated alerts produce a complete
attack scenario. On the other hand, first order logic is employed on different approaches
[218], [219] to describe pre and post conditions as well as causal relationships among
alerts. Both pre and post conditions are defined for each generated alert by extracting
the relevant alert attribute values, which are processed afterwards for finding their correlations via possible partial matches. The correlated alerts are grouped for conducting
potential attack scenarios, forming prepare-for relation models. These relations are used
for constructing correlation directed acyclic graphs, where the nodes correspond to alerts
and links indicate ”prepare-for” relations. An extension of this technique [221] employs
hypothesis and reasoning methods to further detect unidentified attacks, capitalizing on
the observation that missed intermediate attacks by IDSs may have possibly produced
multiple attack scenarios. The identification of the relevant constraints regarding the
possible multiple attacks can be used in the hypothesis process aiming to discover the
relevant attribute values. The hypothesized attacks are validated using the original data
from the sensors and failed attacks are removed. Using the validated attacks and the existing alerts, concise attack scenarios are constructed. On a different approach, JIGSAW
[273] describes attack conditions employing capabilities and concepts. Capabilities specify the information that intruders require knowing to perform particular attacks such as
user names and passwords as well as the required conditions that clarify the context of
the attack. On the other hand, concepts model fragments of complex attacks utilizing
25
capabilities to specify the pre and post conditions. Complex attack scenarios are then
detected by correlating capabilities included in a particular concept with capabilities of
other concepts and therefore discover for instance that a remote shell connection spoofing that relies on a denial-of-service attack. Nevertheless, there are several disadvantages
related with the pre- and post-condition based approaches. It is a fact that there are
strong assumptions that only well defined alerts exist and attacks trigger multiple alerts,
therefore, they ignore unrelated and uncorrelated events [220]. However, observations
of collected data demonstrate that such assumptions do not necessarily hold. Moreover,
there is a need for manual specification of the conditions for each alert and no automatic correlation operations are involved. Finally, when the modeling phase involves
only dependencies between alerts, it is challenging to monitor the evolution of an attack
scenario in real time, thus making them hard to be used in demanding use cases.
Developing methods that are based on multiple information sources is a promising
approach to provide complementary security assurance to networks. However, the scale
of produced alerts is increasing heavily and it can become a challenging issue for such IDS
systems design as their users are getting overwhelmed with huge amount of alerts, making
it hard to detect the critical ones and prioritize them. Moreover, lack of cooperation
and coordination among the considered sources of information hinders the investigation
process. A mission-impact-based technique is proposed in [241], aiming to correlate alerts
coming form several heterogeneous and spatially distributed information sources such as
network firewalls and deployed IDSs in an automatic way. The host configurations are
considered when alerts are inspected for system vulnerability. This approach is heavily
based on maintaining two DBs, one for incident handling fact base and one for the
topology map of the protected network and hosts. Moreover, a series of processing steps
is specified in order to perform filtering, topology inspection, priority calculation, event
ranking and alert grouping. The DBs are critical sources of information in order to
perform the topology inspection where a relevance score is calculated per raised alert.
This score defines the dependency among the incidents and the associated component
configurations. At last, the level that an event influences the valid operation of a network
is demonstrated via the generated priorities. M2D2 model [215] aims to removing several
false positive alerts. The model relies on defining the sensor capabilities in a formal
manner and considering their scope and position in the network to decide whether an
alert is a false positive. M2D2 verifies whether all the pertinent sensors able to detect an
attack confirmed it during the detection phase, having the assumption that inconsistent
reports denote false alarms. However, this method can be compromised by the attackers
who can participate in the voting process, which by itself is a challenging concern to
avoid. An extension of the M2D2 model, called M4D4 data model [212], [31] has been
designed seeking to provide reasoning about the security alerts as well as the relevant
26
context in a cooperative manner. The extended model is a reliable and formal foundation
for reasoning about complementary evidences providing the means to validate reported
alerts by IDSs. A joint approach to carry out alert correlations from multiple IDSs
is described in [314], where a fundamental concept is the assumption that correlation
is based on triggering events. Thus, clustering incidents that are coming from similar
triggering events allows their partitioning into discriminated groups that may be related
to the same attack attempt. In addition, the consistency between alerts of the same
cluster and the related configuration descriptions can provide further assurance about
the accuracy of the results and rate the severity of alerts and clusters. Furthermore,
a second fundamental concept is the importance of input and output resources in the
derived correlation level, considering that input resources are mandatory resources for an
accomplished attack while output resources are the supplied ones at successful scenarios.
The discovery of common resources between input for one attack and output for another
allows the recognition of causal relationships among grouped alerts for developing attack
scenarios. A decentralized IDS system is described in [171] seeking to both correlate the
gathered events and fuse the relevant data observed among the multiple sensors. The
deployed monitoring points collaborate using the peer-to-peer paradigm and exchanging
events relevant to complex distributed attack scenarios. Afterwards, a distributed misuse
detection algorithm is employed to perform event correlation. An inherit challenge to
this approach the requirement for correct temporal order of the events that may be hard
to achieve.
2.2.2 Monitoring from several vantage points
Maintaining the routing tables of a huge and heterogeneous network such as the Internet
is challenging issue considering the large number of prefixes, ASes and BGP updates.
Therefore, detecting abnormal events updates requires advanced data mining methods to
investigate the roots of the problem [260]. Pin-pointing the exact cause behind observed
network routing issues remains a complex problem. A formal framework to represent
and study MOAS events and relevant network management activities is described in
[211]. A learning approach over raw BGP data is taking place that evaluates and ranks
the possible relevant actions. It has been discovered that although multiple ASs perform
promptly reactive actions before correcting the false BGP updates, more than 90% of
affected prefixes were routed back to their correct routing path. Another distributed
measurement framework for pin-pointing routing changes is described in [272]. In this
work, each AS maintains an accurate view of occurred routing changes. Then, for each
route modification the involved measurement servers are queried following the path from
source to destination aiming to detect the exact location and the reason for the modifi-
27
cation. A large study of real network control traffic over the Sprint and AT&T backbone
networks has been performed in [271], where the impact of BGP routing modifications
on network traffic is investigated. It has been observed that a small number of routing
modification have significant impact on data traffic while the majority of them have
little influence. A formal model about the dynamicity of BGP is given in [112] highlighting the differences among multiple network monitor observations and focusing on route
flapping.
Root-cause analysis is a typical technique for detecting the reason and the location
of BGP route modifications. An investigation about detecting the responsible AS for
a routing change is provided in [91]. Correlating BGP update messages for prefixes
collected at several observation locations forms the basis of the method. In particular,
successfully pinpointing the origin of an AS number is conducted in two steps. The
initial one involves simulations on snapshots of the AS topology as it can be developed
out of the BGP updates having properly behaving routers. Then, a number of heuristic
algorithms are suggested to deal with the restrictions of the actual update procedure.
The differences between the simulation and real-world observations give insights about
the deployed observation points.
A VA-based approach combining both computer and human intelligence via properly
selected visualization techniques for BGP anomaly event analysis and correlation is given
in [275]. The work provides interactive means for presenting BGP OASC (Origin AS
Change) events demonstrating the superiority of VA techniques.
A distributed system for real-time IP prefix hijack detection is provided in [332] capitalizing mostly on data plane observations. Two key assumptions of this work are the
facts that the path hop count from a source to a legitimate prefix is generally stable
and second, the path from a source to a legitimate prefix is nearly at all times a superpath of the path from the same source to a reference point along the previous path for
points topologically close to that prefix. The appropriate choice of vantage points for
monitoring modifications that do not meet the aforementioned assumptions is critical
for raising valid alerts. A Principal Components Analysis (PCA) based method for root
cause analysis of BGP updates is provided in [316] aiming to develop a set of groups of
prefixes or AS numbers that are affected by the same BGP update message. The method
uses BGP update data from multiple border routers inside the AS to detect BGP routing changes. However, this method has limitations when two distinct events affect the
same prefix or AS during the same observation time. An online tool able to generate a
relatively small number of alerts out of millions of BGP messages is described in [311],
where r-vector data is proposed to detect and capture BGP modifications. Correlating
time and prefix modifications is a critical step to identify unstable routes. The tool has
been used on a Tier-1 ISP backbone with hundreds of border routers with very inter-
28
esting results. Another root-cause analysis method is described in [45], which aims to
detect the cause and the origin of a routing change. Correlating routing updates across
different vantage points reduces false or redundant events. This method performs well
on the analysis of events affecting relatively stable prefixes, applied on some use cases
such as multiple BGP session resets during Internet worm attacks [301] and analyzing
the updates generated by BGP Beacons [201] to pinpoint the update sources. A spatiotemporal clustering method that utilizes path vector information for assigning several
related messages to the relevant events is described in [51]. The approach classifies the
effect of routing events and estimates the distances to the originating AS, observing that
more than 45% path changes are caused by events on transit peerings and that several
path changes are transient indicating short-term path modifications. A pin-pointing
algorithm for the origin of routing changes called MVSChange [178] proposes a simplified BGP model called Simple Path Vector Protocol (SPVP) combined with a graph
model of the Internet to locate the origins of updates using multiple vantage points.
The mechanics of router reactions when there are large routing tables are examined in
[50]. There are routers demonstrating table-size fluctuations that are possible to cause
cascading failures. Moreover, it has been found that in some cases an administrator is
necessary to recover from failures and in some others BGP mechanisms such as prefix
limits and route flap damping only partially handle the overhead of large routing tables.
The aforementioned methods study streams of BGP messages from multiple observation
sensors aiming to infer the cause and the origin of an unstable route. The root-cause
analysis fits nicely with attack attribution. The aforementioned approaches capitalizing
on distributing monitoring methods are heavily depended on the selection of the position
of the sensors.
Thonnard et al. [284] employed multi-dimensional data mining techniques for detecting actionable knowledge about network security issues aiming to build global indicators
about existing malicious activities and investigate the modus operandi of rising security threats, considering specialists supervision. For this, a graph-based KDD approach
is applied to evaluate real data from attack traces. More particularly, a clique-based
clustering technique is used to extract the critical knowledge and afterwards combined
with a multidimensional synthesis process, a concept lattice is created to describe the
observed phenomena. Moreover, a customizable analysis framework for deeper investigation of raw honeynet data has been also developed [283] providing the means to
discover traces in the network with common activity patterns. A clustering mechanism
applicable on several feature vectors has been designed focusing particularly on the time
series of the attacks. In particular, clique-based analysis can assist honeypot forensics by
stressing correlation patterns, even when they are referring to completely independent
attacks. A result of this work demonstrated that appropriate similarity measures assist
29
greatly on clustering attack patterns for detecting the probable root causes of attacks.
In addition, a fuzzy inference system [285] has been employed in a knowledge discovery
technique aiming to reproduce as close as possible experts reasoning for attack attribution. A multi-criteria decision-making process takes as input the extracted knowledge
from large-scale attacks to attribute them and discover them. This method is particularly useful against distributed zombie-armies attacks. The aforementioned pieces of
work for attack attribution automatically group together events that are likely to be due
to the same underlying root cause. These techniques offer an automated means to apply
a multi-criteria decision process to cluster groups together. Applications of the method
have shown its usefulness but also, its limits when it comes to explain why events have
been grouped together. Finally, systems such as honeynets [242] may also be set up
specifically to support network attack detection and attribution.
Another interesting approach is that of forward-deployed IDSs. The philosophy of
forward-deployed IDSs differs dramatically from typical IDSs since they are systems
deployed as close as possible to the attackers in order to maximize attribution information [307]. The great benefit of these systems is that they are able to supply faster
more accurate information about the location of the attackers with reduced cost finding the correlations in the gathered information of the local log files [255]. However,
false positives and negatives are possible requiring continuous observation and moreover,
forward-deployed IDSs require some information to be well deployed and it may not be
possible to be placed close enough to the attacker location. Moreover, forward-deployed
IDSs need stronger protection since they are more vulnerable to attacks to avoid being
disabled, controlled by attackers or revealing the internal detection policies. Nevertheless, assuming that such policies can be updated fast enough, the forward-deployed IDS
can become an input debugging tool [254] where upstream routers are supplied with a
policy/pattern of the target requested to generate an alert when the pattern is validated
in future attempts. Forward-deployed IDSs are able to identify the initial attack event
without requiring several messages to begin attribution. A number of techniques have
been proposed for Level 3 attribution based on Bayesian networks [123] that are able to
handle incomplete data scenarios, Hidden Markov Models (HMMs) that can unify analysis steps from different perspectives, Self-Organizing Maps (SOMs) and game-theoretic
models [54]. For example, the latter method allows trackers to detect the methods utilized by intruders by comparing the actual evidence with the attack trajectory predicted
by the models. Spoofed message discovery has a significant importance for attribution
because it allows the interaction with protocols attempting to uncover attackers [274].
A relevant commercial tool called eTrust Network Forensics is a widely accepted environment to carry out multiple kinds of automated analysis for attack attribution and
trace-back procedures.
30
2.3 BGP State-of-the-art
2.3.1 Background
The Internet is partitioned into tens of thousands of independently administered routing domains called Autonomous Systems (ASes), where an AS corresponds to an ISP, a
company, a government body, an academic institution, etc. The Border Gateway Protocol (BGP) is the de facto inter-domain routing protocol that maintains and exchanges
routing information between ASes. Since January 2006, BGP version 4 is codified in
RFC 4271 [8].
When interconnecting, two ASes must be able to exchange network reachability information. Unlike intra-domain routing protocols that route packets through the shortest
possible network path, BGP lets each AS define its routing policy, which is then enforced
on each BGP-speaking router by filtering on incoming and outgoing update messages
[297, 198].
A BGP update message is exchanged between two BGP-enabled routers to announce
or withdraw network addresses reachable through them. Such a message mainly contains
the destination network address, the AS path to the destination, and preference indicators1 . The AS path is built sequentially: when a router exports a route to a neighbour,
it prepends its unique AS number to the path it has received2 . The first AS exporting a
route to a given network is called the originating AS : the update message contains then
a single AS number in the AS path field. The AS path is primarily used to avoid routing
loops between ASes. Indeed, a router receiving an update containing its AS number in
the AS path field will not consider the route as it already is in the path to destination.
As a result of the existence of the routing policy, unlike intra-domain routing protocols
that use the shortest possible path to the destination address, BGP uses the following
mechanism to select the preferred route. First, when multiple network addresses overlap,
BGP uses the longest prefix match rule. Then, for identical network addresses, BGP
selects the route with:
1. the highest local preference,
2. the shortest AS path,
3. the lowest Multi-Exit Discriminator (MED).
1
2
Note that in case of a withdrawal, an update message only contains the network address.
Public AS numbers are uniquely assigned by Regional Internet Registries (RIRs). Their values range
from 1 to 64511. Private AS numbers (from 64512 to 65535) can be used locally for a connection
between a network and its provider [297, 8].
31
If multiple routes are still possible, tie-breaking rules are applied [297]. The local preference is a value assigned to a route as part of the AS policy. It is only relevant within an
AS and is not communicated to external networks. The MED value is used to balance
traffic between multiple possible links between two ASes and is only shared between
them [297, 43]. Once the process has successfully selected a route to a prefix, the route
is added to the forwarding table.
BGP security issues
BGP was designed based on the implicit trust between all participants, and no measure
exists in the protocol itself to authenticate the routes injected into or propagated through
the system. Therefore, virtually any AS can announce any route into the routing system
and sometimes, bogus routes can trigger large-scale anomalies in the Internet. This
intrinsic weakness of the BGP protocol can lead to so-called prefix hijacking attacks, be
it intentional or not (i.e., due to router misconfiguration or because of a real attack).
Prefix hijacking basically consists in redirecting Internet traffic by tampering with the
control plane itself (i.e., the BGP protocol). The problem of prefix hijacking is considered
as a crucial one and has recently received much attention. There are indeed some claims
that the core infrastructure of the Internet may be misused by attackers in one or
another way to surreptitiously perform malicious activities. For example, in [248] the
authors have shown evidence that, in a few limited cases, it is quite likely that attackers
were misusing the BGP routing protocol to hijack blocks of IP addresses during limited
amounts of time, so as to launch spam campaigns from legitimate-looking blocks of IP
addresses. If successful, such techniques would clearly defeat the spam blacklists that
anti-spam tools use as a first layer of defence against spammers.
Since one of the main objectives of VIS-SENSE is to correlate security events with
possible traces of attacks targeting the core of the Internet, we perform an extensive study
of BGP prefix hijacking and its related concepts in Section 2.3.2. We will then briefly
describe a few techniques developed to securing BGP in Section 2.3.3. In Section 2.3.4,
we review some popular tools for the observation of the BGP routing process. Finally,
we finish this state-of-the-art on BGP by reviewing available methods and services for
detecting BGP hijacking attacks in Section 2.3.5 respectively.
It is worth noting that, as BGP runs over TCP/IP (BGP listens on TCP port 179), the
protocol is also subject to the same attacks than any other protocol relying on TCP, e.g.:
Denial of Service (DoS) attacks (e.g. SYN flooding or RST spoofing), eavesdropping,
attacks against packet integrity, replay attacks, etc. [43]. However, as interesting as
these attacks may be, these are out of the scope of this document.
32
2.3.2 Prefix Hijacking
Prefix hijacking (also known as BGP hijacking or IP hijacking) is the act of absorbing
(a part of) the traffic destined to another AS through the propagation of erroneous
BGP routes. It can be the result of router misconfigurations [198] or of malicious intent
[21, 43, 124, 247, 268].
Regardless of the intentions of the issuer of the incorrect routes, we will refer to him
as the hijacking AS. In the same fashion, the route propagated by the hijacking AS is
the hijacked route. The network whose route has been hijacked will be referred to as the
victim AS. The correct route to the victim AS is referred to as the legitimate route (or
the original route). Finally, any occurrence of prefix hijacking will be considered as an
attack.
Objectives
By hijacking the traffic of another AS, an attacker may [331, 222]:
(i) create a black hole, i.e., perform a complete Denial of Service of an AS/prefix;
(ii) impersonate the victim by stealing its AS’s identity and imitating certain services
(e.g., duplicate a website);
(iii) intercept the traffic to eavesdrop (or record) the exchanged data, and then forward
the data back to the victim AS (i.e., a case of subversion).
(iv) create a network instability by triggering connectivity outages [261].
To achieve these objectives, different types of attack can be employed. These are
explained here below. For illustrative purposes, we then describe a few public incidents
of BGP hijacking that appeared in the headlines in the recent years.
Types of Hijacking
IP prefix hijacking can be performed in several ways. Hu et al. present a taxonomy of
hijacking attacks in [124]. A similar work was done by Lad et al. in [176] and Katz-Basset
et al. in [149].
The attacks are usually based on the following key elements:
• AS ownership: the hijacking AS claims to be the origin AS of the prefix.
Since the hijacker is advertising itself as the origin AS, the AS path is much shorter
than the one of a legitimate route. The hijacked route is then selected – if only by
peers of the hijacking network – to route to the victim network.
33
This kind of attack can usually be easily detected because it creates a so-called
Multiple Origin AS (MOAS): a single prefix is originated from multiple ASes. Note,
however, that there may be valid reasons for a network to be a MOAS (e.g., multihomed stub networks) [330], so it is not always trivial for an external observer to
differentiate between a legitimate MOAS route and a prefix hijacking attempt.
• Intermediate hop: the hijacking AS claims to be closer to the origin AS than it
really is.
The announced AS path is longer than for an ownership attack, but it is also
harder to detect. Usually, the attacking AS will claim to be second hop, since
being any further down in the path would significantly decrease the amount of
hijacked traffic [21].
Because the victim’s AS number is still the originating AS, it does not create a
MOAS route.
Another approach to this type of attack is described in [203] where they study the
amount of traffic that can be stolen with an intermediate hop attack, depending
on how far in the AS path the hijacking AS is. The idea behind it is not to hijack
the whole traffic of the victim, but to only suck in a little percentage of packets
destined to them, enabling the attacker to go undetected for a longer time (i.e.,
stealthier attack).
• Subprefix hijacking: the hijacking AS propagates update messages containing
a route to a more specific network address than the original announcement.
Because of the longest prefix match rule, this is a very effective attack: any router
that receives (and accepts) the incoming route will automatically forward any
traffic destined to the victim to the hijacking AS.
The victim has only two ways of dealing with this attack:
1. inform the NOC of the hijacking AS that they are misbehaving. Since it is
unlikely they will cooperate if the attack is not the result of a misconfiguration,
the victim will have to get the cooperation of an upstream provider of the
hijacking AS, which can be quite complicated.
2. announce an even more specific prefix for the network. However, this countermeasure may also fail in some cases, since most ASes tend to reject too specific
incoming routes in order to keep the size of the routing table as low as possible
(usually, anything more specific than a /24 is dropped) [297, 124, 43].
34
• Supernet: the hijacking AS propagates update messages containing a route to
a less specific network address than the original announcement, hoping to receive
the traffic whenever the legitimate AS is unavailable, or to use a range of addresses
not covered in the original announcement.
• Invalid or unassigned prefix: the hijacking AS announces a network prefix that
is not assigned to any entity (e.g., a bogon). In this case, there is no victim AS,
but malicious activities can be easily carried out by using these addresses (e.g.,
spam campaigns).
Finally, it is interesting to note that, depending on the position in the Internet hierarchy of the hijacking AS [100], the probability of a successful hijacking may vary quite
substantially (between 38 and 63% according to [21]).
Some public incidents
In the recent years, several cases of BGP hijacking have made the headlines. We briefly
describe some of them to illustrate the concepts explained here above.
The AS7007 incident
The first BGP-related incident on the Internet dates back to April 25, 1997 when AS7007
– assigned to MAI Network Services (MAI), a regional ISP in Virginia, USA – started, as
the result of a misconfiguration, announcing highly specific routes to one of its providers:
Sprint (a large backbone network). Sprint didn’t filter out those announcements and
started propagating them. Because of their network size, the erroneous routes completely contaminated the Internet, resulting in a large-scale prefix attack coupled with
an ownership attack. When MAI noticed what was happening (within 15 minutes), they
disconnected themselves off the Internet. However, the highly specific routes still existed
for a while, resulting in a massive blackhole of the global network that lasted a bit less
than 6 hours [38, 60].
Christmas Eve leak
On December 24, 2004, TTNet (the largest ISP in Turkey) started announcing over
106,000 prefixes to Telecom Italia who did not set a maximum prefix count on the incoming routes from TTNet, so they accepted the routes and started propagating them
upwards. Fortunately, these peers had an upper limit on the number of accepted incoming routes and it was rapidly reached. Unfortunately, more specific routes were still
35
propagated, albeit in a small number, which resulted in a virus-like propagation of erroneous routes (everybody got a little bit infected). The event lasted a little under 12
hours [239].
The YouTube attack
On February 24, 2008, the Pakistani government decided to forbid access to YouTube
[251]. YouTube is announced with an aggregated /22 prefix. Pakistan Telecom decided
to enforce the interdiction by BGP means and announced the /24 prefix of YouTube
that contains YouTube’s DNS and web servers. Somehow, Pakistan Telecom announced
that route outside of their networks, including to their provider, PCCW Global, that did
not filter them and propagated the more specific /24 route to the rest of the world. For
approximatively 80 minutes, the whole traffic of YouTube was blackholed in Pakistan.
YouTube then reacted by announcing even more specific /25 subnets, which resulted in
getting the traffic redirected to them. Roughly 2 hours after the start of the attack,
PCCW Global withdrawed the routes originated by Pakistan Telecom, and YouTube
reaggregated its announcement to the original /22 prefix.
China Telecom
On April 8, 2010, China Telecom released 37,000 prefixes instead of the normal amount
of 40, affecting networks owned by CNN, Dell, Apple, US DoD, France Telecom, Amazon
Deutschland, and others, for approximatively 20 minutes [289, 175, 174, 206, 309]. About
15% of the global routing table was apparently affected [175, 174]. The impact in
North-America and Europe was minimal [175, 160], although the impact in Asia was
certainly not negligible. The incident raised awareness about the fragile security of
Internet routing in the media that started drawing conclusions about “Cyber-War”.
However, there is a consensus among experts that the incident was most likely due to a
misconfiguration.
DEFCON Man-In-The-Middle
While not an incident in its own right, the Man-In-The-Middle BGP attack presented
in [238] is very instructive and probably one of the most dangerous types of hijacking.
Unlike precedent incidents, the goal here is to silently redirect Internet traffic through
another AS, and then forward it back to its final destination.
Wile diverting the traffic to another network can be actually simple, the trickiest part
is to be able to forward it afterwards to the legitimate owner. Pilosov and Kapela have
demonstrated how to do this during DEFCON (16) in 2008 [238]. First, they identify
a possible path from the attacking AS to the destination. This legitimate route will
36
not be modified nor hijacked, as it will be used as return path to forward the traffic
back to its destination. Secondly, to attract the traffic, the hijacking AS will perform a
regular subprefix attack, but it will prepend the return path to the destination AS in the
announced AS path. As a result, many routers will receive and accept the more specific
routes, except the routers being on the forwarding route that will discard them because
they are already in the AS path.
As such, this can be seen as a combination of a subprefix attack (i.e., regular blackholing) with an intermediate hop attack. The good news, however, is that BGP man-in-the
middle has not been observed yet in the wild [118].
2.3.3 Securing BGP
Like many other protocols in the Internet, BGP was originally designed on the premise
of mutually trusting and well-behaving entities, and thus no measure has been included
in the protocol itself to authenticate the routes propagated by BGP routers. As a result,
there have been many propositions for securing BGP and inter-domain routing.
Current research efforts to securing BGP attempt to secure the confidentiality, integrity and availability of the BGP data. Most of the techniques proposed until now to
secure the protocol are based on cryptographic extensions of the protocol.
Confidentiality in BGP sessions
Regarding confidentiality aspects, a possible method for mitigating attacks on BGP
sessions is to protect the TCP connections. The TCP protection mechanisms include
the generalized TTL security mechanism limiting the effective radius of potential attack on BGP sessions, and providing in parallel host-level defences against TCP SYN
attacks [81]. Another category of TCP protection mechanisms, namely the IPsec at the
IP level, and the TCP MD5 signature option at the TCP session level [117], protect the
BGP TCP session from external disruption using cryptographic protection techniques
for the underlying TCP connections. On this matter, the MD5 signature option is a
frequently suggested method because it provides a relatively sufficient level of protection
combined with simplicity; however it has some potential weaknesses when compared
with IPsec [27].
IPsec (RFC 4301 [156]) has been suggested as a secure underlying message delivery
protocol that aims at providing security over plain IP. BGP operates on top of IPsec
by utilizing the authentication capability, in particular the Authentication Header (AH)
option that can be used at the IP layer implementing packet level security with differing
guarantees [153]. Additionally, using the Encapsulating Security Payload (ESP) option,
37
BGP capitalizes on an added layer of protection to encrypt BGP update messages [154].
Nevertheless, despite the higher levels of assurance provided by IPsec and the dynamic
approach of secret sharing, there are several disadvantages when employing IPsec for
BGP communication. The strong encryption algorithm generates high packet processing
workload to routers that can cause increased packet queues and become a DoS attack
target [59]. Moreover, a mechanism for key coordination is necessary.
An approach that exploits the Time-to-Live (TTL) has been devised by IETF and is
called the BGP TTL Security Hack (BTSH) [103], which is also known as the Generalized
TTL-based Security Mechanism (GTSM) (RFC 5082) [104]. This TTL-based security
protection leverages the TTL value of IP packets to ensure that the received BGP packets
are from a directly connected peer. However, such an approach requires cooperation and
mutual acceptance among BGP routers, therefore, it cannot be easily deployed. Each
router receiving BGP packets has to check the TTL value, which must be greater than or
equal to 255 minus the hop-count specified, otherwise it shall be considered invalid and
should be discarded. The big advantage of this approach is the lightweight processing
requirements as compared to crypto-based approaches. However, it protects only against
intrusion of external packets into an existing session, assuming that spoofing of the TTL
field in an IP header is a challenging task for remote attackers.
Integrity of BGP messages
A number of studies have focused on securing BGP messages themselves and validating
the integrity of a message as it is accumulated along crossed routers (e.g., the IP prefixes
in the AS PATH messages) [132]. Two important candidate solutions in this area are
S-BGP [155], and soBGP [308], which address both the integrity and authenticity of
the BGP data. However, it should be noted that both soBGP and S-BGP, although
developed during 2000 − 2003, have not been widely deployed yet [155], mainly because
they require substantial modifications to the BGP protocol and its core operation.
S-BGP [155] is a solid piece of work for securing the exchange of BGP messages. SBGP is enforcing both integrity and authentication by employing digital signatures for
both the addresses and the AS Path information. For the validation of these signatures,
S-BGP requires a Public Key Infrastructure (PKI). In addition, S-BGP proposes the
use of IPsec to secure the inter-router communication paths. S-BGP defines the correct
operation of a BGP speaker in terms of a set of constraints placed on individual protocol
messages, guaranteeing that
• no protocol update messages have been modified between the BGP routers,
• the updates were sent by the indicated BGP node,
38
• the update are destined to this particular BGP node,
• the BGP node is authorized to advertise routing information on behalf of the AS
it represents.
Moreover, there are a number of conditions that should hold: every pair of originating
AS and a related prefix must be valid pairs, the originating AS must be authorized
to advertise the particular prefix and finally, every subsequent advertisement must be
authorized by the AS holder of the prefix.
The security features of S-BGP are based on digital signatures for verifying BGP peer
identities, IP prefix owners and their administrators. Hence, PKI is a key element in
this process, where PKI-signed certificates are used to verify each address assignment
and allocation. Nevertheless, the operation of S-BGP is significantly more costly in
terms of processing workload, required memory and utilized bandwidth as compared to
plain BGP, mainly caused by the attestations and the certificates for signature generation and validation [320]. Moreover, challenging issues are the increased load during
session restarting, the completeness of route attestations, and the requirement that the
BGP UPDATE message has to traverse the same AS sequence as that contained in the
UPDATE message [158].
In order to deal with the challenges introduced by S-BGP, another approach have been
proposed, called Secure Origin BGP (soBGP) [308] that mainly aims to provide a solution
with reduced processing load during attestations validation as well as reduced signing
overhead, mainly by using locally generated RAs [157]. The concept of EntityCert is
introduced for binding an AS to a public key. Instead of capitalizing on hierarchical
infrastructures such as PKI, soBGP involves a reputation mechanism (i.e., web of trust)
for certificate validation. Moreover, a second concept is introduced, called AuthCert
for correlating address prefixes and originating ASes. In order to sign AuthCerts, a
private key bound to an AS is used. A third concept introduced by soBGP is that of
ASPolicyCert that includes a signed list of neighbor ASes that have to appear mutually
in the lists of the two neighbors to be valid. Such an approach avoids on purpose strong
dependency on the ASes or the address distribution mechanism, however, it brings an
open issue on how to validate and extract trustful relationships between the introduced
objects, which are considerable shortages on the design of soBGP.
Aiming to overcome the aforementioned issues on S-BGP and soBGP, Pretty Secure
BGP (pS-BGP) [227] proposed a combination of a centralized trust model for AS number
authentication as well as a decentralized one for IP prefixes verification. In particular,
ASes are equipped with a trusted certificate binding their number to the public key.
Moreover, a lightweight rating mechanism is used for verifying the advertised prefixes
and the relevant AS PATHs. The introduced decentralized model then is used to verify
39
the constructed AS prefix graph. Therefore, a configurable solution is provided for
each AS that can consider the rating values to give weights to AS PATHs and take
local decisions on whether to accept advertisements. Such a method is preferred over
globally determined ones for countering the wide spread of security threats. However,
the increased design complexity that involves two trust models is a shortage of this
approach and it has not been widely accepted and deployed.
Interdomain Route Validation (IRV) [108] is a proposed query-response protocol operating in parallel with BGP that allows BGP listeners to query the originating ASes about
the validity and authenticity of the received UPDATE messages and the advertised prefixes. However, such an approach introduces new challenges such as how to validate IRV
messages, authenticate and correlate routers, collaboration issues, etc. while additional
workload is introduced.
The performance advantage of symmetric versus asymmetric cryptographic functions
has triggered the interest for deeper investigation [125]. In particular, a tree-based hash
function for authentication has been used to encode sequences of data, thus, fulfilling
the requirement for an ordered relationship among the data that is mandatory for the
application of symmetric functions. Such an approach is particularly useful for preventing malicious manipulation of the ASes, as they are members of a list included in BGP
route UPDATE messages. Another application of symmetric cryptographic functions is
the origin authentication [16]. In this investigation, taking advantages of properties such
as the density and the static nature of the address delegation structure and analyzing
their semantics, it has been observed that the delegations were very stable over time,
and therefore, using mechanisms such as Merkle hash trees [209], ownership validation
can be effectively implemented.
Secure Path Vector (SPV) [126] is another BGP proposal for increased security capitalizing on the symmetric hash functions. Although it achieves improved performance
in terms of processing workload, it requires more storage, higher synchronization and information update times. Moreover, it is based on a complex key distribution mechanism.
Finally, even though ISPs are aware of the weaknesses of BGP, and despite all the protection mechanisms that have been proposed, there have been no important changes so
far. A common practice today is to rely on ingress-filtering techniques at AS level, manually implemented in an ad-hoc way, along with some simple transport-level techniques
to ensure that BGP speakers talk only to their direct neighbours (e.g., with TTL-based
protection techniques, as explained here above).
40
2.3.4 BGP monitoring
Over the years, a variety of tools allowing the observation of BGP routing tables were
developed. This section will briefly cover the most popular ones. Note, however, that
a more comprehensive survey of available information sources (both BGP and attackrelated) will be provided in the VIS-SENSE deliverable D2.1.
Looking glasses
A Looking glass is a network, somewhere on the Internet, that is “kind enough to show
you their BGP routing table” [297]. For example, Packet Clearing House (PCH) offers
a looking glass web application at [228]. PCH offers archived BGP update messages
from over 30 routers over the world. An archive contains 5 minutes of exchanged update
messages for a single router.
RouteViews
The RouteViews project [293], run by the University of Oregon, is a network of routers
of AS6447 placed at several locations and peering with different backbones. The idea
is to obtain near real-time informations about BGP routing to understand better the
relationship between an AS and the rest of the network.
Most of the routers are available directly via Telnet, so that information can be viewed
in real-time. Moreover, every two hours, the data of the BGP table is dumped into a
file and made available from RouteViews website. Also, every 15 minutes, BGP update
messages received is saved in an archive file that is also made available at [293].
A lot of tools have been developed based on RouteViews data, most notably Cyclops
(see below). Many analytical works, such as studies of the global dynamics of BGP
routing tables [132, 204], have been based on the very same data.
BGPlay
BGPlay [63] is an application that graphically displays AS-relationships based on RouteViews data. It was developed by the Computer Networks Research Group of Roma Tre
University.
RIPE RIS
RIPE NCC (Réseaux IP Européens - Network Coordination Centre) offers several tools
as part of its Routing Information Service (RIS) project [216].
41
Visualize
Visualize is a Flash application that graphically displays topology changes, based on
updates and withdrawals seen by RIS, towards a given prefix in a given time frame.
Search RIS
The Search RIS module enables a search, in the last three months, of announcements
and withdrawals for a given prefix (with the option to search for less and/or more specific
prefixes) in a given location in a given time frame.
ASInUse
ASInUse determines the last time an AS appeared in a routing table (in the last three
months), and displays its known peering ASes. Filtering on a specific location is also
possible.
PrefixInUse
Similar to ASInUse, PrefixInUse determines the last time a given prefix appeared in a
routing table (in the last three months). The result also displays the originating AS(es)
for the prefix, or related ones.
Looking glass
RIPE provides a webpage that enables the execution of a command on one of their
routers. The available commands permit querying the BGP and routing tables of the
router, the execution of a traceroute, sending a PING message, etc.
RISwhois
The RISwhois tool returns the matching prefix/origin AS pair for a given host address.
Raw Data
RIPE dumps and archive data of their collector routers, which is publicly available as
raw data. The entire BGP routing tables are dumped every 8 hours, while the updates
messages are saved every 5 minutes. For example, data collected by the router located
in Amsterdam dates back as far as September 1999.
BGPmon: BGP Monitoring System
BGPmon [62] aims at giving real-time access to BGP data, avoiding update lags inherent
to collectors-based systems (e.g., RouteViews, RIPE RIS). Unlike collectors, BGPmon
42
does not implement a full-blown BGP client, but only the requested functions: receive
and log routes. As a result, BGPmon is lightweight enough to peer with more neighbours
[318].
The architecture used by BGPmon is the publish/subscribe one. A set of brokers form
an overlay network that peer with neighbour BGP routers and exchange information
among themselves. They manage the final stream and compute the best route from the
publisher to the subscriber. Subscribers (applications) can personalise the informations
they want to receive in their stream (including open, close, update, notification BGP
messages, state changes in BGP, break up and tear down of peering connections, etc).
Currently, the BGPmon application incorporates the three facets of the system: broker, publish, and subscribe. It is divided in three levels. The first one peers with a
BGP-enabled device and places BGP messages in a queue, creating a stream of events.
The second one labels events from that queue that identify announcements, withdrawals,
updates, duplicates, etc. As this second stage can be quite costly in terms of memory,
it can be disabled. Disabling it, however, results in losing the ability to simulate a route
refresh without the help of remote sources. The final stage adds status informations and
injects route table snapshots in the stream.
Cyclops
Cyclops, the AS-level connectivity observatory, is a monitoring tool developed by the
University of California, Los Angeles. In a nutshell, the goals of this project are i) to
detect anomalies in BGP data (such as misconfigurations and route leakages), ii) to
provide a connectivity map of inter-connected networks, to detect suspicious peerings,
and iii) to correlate these events together.
As of now, Cyclops fetches data from RouteViews devices, RIPE-RIS, Albeine, Packet
Clearing House and BGPmon. The data is then preprocessed by extracting AS links
from the AS-paths, and adding timestamps. Also, a weight is associated to a link, which
represents the number of routes that make use of it. Finally, a relationship inference
is performed and the ASes are classified (i.e.., stub AS, transit AS, tier-1, etc). After
preprocessing, the data is entered in the Cyclops database [225].
The Cyclops database can then be browsed through the Cyclops website [291]. The
web interface can display AS connectivity (which ASes are peering with a given AS),
prefix origins (which AS announces a given prefix), transient prefix origin (which AS
has announced a given prefix for less than 5 days), anomalous peerings (when does AS
link disappear for more than 24 hours), and even more. Data can be filtered by date,
by activity, etc. The raw data of the database is also available at [195] for people who
want to build their own tools based on it.
43
Finally, Cyclops offers the possibility to register and allows the user to define a set of
ASes they are interested in. Cyclops will then show by default information regarding
these networks (neighbours, alerts, etc).
Netviews
NetViews [292] is an effort between University of Oregon, Colorado State University,
University of Memphis, and University of California Los Angeles aiming at building the
next-generation RouteViews.
The system relies mostly on BGPmon, and provides therefore real-time information.
A central server, called data broker is connected to BGPmon and forwards the data to
its clients.
The NetViews client is a Java application that displays BGP data in real time in
multiple forms. The default view is quite standard: it shows plots of BGP activity,
including the state of the routing table, incoming announcements and withdrawals.
Another view is the visualizer, which displays geographically positioned ASes. ASes
and links are drawn differently depending on selectable factors such as the number of
originating ASes, number of peers, link degree, etc. The map is interactive and dynamic:
it updates in real-time. The live mode can be stopped to observe an event in more details.
In this case, update messages are queued for further processing. It is therefore always
possible to go back and forth in time to view messages. A complete BGP table can be
downloaded from a source as a base so that the map is populated with correct entries
at startup. Information about ASes (such as WHOIS) are also integrated. Finally, the
user can filter on a given prefix (or AS), or display the routes towards a given location.
The NetViews client is still in beta development, and is not currently available to the
public [292].
Robtex and BGP Toolkit
Because the information about ASes are spread in different databases, some projects
have been focusing on gathering these data in a centralized view. Robtex AS Analysis
[252] gathers data from WHOIS and routing registries, and infers peering relations from
BGP tables based on data from RouteViews and RIPE RIS.
Similarly, Hurricane Electrics BGP Toolkit [131] gathers the same kind of data, and
performs some statistical analysis on it.
44
2.3.5 Methods for detecting prefix hijacking
This Section focuses on existing methods and algorithms used to detect prefix hijacking
attacks and briefly describes some tools, services or implementations for detecting prefix
hijacking.
The Next-Hop anomaly
This method is presented in [21] was designed under the assumption of an ownership or
an intermediate hop attack where the hijacker is the first hop after the legitimate AS3 .
It uses information from both the control and data plane.
Detection method
Let p be a prefix originated by AS O. A router belonging to AS S receives in its update
an AS path field containing N1 , . . . , Nj , O. Based on this AS path, any packet to p should
be directly forwarded to O once it reaches Nj . The authors define a next-hop anomaly
as a data-plane trace where AS Nj forwards packets for p to some AS I (I 6= Nj ). It
suggests that Nj and O are not interconnected. The next-hop anomaly is used as signature for detection.
Limitations
As such, the signature generates a lot of false positives that the authors attribute to
errors in IP-to-AS mappings, including IXPs routers not included in the AS path, sibling ASes that share address space and have routing agreements, and provider address
spaces in which customers use a small part of ISP’s space as their own.
Having removed events attributed to the causes here above, the authors are still unable to decide whether the remaining cases are the result of prefix hijacking or traffic
engineering agreements. Basically, “there is no way to verify the data-plane adjacency
of two ASes as claimed by the corresponding control-plane advertisements”.
PHAS: a Prefix Hijack Alert System
The idea behind PHAS [176] is to provide a prefix hijack alert service. Based on the
premise that the prefix owner is the only one that can unambiguously distinguish a legitimate route change and a hijacking attack, the authors offer the possibility for network
administrators to subscribe to monitoring services for a given prefix p, and to be notified
3
The method would work for any intermediate path level attack, but was limited to this case to reduce
the problem to a manageable size.
45
of an origin AS change somewhere on the Internet, in near real-time.
Detection method
The system builds, over time, a set Op (t) containing the different origins ASes for prefix p seen at time t on every router where PHAS is deployed4 . PHAS alerts the users
whenever Op (t) 6= Op (t − 1).
Obviously, by simply doing this, the system will notify a user each time there is a
change in the set. To avoid notifying users of repeated origin changes, the authors
introduce a time window. The origin set is extended to Op (t − k, t) that contains every
origin AS seen for prefix p during the time [t − k, t], on every PHAS-enabled router.
The system then generates an alert when Op (t − k − 1, t − 1) 6= Op (t − k, t). This trick
avoids repeated origin events, but will still generate an alert whenever a new origin AS
appear, or whenever a known origin AS disappears, notifying users only on potentially
wrong origin ASes.
Such a detection scheme works relatively well, but users should not have their notification delayed when Op changes if their network behaves well. To avoid this, the authors
introduce an adaptive window size. On top of a windowed origin set, each prefix is
assigned a penalty Sp . When an update message is received for prefix p, Sp is increased
by 1/2. The size of the window for p is then 2bSp c . Sp decays exponentially, determined
by a time value.
Finally, users have also the possibility to add filters before alerts are being sent to the
user.
Extensions
The authors also provide possible extensions to PHAS to deal with other types of attacks
from the origin attack. For subnet attacks, a mechanism based on watching modifications made to the set SP p that contains the advertised subprefixes of p is proposed. If
no subprefixes are advertised, SP p = { }. For last hop attacks, the suggested method is
to watch the set LA containing the last hops witnessed for prefixes with A as the origin
AS.
Using these two additional sets in PHAS helps to further identify hijacking attempts.
However, the subprefix set (resp. the last hops set) is potentially huge for a network
such as 12.0.0.0/8 (resp. for a tier-1 ISP).
Accuracy
PHAS has successfully detected every known ownership attack. It cannot, however, de4
The authors decided to use data from RouteViews routers [293].
46
tect a stealthy IP hijack, like the one presented in [203]. As a result, PHAS is unlikely
to detect a man-in-the-middle attack such as the DEFCON one [238].
Directed AS topology
The idea behind this method is to build a directed graph of the network topology, and to
use it to verify the AS paths in update messages. It is presented in [247] and is heavily
dependant on a previous work of the same authors [100].
Detection method
The authors first observe that the majority of BGP routes are stable and legitimate.
Thus, these routes can be learnt over time. Let’s consider a prefix p. An observer receives
a legitimate update message for p, containing the AS path ak , . . . , a0 . In other words,
ASes ai and ai−1 are neighbours. A directed AS link is a link ai → ai−1 (i = 1, . . . , k).
Moreover, ai (resp. ai−1 ) is upstream (resp. downstream) of ai−1 (resp ai ). The directed
links also indicate the import/export policies of the involved ASes. A downstream (resp.
upstream) AS allows route to be exported (resp. imported) to an upstream (resp. downstream) AS.
Let’s consider, at time t, the sets A(t − k, t) and L(t − k, t) containing the associations
between a prefix and an origin AS number and the directed AS links, respectively, seen
between time t − k and t (i.e., in a time window of size k).
Whenever an update message reaches the observer, the system verifies that the AS
links given in the AS path of the message are valid (i.e., are part of set L). If the links
are ok, the system verifies the association between the prefix and the origin AS (i.e., it
is part of set A).
If an extracted ai → aj association from the AS path does not belong to L but aj → ai
does, there is a policy violation, and the link is a redistribution link. An example of such
a behaviour is when a customer having two different ISPs forwards traffic between the
two providers. If aj → ai 6∈ L, the path is a fake link : the announced neighbouring ASes
are not really neighbours. It is highly likely that someone tampered with the AS path.
Also, if (p, a0) 6∈ A, there is prefix hijacking. Furthermore, if (p, x) ∈ A for x 6= a0 ,
it is an ownership attack. If (q, x) ∈ A with q ⊂ p (i.e. q is more specific than p), it is
a prefix attack. Finally, if (q, x) ∈ A with q ⊃ p (i.e. q is less specific than p), it is a
supernet attack.
Of course the scheme will only work if the model of the network (i.e., the sets A
and L) are close enough to reality. Therefore, the initialisation phase is very important. The authors propose heuristics to remove alerts generated by transient routes,
path extensions (which are the result of address suballocation), usual BGP misconfig-
47
urations, (de)aggregations, sibling ASes links, address-sharing peers, and backbone links.
Accuracy
The authors announce a false positive rate as low as 0, 02% and an average of 20 alerts
raised per day. They have nearly 100% accurate detection on documented public incidents.
However, the required quality of the calibration data can be a strong limitation. Moreover, the AS relationships on which this method is based is a model of a perfect Internet,
and thus not entirely accurate. Finally, the authors do not provide any detail on how to
set the threshold values used for the different heuristics.
Hop count to a reference point
The technique presented in [331] only relies on the data plane to detect possible hijacking
events. Namely, it uses the distance (expressed in hop count) between a set of N wellplaced monitors M and the watched network, based on the assumption that distance
measurements to a destination network is relatively stable over time (which seems to be
confirmed by [330]).
In addition to the N monitors, one (or more) reference points per monitor are needed.
A reference point is a router topologically close to the network under surveillance, but
outside of it.
Detection method
First, periodically, a monitor measures its distance from the network dt (at time t). It
keeps in memory a moving average window of size k that contains, at time t the average
value of the distances between t − k and t, called At . Because a prefix hijack is likely to
have serious consequences the topological location of the victim network, whenever an
attack occurs, dt will significantly differ from At , thus raising a red flag5 . This step is
known as the network location monitoring.
Secondly, when a red flag is raised, the path disagreement detection is called. Its
goal is to compute the path similarity between the (supposedly affected) AS path to
the network and the (normally unaffected) AS path to a reference point. Because the
authors rely only on the data plane, they chose to use iPlane [294] to map the hop IP
addresses to their (supposedly correct) AS number. Once the similarity st between these
paths has been computed, its value is compared with sh , the similarity path value that
5
To be complete, the authors use another window to smooth the instantaneous measurement dt as
transient problems leads to noise.
48
had been computed prior to the hijacking alert. If st /sh > T for a threshold T (i.e., the
similarity has decreased dangerously), an alert is raised by the monitor.
Obviously, if multiple monitors raise an alert, the probability of being under attack
increases.
Limitations
The detection accuracy highly depends on the choice of the monitors. To be effective,
monitors have to be largely distributed and use different routes to the network. It may
not be easy to locate such positions.
Also, the method relies solely on the data plane. An attacker using a tool like Fakeroute
[203] will make the detection system blind. Moreover, a MITM-attack such as [238] also
makes the scheme useless.
The path disagreement detection might not be accurate because of the policy of one
AS along the way between the monitor and the network/reference point. An AS radically
changing its policy could even trigger an alarm.
Fingerprinting the network
The fingerprinting technique [124] is based on the hypothesis that the hijacking network is different from the legitimate one. Consequently, it is possible to compare the
fingerprint properties of the hosts on these networks to infer if they are identical or not.
Multiple fingerprinting techniques are used, both network based and end-host based.
Network based fingerprints include firewall policies, bandwidth information, characteristics of routers, etc. End-host fingerprints include OS, IP identifier probing, TCP/ICMP
timestamp probing, uptime, etc. It is essential to select multiple discriminative properties to ensure that the hijacker cannot fake the responses.
Detection method
To detect ownership attacks, the system looks for MOAS. For each prefix in a MOAS
conflict, the method then builds an AS path tree rooted at the prefix. Then, it tries to
find a live host to use as probing target. Multiple probing locations are selected such
that packets traverse every possible AS to the destination, and fingerprints are then
acquired. Finally, the results are analysed and compared.
To detect intermediate hop attacks, the authors use an AS-level traceroute to detect
fake edges in the path. They limit the amount of false positive with a couple of heuristics:
popularity constraints (i.e., if an edge of the network is only used by a few prefixes, it
is more suspicious than a route used by a lot of prefixes), geographic constraints (i.e.,
a network edge corresponding to two geographically distant points is suspicious), and
49
relationships constraints (partially based on AS relationships [100]).
When a potential subnet attack is detected, the method first removes all networks with
a provider-customer relationship. This is based on the assumption that a provider has no
reason to hijack the traffic of one of its customers, and that a customer cannot steal the
traffic of its provider. For the remaining routes, a reflect-scan is used for fingerprinting.
The reflect-scan is similar to the TCP idle-scan technique. An additional step for the
reflect scan is to find a live host that is not inside the attacked subnet to perform the test.
Limitations
The result of fingerprints are highly dependent on the OS installed on the machine. Also,
it is not always possible to find a live host to perform those tests (or even two hosts in
the case of reflect-scans). Moreover, devices on the path (e.g. firewalls) can hinder the
quest for probe-able hosts.
Using idle scan
Detecting BGP hijacking attacks through idle scan is presented in [121]. However, this
technique relies on a single vantage point to probe networks, which makes it even more
complex to detect an attack.
Detection method
The system watches BGP update messages and, whenever it detects a MOAS conflict,
it starts an idle scan to find out whether the MOAS is legitimate or the result of an
attack.
The probing technique is similar to fingerprinting’s reflect-scan; however, instead of
using a machine outside of the hijacked subnet (but still inside the original network) to
perform the test, it makes use of a host part of the legitimate last-hop network.
Limitations
The limitations are the same as the ones presented before for the fingerprinting technique.
Using PING tests
This method [268] focuses on the sole use of PING tests to differentiate legitimate MOAS
and an ownership attack.
Detection method
50
When a monitor receives a suspicious update containing a prefix to be observed, that
monitor executes ping tests for every host address of that prefix. At the same time, it
notifies another monitor that did not receive the update yet to perform the same test
on the original route. The two ping results are compared. If the results are “similar
enough”, the system concludes that there is no hijacking.
Limitations
A preliminary experiment showed good results, but a large-scale test remains to be
done. However, pinging a whole network range may result in substantial network load,
although the authors suggest that for larger networks, only a set of distributed subnets
need to be checked.
iSPY
In [328], the authors present a method for detecting prefix hijacking without relying on
external infrastructure (vantage points, monitors, etc).
The method relies on the ability of the victim AS to find its vPath. The vPath is the
set of AS-level forward paths from the network to the others ASes on the Internet6 . It
can easily be obtained from tools such as traceroute.
Detection method
Considering two forward paths P and P 0 to destination d, obtained at time t and t0
(t < t0 ), if P = P 0 , then everything is fine. If P 0 6= P but P 0 is complete (i.e. traceroute
receives every response to destination), the route change was legitimate. If P 0 is incomplete and P 0 ⊂ P (i.e. every AS number in P 0 is in P , up until P 0 receives no more
data), then a cut exists between the last router of P 0 and the next one in P . Finally, P 0
is incomplete and P 0 6⊂ P , there is a cut between the last hop in P 0 and the (unknown)
next one.
Defining Ω as the set of all existing cuts, the cardinal |Ω| of Ω is the detection signature: if it is bigger than a threshold value, there is hijacking.
Limitation
First and most importantly, the detection scheme only works if the hijacker blackholes
the traffic.
Also, iSPY is likely to confuse a stealthy attack as a legitimate cut link, and is blinded
by a tool such as Fakeroute [203].
6
Actually, only to the transit ASes of the Internet (i.e., without stub-ASes).
51
PGBGP: Pretty Good BGP
PGBGP’s goal is not only to detect hijacking events, but to improve overall routing quality and reliability. The core idea behind PGBGP, presented in [148], is that “unfamiliar
routes should be treated cautiously when forwarding data traffic”.
PGBGP defines a set of normal data containing the prefix, its origin AS, a timestamp
of the last received update. The normal data set and the router’s Routing Information
Base (RIB) are used to create a history for known prefixes and origins. Obviously, at
startup, there is no known history, and all routing updates are accepted for h days.
Afterwards, incoming routes that would alter the state of the normal behaviour are
quarantined for s days. The quarantined routes are considered as suspicious. After that
time, they are accepted, if still in the routing table. This quarantine mechanics prevents
short-term erroneous announcement from disrupting routing. Finally, PGBGP removes
data from the history if it has not been announced for h days.
As any incoming route is tested against the history, hijacking attempts, arriving with a
new origin AS, do not match known history for that prefix and are therefore quarantined.
While suspicious, the old, trusted route is used for packet forwarding.
To avoid subnet attacks, PGBGP checks if the new incoming prefix is a subnet of a
known one. If it is, and the route the subnet does not traverse the larger prefix AS,
it is suspicious. However, forwarding packets along the trusted route may be useless
if routers along that route have been compromised. Therefore, PGBGP tries to avoid
forwarding packets to neighbour routers that have announced the suspicious route.
Super-prefixes of known prefixes are always accepted by PGBGP as the authors believe
that it is the result of a new network destination, not of a hijack, because traffic destined
to the original network will use the legitimate, more specific route.
2.4 Analysis of Spam Campaigns
2.4.1 Introduction
In [248], the authors show that cybercriminals are able to misuse the BGP routing
protocol to hijack blocks of IP addresses for limited periods of time during which they
could launch spam campaigns from, apparently, legitimate blocks of IP addresses. To
the best of our knowledge, nobody else could demonstrate, until now, to which extent
this assumption can be verified. However, if this claim is true, such techniques would
clearly defeat the spam blacklists that anti-spam tools use as a first layer of defence
against spammers.
One of the main objectives of VIS-SENSE is to provide sound scientific rationales in
52
favour, or against, the idea that the core infrastructure can be misused by cybercriminals
to carry on malicious activities, such as launching spam campaigns. To do this, we need
first to have a clear view on the inner workings of such spam campaigns, and how we
can observe them.
As suggested in [333], there are two main approaches that can be used to get insights
into spammers activities:
• passive observation, which consists in observing the visible effects of spammers
activities, e.g., analyzing the content of spam messages received for a given domain
(to create spam filters), or looking at the IP addresses of the machines used for
sending spam messages (IP reputation analysis).
• active observation: as spammers are moving to more sophisticated techniques, and
because they are increasingly relying on botnets to send spam campaigns, it can
be sometimes necessary to infiltrate the infrastructure of spam gangs to better
understand their modus operandi. Examples of active observation techniques include the execution of a malware sample in a sandbox to observe its behaviour, the
manipulation of C&C servers to uncover botnets communication protocols, etc.
On the other hand, spam detection and mitigation techniques can also be categorized
according where they are applied along the path between the spam source and the
destination. We can usually distinguish between:
• pre-acceptance detection techniques, by which spam emails are detected before they
actually reach the destination mail server. These techniques take advantage of lowlevel network features to detect and block spam traffic as soon as possible, so as to
reduce the load on SMTP servers. Examples of techniques include fingerprinting
spam bots at the SMTP layer, IP reputation filtering, etc.
• post-acceptance detection techniques, by which spam emails are identified after
they have reached the destination mail server. These techniques take advantage
of features extracted from the whole spam message, including the analysis of the
content of the email message, which obviously involves a heavier processing.
In Sections 2.4.2, 2.4.3 and 2.4.4, we first detail a few classes of techniques that are
commonly-used to detect, block or analyze spam messages. Then, in Sections 2.4.5 and
2.4.6, we describe some previous works that have focused on studying the higher-level
behaviour of spammers, as well as the scam infrastructure they are using.
53
2.4.2 IP reputation analysis
The idea behind this technique is simply to build a database of IP addresses associated
with spamming activities. Upon reception of an email, this knowledge base can be
queried to help determine if this is spam or not. Basically, three types of list can be
built:
- blacklist: contains IP addresses of hosts from which all emails should be blocked;
- greylist: contains IP addresses of hosts from which all emails should be first rejected
and then accepted. This technique works because spamming hosts usually don’t
resend emails;
- whitelist: contains IP addresses of hosts from which all emails should be accepted.
Relying on an IP reputation database is relatively effective since one just needs to accept,
delay or reject emails from already known hosts. However, it can be hard sometimes
to identify the correct source IP address depending on the data collection infrastructure [109].
Whitelists are often implemented in mail servers to automatically accept emails from
specific IP addresses regardless of the presence of that address in any bad hosts list.
Greylists take advantage of the often poorly implemented spamming hosts software
which does not resend any email that is not accepted by the server. For every email
coming from an unknown source IP address, the message is first rejected and then
accepted. This allows to record potential spam bots while adding a little delay when
receiving legitimate emails from an unknown source for the first time.
Blacklists contain IP addresses of hosts that have participated in spam sending operations. They are widely used by the research community [39] and implemented in many
commercial anti-spam systems ([267, 1, 13]) to help classify input messages as legitimate or spam according to the source IP of the spamming host. The different blacklists
independently maintain records of IP addresses that have been involved in spamming
activities (e.g., spam sending, open-relay, member of a botnet, etc). Examples of popular
blacklists include Spamhaus (PBL, SBL, XBL) [12], SORBS [10], Spamcop [11], DSBL
[3], NJABL [6], and Composite Blocking List [2], among others.
However, IP blacklists are often said to be inefficient because spammers techniques
have evolved. According to [39, 167, 147, 230], blacklists have actually forced spammers
to build networks of compromised hosts (a.k.a. botnets) because many zombie machines
use dynamic IP addresses, which makes blacklisting less effective. This finding is reinforced by the fact that that most of today’s spam comes from botnets [265, 266, 248, 230].
Moreover, [17] claims that blacklists are often too slowly updated to allow the real-time
54
detection of spamming hosts. It is also claimed in [230] that bots tend to send low volumes of spam in order to avoid being blacklisted. However, [19] concludes that blacklists
are still quite effective in identifying spam sending hosts.
MessageLabs [267] makes use of IP reputation analysis in their Traffic Management
Layer. They analyze the source IP address and look at its past activities to decide
whether or not they should proceed with email processing. IP reputation is also used in
the Skeptic(TM) Anti-Spam Layer by means of DNSBL lookups. In [39], the authors develop a distributed system called Trinity based on IP reputation to detect spam sources.
The system is based on the assumption that bots are sending a lot of spam in short
amounts of time, so the reputation of these sources must be shared as soon as possible.
In [264], the authors develop a system called FIRE (FInding Rogue nEtworks) to
identify and expose organizations and ISP’s that demonstrate persistent, malicious behavior. The goal is to isolate the networks that are consistently implicated in malicious
activity from those that are victims of compromise. To this end, FIRE actively monitors
botnet communication channels, spam traps, drive-by-download servers, and phishing
web sites. A malscore is then computed for each AS, based on the observed activity of all
IP’s belonging to that AS, which somehow reflects also the reputation of each network.
IP reputation analysis is a pre-acceptance technique which can be used in both active
and passive observation of spamming activities. First, IP testing can be performed before
message are actually received by the SMTP server. Although blacklists and whitelists
can be queried by low-level hosts, greylists require recording each new connection from
unknown sources, and thus they should be deployed on powerful hosts (but before the
SMTP server).
2.4.3 Message content analysis
The analysis of the email content is widely used by the research community [313] and in
many commercial spam filters [267, 13]. This approach basically consists in extracting
every piece of information from the content of email messages to determine if this is
spam or not. By analyzing the message content, we can extract patterns that allow
us to detect future instances of similar spam messages. This technique is very popular
because you only need to have access to email messages. However, spammers rely now
on more sophisticated message generation techniques that take advantage of message
polymorphism [168, 147, 248, 230].
Another very popular technique is Bayesian spam filtering, like the system used in [13],
which is based on a two step process including a learning phase and a filtering phase.
During learning, the filter analyses message words and computes the probability that
a message is a spam based on words occurrences. In the filtering phase, the Bayesian
55
formula is used to compute the probability that the message is spam or ham based on
the individual probabilities computed in the first place. The advantage of this technique
is that it can adapt itself to users email content. However, spammers have learned in the
meantime how to fool Bayesian filters by, for instance, including legitimate words/URL’s
in spam messages [262].
MessageLabs applies content-filtering in the Skeptic(TM) Anti-Spam layer by using
BrightMail in conjunction with different heuristics applied to message content and headers. New heuristics are developed and existing ones are constantly improved to reflect
new uncovered spam patterns.
Message content analysis falls into the post-acceptance class of techniques. Most
works involving content message analysis use passively collected spam data. However,
some works make use more active techniques (e.g., infiltration) such as in [230, 46, 147],
although message content analysis is usually not the final objective for those studies,
but rather a means to perform some higher-level analysis (e.g., studying characteristics
of spam campaigns, spam marketing conversion, etc).
2.4.4 Network-level spam detection
Network-level spam detection techniques leverage features of spam traffic below the application layer to detect and stop spam from entering mailboxes. Advocates of this
technique claim that network-level features are less changing from spammers to spammers and from campaigns to campaigns. They also argue that this approach allows
spam to be filtered closer to the source and, as a consequence, prevent network resources
from being spoiled. However, this approach is actually limited by the type of data available (tcpdump logs, BGP routing information, etc). In [87], the authors deploy TCP
signatures of identified spamming hosts in routers. They claim that good signatures
exhibit low false-positives and low false-negatives. However, they also admit that this
kind of signatures can currently only complement other spam detection techniques. In
[36], the authors suggest that transport-level features like the RTT, the time between
each SMTP packet in a flow and the TCP flow termination process can be leveraged in
order to differentiate traffic that carries spam from traffic carrying legitimate email. In
[248], the authors study some network-level characteristics of email traffic and attempt
to infer features that can help detect spam. They look at different features like the distribution across the IP space, across ASes and by country. They also look at the volume
of spam sent and the time during which spamming hosts are active. This paper makes
an important contribution to the field of spam analysis: they have indeed witnessed
a few spammers advertising short-lived hijacked BGP routes used to send spam in a
stealthy way. However, this paper is currently the only one that found some evidence
56
that spammers could take advantage of BGP-hijacking to send spam.
Some other network-level techniques leverage features of the SMTP protocol used between mail servers and mail clients to detect connections from spamming botnets. This
kind of method is based on the assumption that spam bots implement a customized
version of the SMTP protocol. Second, these customized implementations are assumed
to exhibit few polymorphism compared to headers and message contents. By extracting
features at the level of SMTP communications with spam bots, it is possible to use them
to detect future instances of those spamming hosts, but also instances of spamming hosts
using the same spam engine. SecureWorks [9] and MessageLabs [267] both take advantage of that technique to detect spam originating from very large spamming botnets.
MessageLabs uses regular expression-based signatures provided by CBL [2] contributors
which describe the SMTP sessions of different spamming botnets.
Since these techniques take advantage of low-level characteristics of spam traffic, they
can be applied before messages are received by the mail server (pre-acceptance). The
works described in [87, 36] rely on spam data collected passively. However, in [263], the
authors hijack a botnet C&C server to discover IP addresses of bots and they correlate
them with passively observed spamming activities to study spam originated from the
botnet. As a result, both passive and active observation techniques can be used to
analyze network-level features of spammers.
2.4.5 Analysis of scam infrastructure
The scam hosting infrastructure refers to the Internet infrastructure used to host web
sites advertised in spam. By analyzing the characteristics of such an infrastructure, one
might be able to identify all spam messages advertising the same web sites or the same
product. Discovering the different advertised topics and products can help to study spam
campaigns and how they are carried out. It also helps to learn how spammers manage
their scam hosts (e.g., multiple spammers may share a common infrastructure). In [304],
the authors extract embedded URL’s and retrieve domains and IP addresses of hosting
servers. They observe that domains are associated with several IP’s, probably to increase
the resilience of their infrastructure when certain servers are banned or blacklisted. IP
addresses also match several domains.
Most hosting infrastructures seem to be widely distributed. The rotation of IP addresses seems to occur less frequently than the change rate of domains, which suggests
that looking at the advertised servers IP addresses can be leveraged to analyze spam.
In [142], the authors study the web scam infrastructure related to the spam they receive.
They find that, while spammers employ sophisticated methods to generate polymorphic
spam content, advertised web content is more static. They further cluster spam mes-
57
sages they receive based on the IP addresses of the advertised web domains. They also
find that many spam campaigns share common scam infrastructures making it difficult
to characterize individual botnets from that kind of data.
In [19], Anderson et al. characterize scam infrastructure and use data related to
scam to better understand the dynamics and business pressures exerted on spammers.
They designed an opportunistic measurement technique called spamscatter that mines
emails in real-time, follows the embedded link structure, and automatically clusters the
destination Web sites using image shingling to capture graphical similarity between
rendered sites.
Another work by Kanich et al. [147] analyzes the conversion rate of spam, i.e., the
probability that a spam message will ultimately elicit a sale. To that end, they infiltrate
a spamming botnet and swap the malicious advertised web pages with innocuous web
pages under their control. This sheds light on the real benefits spammers get from
sending spam. The results show that running a spamming botnet is costly and that
sometimes, the spammers and the advertisers may be the same. This work highlights
the importance of the scam hosting infrastructure in conveying spammers’ message that
incites users to buy products.
Finally, in [66] Cova et al. have conducted a large-scale analysis of rogue AV campaigns
and have studied the distribution infrastructure (i.e., the rogue AV websites) used for
such campaigns. A rogue AV software is a type of misleading application that pretends to
be legitimate security software, such as an anti-virus scanner, but which actually provides
the user with little or no protection. Quite similarly to spam campaigns, Rogue AVs
typically find their way into victim machines by relying on social engineering techniques
to convince inexperienced users that a rogue tool is legitimate and that its use is necessary
to protect their computer. It is worth noting that [66] is one of the very first ones that has
demonstrated the usefulness of attack attribution approaches to the problem of mining
large security datasets. By using multi-criteria decision analysis techniques (MCDA),
the authors were able to discover specific campaigns likely to be associated to the action
of a specific individual or group. Prior to this work, a preliminary, high-level overview
of some of the results obtained with the very same attribution method was presented in
the Symantec Report on Rogue Security Software [97].
Studying scam hosting infrastructures often involves extracting URL’s from spam
messages. Most techniques used for this purpose are thus post-acceptance and passive
techniques.
58
2.4.6 Analysis of higher-level behaviour of spammers
The majority of studies on spam detection and mitigation techniques concentrate on
the core-business activity of spammers, that is, sending spam. However, spammers
must also perform many other activities before being able to flood users’ mailboxes
with spam. For example, [243] describes how spammers can find email addresses as
new target. Moreover, as most of today’s spam comes from botnets, spammers have
to manage these large networks to ensure that they can work properly without being
detected. For instance, some bots may send spam to recruit new members to make the
botnet grow, whereas others may be responsible for relaying spammers’ orders to other
bots [147, 168]. In fact, although these secondary activities are critical for spammers,
both the research community and the commercial spam filters vendors don’t pay much
attention to them. However, studying these activities can help understand more about
spammers’ behaviours.
Another important task that spammers have to perform is called email harvesting.
This consists in collecting email addresses from websites or infected computers to further
use them as recipient addresses of spam emails. In [243], the authors describe the
Project Honey Pot [7] which studies email harvesting by setting up honeypots recording
any attempt to harvest email addresses on web sites by providing fake email addresses
associated with spamtraps. This way, they are able to associate spam senders with
email harvesters. They identify two classes of spammers: those sending spam only a few
hours after email addresses have been harvested and those sending spam a few weeks
after email addresses have been harvested. They also show that hosts harvesting email
addresses tend to be associated with static IP addresses and that they are less likely to
be blacklisted than spamming hosts. Finally, they find that, quite surprisingly, many
email harvesters can be fooled by means of simple email address obfuscation techniques.
Spam campaigns have also become an important research topic over the past few
years. Although the concept of “spam campaign” is not clearly defined in the research
community, a spam campaign is often considered as a group of spam messages advertising the same product, and likely due to the same spammer or spam organization.
Characteristics of spam campaigns can be uncovered by studying them using different
techniques [313, 230]. One such technique is the analysis of similarity in the content of
the spam messages, or of other specific features available in the message by itself. For
instance, one can leverage URL’s to detect spam campaigns [313, 230]. In [313], the
authors assume that spam campaigns are bursty and design a detection system based
on the automated generation of URL regular expression signatures. On the other hand,
another study of spam campaigns in [230], which uses URL’s extracted from spam messages collected at an open-relay, states that campaigns can be long lasting and are not
59
necessarily bursty. The authors also found that a bot may participate in different campaigns but targeting different recipients. In [46], they define a spam campaign as a set of
spam messages advertising the same product and using similar obfuscation and dissemination strategies. By leveraging frequent pattern trees and a set of extracted features
(i.e., the source and destination of the messages, the type of abuse and content obfuscation strategy), they can group spam messages into campaigns. In [333], the authors
take advantage of text shingling to identify nearly duplicate messages in order to cluster
them into campaigns. They find that half of the campaigns stay active for only a few
hours and that the amount of spam sent by a botnet primarily depends on its size.
Finally, in [167, 168], the authors study the way bots receive email addresses lists from
C&C servers, how bots build spam messages from given templates, how bots report spam
sending errors and the fact that spammers use email accounts to test their campaigns
against different filters.
All these activities are also part of the spamming process, and thus studying them
really helps to gain insights into the spam phenomenon as a whole.
2.5 Root Cause Analysis and Attack Attribution
2.5.1 Introduction
In the context of cyber-attacks, a fundamental aspect is how to address the problem
of attribution. Note that there is currently no universally agreed definition for “attack
attribution”. If one looks at the definition of the term attribution in a dictionary, one
will find something similar to: “explain by indicating a cause”7 . However, most previous
works related to that field tend to use the term “attribution” as a synonym for traceback,
which consists in “determining the identity or location of an attacker or an attacker’s
intermediary” [307].
In the context of a cyber-attack, the obtained identity can refer to a person’s name,
an account, an alias, or similar information associated with a person, a computer or an
organisation. The location may include physical (geographic) location, or any virtual
address such as an IP address. In other words, IP traceback is a process that begins with
the defending computer and tries to recursively step backwards in the attack path toward
the attacker so as to identify her, and to subsequently enable appropriate protection
measures. The rationales for developing such attribution techniques lie in the untrusting
nature of the IP protocol, in which the source IP address is not authenticated and
can thus be easily falsified. For this reason, most existing approaches dealing with IP
7
Definition given by Merriam-Webster. http://www.merriam-webster.com/dictionary/attribute
60
traceback have been tailored toward (D)DoS attack detection, or eventually to some
specific cases of targeted attacks performed by a human attacker who uses stepping
stones or intermediaries in order to hide her true identity.
In this project, we will refer to “attack attribution” as something quite different from
what is described here above, both in terms of techniques and objectives. Although
tracing back to an ordinary, isolated hacker is an important issue, we are primarily
concerned by larger scale attacks that could be mounted by criminal organizations,
dissident groups, rogue corporations, and profit-oriented underground organizations.
Consequently, we are rather looking at analysis methods that can help security analysts to determine the root cause of global attack phenomena (which usually involve a
large amount of sources or events), and to easily derive their modus operandi. These attack phenomena can be observed through many different means (e.g., honeypots, IDS’s,
sandboxes, web crawlers, malware collection systems, spamtraps, etc). Typical examples
of phenomena that we may want to identify and study can go from malware families that
propagate via code injection attacks [188], botnets controlled by underground groups and
targeting machines in the IP space [286, 71], spam campaigns, or even to certain clientside threats such as rogue software campaigns run by the same organization, which aims
at deploying numerous malicious websites (or compromising legitimate ones) in order to
host and sell rogue software [66].
Attack phenomena are often largely distributed in the Internet, and their lifetime can
vary from a few days to several months. They typically involve a considerable amount
of features interacting sometimes in a non-obvious way, which makes them inherently
complex to identify. That is, due to their changing nature, the attribution of distinct
events having the same root phenomenon can be a challenging task, since several attack
features may evolve over time. As noted by Richard Bejtlich on his TaoSecurity blog:
“Attribution means identifying the threat, meaning the party perpetrating the attack.
Attribution is not just malware analysis. There are multiple factors that can be evaluated to try to attribute an attack. [...]” [29, 28]. Bejtlich suggests that those factors
are very diverse and should include, e.g., the timing of the attack, information on the
targets, delivery mechanism, vulnerability or exposure, propagation method, command
and control mechanisms, and several other contextual features.
Finally, Tim Bass suggested in [25] that “Next-generation cyberspace intrusion detection (ID) systems will require the fusion of data from myriad heterogeneous distributed
network sensors to effectively create cyberspace situational awareness [...] Multisensor
data fusion is a multifaceted engineering approach requiring the integration of numerous diverse disciplines such as statistics, artificial intelligence, signal processing, pattern
recognition, cognitive theory, detection theory, and decision theory. The art and science
of data fusion is directly applicable in cyberspace for intrusion and attack detection”.
61
Hence, it is not surprising to observe that emerging methods in the field of attack attribution are at the crossroads of several research domains, which we can try to categorize
as follows:
i) investigative and security data mining, i.e., knowledge discovery and data mining
(KDD) techniques that are specifically tailored to problems related to computer
security or intelligence analysis;
ii) problems related to multi criteria decision analysis (MCDA), and multisensor data
fusion;
iii) general techniques for malicious traffic analyses on the Internet, with an emphasis
on methods that aim to improve the “cyber situational awareness” (Cyber-SA).
In the next paragraphs, we give an overview of some key contributions in each research
area.
2.5.2 Investigative and Security Data Mining
In the last ten years, considerable efforts have been devoted to applying data mining
techniques to problems related to computer security. However, a great deal of those
efforts has been exclusively focused on the improvement of intrusion detection systems
(IDS) via data mining techniques, rather than on the discovery of new fundamental
insights into the nature of attacks or their underlying root causes [144]. Furthermore,
only a subset of common data mining techniques (e.g., association rules, frequent episode
rules or classification algorithms) have been applied to intrusion detection, either on
raw network data (such as ADAM [23], MADAM ID [184, 185] and MINDS [84]), or
on intrusion alerts streams [76, 146]. A comprehensive survey of Data Mining (DM)
techniques applied to Intrusion Detection (ID) can be found in [24, 40].
We note that most of these previous approaches aim at improving alert classification
or intrusion detection capabilities, or at constructing better detection models thanks
to the automatic generation of new rules (e.g., using some inductive rule generation
mechanism). Only recently, namely in the context of the WOMBAT Project [71],
some emerging work has been done regarding the application of novel data mining approaches to different security data sets, and with different purposes. More precisely,
in [284, 283, 285, 282, 287], the authors have developed a graph-based, unsupervised
data mining technique to discover unknown attack patterns performed by groups or
communities of attackers by mining data sets containing only malicious activities. The
final objective does not consist in generating new detection signatures to protect a single network, but instead to understand the root causes of large-scale attack phenomena,
62
and get insights into their long-term behavior, i.e.: how long do they stay active, what
is their average size, their spatial distribution, and how do they evolve over time with
respect to their origins, or the type of activities performed.
During the VIS-SENSE project, we want to pursue these efforts by further developing
and enhancing those graph-based clustering techniques. For example, the clustering
techniques employed in [282, 287] to create some sort of viewpoints for each attack
feature separately, rely on graph-based techniques that require a full similarity matrix
(n × n) as input. Quite obviously, such techniques do not scale very well with the size of
data sets. Hence, we could investigate the possible application of other, more scalable
clustering techniques, such as BIRCH [327] or BUBBLE [98], which are able to find
clusters in very large databases in a single pass. However, for these techniques to be
applicable to honeypot data (and the like), we still need to research and find the most
appropriate representation for attack features extracted from such data sets.
Crime Data Mining
There are also many similarities between the tasks performed by analysts in computer security and in crime investigations or in law-enforcement domains. As a result, several researchers have studied the potential of data mining techniques to assist law-enforcement
professionals. In [205], McCue provides real-world examples showing how data mining
has identified crime trends and helped crime investigators in refining their analysis and
decisions. Previous to that work, Jesus Mena has described and illustrated the usefulness of data mining as an investigative tool by showing how link analysis, text mining,
neural networks and other machine learning techniques can be applied to security and
crime detection [208]. More recently, Westphal provides additional examples of realworld applications in the field of crime data mining, such as border protection, money
laundering, financial crimes or fraud analytics, and elaborates also on the advantages of
using information-sharing protocols and systems in combination with those analytical
methods [305].
We observe, however, that most previous work in the crime data mining field has
primarily focused on “off-the-shelf” software implementing traditional data mining techniques (such as clustering, classification based on neural networks and Kohonen maps,
or link analysis). Although very useful, those techniques are generally not very appropriate for modeling complex behaviors for the kind of attack phenomena that we want
to identify on the Internet.
63
2.5.3 Attack Attribution based on Multi-criteria Decision Analysis
As mentioned here above, a new approach has been recently proposed by Thonnard
et al. towards attack attribution in cyberspace [282, 287]. The idea is to combine
multi-criteria decision analysis (MCDA [30, 290]) with clustering techniques, in order to
identify groups of security events that are likely due to the same root cause (i.e., the
same underlying phenomenon). This method can be applied to a broad range of security
data sets, such as intrusion detection alerts, honeypot events, malware samples, rogue
AV domains, spam messages, and more. Examples of real-world applications include
the analysis of Rogue AV campaigns likely run by the same group of people [97, 66],
the analysis of honeypot attacks [285, 71], or potentially also the analysis of large spam
campaigns run by gangs of spammers.
Despite their great flexibility in combining features or evidences, we note that rather
few previous works have used MCDA approaches in order to address security-related
problems. Still, in [53] the authors consider the problem of discovering anomalies in a
large-scale network based on the data fusion of heterogeneous monitors. The authors
evaluate the usability of two different approaches for multisensor data fusion: one based
on the Dempster-Shafer Theory of Evidence and one based on Principal Component
Analysis. The Dempster-Shafer theory is a mathematical theory of evidence based on
belief functions and plausible reasoning [257]. It allows one to combine evidence from
different sources and to obtain a certain degree of belief (represented by a belief function)
that takes into account all the available evidence. It can be seen as a generalization of
Bayesian inference where probability distributions are replaced by belief functions. When
used as method for sensor fusion, different degrees of belief are combined using Dempster’s rule which can be viewed as a generalization of the special case of Bayes theorem
where events are independent. In our attribution method, we prefer using aggregation
functions as described previously, for the greater flexibility they offer in defining how we
want to model interactions among criteria (e.g., a positive or negative synergy between
a pair of criteria). Moreover, in Dempster-Shafer all criteria are considered as independent of each other, which is usually not the case with features used in attack attribution.
Interestingly, it has been showed that there is a direct connection between fuzzy measures used in MCDA, and belief or plausability functions used in Dempster-Shafer theory
([111, 302]).
During VIS-SENSE, we will thus further investigate the MCDA techniques that are
best suitable for attack attribution purposes. More precisely, the aggregation of attack
features performed in [282, 287] deals mainly with a limited set of aggregation functions,
such as the Ordered Weighted Average (OWA) operator [317], the Weighted OWA and
the Choquet integral [30, 290]. However, more real-world experiments need to be carried
64
out to fine-tune the integration of these aggregation techniques into an attack attribution
framework, in particular with respect to the determination of appropriate weighting
vectors and fuzzy measures. Furthermore, we need to define an aggregation function
that is able to model a decision scheme matching as closely as possible the phenomena
under study. In many cases, the aggregation process can be modelled using a sort of
averaging function, like a simple weighted means or an OWA-based operator. However,
one could prefer to use another form of conjunctive or disjunctive function (such as tnorms and t-conorms), or mixed functions (such as uninorms and nullnorms) to model
the aggregation of criteria in more complex scenarios.
Finally, it is worth noting that MCDA has been ranked in the top 5 intelligence
analysis methods by K. Wheaton, assistant professor of intelligence studies at Mercyhurst
College [306].
2.5.4 Malicious Traffic Analysis and Cyber-SA
This research will build also on prior work in malicious traffic analysis, for which the literature in this field is quite significant. For example, in [321], Yegneswaran et. al. have
studied the global characteristics and prevalence of Internet intrusions by systematically
analyzing a set of firewall logs (from D-Shield) collected from a wide perspective (over
four months of data collected from many different networks worldwide). Their study is
a general analysis that focused on the issues of volume, distribution (e.g., spatial and
temporal), categorization and prevalence of intrusions. Then, in [229] Pang et al. characterize the incessant non-productive network traffic (which they term Internet background
radiation) that can be monitored on unused IP subnets when deploying network telescopes or more active responders such as honeypots. They analyzed temporal patterns
and correlated activity within this unsolicited traffic, and they found that probes from
worms heavily dominate. More recently, similar research has been conducted by Chen
et al. in[56]. While all these previous works provide meaningful results and have much
contributed in making advances in malicious traffic analysis, the traffic correlation and
analysis techniques used by these authors stay at a fairly basic level. Indeed, they basically break down the components of background radiation by protocol, by application
and sometimes by specific exploit, and then apply some statistics across each component.
In [283], Dacier et al. developed a more elaborated clique-based clustering method to
extract groups of correlated attack clusters from a large honeynet dataset. In [284, 285]
the same authors explored two different approaches to combine attack knowledge obtained through these means. Then, they also presented in [234, 235] different signal
processing techniques that can be used to extract, systematically, so-called attack events
from a large set of honeynet traces. More recently, Leita et al. offer in [187] an empir-
65
ical study of an extensive data set collected by the SGNET honeypot deployment. In
particular, they show the value of combining clustering techniques based on static and
behavioral characteristics of the malware samples, and show how this combination helps
in detecting clustering anomalies but also in underlining relationships among different
code variants. Finally, they highlight the importance of using contextual information related malware propagation in order to get a better understanding of the malware threat
ecosystem.
It would be incomplete to discuss attack attribution without mentioning some active
research carried out in Cyber Situational Awareness (or Cyber-SA). We acknowledge
the seminal work of Yegneswaran and colleagues in this field, such as in [322] where
they explore ways to integrate honeypot data into daily network security monitoring,
with the purpose of effectively classifying and summarizing the data to provide ongoing
situational awareness on Internet threats. However, their approach aims at providing
tactical information, usable for the day to day operations, whereas Dacier et al. are
interested in strategic information that reveal long term trends and the modus operandi
of the attackers. Closer to their research, Li et. al. have described in [192] a framework
for automating the analysis of large-scale botnet probing events and worm outbreaks
using different statistical techniques applied to aggregated traffic flows. They also design
schemes to extrapolate the global properties of the observed scanning events (e.g., total
population and target scope) as inferred from the limited local view of a honeynet.
Finally, a first compilation of scientific approaches for Cyber-SA has recently been
published in [140], in which a multidisciplinary group of leading researchers (from cybersecurity, cognitive science, and decision science areas) try to establish the state of the
art in cyber situational awareness and to set the course for future research. The goal of
this pioneering book is to explore ways to elevate the situation awareness in the Cyber
domain.
Finally, another interesting project is Cyber-Threat Analytics (Cyber-TA), founded
by SRI International [70]. Cyber-TA is an initiative that gathers several reputed security researchers. It aims at accelerating the ability of organizations to defend against
Internet-scale threats by delivering technology that will enable the next-generation of
privacy-preserving digital threat analysis centers. According to Cyber-TA, these analysis centers must be fully automatic, scalable to alert volumes and data sources that
characterize attack phenomena across millions of IP addresses, and give higher fidelity
in their ability to recognize attack commonalities, prioritize, and isolate the most critical
threats. However, very few information is available at [70] on which scientific techniques
could enable organizations to achieve such goals or to elevate their cyber-situational
awareness.
66
3.1 Introduction
As described earlier visual analytics is the combination of automatic analysis methods
and visual approaches to investigate huge datasets. After introducing the automatic
algorithmic methods in network analysis in the second chapter, the focus of Chapter 3
will be the visual approaches, which emphasize interactive visualizations with the human
in the loop.
There are many fields where visual analysis has successfully improved automatic analysis or has even been able to provide new insights in complex relationships, occurring
patterns or anomalies which were not known before. To get a better understanding of
the techniques used in visualization applications, the following sections provide a brief
overview of the most common visualization and interaction techniques.
3.1.1 Visualization Techniques
When we talk of datasets, we mean datasets of no particular format. In the case of
data tables, we call the columns attributes. The term attributes can be applied to
datasets of other paradigms. In this case an attribute is any clearly defined property
of the dataset. Additional attributes can be derived during the analysis process. These
generated attributes either summarize other attributes or are the result of automated
processing. The individual values of each attribute are known as data items.
A visualization is simply a mapping of sets of data items onto marks. In the case of
interactive computer visualizations, marks are connected groups of pixels, together with
their color specifications. Marks have a dimension (point, line, area), a shape, a color,
a size and a texture; all of which can be used to represent different attributes. A very
detailed discussion of marks can be found in the monograph Sémiologie Graphique by
the French cartographer Jacques Bertin [34]. The set of marks displayed on a computer
screen at any moment is the current view of the visualization. Beside the common
visualization techniques there are some more specialized ones for particular data types.
In the following we will introduce and briefly describe common visualization techniques. We only focus on those visualizations, which are commonly used in most network
67
security prototypes and tools as discussed in detail later (Section 3.2). Most of the implementations use simple timeline or graph visualizations to represent data. Sometimes
a 3D display visualizes the data in a 3D space to gain a further axis, but run the risk
of loosing the overview or having some overlap among data points. Pixel visualizations
try to represent as many data objects as possible on the screen by mapping each data
value to a pixel and arranging the pixels adequately [151]. The most known representations of data are tables or different kinds of charts. They are easy to create and to
understand but it is difficult to see relations or correlations between massive amounts
of multi-dimensional data. To get a better understanding of interrelations between certain parameters, parallel coordinates [137] and scatterplots might be the right choice.
So-called glyphs [303] can be used to map different attributes on a single data representation. Glyphs change their appearance depending on the characteristics of certain
parameters. They can be combined with different kinds of layout algorithms to create
visual patterns. Matrices are a way to arrange data points in a two dimensional way. For
hierarchical datasets a treemap visualization [143] is a good choice. It arranges the data
in nested rectangles representing the hierarchical structure. The area of each rectangle is
mapped to the value of a specific attribute (e.g., the numbers of attackers or number of
IP addresses). To easily compare different aspects of a dataset small multiples arrange
several instances of the same kind of visualization technique next to each other to make
the differences between the representations visibly salient.
Of course there are many more ways to display or arrange different types of datasets,
but the above mentioned ones are most commonly used in the field of network security
and thus the most important ones to know. While some visualizations are static, modern
computer-based visualizations are highly interactive and provide the analyst with a
number of ways to modify the current view. The most common basic and some more
advanced interactive techniques will be introduced in the following subsections.
3.1.2 Basic Interaction Techniques
The visualizations provide the analyst with a variety of ways to interact with marks of
a view and, thus with the data itself. The most basic forms of interaction are filtering,
zooming, panning and brushing.
Filtering restricts the marks in the view to a data subset, which fulfills the chosen
criteria. Filters usually take the form of drop-down lists, sliders and check boxes, but
can also involve complex graphic or textual query formulation.
In its most basic form, zooming means an increase in the size of marks in the current
view. Knowledge of the dataset may be used to add and remove information at different
zoom levels. This technique is known as level-of-detail (LOD) zooming. A good example
68
3.1 Introduction
of LOD zooming is an interactive map; at the highest level only continents are shown in
the view, after zooming in the boundaries appear, zooming in further causes cities and
important roads to appear. Closely related to zooming is panning. When the current
view is enlarged it may not fit into the available screen space. In this case, panning
becomes necessary to see all of the marks. Thus, while zooming modifies the granularity
of the information displayed in the view, panning simply involves shifting the view to
see different parts of it. Panning has no effect on view granularity.
The final basic interaction is brushing. To see how a particular mark (or group of
marks) changes as the view changes, it could be brushed or highlighted to make visual
tracing of the mark easier. In many cases, more than one view (often even more than one
visualization) are used to see different parts or aspects of the same dataset. To enable
the visual tracking of data items across views and visualizations these can be linked.
Items brushed in one view are then brushed in all other views as well. This technique is
frequently referred to as brushing and linking.
3.1.3 Advanced Interaction Techniques
The basic interactions are complemented by interactions involving some form of automated data processing. Most of the techniques used in visual analytics are borrowed
from the fields of statistical analysis and information retrieval. We will discuss three
categories of interaction: sorting, searching and aggregation.
Sorting (or ordering) can be applied to the marks in a visualization (e.g., the bars in a
categorical bar chart) based on a displayed or non-visible attribute. Some visualizations
consist of a number of views displayed in a list or matrix format. In this case, the views
themselves can be sorted based on a chosen attribute.
Aggregation also involves data processing. It is related to zooming, except that aggregation is applied to the data items themselves and not to the marks in a view. The
most basic forms of aggregation involve basic operations, such as averaging, summing
(also known as rolling up) and linear regression. More advanced aggregation techniques
include clustering. Clustering uses statistical techniques and artificial intelligence to partition datasets into meaningful subsets. Some representation of the subsets themselves
can then be used as an aggregated view of the data. There are numerous clustering algorithms, each applicable to different datasets and problems. A review of these algorithms
is beyond the scope of this document. The appropriate use of clustering algorithms usually requires the adjustment of certain parameters, thus clustering interactions involve
the entry of information in a dialog box of some sort.
When conducting an exploratory analysis of a large dataset an analyst may wish to
single out a very particular subset of data. Providing some facility for searching the data
69
set makes finding specific data items much easier. In some cases, preprocessing may be
necessary to enable fast searching. The input for a search query may be textual or take
the form of a selection of marks in a view. This graphical search mode is also known as
a similarity search, since the search query is for data items to those represented by the
selected marks.
3.1.4 The Results of an Analysis
When an analyst has used a visualization to find something interesting or significant
it may be necessary to access the raw data represented by a mark. The ability to
interactively bring up a detailed display or open a document from the visualization is
known as details on demand.
If the analysis leads to new discoveries or a better understanding of the given problem
then the analyst will probably want to make a note of the discovery. Recording such
data in an orderly fashion makes it available to other analysts and for future reference.
Storing data in this way is known as a feedback loop.
3.2 Tools for Generic Data Visualizations
To get a first impression of a given data set it is possible to use free software tools, which
try to visualize the data in an easy and insightful way. The idea is to quickly display
different kinds of datasets without the need to have special programming skills.
Graphviz [82] for example is a software to view and manipulate abstract graphs like
in the database domain or in the case of computer networks. The algorithms used in
Graphviz concentrate on static layouts. However, it is possible to choose different kinds
of layouts like for a example a hierarchical, a force based or a radial one. Each layout
provides the user with certain attributes, which can be changed to improve the visual
representation of the graph. Beside changing the layout algorithm it is also possible to
modify the appearance of the nodes, to label nodes and edges or to change the color.
The software is available under an open source license and can be downloaded from the
tool’s website.
Another graph visualization tool is Gephi [26]. It focuses on graph data and provides
therefore certain techniques for filtering, navigating, manipulating or even clustering.
The open source software is available on the homepage and can display large networks
with about 20,000 nodes. Every node can be individually designed with textures, photos,
etc. Other attributes can be configured in real-time like different layout algorithms, sizeadjustments, or node-repulsion. With little programming skills the tool can be extended
with filters or other kinds of algorithms. A big plus is the dynamic module which allows
70
the user to send data to the visualization while running. This means the results are
immediately visible in the graph so changes in the network structure can be examined.
To facilitate data analysis on large volumes of data the software Knime [33] offers an
easy access to these tasks with the ability to dig deeper into the material and to program
individual data mining algorithms. The idea is to provide the user with a visual pipeline
where he can add different modules to preprocess, analyze and visualize the data. Each
module, or visually spoken node, processes the arriving data and produces results on
its output. Typical tasks are filtering or merging, some statistical functions like mean
calculation or more intensive algorithms like clustering, etc. Some nodes produce as
output additionally a view, which can be displayed in a separate window. These views
reach from simple tables to more complex ones like scatterplots, histograms or parallel
coordinates. The software can be downloaded free of charge from its website.
Nearly the same idea is realized by RapidMinder [210]. The software also offers a
graphical user interface to design the analytical process with the possibility to define
certain display modules. An advantage is the use of the standardized XML format to
exchange information in the pipeline itself. Because of the well-known format the pipeline
can easily be extended with external tools or algorithms. Additionally RapidMinder
integrates Weka [14] with its machine learning algorithms to perform data preprocessing,
clustering, regression or other data mining tasks. While Weka is an open source tool,
which can be used as a stand alone version, the tool Rapidminder is available for free in
a not supported light community version and in a supported commercial version.
ManyEyes [298] is a free to use web tool. Datasets can be uploaded and stored on the
server or it is possible to use an already existing dataset. After choosing the data, the
user can decide which visualization technique he wants to use. The different kinds of
visualization techniques are categorized for their main purpose like analyzing a text or
showing relations etc. The visualization is only created if the representation supports the
uploaded dataset so the user has to choose the representation type wisely. Unfortunately
it is not possible to automatically analyze the data.
Another visualization tool without any automatic analysis functions is Gnuplot [4].
The tool has to be used via the command-line and is therefore not as easy to use as
other tools. The former idea was to plot mathematical functions but the functionality
was improved to visualize nearly any kind of data in many different ways like for example
heatmaps, vectorfields or datastrings. The InfoVis toolkit [90] provides the user with
almost the same functionality, but offers a graphical user interface for the data import.
Additionally the toolkit can be extended with some programming skills.
To focus on statistical and mathematical computing the R-project [161] offers a software to deal with these calculations and visualize the results via the command-line.
The tool supports different kinds of visualization techniques like scatterplots, barcharts
71
or parallel coordinates. To facilitate the access to the tool it is possible to use some
third party software for a visual GUI support of the tool. One example would be the
RStudio [5] which is a free to use software.
Of course there is a tradeoff between the easy to use software and of how good the
different software tools visualize your individual dataset and how good you can adjust
certain parameters. In the case of network security it is not possible to only use these
common visualization tools because the tasks of a network analyst are very special and
must be supported with strong, rich and preferable high interactive software solutions.
That is why the following chapters introduce different specialized tools for the different
types of network data and tasks. The results of this elaboration is also summarized in
Table 3.1 for a better overview and comparison.
72
IDS Logs
Network Traffic
BGP
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
Online Available
Traffic
x
x
x
x
x
x
x
x
x
x
x
x
IDS
BGP
time-varying
(near) real-time
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
historical
Similarity Search
Ranking
Feedback Loop
Clustering
Linking & Brushing
Zoom
Ordering
Data
x
x
x
x
x
x
Analytics
x
x
x
x
x
x
x
x
Filtering
Other(Tag Cloud etc.)
Small Multiples
Matrix
Geographic Map
Glyph
Scatterplot
Parallel Coordinates
TreeMap
x
x
x
x
x
Interaction
x
x
x
Charts
Pixel Visualization
Tables
3D Display
Graph
Tool
BGPlay [61]
BGPlay++ [65]
TAMP [310]
BGPEye [277]
VAST [224]
Elisha [276]
Link-Rank [177]
BGPeep [258]
Teoh:2004 [278]
Flamingo [223]
NetBytes viewer [269]
FloVis [270]
DNVS [226]
Spinning Cube [181]
InetVis [139]
VIAssist [107]
NVisionIP [180]
NFlowVis [93]
RUMINT [64]
Krasser:05 [166]
Pearlman:08 [232]
Nfsight [32]
Xiao:06 [312]
Chen:07 [55]
OverFlow [105]
Mansmann:08 [200]
PortVis [207]
Existence Plots [141]
Portall [92]
Flowtag [182]
VisFlowConnect [326]
IDGraphs [250]
Isis [236]
TNV [106]
Irwin:08 [138]
NUANCE [237]
Harrop:06 [116]
IDS Rainstorm [15]
Snort View [163]
Idtk [165]
Visual Firewall [183]
Yelizarov:10 [323]
IP Matrix [164]
VisAlert [96]
SnortSnarf [119]
Avisa [259]
SpiralView [35]
Timeline
Visualization
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
Table 3.1: Overview Table of Network Security Tools
73
3.3 Tools and Methods for BGP Data
Much research has been done in the area of visualizing traffic flows, while there are only
few approaches for BGP related data. However, BGP is a very vulnerable part of the
Internet infrastructure and could be the main target for criminals in the future.
A well-known tool to visualize routing information is BGPlay [61] in which an animated graph is used to visualize the autonomous systems (AS) and their connections
with each other (see Figure 3.1). To enable this, BGPlay uses the routing information,
that is made available by the Route Views [293] system. BGPlay is capable of showing
changes over time in an animation and enabling the user to shift the time interval in any
direction to examine changes of routing events like “route withdrawal” or “new route”.
If an AS path does not change in a given time interval the connections between the vertices are dashed, combined in a set and merged in a tree. Each path and each tree has
its own color. The animation of routing events is only shown on solid paths starting at
the collector peer and ending in the target AS. There is an online version of the system
where the user can enter the IP prefix and the time interval to be analyzed.
Figure 3.1: The BGPlay tool [61].
74
3.3 Tools and Methods for BGP Data
The tool was improved one year later with an underlying topological map [65]. The
idea is to show the AS paths of the BGP and the levels of the AS hierarchy to see if
some AS have to ”climb” a hierarchy level higher to reach a place in the Internet, or if
they can use a path on the same level.
Another animated graph visualization for BGP data is provided in TAMP [310]. It
displays a pruned graph for the network topology, an animated clock with controls to
show and manipulate the time of the current state of the graph and another plot to
present the events belonging to a selected edge. TAMP tracks the routing changes
expressed by the events to generate frames of TAMP pictures to form an animation.
BGPEye uses two different visualizations to satisfy the goal of tracking the healthiness
of BGP activity [277]. The Internet-Centric View shows the activity among different
ASes with a graph. The Home-Centric View uses a panel display to visualize the prefix
status from a single border router perspective. With the different visualization techniques and the underlying Route Views data it is possible to watch real-time routing
activity like the moving average of the total number of BGP events or the deviation
from historical trends.
Another tool showing the overall topology of the Internet as well as individual AS
behavior is VAST [224]. VAST uses a quadtree visualization for a single and a 3D OctoTree visualization for multiple ASes to display the topology and the BGP behavior. Color
coding, node and link size help to display further information. Interaction possibilities
help to explore the 3D space like rotating, zooming or panning the information space.
Furthermore, different filter techniques provide the possibility to focus on certain aspects
of the data. The tool allows mainly to visualize routing anomalies and sensitive points.
The tool Elisha also uses a quadtree visualization in combination with a pixel view
[276] as shown in Figure 3.2. All paths from the observation point AS to the origin AS of
the IP prefix are plotted. Three detail windows help the analyst to examine certain areas
of the quadtree in more detail. Additionally, the tool offers different kinds of perspectives
on the data like a 3D display, which can be rotated or a fish-eye view. Further options
allow filtering or changes in the visualization itself like additional projection planes in
the 3D view, etc. To understand changes in the dataset the system uses animation over
time, which can be displayed like a video or with single steps. Color coding is used to
represent the time the path was used within the currently displayed time window. For
a closer look at the tool there is a free to use version downloadable from the Internet.
To visualize routing changes at a global scale the LinkRank tool was developed [177].
A graph layout in combination with an overview plot helps to focus while maintaining
scalability. The graphs only show changes triggered by BGP updates. Two connected
nodes gaining one or more links are shown in green, nodes loosing one or more links are
displayed in red. The other links without any changes are invisible. An activity plot
75
Figure 3.2: The Elisha tool [276].
76
3.4 Tools and Methods for Network Traffic Data
as overview visualization shows all the changes as green and red bars in the context of
time. To get an impression it is possible to download the system from the Internet.
The BGPeep tool [258] offers a tag-cloud and a parallel coordinates visualization to
gain more insight into BGP traffic. The tag-cloud view is used to represent results of
different queries like ”ASes originating prefixes”. After selecting the interesting tags (up
to four) the user can gain more information about the tags in the prefix viewer. The
prefix viewer contains a parallel coordinate visualization, which consists of five axes.
The first one represents the AS associated with the update message. The others display
a different octet of the IP address. Because all updates are rendered simultaneously it is
necessary to use opacity. With the help of a color coding the user is able to spot often
announced prefixes, route flapping or prefix hijacking at once. A timeline is implemented
to investigate only specific periods of time.
Teoh et al. [279] presented a combination of statistical and visual methods to analyze
BGP update messages. The collected data is thereby filtered and processed to obtain
statistical measures for each BGP update message, which is also the reason for the tool
being only applicable in near real-time. With the help of the visualization the user
can select the prefix and time period to be displayed to detect clusters of BGP update
messages and compare them to their associated statistical anomaly measures.
Traffic flows are captured primarily on routers or switches. They are one layer above the
packet captures and that is why some information will be lost. However, it is possible
to gain new information namely the AS, the next hop of a packets path through the
network, and the number of packets in a flow. Parts of the information are useful to
understand the topology of the network. The tools introduced in this section handle
either traffic flows (e.g NetFlow data) or packet captures.
Flamingo [223] is a software tool that enables 3D Internet traffic data exploration in
real-time. It provides a series of different visualization methods to illustrate different
aspects of the data which is collected from NetFlow records. To visualize the traffic
between two IPs, for example, a quadtree algorithm is used to place the source IP
address space on one side of a cube and the destination address space on the other side.
The traffic between the source and the destination IP is displayed with a line connecting
the two. The thickness of the line codes the amount of traffic. With zooming, rotating
and panning options the user is able to navigate through the information space and
extract the necessary information. The cube can be reordered and filtered in different
ways to take into account other aspects of the NetFlow data like for example the ports
77
used.
Another interactive 3D visualization tool of NetFlow data is the NetBytes viewer
[269]. It deals with historical flow data leaving or entering a single entity and displays
the volume of traffic flows as well. For that purpose a 3D impulse graph with a time
dimension, a port or protocol dimension and a volume dimension is used. To diminish
the disadvantages of a 3D display the tool provides important interaction methods like
rotating and zooming the 3D display or highlighting data to gain further information in
separate 2D graphs.
FloVis [270] combines the NetBytes viewer with other visualizations like a flow bundle
diagram or an existence graph. With the added flow bundle diagram a user can additionally investigate host to host or network to network interactions while the existence
graph is useful to spot role-based host information. The data source is provided by the
SiLK toolkit [48] which filters the raw flow data. The tool can be downloaded from the
homepage.
A 3D tool monitoring network traffic in real-time is DNVS [226]. The tool is still
under development and so it lacks of important features like interaction possibilities or
filtering options. Nevertheless the system provides two visualizations namely the Service
Behavior View and the Category View which help, even in this early state, to discover
possible anomalies of the network such as DoS types or probing attacks.
The Spinning Cube of Potential Doom [181] visualizes darknet data in a 3D cube. The
axes represent the source IP address, the destination IP address and the destination port
number. With this coding scan attacks are quickly visible to the analyst. Vertical lines
represent port scans, flat two-dimensional planes appear for port scans upon multiple
continuous host IP addresses and port scans trying to avoid detection produce spiral like
patterns. To get a better understanding of the functionality of the tool it is possible to
download a video.
Focusing on scan detection InetVis [139] captures and visualizes live traffic in a 3D
scatterplot. A time window offers the possibility to change the time scale or show certain
states in the past. To investigate the data further and avoid over-plotting the display
can be split into sub-networks or smaller port ranges. Additionally the traffic can be
filtered with the help of the Berkeley Packet Filter (BPF). With this syntax the packets
can be filtered on any parameter. To enhance the overview the user can color certain
aspects of the data like for example different ports. A fully functional version is available
on the homepage.
Leaving the area of 3D displays VIAssist [107] provides an intuitive, customizable
2D dashboard to provide a big-picture overview of network flow data to enhance situational awareness. Different kinds of visualizations like scatterplots, parallel coordinates
or charts are provided to analyse network activity. Users can zoom into the data by
78
increasing the accuracy of the data. Filtering options are provided by sliders and checkboxes. Additionally, all of the visualization views in VIAssist are linked, so LOD zooming
and filtering in one view is reflected in the others. The different views can be relocated
and resized within the workspace. The state of each workspace can be saved and exchanged to help other cyber defenders fulfilling their tasks. To increase the number of
linked views and to provide a better overview the analyst can use multiple displays.
NVisionIP [180] also uses different visualizations to support the analyst in detecting
network attacks. The first view, named the Galaxy View, displays high level data about
the entire network. In this pixel visualization each point represents one IP address.
Every point is color coded to represent the number of unique ports used by that IP
address. The second view, the Small Multiple View, is a more detailed representation
of the galaxy view. Every IP address is visualized via two bar graphs. Both of these
bar graphs show traffic over ports which are colored for a better understanding. The
third view, the Machine View, visualizes only a single IP address with different charts
to provide the most detailed information at a single glance.
Figure 3.3: The NFlowVis tool [93].
The NFlowVis [93] system combines alerts from intrusion detection systems with
NetFlow data of a whole company network. To enhance network security and to assess
79
the impact of current attackers the system provides several views to support the workflow
of the analyst. This workflow starts with an overview based on several timeline and
pixel visualizations, followed by an intrusion detection view, which shows the current
IDS alerts. The flow visualization detailed in Figure 3.3 combines attacking external
hosts with affected hosts within the internal network using novel visualizations based
on treemaps, splines and graphs. Applying visual data analysis to traditional IDS data
allows the analyst to gain deeper insight into current threat situations.
Another dashboard providing the user with many different types of visualizations is
the tool RUMINT [64]. As a starting point the system shows a real-time thumbnail
visualization. Each thumbnail represents one of the seven different visualizations which
can be enlarged after clicking on it. The user has the option to choose between a parallel
coordinate plot, a scatterplot, a glyph based animation and many more. The user can
investigate the data and choose the degree of detail in an explorative way. The tool as
well as the source code can be downloaded from the project’s webpage.
Like one of RUMINTs views, two other systems use a glyph based visualization to
deal with network security. The first is from Krasser [166] who innovated a parallel
coordinate plot in combination with glyphs. Each glyph represents a packet and can
be clicked to retrieve more information. Additionally the analyst can chose between
different time scales and zoom into interesting areas of the data. The second is provided
by Pearlman [232]. He combines glyphs with a graph layout. Each glyph represents a
node on the network and codes the amount of traffic on a particular port where each
port is a slice in a circle. The size of the slice depends on the relative amount of traffic
on the corresponding port. Different inner circles show changes over time for a single
node. Two nodes are connected when their services communicate with each other. To
maintain the overview it is possible to zoom and pan the visualization to change the
point of interest.
Nfsight [32] uses unidirectional NetFlow data provided by Nfdump or Nfsen to monitor
client server activity. One major part of the tool is the service detector which converts
these flows into bidirectional flows. This detector identifies the client and the server.
Additionally event alerts generated by the self implemented Intrusion Detection System
are stored in a database. The visualization consists of a search engine, a dashboard
and a network activity visualization table. The search engine enables the analyst to
filter or query for specific parameters like IP address, etc. The dashboard displays the
latest alerts, the top 20 servers, services, scanned services, and internal scanner. The
visualization table provides statistical information and displays the network activity as
a time series using a heat map. Color is used to distinguish between client and server
and to identify invalid flows. Because the tool cannot be used with real-time data its
main purpose is a forensic analysis.
80
Xiao [312] stores the network flow data in a database and visualizes them with scatterplots or event diagrams. The database is used to support the use of different clauses
to filter the data and to store them for later reuse. The analyst can select patterns in the
different visualizations for which he gets a list of predicates. This additional information is necessary to construct clauses which are currently only limited to conjunctions.
After the analyst has found a clause which describes a certain pattern, he can name the
pattern and commit it to the knowledge base for later use.
The tool invented by Chen [55] uses a machine learning method in combination with
visualizations to reconstruct and classify network scan patterns. A training set of controlled scan patterns is needed to reconstruct a noisy or incomplete pattern. This pattern
can be used for later comparison or clustering to find correlations in malicious network
activities. When dealing with large numbers of network scans using visual representation
in combination with machine learning methods are a great advantage.
The tool OverFlow [105] focuses on different types of overview visualizations. The
system aggregates flow level data to provide analysts with a starting point for their
network traffic investigation. The idea is to show traffic between different subnets in
such a way that the analyst can spot interesting areas and focus especially on those.
Therefore, the analyst is able to quickly determine if there is traffic between subnets
that should not exist, or if the characteristics of that traffic have changed.
Mansmann [200] introduced a graph based metaphor to satisfy the goal of discovering
anomalies in the behavior of hosts or higher level network entities as shown in Figure 3.4.
Therefore, the nodes of the graph represent the hosts which are placed close to each other
if they have similar traffic proportions. This layout algorithm can be influenced by the
analyst by changing the attractor level of the nodes. Further interaction possibilities are
integrated to allow the explorative investigation of the graph like highlighting different
nodes or more detailed information. Additionally the analyst can combine the graph
with a treemap visualization to gain further information about the network and the host
behavior.
PortVis [207] displays three visualizations to present high level information as well
as low level semantic constructs. The first view, the timeline, shows the number of
sessions on the port range in combination with the time. Different time units can be
selected to be shown in the second visualization, the main view. It consists of a pixel
visualization displaying each port. Color is used to code a user selected attribute for each
port. Such a port can be selected to receive additional information in the third view,
the port visualization. The port visualization displays details over time for a selected
port to identify if the activity on the port is anomalous. The tool helps to detect port
scans and suspicious traffic patterns on individual ports.
The same goals can be achieved by using existence plots as introduced by Janies [141].
81
Figure 3.4: The system developed by Mansmann [200].
The system uses a low-resolution visualization to represent the port usage of individual
hosts over time. The display maps time on the x-axis and the port range on the y-axis.
Color is used to represent the magnitude of traffic. The time scale can be changed to
receive either an overview or a more detailed presentation. In both cases, the existence
plot provides useful insight into the hosts activities by concurrently representing ports
usage.
For a more detailed analysis of application ports the system Portall [92] was invented.
The tool gives analysts an end-to-end visualization of the host processes correlated with
the network traffic in which the processes participate. Apart from the main window
which displays the communicating applications the tool provides additional detail windows to gain further information. A timeline allows the analyst to investigate traffic and
processes at some point in the past. Further interaction possibilities like highlighting
help the user to obtain the overview if there are many occluding lines.
Changing the time scale and other parameters is also possible with FlowTag [182].
The tool uses double-ended sliders to manage the filtering process, different tables and a
parallel coordinate plot for the visualization task. It is possible to share the attack data
to enhance the possibility of a collaborative analysis of Honeynet researchers. Network
82
flows can be tagged and later queried by a user interface to select only the interesting
flows. Because of the graphical interface there is no need for a textual query because
the flows are represented as lines and can be selected using rectangulars. A video of the
functionality and the program itself is available on the corresponding homepage.
VisFlowConnect [326] provides an animated parallel coordinate plot as the main view
and a detailed host statistics table. With this combination the tool supports a high level
overview but with the possibility to drill down into interesting or anomalous regions of
the data. The animation is used to show changes over time. Additionally the user can
manipulate the time in a way that he can go backwards and replay a certain event. With
the different visualization techniques and interaction possibilities the tool satisfies the
goal of displaying relationships between internal hosts and external machines, including
the direction and volume of traffic.
Another interactive visualization system for NetFlow data streams is IDGraphs [250].
The tool uses a Histograph visualization together with a correlation matrix to reveal
network anomalies and attacks like port scans or SYN flooding. For the Histograph
presentation time is mapped to the horizontal axis and SYN-SYN/ACK values to the
vertical axis. This mapping is useful because high SYN-SYN/ACK values are suspicious
and easy to spot. In the linked correlation matrix each row and column represents one
stream. To display the correlation, every cell is additionally color coded from green
(positive) to red (negative). To receive better results out of the matrix the streams are
clustered to provide a better ordering. Furthermore, the matrix and the Histograph view
are connected via linking and brushing interactions.
Isis [236] supports the analysis of network flows through two visualization methods,
progressive multiples of timelines and event plots. The system uses a matrix like metapher to show the IP addresses one host has traffic with on the y-axis and the time on the
x-axis. Glyphs are used to code different events and are placed corresponding to their
time and the source IP address. An interesting approach is the combination of visual
affordances with structured query language (SQL) to minimize user error and maximize
flexibility. To enable a feedback loop Isis keeps a history of a user’s investigation, easily
allowing a user to revisit a query and change a hypothesis. A MySQL database is used
to store the flows, which provides the analyst with a flexible and familiar interface for
specifying queries.
To preserve the big picture while performing packet-level analysis TNV [106] was
developed. As an overview the system includes a histogram of the relative network
traffic activity of the entire dataset. The main view combines a matrix, displaying
the time and the host IP addresses, with a link display to explicitly show connectivity
between hosts. Connected to the main visualization are a port activity view and a table
of the textual network packet details. The port activity view provides a visual overview
83
of relative port activity and connections for selected hosts, while the details table offers
access to the raw packet-level details required for the analysis task. Several filtering and
highlighting mechanisms help to explore link patterns and activity. A java version of the
tool is available online.
Irwin [138] invented a tool to display a very large amount of network telescope traffic
and in particular to compare data collected from multiple telescope sources. To achieve
these goals the author invented a visualization using a Hilbert curve to layout data
points. This layout algorithm aims to place similar IP addresses close to each other.
Color can be used to code different parameters like networks with unique hosts or, as a
future work, geographical information.
Beside the more common visualization techniques the system NUANCE [237] tries
to gain context information about network attacks by building clusters of actors. Every actor is represented via an IP address and has an own profile which describes the
traffic over time. These actor models are clustered with k-means to represent similar
behavioral profiles. After this clustering process NUANCE constructs a text vocabulary
by performing an automated web search to describe each group. Of course the system
also creates visualizations like histograms or a geographic map to display the clusters of
actors but more exceptional are the additional text information gained from news feeds.
A very unconventional approach to visualize network security is a system invented
by Harrop [116]. A 3D game engine is used to display hosts with their traffic and port
information along with different analysts collaboratively supervising the network. The
analysts interact with the environment like computer game players. They can move or
jump in any direction and even shoot with their weapon in order to initiate an action.
3.5 Tools and Methods for IDS Logs
The massive amount of textual alarm logs generated from intrusion detection systems
makes it difficult to analyze each of them or to get an overall picture of what is occurring
in the network. Therefore, visualizations are important to display alarm activity in a
clearly arranged way.
Important for the analyst is to get an overview of the alarms to obtain an idea about
the general network activity and to easily detect anomalies. IDS Rainstorm [15] provides
this overview with the possibility to gain additional information via zooming and drill
down options. For the overall representation a matrix like visualization is used. A
couple of rectangular regions are used to split the view into different sections to gain
multiple y-axes. These sections provide the IP addresses on the y-axis and the time
on the x-axis. Color is used to code the alarm severity. To gain further information
84
the analyst can use the cursor to select a focus point. After clicking on the area of
interest a secondary window opens and shows additional information in a zoomed view.
In this visualization colored glyphs are used to represent the alarms. A mouse-over
reveals detailed information in a popup window while double clicking allows time scaling.
Currently, the tool can only be used for forensic analysis. A download link for the tool
can be found on Christopher Lee’s homepage.
SnortView [163] uses nearly identical visualization techniques as IDS Rainstorm. In
the main view the x-axis codes the IP addresses, the y- axis the time, and colored icons
represent the alarms. One difference between the two systems is the shape of these alarm
icons which codes additional information about the types of attack. To avoid overlapping
or redundant repainting of icons a vertical red bar is used to display consecutive alarms.
A detail view at the bottom of the screen shows further information about the alarms.
The next difference is the additional Source-Destination Matrix frame which is displayed
on the left of the main screen. The source IP address can be seen on the y-axis and
the destination IP address at the bottom. A red circle represents the communication
between source and destination. When a user clicks a symbol in the alert frame, the
communication path is highlighted and further information is shown in the detail view.
Unlike IDS Rainstorm SnortView does a real-time monitoring of Snort alarm logs.
A similar two dimensional mapping of time and source IP addresses with color coding
and glyph representations is used in IDtk [165]. The main difference of this tool is the
possibility for the analyst to change the mappings of different data variables for example
of the glyphs (e.g size, opacity,...) or the axis. It is even possible to introduce a third axis
to create a 3D display. This flexibility allows the analysts to develop their own style of
work for their own unique networks. With different filtering and interaction techniques
the analysts can easily handle the massive amount of data.
A few tools even combine IDS alerts with other information like traffic or other log
files. Visual Firewall [183] uses four different views to handle traffic data as well as
IDS logs. The Real-Time Traffic View displays packets in motion to show if a packet
is rejected by the firewall or not. Colored glyphs are used for every packet to code
the different kinds of traffic like UDP or TCP. The Visual Signature View is a parallel
coordinate plot with two axes. The one on the left displays the local host port and the
one on the right shows the foreign host IP address. A connection is drawn if there is
traffic between a port and a foreign host. After some time the lines fade out to avoid
occlusion and to give the analyst a feeling of time. The Statistics View uses a line chart
to illustrate the overall throughput of the network over time. The last view the IDS
Alarm View displays IDS alerts in a quad-axes diagram. The time is displayed at the
bottom, the left axis shows different categories of snort rules, the right axis represents all
possible subnets where attacks originate and the top displays all the hosts on the local
85
machine’s subnet (the victims). Faded lines are drawn to visualize connection between
the parameters. The fading animation codes the time. Colored Dots are used to display
IDS alarms with color coding the severity level. The combination of the different views
allows an analyst to form a coherent illustration of the network state.
The tool invented by Yelizarov [323] uses cylinder like glyphs to code the severity level
by height and the type of attack by color. Every glyph is linked in respect to a previously
discovered attack to reveal relations of attacks within a complex event and the duration
as well. The glyphs are placed on a 3D matrix. The y-axis represents the attacked IP
address and the x-axis the time. The source IP addresses are displayed on a single line
in the 3D space. Connections between this line and a cylinder visualize attacks from the
source to the destination IP address.
To visualize similar IP addresses close to each other the Tool IP-Matrix [164] uses
two 2D matrices. The first matrix is for an analysis on the Internet-level and displays
the first eight bits of the IP address on the vertical axis and the second eight bits on
the horizontal axis. The second matrix is meant for monitoring the local network and
displays the last 16 bits in the same way. To handle the large amount of different
alert types the system summarizes them into eight categories and colors every category.
Since it is very difficult to analyze single pixels the tool builds grids which are colored
according to the most frequent alert type occurring in this grid. To visualize the amount
of attacks for each single pixel two histograms are displayed on the bottom and the
left of the matrix. Changes in the temporal behavior are displayed with the help of
animation which can be controlled by the analyst. He can play certain events again or
can change the update interval. Further interaction possibilities allow him to filter for
certain attributes like for example different protocols or to gain additional information
about an attack by clicking on the corresponding pixel.
VisAlert [194] aims to enhance the situational awareness via visual correlation of
existing alerts. This goal is achieved via a topological map in a multiple circle layout.
The topological map is shown in the inner circle, while the time is represented as the
radial coordinate of a polar coordinate system. The shape and the size of the nodes
code different parameters like the uniqueness of alerts. When multiple alerts of the same
type are triggered with regard to the same node, the alert lines will be replaced by a
beam which encodes the additional information by its width and color. The different
cells on the outer rings represent particular types of alerts which are colored according
to their number of instances in this time slot. The system performs no analysis itself
but provides a visual alert representation with which the analyst can manually explore
noticeable events.
SnortSnarf [119] is not interesting in the way it displays IDS alarms, but in the way it
preprocesses data and the interaction possibilities it provides. For visualizing the alarm
86
logs different HTML pages display simple tables and text sections. Some links are used
to connect the pages where the user can get different information from like an ordering
of the log files or only some filtered information. But the most important feature of the
tool is the possibility to divide up the alerts into a hierarchy of groups and to view only
the representatives. With this method the analyst will still able to retain the overview
over the log files even if there is a massive amount of alarms occurring. The tool can be
downloaded from SourceForge.
In order to avoid an occluded, overdrawn and hard to perceive display the tool
Avisa [259] offers an automatic as well as a user directed prioritization of alarms and
hosts. This enables the analyst to identify the hosts with interesting and often irregular
behavior and discard the other ones. With this preprocessing step the main display gets
clearly arranged showing a radial visualization with interior arcs and an inner and outer
ring. The inner ring shows the IDS alert types in different colors while the outer ring
is used for categorizing the alert types. Beside the colored IDS alert types the internal
hosts of the network are displayed. The inner arcs represent the alarms starting at the
alert type panel and ending at the host panel. To gain an even better overview the
analyst can apply some filtering methods while interacting with the data and animation
is used to understand changes over time.
SpiralView [35] uses IDS alerts which are visualized in a spiral layout. It is a real-time
tool keeping history of already known alerts. Older ones are located near the center of
the spiral while newer alarms are on the outer ring. This kind of ordering has different
advantages. More recent alarms have more space than older ones, the data is presented
sequentially and it displays periodic behavior. One circle represents all alarms of the
last 24 hours. The alarms are color coded to visualize different types of alerts and their
size codes the severity. An additional histogram above the spiral is used to show the
aggregated data over time. The analyst can select a time interval on the histogram to
zoom on the corresponding ring of the spiral and investigate the result further. With the
help of filtering options the user can reduce the visible data points to avoid overlapping.
The tool also supports collaborative work because it is possible to label certain alarms
to make other analysts aware of the discovery.
87
4 Conclusions and Future Work
The presented state-of-the-art techniques make it very clear that there has been much
research in each of the VIS-SENSE relevant fields of network and visual analytics. There
are numerous approaches for network abnormalities detection described in the relevant
literature. As also discussed in Chapter 2 each approach has advantages and disadvantages on the ability to detect occurred abnormalities at a high success rate and avoid
raising false positive alarms. Some of these approaches face difficulties operating in realtime and are subject to scalability limitations. A promising direction to get solutions
with safe and rich functionality is the combination of several techniques into integral
approaches. Moreover, introducing effective correlation mechanisms for Intrusion Detection Systems help to improve the network analysts’ ability to identify promptly the
occurred abnormal events. A very interesting architecture for designing IDSs is that of
honeynets and honeypots. Such an architecture effectively attracts and monitors the
attacks and is able to locate the attackers.
Most visual analytics applications focus on the visualization part, but often do not rely
on the most advanced network analytics approaches. The survey of visual analysis tools
for network security in Chapter 3, which provide interactive visual exploration for flow
data, BGP and intrusion detection data also showed that most tools are very specific
and in many cases only suitable for particular tasks. In addition to this the presented
summary table reveals that most of those custom-built tools are rarely able to combine
different data sources, but focus only on single data types. Because of these limited
capabilities and scalability issues the deployment to real world operational scenarios or
analyzing large datasets is often not possible.
Therefore, it has to be pointed out that there is still a substantial gap between those
fields of research. Especially when there are very complex algorithms involved, visual
analysis could actually help to gain more insight into the data. In the field of attack
attribution, for example, there are algorithms which are able to automatically group
those events together which probably have the same underlying root cause. However,
because of the large number of dimensions it is not obvious to the analyst any more why
these events are in the same group or not. In a system where interactive visual analysis is
combined with analytics algorithms, the analyst would be better equipped to gain insight
into the groups of attacks. This means that we need to tightly couple network security
algorithms with directly integrated visual analysis methods in the future. Besides such
88
integrative aspects, further improvements of the scalability of both the data analysis
algorithms and visualizations are necessary to eventually reach this goal. Moreover, it is
important to bring this research to an operational level. The solid knowledge of the VISSENSE partners in network algorithmics and visual analysis will enable us to close this
gap by providing a visual analytics framework that combines the respective strengths of
both worlds.
89
Bibliography
[1] Cisco IronPort SenderBase Security Network. http://www.senderbase.org. [Online; accessed 22-Jan-2011].
[2] Composite Blocking List (DNSBL). http://cbl.abuseat.org. [Online; accessed
22-Feb-2011].
[3] Distributed Sender Blackhole List (DNSBL). http://dsbl.org. [Online; accessed
11-Jan-2011].
[4] Gnuplot Homepage. http://www.gnuplot.info/. [Online; accessed 24-Feb-2011].
[5] Introducing RStudio. http://www.rstudio.org/. [Online; accessed 24-Feb-2011].
[6] Not Just Another Bogus List (DNSBL). http://www.njabl.org. [Online; accessed
22-Jan-2011].
[7] Project Honey Pot. http://www.projecthoneypot.org. [Online; accessed 22Jan-2011].
[8] RFC4271. A Border Gateway Protocol 4 (BGP-4). http://tools.ietf.org/
html/rfc4271. [Online; accessed 22-Jan-2011].
[9] SecureWorks. http://www.secureworks.com. [Online; accessed 22-Jan-2011].
[10] Spam and Open-Relay Blocking System (DNSBL). http://www.au.sorbs.net.
[Online; accessed 25-Jan-2011].
[11] Spamcop Blocking List (DNSBL). http://www.spamcop.net/bl.shtml. [Online;
accessed 25-Jan-2011].
[12] Spamhaus (DNSBL). http://www.spamhaus.org. [Online; accessed 24-Jan-2011].
[13] The Apache SpamAssassin Project. http://spamassassin.apache.org. [Online;
[14] Weka 3: Data Mining Software for Java. http://www.cs.waikato.ac.nz/ml/
weka/. [Online; accessed 24-Feb-2011].
90
Bibliography
[15] K. Abdullah, C. Lee, G. Conti, J. Copeland, and J. Stasko. Ids rainstorm: Visualizing ids alarms. Visualization for Computer Security, IEEE Workshops on, 2005.
http://chrislee.dhs.org/projects/rainstorm.html.
[16] W. Aiello, J. Ioannidis, and P. D. McDaniel. Origin authentication in interdomain
routing. In S. Jajodia, V. Atluri, and T. Jaeger, editors, ACM Conference on
Computer and Communications Security, pages 165–178. ACM, 2003.
[17] A. Al-Bataineh and G. White. Detection and Prevention Methods of Botnetgenerated Spam. In MIT Spam Conference, 2009.
[18] D. Anderson, T. Lunt, H. Javitz, A. Tamaru, and A. Valdes. Next-generation
intrusion detection expert system (nides): A summary. Technical report, SRI
International, 1995.
[19] D. S. Anderson, C. Fleizach, S. Savage, and G. M. Voelker. Spamscatter: characterizing internet scam hosting infrastructure. In SS’07: Proceedings of 16th USENIX
Security Symposium on USENIX Security Symposium, pages 1–14, Berkeley, CA,
USA, 2007. USENIX Association.
[20] S. Axelsson. Intrusion detection systems: A survey and taxonomy. Technical Report 99-15, Department of Computer Engineering, Chalmers University of Technology, Goteborg, Sweden, 2000.
[21] H. Ballani, P. Francis, and X. Zhang. A Study of Prefix Hijacking and Interception
in the Internet. In SIGCOMM ’07: Proceedings of the 2007 conference on Applications, technologies, architectures, and protocols for computer communications,
pages 265–276, New York, NY, USA, 2007. ACM.
[22] Z. Bankovic, S. Bojanic, O. Nieto-Taladriz, and A. Badii. Unsupervised genetic
algorithm deployed for intrusion detection. In E. Corchado, A. Abraham, and
W. Pedrycz, editors, HAIS, volume 5271 of Lecture Notes in Computer Science,
pages 132–139. Springer, 2008.
[23] D. Barbara, J. Couto, S. Jajodia, and N. Wu. Adam: A testbed for exploring the
use of data mining in intrusion detection. SIGMOD Record, 30(4):15–24, 2001.
[24] D. Barbara and S. J. (Eds), editors. Applications of Data Mining in Computer
Security, volume 6 of Advances in Information Security. Springer, 2002.
[25] T. Bass. Intrusion detection systems and multisensor data fusion. Communications
of the ACM, 43(4):99–105, 2000.
91
Bibliography
[26] M. Bastian, S. Heymann, and M. Jacomy. Gephi: An Open Source Software
for Exploring and Manipulating Networks. In International AAAI Conference on
Weblogs and Social Media, pages 361–362. AAAI, 2009. http://gephi.org/.
[27] M. Behringer. Bgp session security requirements. Internet Draft, draft-ietf-rpsecbgp-session-sec-req-01.txt, July 2008.
[28] R. Bejtlich. Attribution Is Not Just Malware Analysis. http://taosecurity.
blogspot.com/2010/01/attribution-is-not-just-malware.html. [Online; accessed 22-Jan-2011].
[29] R. Bejtlich. Attribution Using 20 Characteristics. http://taosecurity.
blogspot.com/2010/01/attribution-using-20-characteristics.html. [Online; accessed 22-Jan-2011].
[30] G. Beliakov, A. Pradera, and T. Calvo. Aggregation Functions: A Guide for
Practitioners. Springer, Berlin, New York, 2007.
[31] H. D. Benjamin Morin, Ludovic Me and M. Duccasse. M4d4: a logical framework
to support alert correlation in intrusion detection. Information Fusion, 10(4):285–
299, October 2009.
[32] R. Berthier, M. Cukier, M. Hiltunen, D. Kormann, G. Vesonder, and D. Sheleheda.
Nfsight: NetFlow-based Network Awareness Tool. In Proceedings of the 24th Large
Installation System Administration Conference (LISA ’10), November 2010.
[33] M. Berthold, N. Cebron, F. Dill, T. Gabriel, T. K
”otter, T. Meinl, P. Ohl, C. Sieb, K. Thiel, and B. Wiswedel. KNIME: The
Konstanz information miner. Data Analysis, Machine Learning and Applications,
pages 319–326, 2008. http://www.knime.org/downloads-overview.
[34] J. Bertin. Sémiologie Graphique. Les diagrammes, les réseaux, les cartes. GauthierVillars, Paris, France, 1967.
[35] E. Bertini, P. Hertzog, and D. Lalanne. SpiralView: towards security policies assessment through visual correlation of network resources with evolution of alarms.
In Visual Analytics Science and Technology, 2007. VAST 2007. IEEE Symposium
on, pages 139–146. IEEE, 2007.
[36] R. Beverly and K. Sollins. Exploiting transport-level characteristics of spam (technical report mit-csailtr-2008-008, 2008.
92
Bibliography
[37] J. Bonifacio, A. Cansian, A. de Carvalho, and E. E. Moreira. Neural networks
applied in intrusion detection. In Proceedings of the International Joint Conference
on Neural Networks, 1998.
[38] V. J. Bono. 7007 explanation and apology. NANOG mailing list, msg00444, 1997.
[39] A. Brodsky and D. Brodsky. A distributed content independent method for spam
detection. In HotBots’07: Proceedings of the first conference on First Workshop
on Hot Topics in Understanding Botnets, pages 3–3, Berkeley, CA, USA, 2007.
USENIX Association.
[40] S. T. Brugger. Data Mining Methods for Network Intrusion Detection. In dissertation proposal, submitted to ACM Computer Surveys (under revision), 2009,
2009.
[41] H. Bunke, P. Dickinson, A. Humm, C. Irniger, and M. Kraetzl. Computer network
monitoring and abnormal event detection using graph matching and multidimensional scaling. In P. Perner, editor, Industrial Conference on Data Mining, volume
4065 of Lecture Notes in Computer Science, pages 576–590. Springer, 2006.
[42] H. Bunke and M. M. Kraetzl. Classification and detection of abnormal events in
time series of graphs, volume Data Mining in Time Series Databases, chapter 6,
pages 127–148. World Scientific, 2004.
[43] K. Butler, T. Farley, P. McDaniel, and J. Rexford. A Survey of BGP Security Issues
and Solutions. In Proceedings of the IEEE, volume 98, pages 100–122, January
2010.
[44] J. Cabrera, L. Lewis, X. Qin, W. Lee, R. Prasanth, B. Ravichandran, and
R. Mehra. Proactive detection of distributed denial of service attacks using mib
traffic variables-a feasibility study. In Integrated Network Management Proceedings, 2001 IEEE/IFIP International Symposium on, pages 609 –622, 2001.
[45] M. Caesar, L. Subramanian, and R. H. Katz. Towards localizing root causes of bgp
dynamics. Technical Report UCB/CSD-03-1292, EECS Department, University
of California, Berkeley, 2003.
[46] P. H. Calais, D. E. V. Pires, D. O. Guedes, W. Meira, C. Hoepers, and K. Stedingjessen. A campaign-based characterization of spamming strategies. In In CEAS,
2008.
93
Bibliography
[47] J. Cannady and J. Mahaffey. The application of artificial neural networks to misuse
detection. In In Proceedings of the International Workshop on the Recent Advances
in Intrusion Detection (RAID1998), 1998.
[48] CERT/NetSA at Carnegie Mellon University. SiLK (System for Internet-Level
Knowledge). http://tools.netsa.cert.org/silk. [Online; accessed 24-Feb2011].
[49] P. Chan, M. Mahoney, and M. Arshad. Managing Cyber Threats: Issues, Approaches and Challenges, chapter Learning Rules and Clusters for Anomaly Detection in Network Traffic, pages 81–100. Springer, 2005.
[50] D.-F. Chang, R. Govindan, and J. S. Heidemann. An empirical study of router
response to large bgp routing table load. In Internet Measurement Workshop,
pages 203–208. ACM, 2002.
[51] D.-F. Chang, R. Govindan, and J. S. Heidemann. The temporal and topological
characteristics of bgp path changes. In ICNP, pages 190–199. IEEE Computer
Society, 2003.
[52] C.-S. Chao, Y.-X. Chen, and A.-C. Liu. Abnormal event detection for network
flooding attacks. J. Inf. Sci. Eng., 20(6):1079–1091, 2004.
[53] V. Chatzigiannakis, G. Androulidakis, K. Pelechrinis, S. Papavassiliou, and
V. Maglaris. Data fusion algorithms for network anomaly detection: classification
and evaluation. In IEEE International Conference on Networking and Services,
ICNS’07, Athens, Greece, June 2007, June 2007.
[54] L. Chen and J. Leneutre. A game theoretical framework on intrusion detection in
heterogeneous networks. Information Forensics and Security, IEEE Transactions
on, 4(2):165 –178, June 2009.
[55] L. Chen, C. Muelder, K. Ma, and A. Bartoletti. Intelligent Classification and
Visualization of Network Scans. Technical report, Lawrence Livermore National
Laboratory (LLNL), Livermore, CA, 2007.
[56] Z. Chen, C. Ji, and P. Barford. Spatial-temporal characteristics of internet malicious sources. In Proceedings of INFOCOM, 2008.
[57] S. Cheung, U. Lindqvist, and M. W. Fong. Modeling multistep cyber attacks for
scenario recognition. In DISCEX (1), pages 284–292. IEEE Computer Society,
2003.
94
Bibliography
[58] A. Chittur. Model generation for an intrusion detection system using genetic algorithms. PhD thesis, Ossining High School. In cooperation with Columbia University, 2001.
[59] B. Christian and T. Tauber. Bgp security requirements. Internet Draft, draft-ietfrpsec-bgpsecrec-10.txt, November 2008.
[60] CNET News.
Router glitch cuts Net access.
http://news.cnet.com/
2100-1033-279235.html. [Online; accessed 22-Apr-1997].
[61] L. Colitti, G. Di Battista, F. Mariani, M. Patrignani, and M. Pizzonia. Visualizing
Interdomain Routing with BGPlay. Journal of Graph Algorithms and Applications,
9(1):117–148, 2005. http://bgplay.routeviews.org/.
[62] Colorado State University. BGP Monitoring System: BGPmon. http://bgpmon.
netsec.colostate.edu/. [Online; accessed 22-Jan-2011].
[63] Computer Networks Research Group – Roma Tre University. BGPlay. http:
//bgplay.routeviews.org/. [Online; accessed 24-Jan-2011].
[64] G. Conti, K. Abdullah, J. Grizzard, J. Stasko, J. Copeland, M. Ahamad, H. Owen,
and C. Lee. Countering security information overload through alert and packet
visualization. IEEE Computer Graphics and Applications, pages 60–70, 2006.
http://rumint.org/.
[65] P. Cortese, G. Di Battista, A. Moneta, M. Patrignani, and M. Pizzonia. Topographic visualization of prefix propagation in the internet. IEEE Transactions on
Visualization and Computer Graphics, pages 725–732, 2006.
[66] M. Cova, C. Leita, O. Thonnard, A. D. Keromytis, and M. Dacier. An analysis of
rogue av campaigns. In Proceedings of the 13th international conference on Recent
advances in intrusion detection, RAID’10, pages 442–463, Berlin, Heidelberg, 2010.
Springer-Verlag.
[67] F. Cuppens. Managing alerts in a multi-intrusion detection environment. In Computer Security Applications Conference, 2001. ACSAC 2001. Proceedings 17th Annual, pages 22 – 31, December 2001.
[68] F. Cuppens and A. Miege. Alert correlation in a cooperative intrusion detection
framework. In Security and Privacy, 2002. Proceedings. 2002 IEEE Symposium
on, pages 202 – 215, 2002.
95
Bibliography
[69] F. Cuppens and R. Ortalo. Lambda: A language to model a database for detection
of attacks. In H. Debar, L. Me, and S. F. Wu, editors, Recent Advances in Intrusion
Detection, volume 1907 of Lecture Notes in Computer Science, pages 197–216.
Springer, 2000.
[70] Cyber-TA. Cyber-threat analytics (cyber-ta), sri international. Available online
at http://www.cyber-ta.org/. [Online; accessed 24-Jan-2011].
[71] M. Dacier, V. Pham, and O. Thonnard. The WOMBAT Attack Attribution
method: some results. In 5th International Conference on Information Systems
Security (ICISS 2009), 14-18 December 2009, Kolkata, India, Dec 2009.
[72] D. Dasgupta. An immunity-based technique to characterize intrusions in computer
networks. IEEE Transactions on Evolutionary Computation, 6:1081–1088, 2002.
[73] H. Debar, M. Becker, and D. Siboni. A neural network component for an intrusion
detection system. In Proceedings of the 1992 IEEE Computer Society Symposium
on Research in Computer Security and Privacy, pages 240–250, 1992.
[74] H. Debar, D. Curry, and B. Feinstein. The Intrusion Detection Message Exchange
Format (IDMEF). RFC 4765 (Experimental), March 2007.
[75] H. Debar, M. Dacier, and A. Wespi. A revised taxonomy for intrusion-detection
systems. Annals of Telecommunications, 55:361–378, 2000. 10.1007/BF02994844.
[76] H. Debar and A. Wespi. Aggregation and correlation of intrusion-detection alerts.
In W. Lee, L. Me, and A. Wespi, editors, Recent Advances in Intrusion Detection,
volume 2212 of Lecture Notes in Computer Science, pages 85–103. Springer, 2001.
[77] D. Denning and P. Neumann. Requirements and model for ides a real-time intrusion detection system. Technical Report 83F83-01-00, Computer Science Laboratory, SRI International, 1985.
[78] D. E. Denning. An intrusion detection model. IEEE Transactions on Software
Engineering, SE-13:222–232, 1987.
[79] Y. Dhanalakshmi and R. Babu. Intrusion detection using data mining along fuzzy
logic and genetic algorithms. IJCSNS International Journal of Computer Science
and Network Security, 8(2):27–32, 2008.
[80] J. Dickerson and J. Dickerson. Fuzzy network profiling for intrusion detection. In
Fuzzy Information Processing Society, 2000. NAFIPS. 19th International Conference of the North American, pages 301 –306, 2000.
96
Bibliography
[81] W. Eddy. TCP SYN Flooding Attacks and Common Mitigations. RFC 4987
(Informational), August 2007.
[82] J. Ellson, E. Gansner, L. Koutsofios, S. North, and G. Woodhull. Graphviz: Open
source graph drawing tools. Lecture notes in computer science, pages 483–484,
2002. http://www.graphviz.org/Download..php.
[83] R. Ensafi, S. Dehghanzadeh, and M. R. Akbarzadeh-Totonchi. Optimizing fuzzy
k-means for network anomaly detection using pso. In AICCSA, pages 686–693.
IEEE, 2008.
[84] Ertoz, Eilertson, Lazarevic, Tan, Kumar, Srivastava, and Dokas. MINDS - Minnesota Intrusion Detection System. In Next Generation Data Mining, MIT Press,
2004, 2004.
[85] E. Eskin, A. Arnold, M. Prerau, L. Portnoy, and S. S. o. A geometric framework
for unsupervised anomaly detection: Detecting intrusions in unlabeled data. In
D. Barbara and S. Jajodia, editors, Applications of Data Mining in Computer
Security. Kluwer, 2002.
[86] F. Esponda, S. Forrest, and P. Helman. A formal framework for positive and
negative detection schemes. Systems, Man, and Cybernetics, Part B: Cybernetics,
IEEE Transactions on, 34(1):357 – 373, February 2004.
[87] H. Esquivel, T. Mori, and A. Akella. Router-Level Spam Filtering Using TCP
Fingerprints: Architecture and Measurement-Based Evaluation. In Conference on
E-Mail and Anti-Spam (CEAS), 2009.
[88] J. M. Estevez-Tapiador, P. Garcia-Teodoro, and J. E. Diaz-Verdejo. Stochastic
protocol modeling for anomaly based network intrusion detection. In IWIA, pages
3–12, 2003.
[89] W. Fan. Cost-Sensitive, Scalable and Adaptive Learning Using Ensemble-based
Methods. PhD thesis, Columbia University, 2001.
[90] J. Fekete. The InfoVis Toolkit. In IEEE Symposium on Information Visualization,
INFOVIS 2004, pages 167–174. IEEE, 2004.
[91] A. Feldmann, O. Maennel, Z. M. Mao, A. Berger, and B. Maggs. Locating internet
routing instabilities. In SIGCOMM ’04: Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications,
97
Bibliography
[92] G. Fink, P. Muessig, and C. North. Visual correlation of host processes and
network traffic. In Visualization for Computer Security, 2005.(VizSEC 05). IEEE
Workshop on, pages 11–19. IEEE, 2005.
[93] F. Fischer, F. Mansmann, D. Keim, S. Pietzko, and M. Waldvogel. Large-scale network monitoring for visual analysis of attacks. In Visualization for Computer Security: 5th International Workshop, Vizsec 2008, Cambridge, Ma, USA, September
15, 2008, Proceedings, page 111, 2008.
[94] M. Fisk and G. Varghese. Fast content-based packet handling for intrusion detection. Technical Report CS2001-0670, UCSD, 2001.
[95] G. Florez, S. Bridges, and R. Vaughn. An improved algorithm for fuzzy data
mining for intrusion detection. In Fuzzy Information Processing Society, 2002.
Proceedings. NAFIPS. 2002 Annual Meeting of the North American, pages 457 –
462, 2002.
[96] S. Foresti and J. Agutter. VisAlert: From Idea to Product. In VizSEC 2007, pages
159–174. Springer, 2008.
[97] M. Fossi, D. Turner, E. Johnson, T. Mack, T. Adams, J. Blackbird, M. K. Low,
D. McKinney, M. Dacier, A. Keromytis, C. Leita, M. Cova, J. Overton, and
O. Thonnard. Symantec report on rogue security software. Whitepaper, Symantec,
October 2009.
[98] V. Ganti, R. Ramakrishnan, J. Gehrke, and A. Powell. Clustering large datasets
in arbitrary metric spaces. In Proceedings of the 15th International Conference
on Data Engineering, ICDE’99, pages 502–, Washington, DC, USA, 1999. IEEE
Computer Society.
[99] J. Gao, G. Hu, X. Yao, and R. Chang. Anomaly detection of network traffic based
on wavelet packet. In Communications, 2006. APCC ’06. Asia-Pacific Conference
on, pages 1 –5, 312006-sept.1 2006.
[100] L. Gao. On inferring autonomous system relationships in the Internet. IEEE/ACM
Trans. Netw., 9(6):733–745, 2001.
[101] T. D. Garvey and T. F. Lunt. Model-based intrusion detection. In Proceedings of
the 14th National Computer Security Conference, 1991.
98
Bibliography
[102] G. Giacinto, R. Perdisci, and F. Roli. Alarm clustering for intrusion detection
systems in computer networks. In P. Perner and A. Imiya, editors, MLDM, volume
3587 of Lecture Notes in Computer Science, pages 184–193. Springer, 2005.
[103] V. Gill, J. Heasley, and D. Meyer. The bgp ttl security hack (btsh). Presentation
at NANOG-27 meeting, October 2001.
[104] V. Gill, J. Heasley, D. Meyer, P. Savola, and C. Pignataro. The generalized tl
security mechanism (gtsm). RFC 5082, Internet Engineering Task Force, October
2007.
[105] J. Glanfield, S. Brooks, T. Taylor, D. Paterson, C. Smith, C. Gates, and
J. McHugh. Over flow: An overview visualization for network analysis. In Visualization for Cyber Security, 2009. VizSec 2009. 6th International Workshop on,
pages 11–19. IEEE, 2010.
[106] J. Goodall, W. Lutters, P. Rheingans, and A. Komlodi. Preserving the big picture:
Visual network traffic analysis with tnv. In Visualization for Computer Security,
2005.(VizSEC 05). IEEE Workshop on, pages 47–54. IEEE, 2005. http://tnv.
sourceforge.net/.
[107] J. Goodall and M. Sowul. VIAssist: Visual analytics for cyber defense. In Technologies for Homeland Security, 2009. HST’09. IEEE Conference on, pages 143–150.
IEEE, 2009.
[108] G. Goodell, W. Aiello, T. Griffin, J. Ioannidis, and P. McDaniel. Working around
bgp: An incremental approach to improving security and accuracy of interdomain
routing. In Proc. of Internet Society Symposium on Network and Distributed System Security (NDSS03), February 2003.
[109] J. Goodman. IP Addresses in Email Clients. In First Conference on Email and
Anti-Spam, Mountain View, CA, 2004.
[110] A. K. Gosh, J. Wanken, and F. Charron. Detecting anomalous and unknown
intrusions against programs. In ACSAC, pages 259–267. IEEE Computer Society,
1998.
[111] M. Grabisch, T. Murofushi, M. Sugeno, and J. Kacprzyk. Fuzzy Measures and
Integrals. Theory and Applications. Physica Verlag, Berlin, 2000.
[112] T. Griffin. What is the sound of one route flapping? Presentation at the Network
Modeling and Simulation Summer Workshop, 2002.
99
Bibliography
[113] S. Guha, R. Rastogi, and K. Shim. ROCK: A robust clustering algorithm for
categorical attributes. Information Systems, 25(5):345–366, 2000.
[114] H. Hajji. Statistical analysis of network traffic for adaptive faults detection. Neural
Networks, IEEE Transactions on, 16(5):1053 –1063, September 2005.
[115] J. M. Hall. Isnids, a network intrusion detection system inspired by the human
immune system. Technical Report CSDS-DF-TR-03-12, CSDS, 2002.
[116] W. Harrop and G. Armitage. Real-time collaborative network monitoring and
control using 3D game engines for representation and interaction. In Proceedings
of the 3rd international workshop on Visualization for computer security, pages
31–40. ACM, 2006.
[117] A. Heffernan. Protection of BGP Sessions via the TCP MD5 Signature Option.
RFC 2385 (Proposed Standard), August 1998.
[118] C. Hepner and E. Zmijewski. Defending Against BGP Man-In-The-Middle Attacks.
Slides, February 2009. Black Hat DC. Arlington, VA. Renesys Corporation. http:
//www.renesys.com/tech/presentations/pdf/blackhat-09.pdf.
[119] J. A. Hoagland and S. Staniford. Viewing ids alerts: Lessons from snortsnarf. DARPA Information Survivability Conference and Exposition,, 1:0374, 2001.
http://sourceforge.net/projects/snortsnarf/.
[120] S. A. Hofmeyr and S. Forrest. Immunizing computer networks: Getting all the
machines in your network to fight the hacker disease. In Proc. of the 1999 IEEE
Symp. on Security and Privacy, pages 9–12. IEEE Computer Society Press, 1998.
[121] S.-C. Hong, H.-T. Ju, and J. W. Hong. IP prefix hijacking detection using idle
scan. In APNOMS’09: Proceedings of the 12th Asia-Pacific network operations and
management conference on Management enabling the future internet for changing
business and new computing services, pages 395–404, Berlin, Heidelberg, 2009.
Springer-Verlag.
[122] C. Hood and C. Ji. Intelligent network monitoring. In Neural Networks for Signal
Processing [1995] V. Proceedings of the 1995 IEEE Workshop, pages 521 –530,
August 1995.
[123] C. S. Hood and C. Ji. Proactive network fault detection. In INFOCOM, pages
1147–1155, 1997.
100
Bibliography
[124] X. Hu and Z. M. Mao. Accurate Real-time Identification of IP Prefix Hijacking. In
SP ’07: Proceedings of the 2007 IEEE Symposium on Security and Privacy, pages
3–17, Washington, DC, USA, 2007. IEEE Computer Society.
[125] Y.-C. Hu, A. Perrig, and D. B. Johnson. Efficient security mechanisms for routing
protocols. In In Proc. NDSS03, pages 57–73, 2003.
[126] Y.-C. Hu, A. Perrig, and M. A. Sirbu. Spv: secure path vector routing for securing
bgp. In R. Yavatkar, E. W. Zegura, and J. Rexford, editors, SIGCOMM, pages
179–192. ACM, 2004.
[127] C.-T. Huang, S. Thareja, and Y.-J. Shin. Wavelet-based real time detection of
network traffic anomalies. I. J. Network Security, 6(3):309–320, 2008.
[128] L. Huang, X. Nguyen, M. N. Garofalakis, J. M. Hellerstein, M. I. Jordan, A. D.
Joseph, and N. Taft. Communication-efficient online detection of network-wide
anomalies. In INFOCOM, pages 134–142. IEEE, 2007.
[129] P. Huang, A. Feldmann, and W. Willinger. A non-intrusive, wavelet-based approach to detecting network performance problems. In Proceedings of ACM SIGCOMM Internet Measurement Workshop, November 2001.
[130] Y.-A. Huang, W. Fan, W. Lee, and P. S. Yu. Cross-feature analysis for detecting
ad-hoc routing anomalies. In ICDCS, pages 478–. IEEE Computer Society, 2003.
[131] Hurricane Electric. BGP Toolkit. http://bgp.he.net/. [Online; accessed 22Jan-2011].
[132] G. Huston, M. Rossi, and G. Armitage. Securing bgp - a literature survey. Communications Surveys Tutorials, IEEE, PP(99):1 –24, 2010.
[133] K. Hwang, M. Cai, Y. Chen, and M. Qin. Hybrid intrusion detection with weighted
signature generation over anomalous internet episodes. IEEE Trans. Dependable
Sec. Comput., 4(1):41–55, 2007.
[134] IBM. Iss, realsecure. http://www.iss.net, 2010.
[135] T. Ide and H. Kashima. Eigenspace-based anomaly detection in computer systems.
In W. Kim, R. Kohavi, J. Gehrke, and W. DuMouchel, editors, KDD, pages 440–
449. ACM, 2004.
[136] K. Ilgun. Ustat - a real-time intrusion detection system for unix. Master thesis,
University of California at Santa Barbara, November 1992.
101
Bibliography
[137] A. Inselberg and B. Dimsdale. Parallel coordinates: a tool for visualizing multidimensional geometry. In Proceedings of the 1st conference on Visualization’90,
pages 361–378. IEEE Computer Society Press, 1990.
[138] B. Irwin and N. Pilkington. High level internet scale traffic visualization using
hilbert curve mapping. In VizSEC 2007, pages 147–158. Springer, 2008.
[139] B. Irwin and J. Riel. Using inetvis to evaluate snort and bro scan detection on a
network telescope. VizSEC 2007, pages 255–273, 2008. http://www.cs.ru.ac.
za/research/g02v2468/inetvis.html.
[140] S. Jajodia, P. Liu, V. Swarup, and C. Wang, editors. Cyber Situational Awareness:
Issues and Research, volume 46 of Advances in Information Security. Springer, Nov
2009.
[141] J. Janies. Existence plots: A low-resolution time series for port behavior analysis.
Visualization for Computer Security, pages 161–168, 2008.
[142] J. P. John, A. Moshchuk, S. D. Gribble, and A. Krishnamurthy. Studying spamming botnets using Botlab. In NSDI’09: Proceedings of the 6th USENIX symposium on Networked systems design and implementation, pages 291–306, Berkeley,
CA, USA, 2009. USENIX Association.
[143] B. Johnson and B. Shneiderman. Tree-maps: a space-filling approach to the visualization of hierarchical information structures. In Proceedings of the 2nd conference
on Visualization ’91, VIS ’91, pages 284–291, Los Alamitos, CA, USA, 1991. IEEE
Computer Society Press.
[144] K. Julisch. Applications of Data Mining in Computer Security, volume 6 of Advances in Information Security, chapter Data Mining For Intrusion Detection - A
Critical Review. Springer, 2002.
[145] K. Julisch. Clustering intrusion detection alarms to support root cause analysis.
ACM Trans. Inf. Syst. Secur., 6(4):443–471, 2003.
[146] K. Julisch and M. Dacier. Mining intrusion detection alarms for actionable knowledge. In Proceedings of the 8th ACM International Conference on Knowledge
Discovery and Data Mining, 2002.
[147] C. Kanich, C. Kreibich, K. Levchenko, B. Enright, G. M. Voelker, V. Paxson,
and S. Savage. Spamalytics: an empirical analysis of spam marketing conversion.
102
Bibliography
In Proceedings of the 15th ACM conference on Computer and communications
security, CCS ’08, pages 3–14, New York, NY, USA, 2008. ACM.
[148] J. Karlin, S. Forrest, and J. Rexford. Pretty good bgp: Improving bgp by cautiously adopting routes. In Network Protocols, 2006. ICNP’06. Proceedings of the
2006 14th IEEE International Conference on, pages 290–299, 2006.
[149] E. Katz-Bassett, H. V. Madhyastha, J. P. John, A. Krishnamurthy, D. Wetherall,
and T. Anderson. Studying black holes in the internet with hubble. In Proceedings
of the 5th USENIX Symposium on Networked Systems Design and Implementation,
NSDI’08, pages 247–262, Berkeley, CA, USA, 2008. USENIX Association.
[150] KDD. The third international knowledge discovery and data mining tools competition dataset (kdd99 cup). http://kdd.ics.uci.edu/databases/kddcup99.html.
[151] D. Keim. Designing pixel-oriented visualization techniques: Theory and applications. Visualization and Computer Graphics, IEEE Transactions on, 6(1):59–78,
2000.
[152] R. Kemmerer and G. Vigna. Intrusion detection: A brief history and overview.
IEEE Computer, 35(4):27–30, April 2002.
[153] S. Kent. IP Authentication Header. RFC 4302 (Proposed Standard), December
2005.
[154] S. Kent. Ip encapsulating security payload (esp). RFC 4303 (Proposed Standard),
Dec. 2005.
[155] S. Kent, C. Lynn, and K. Seo. Secure border gateway protocol (s-bgp). Selected
Areas in Communications, IEEE Journal on, 18(4):582 –592, Apr. 2000.
[156] S. Kent and K. Seo. Security Architecture for the Internet Protocol. RFC 4301
(Proposed Standard), December 2005.
[157] S. T. Kent. Securing the border gateway protocol: A status update. In A. Lioy and
D. Mazzocchi, editors, Communications and Multimedia Security, volume 2828 of
Lecture Notes in Computer Science, pages 40–53. Springer, 2003.
[158] S. T. Kent, C. Lynn, J. Mikkelson, and K. Seo. Secure border gateway protocol
(s-bgp) - real world performance and deployment issues. In NDSS. The Internet
Society, 2000.
103
Bibliography
[159] L. Khan, M. Awad, and B. M. Thuraisingham. A new intrusion detection system
using support vector machines and hierarchical clustering. VLDB J., 16(4):507–
521, 2007.
[160] R. Kisteleki.
Filtering After Recent Chinese “BGP Hijack” Does
not Affect RIPE Region.
http://labs.ripe.net/Members/kistel/
content-recent-chinese-bgp-hijack-does-not-affect-ripe.
[Online;
accessed 10-Apr-2010].
[161] C. Kleiber and A. Zeileis. Applied Econometrics with R. Springer, 2008. http:
//www.r-project.org/index.html.
[162] J. Kline, S. Nam, P. Barford, D. Plonka, and A. Ron. Traffic anomaly detection
at fine time scales with bayes nets. In Proceedings of the International Conference
on Internet Monitoring and Protection (ICIMP ’08), June 2008.
[163] H. Koike and K. Ohno. SnortView: visualization system of snort logs. In Proceedings of the 2004 ACM workshop on Visualization and data mining for computer
security, pages 143–147. ACM, 2004.
[164] H. Koike, K. Ohno, and K. Koizumi. Visualizing cyber attacks using ip matrix.
Visualization for Computer Security, IEEE Workshops on, 0:11, 2005.
[165] A. Komlodi, P. Rheingans, U. Ayachit, J. Goodall, and A. Joshi. A user-centered
look at glyph-based security visualization. Visualization for Computer Security,
IEEE Workshops on, 2005.
[166] S. Krasser, G. Conti, J. Grizzard, J. Gribschaw, and H. Owen. Real-time and
forensic network data analysis using animated and coordinated visualization. In
Proceedings of the 6th IEEE Information Assurance Workshop, volume 142. Citeseer, 2005.
[167] C. Kreibich, C. Kanich, K. Levchenko, B. Enright, G. M. Voelker, V. Paxson,
and S. Savage. On the spam campaign trail. In LEET’08: Proceedings of the
1st Usenix Workshop on Large-Scale Exploits and Emergent Threats, pages 1–9,
Berkeley, CA, USA, 2008. USENIX Association.
[168] C. Kreibich, C. Kanich, K. Levchenko, B. Enright, G. M. Voelker, V. Paxson,
and S. Savage. Spamcraft: an inside look at spam campaign orchestration. In
Proceedings of the 2nd USENIX conference on Large-scale exploits and emergent
threats: botnets, spyware, worms, and more, LEET’09, pages 4–4, Berkeley, CA,
USA, 2009. USENIX Association.
104
Bibliography
[169] C. Kruegel, D. Mutz, W. K. Robertson, and F. Valeur. Bayesian event classification
for intrusion detection. In ACSAC, pages 14–23. IEEE Computer Society, 2003.
[170] C. Kruegel and T. Toth. Using decision trees to improve signature-based intrusion
detection. In G. Vigna, E. Jonsson, and C. Kruegel, editors, RAID, volume 2820
of Lecture Notes in Computer Science, pages 173–191. Springer, 2003.
[171] C. Kruegel, T. Toth, and C. Kerer. Decentralized event correlation for intrusion
detection. In K. Kim, editor, ICISC, volume 2288 of Lecture Notes in Computer
Science, pages 114–131. Springer, 2001.
[172] S. Kumar and E. H. Spafford. A Pattern Matching Model for Misuse Intrusion
Detection. In Proceedings of the 17th National Computer Security Conference,
pages 11–21, 1994.
[173] S. Kumar and E. H. Spafford. An application of pattern matching in intrusion
detection. Technical Report CSD-TR-94-013, Purdue University, 1994.
[174] C. Labovitz.
Additional discussion of the april china bgp hijack
incident.
http://asert.arbornetworks.com/2010/11/
additional-discussion-of-the-april-china-bgp-hijack-incident/.
[Online; accessed 10-Apr-2010].
[175] C. Labovitz.
China Hijacks 15% of Internet Traffic?
http://asert.
arbornetworks.com/2010/11/china-hijacks-15-of-internet-traffic/.
[176] M. Lad, D. Massey, D. Pei, Y. Wu, B. Zhang, and L. Zhang. PHAS: A Prefix
Hijack Alert System. In USENIX-SS’06: Proceedings of the 15th conference on
USENIX Security Symposium, Berkeley, CA, USA, 2006. USENIX Association.
[177] M. Lad, D. Massey, and L. Zhang. Visualizing internet routing changes. IEEE
Transactions on Visualization and Computer Graphics, pages 1450–1460, 2006.
http://linkrank.cs.ucla.edu/.
[178] M. Lad, A. Nanavati, D. Massey, and L. Zhang. An algorithmic approach to
identifying link failures. In PRDC, pages 25–34. IEEE Computer Society, 2004.
[179] A. Lakhina, M. Crovella, and C. Diot. Mining anomalies using traffic feature
distributions. In R. Guerin, R. Govindan, and G. Minshall, editors, SIGCOMM,
pages 217–228. ACM, 2005.
105
Bibliography
[180] K. Lakkaraju, W. Yurcik, R. Bearavolu, and A. Lee. NVisionIP: an interactive
network flow visualization tool for security. In Systems, Man and Cybernetics,
2004 IEEE International Conference on, volume 3, pages 2675–2680. IEEE, 2005.
[181] S. Lau.
The spinning cube of potential doom.
Communications of
the ACM, 47(6):25–26, 2004.
http://www.nersc.gov/nusers/security/
TheSpinningCube.php.
[182] C. Lee and J. Copeland. Flowtag: a collaborative attack-analysis, reporting,
and sharing tool for security researchers. In Proceedings of the 3rd international
workshop on Visualization for computer security, pages 103–108. ACM, 2006.
http://chrislee.dhs.org/projects/flowtag.html.
[183] C. Lee, J. Trost, N. Gibbs, R. Beyah, and J. Copeland. Visual firewall: real-time
network security monitor. In Visualization for Computer Security, 2005.(VizSEC
05). IEEE Workshop on, pages 129–136. IEEE, 2005.
[184] W. Lee, S. Stolfo, and K. Mok. A data mining framework for building intrusion
detection models. In Proceedings of the 1999 IEEE Symposium on Security and
Privacy, pages 120–132, 1999.
[185] W. Lee and S. J. Stolfo. Combining knowledge discovery and knowledge engineering to build IDSs. In RAID ’99: Proceedings of the 3th International Symposium
on Recent Advances in Intrusion Detection, 1999.
[186] W. Lee, S. J. Stolfo, and K. W. Mok. A data mining framework for building
intrusion detection models. In IEEE Symposium on Security and Privacy, pages
120–132, 1999.
[187] C. Leita, U. Bayer, and E. Kirda. Exploiting diverse observation perspectives to
get insights on the malware landscape. In DSN 2010, 40th Annual IEEE/IFIP
International Conference on Dependable Systems and Networks, June 28-July 1,
2010, Fairmont Chicago, USA, 06 2010.
[188] C. Leita and M. Dacier. Sgnet: A worldwide deployable framework to support the
analysis of malware threat models. In Seventh European Dependable Computing
Conference, EDCC 2008, pages 99–109, 2008.
[189] K. Leung and C. Leckie. Unsupervised anomaly detection in network intrusion
detection using clusters. In V. Estivill-Castro, editor, ACSC, volume 38 of CRPIT,
pages 333–342. Australian Computer Society, 2005.
106
Bibliography
[190] L. Lewis. A case-based reasoning approach to the management of faults in communication networks. In INFOCOM ’93. Proceedings.Twelfth Annual Joint Conference of the IEEE Computer and Communications Societies. Networking: Foundation for the Future. IEEE, pages 1422 –1429 vol.3, 1993.
[191] W. Li. Using genetic algorithm for network intrusion detection. In In Proceedings
of the United States Department of Energy Cyber Security Group 2004 Training
Conference, pages 24–27, 2004.
[192] Z. Li, A. Goyal, Y. Chen, and V. Paxson. Automating analysis of large-scale
botnet probing events. In Proc. of ASIACCS, March 2009.
[193] U. Lindqvist and P. Porras. Detecting computer and network misuse through the
production-based expert system toolset (p-best). In Security and Privacy, 1999.
Proceedings of the 1999 IEEE Symposium on, pages 146 –161, 1999.
[194] Y. Livnat, J. Agutter, S. Moon, R. Erbacher, and S. Foresti. A visualization
paradigm for network intrusion detection. In Information Assurance Workshop,
2005. IAW’05. Proceedings from the Sixth Annual IEEE SMC, pages 92–99. IEEE,
2005.
[195] Los Angeles, University of California. Internet topology collection. http://irl.
cs.ucla.edu/. [Online; accessed 13-Jan-2010].
[196] J. Luo. Integrating fuzzy logic with data mining methods for intrusion detection.
Master’s thesis, Mississippi State University, 1999.
[197] A. Magnaghi, T. Hamada, and T. Katsuyama. A wavelet-based framework for
proactive detection of network misconfigurations. In Proceedings of SIGCOMM
2004, 2004.
[198] R. Mahajan, D. Wetherall, and T. Anderson. Understanding BGP Misconfiguration. In SIGCOMM ’02: Proceedings of the 2002 conference on Applications,
Technologies, Architectures, and Protocols for Computer Communications, pages
3–16, New York, NY, USA, 2002. ACM.
[199] M. V. Mahoney and P. K. Chan. Learning rules for anomaly detection of hostile
network traffic. In ICDM, pages 601–604. IEEE Computer Society, 2003.
[200] F. Mansmann, L. Meier, and D. Keim. Visualization of host behavior for network
security. VizSEC 2007, pages 187–202, 2008.
107
Bibliography
[201] Z. M. Mao, R. Bush, T. Griffin, and M. Roughan. Bgp beacons. In Internet
Measurement Comference, pages 1–14. ACM, 2003.
[202] J. Marin, D. Ragsdale, and J. Sirdu. A hybrid approach to the profile creation and
intrusion detection. In DARPA Information Survivability Conference Exposition
II, 2001. DISCEX ’01. Proceedings, volume 1, pages 69 –76 vol.1, 2001.
[203] C. McArthur and M. Guirguis. Stealthy IP Prefix Hijacking: Don’t Bite Off More
Than You Can Chew. In Global Telecommunications Conference, GLOBECOM
2009, pages 1–6. IEEE, 2009.
[204] S. Mccreary. BGP Core Routing Table Size.
dynamics/. [Online; accessed 13-Jan-2011].
http://www.routeviews.org/
[205] C. McCue. Data Mining and Predictive Analysis: Intelligence Gathering and
Crime Analysis. Butterworth-Heinemann (Elsevier), May 2007, 2007.
[206] R. McMillan.
A Chinese ISP Momentarily Hijacks the Internet.
http://www.nytimes.com/external/idg/2010/04/08/
08idg-a-chinese-isp-momentarily-hijacks-the-internet-33717.html.
[207] J. McPherson, K. Ma, P. Krystosk, T. Bartoletti, and M. Christensen. Portvis: a
tool for port-based detection of security events. In Proceedings of the 2004 ACM
workshop on Visualization and data mining for computer security, pages 73–81.
ACM, 2004.
[208] J. Mena. Investigative Data Mining for Security and Criminal Detection.
Butterworth-Heinemann (Elsevier,) Avril 2003, 2003.
[209] R. C. Merkle. Protocols for public key cryptosystems. In IEEE Symposium on
Security and Privacy, pages 122–134, 1980.
[210] I. Mierswa, M. Wurst, R. Klinkenberg, M. Scholz, and T. Euler. Yale: Rapid
prototyping for complex data mining tasks. In L. Ungar, M. Craven, D. Gunopulos,
and T. Eliassi-Rad, editors, KDD ’06: Proceedings of the 12th ACM SIGKDD
international conference on Knowledge discovery and data mining, pages 935–940,
New York, NY, USA, August 2006. ACM. http://rapid-i.com/content/view/
181/196/.
108
Bibliography
[211] S. Ming, S. Wu, X. Zhao, and K. Zhang. On reverse engineering the management
actions from observed bgp data. In INFOCOM Workshops 2008, IEEE, pages 1
–6, April 2008.
[212] B. Morin and H. Debar. Correlation of intrusion symptoms: An application of
chronicles. In G. Vigna, E. Jonsson, and C. Kruegel, editors, RAID, volume 2820
of Lecture Notes in Computer Science, pages 94–112. Springer, 2003.
[213] S. Mukkamala and A. Sung. Feature selection for intrusion detection using neural
networks and support vector machines. Journal of the Transportation Research
Board, 2003:33–39, 2003.
[214] S. Mukkamala and A. H. Sung. Identifying key features for intrusion detection
using neural networks. In Proceedings of the 15th international conference on
Computer communication, ICCC ’02, pages 1132–1138, Washington, DC, USA,
2002. International Council for Computer Communication.
[215] S. Mukkamala, A. H. Sung, and A. Abraham. Intrusion detection systems using
adaptive regression splines. In 6th International Conference on Enterprise Information Systems, ICEIS’04, pages 26–33. Kluwer Academic Press, 2004.
[216] R. NCC. Routing Information Service. http://www.ripe.net/ris/. [Online;
[217] T. Ndousse and T. Okuda. Computational intelligence for distributed fault management in networks using fuzzy cognitive maps. In Communications, 1996. ICC
96, Conference Record, Converging Technologies for Tomorrow’s Applications.
1996 IEEE International Conference on, volume 3, pages 1558 –1562 vol.3, June
1996.
[218] P. Ning, Y. Cui, and D. S. Reeves. Constructing attack scenarios through correlation of intrusion alerts. In V. Atluri, editor, ACM Conference on Computer and
Communications Security, pages 245–254. ACM, 2002.
[219] P. Ning, Y. Cui, D. S. Reeves, and D. Xu. Techniques and tools for analyzing
intrusion alerts. ACM Trans. Inf. Syst. Secur., 7(2):274–318, 2004.
[220] P. Ning and D. Xu. Learning attack strategies from intrusion alerts. In in Proceedings of 10th ACM Conference on Computer and Communications Security
(CCS03, pages 200–209. ACM Press, 2003.
109
Bibliography
[221] P. Ning and D. Xu. Hypothesizing and reasoning about attacks missed by intrusion
detection systems. ACM Trans. Inf. Syst. Secur., 7(4):591–627, 2004.
[222] O. Nordstrom and C. Dovrolis. Beware of BGP attacks. Computer Communication
Review, 34(2):1–8, 2004.
[223] J. Oberheide, M. Goff, and M. Karir. Flamingo: Visualizing internet traffic. In Network Operations and Management Symposium, 2006. NOMS 2006. 10th IEEE/IFIP, pages 150–161. IEEE, 2006.
[224] J. Oberheide, M. Karir, and D. Blazakis. VAST: visualizing autonomous system
topology. In Proceedings of the 3rd international workshop on Visualization for
computer security, pages 71–80. ACM, 2006.
[225] R. Oliveira.
Cyclops:
The internet as-level observatory.
Slides
and video:
http://www.nanog.org/meetings/nanog43/abstracts.php?pt=
NTkmbmFub2c0Mw==&nm=nanog43. [Online; accessed 13-June-2008].
[226] I. Onut, B. Zhu, and A. Ghorbani. A novel visualization technique for network
anomaly detection. In proc. 2nd Annual Conf. on Privacy Security and trust, pages
167–174. Citeseer, 2004.
[227] P. v. Oorschot, T. Wan, and E. Kranakis. On interdomain routing security and
pretty secure bgp (psbgp). ACM Trans. Inf. Syst. Secur., 10, July 2007.
[228] Packet Clearing House. http://www.pch.net/home/index.php. [Online; accessed
13-Jan-2011].
[229] R. Pang, V. Yegneswaran, P. Barford, V. Paxson, and L. Peterson. Characteristics
of Internet Background Radiation. In Proceedings of the 4th ACM SIGCOMM
conference on the Internet Measurement, 2004.
[230] A. Pathak, F. Qian, Y. C. Hu, Z. M. Mao, and S. Ranjan. Botnet spam campaigns
can be long lasting: evidence, implications, and analysis. In Proceedings of the
eleventh international joint conference on Measurement and modeling of computer
systems, SIGMETRICS ’09, pages 13–24, New York, NY, USA, 2009. ACM.
[231] V. Paxson. Bro: a system for detecting network intruders in real-time. Computer
Networks, 31(23-24):2435–2463, 1999.
[232] J. Pearlman and P. Rheingans. Visualizing network security events using compound glyphs from a service-oriented perspective. VizSEC 2007, pages 131–146,
2008.
110
Bibliography
[233] J. Peng, C. Feng, and J. W. Rozenblit. A hybrid intrusion detection and visualization system. In ECBS, pages 505–506. IEEE Computer Society, 2006.
[234] V.-H. Pham and M. Dacier. Honeypot traces forensics : the observation view
point matters. In NSS 2009, 3rd International Conference on Network and System
Security, October 19-21, 2009, Gold Coast, Australia, Dec 2009.
[235] V.-H. Pham, M. Dacier, G. Urvoy Keller, and T. En Najjary. The quest for multiheaded worms. In DIMVA 2008, 5th Conference on Detection of Intrusions and
Malware & Vulnerability Assessment, July 10-11th, 2008, Paris, France, Jul 2008.
[236] D. Phan, J. Gerth, M. Lee, A. Paepcke, and T. Winograd. Visual analysis of
network flow data with timelines and event plots. VizSEC 2007, pages 85–99,
2008.
[237] W. Pike, C. Scherrer, and S. Zabriskie. Putting security in context: Visual correlation of network activity with real-world information. VizSEC 2007, pages 203–220,
2008.
[238] A. Pilosov and T. Kapela.
Stealing The Internet: An Internet-Scale
Man In The Middle Attack.
http://www.defcon.org/images/defcon-16/
dc16-presentations/defcon-16-pilosov-kapela.pdf. [Online; accessed 20Aug-2008].
[239] A. C. Popescu, B. J. Premore, and T. Underwood.
The Anatomy
of a Leak: AS9121.
http://www.renesys.com/tech/presentations/pdf/
renesys-nanog34.pdf. [Online; accessed 13-May-2005].
[240] P. Porras and R. Kemmerer. Penetration state transition analysis: A rule-based intrusion detection approach. In Computer Security Applications Conference, 1992.
Proceedings., Eighth Annual, pages 220 –229, November 1992.
[241] P. A. Porras, M. W. Fong, and A. Valdes. A mission-impact-based approach to
infosec alarm correlation. In RAID, pages 95–114, 2002.
[242] F. Pouget and M. Dacier. Honeypot-based forensics. In Proceedings of AusCERT
Asia Pacific Information Technology Security Conference, May 2004.
[243] M. B. Prince, B. M. Dahl, L. Holloway, A. M. Keller, and E. Langheinrich. Understanding How Spammers Steal Your E-Mail Address: An Analysis of the First
Six Months of Data from Project Honey Pot. In CEAS 2005 - Second Conference
111
Bibliography
on Email and Anti-Spam, July 21-22, 2005, Stanford University, California, USA,
2005.
[244] T. S. Project. Snort 2.0, open source network intrusion detection system.
http://www.snort.org.
[245] T. Qin, X. Guan, W. Li, and P. Wang. Monitoring abnormal traffic flows based
on independent component analysis. In ICC, pages 1–5. IEEE, 2009.
[246] X. Qin and W. Lee. Statistical causality analysis of infosec alert data. In G. Vigna, E. Jonsson, and C. Kruegel, editors, RAID, volume 2820 of Lecture Notes in
Computer Science, pages 73–93. Springer, 2003.
[247] J. Qiu, L. Gao, S. Ranjan, and A. Nucci. Detecting bogus BGP route information: Going beyond prefix hijacking. In Security and Privacy in Communications
Networks and the Workshops, 2007. SecureComm 2007., pages 381–390, 2007.
[248] A. Ramachandran and N. Feamster. Understanding the network-level behavior of
spammers. In SIGCOMM ’06: Proceedings of the 2006 conference on Applications,
technologies, architectures, and protocols for computer communications, pages 291–
302, New York, NY, USA, 2006. ACM.
[249] M. Ramadas, S. Ostermann, and B. C. Tjaden. Detecting anomalous network
traffic with self-organizing maps. In G. Vigna, E. Jonsson, and C. Kruegel, editors,
RAID, volume 2820 of Lecture Notes in Computer Science, pages 36–54. Springer,
2003.
[250] P. Ren, Y. Gao, Z. Li, Y. Chen, and B. Watson. IDGraphs: intrusion detection and
analysis using histographs. Visualization for Computer Security, IEEE Workshops
on, 2005.
[251] RIPE. YouTube Hijacking: A RIPE NCC RIS case study. http://www.ripe.
net/news/study-youtube-hijacking.html. [Online; accessed 13-Jan-2011].
[252] Robtex. AS Analysis. http://www.robtex.com/as/. [Online; accessed 13-Jan2011].
[253] L. F. Salim and A. Mezrioui. Improving the quality of alerts with correlation
in intrusion detection. IJCSNS International Journal of Computer Science and
Network Security, 7(12):210–215, 2007.
112
Bibliography
[254] D. Schnackenberg, K. Djahandari, and D. Sterne. Infrastructure for intrusion
detection and response. In DARPA Information Survivability Conference and Exposition, 2000. DISCEX ’00. Proceedings, volume 2, pages 3 –11 vol.2, 2000.
[255] D. Schnackengerg, H. Holliday, R. Smith, K. Djahandari, and D. Sterne. Cooperative intrusion traceback and response architecture (citra). In DARPA Information
Survivability Conference Exposition II, 2001. DISCEX ’01. Proceedings, volume 1,
pages 56 –68 vol.1, 2001.
[256] K. Sequira and M. Zaki. ADMIT: Anomaly-based Data Mining for Intrusions. In
SIGKDD Conference, 2002.
[257] G. Shafer. A mathematical theory of evidence. Princeton university press, 1976.
[258] J. Shearer, K. Ma, and T. Kohlenberg. BGPeep: An IP-Space Centered View for
Internet Routing Data. Visualization for Computer Security, pages 95–110, 2008.
[259] H. Shiravi, A. Shiravi, and A. Ghorbani. IDS Alert Visualization and Monitoring through Heuristic Host Selection. Information and Communications Security,
pages 445–458, 2010.
[260] N. Spring, R. Mahajan, and T. Anderson. Quantifying the causes of path inflation.
In SIGCOMM ’03: Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications, pages 113–124,
New York, NY, USA, 2003. ACM.
[261] K. Sriram, D. Montgomery, O. Borchert, O. Kim, and D. R. Kuhn. Study of BGP
peering session attacks and their impacts on routing performance. IEEE Journal
on Selected Areas in Communications, 24(10):1901–1915, 2006.
[262] H. Stern. A survey of modern spam tools. In Fifth Conference on Email and
Anti-Spam, Mountain View, CA, 2008.
[263] B. Stone-Gross, M. Cova, L. Cavallaro, B. Gilbert, M. Szydlowski, R. Kemmerer,
C. Kruegel, and G. Vigna. Your botnet is my botnet: analysis of a botnet takeover.
In CCS ’09: Proceedings of the 16th ACM conference on Computer and communications security, pages 635–647, New York, NY, USA, 2009. ACM.
[264] B. Stone-Gross, A. Moser, C. Kruegel, E. Kirda, and K. Almeroth. FIRE: FInding
Rogue nEtworks. In Proceedings of the Annual Computer Security Applications
Conference (ACSAC), Honolulu, HI, December 2009.
113
Bibliography
[265] Symantec MessageLabs Intelligence. In the battle of the botnets rustock remains
dominant. Monthly report, August 2010.
[266] Symantec MessageLabs Intelligence. Survival of the fittest: Selfish botnets dominate the spam landscape as rustock becomes the largest botnet; linux takes a share
of spam from windows. Monthly report, April 2010.
[267] Symantec.cloud. Messagelabs hosted email antispam. http://www.messagelabs.
com. [Online; accessed 13-Jan-2011].
[268] M. Tahara, N. Tateishi, T. Oimatsu, and S. Majima. A Method to Detect Prefix Hijacking by Using Ping Tests. In APNOMS ’08: Proceedings of the 11th
Asia-Pacific Symposium on Network Operations and Management, pages 390–398,
Berlin, Heidelberg, 2008. Springer-Verlag.
[269] T. Taylor, S. Brooks, and J. McHugh. NetBytes viewer: An entity-based netflow
visualization utility for identifying intrusive behavior. VizSEC 2007, pages 101–
114, 2008.
[270] T. Taylor, D. Paterson, J. Glanfield, C. Gates, S. Brooks, and J. McHugh. Flovis: Flow visualization system. In Conference For Homeland Security, 2009.
CATCH’09. Cybersecurity Applications & Technology, pages 186–198. IEEE, 2009.
http://projects.cs.dal.ca/flovis/download.html.
[271] R. Teixeira, S. Agarwal, and J. Rexford. Bgp routing changes: merging views from
two isps. SIGCOMM Comput. Commun. Rev., 35(5):79–82, 2005.
[272] R. Teixeira and J. Rexford. A measurement framework for pin-pointing routing
changes. In NetT ’04: Proceedings of the ACM SIGCOMM workshop on Network
troubleshooting, pages 313–318, New York, NY, USA, 2004. ACM.
[273] S. J. Templeton and K. Levitt. A requires/provides model for computer attacks.
In Proceedings of New Security Paradigms Workshop, pages 31–38. ACM Press,
2000.
[274] S. J. Templeton and K. E. Levitt. Detecting spoofed packets. In DISCEX (1),
pages 164–. IEEE Computer Society, 2003.
[275] S. T. Teoh, K.-L. Ma, S. F. Wu, D. Massey, X. Zhao, D. Pei, L. Wang, L. Zhang,
and R. Bush. Visual-based anomaly detection for bgp origin as change (oasc)
events. In M. Brunner and A. Keller, editors, DSOM, volume 2867 of Lecture
Notes in Computer Science, pages 155–168. Springer, 2003.
114
Bibliography
[276] S. T. Teoh, K. L. Ma, S. F. Wu, and X. Zhao. Case study: interactive visualization
for internet security. In Proceedings of the conference on Visualization ’02, VIS
’02, pages 505–508, Washington, DC, USA, 2002. IEEE Computer Society. http:
//www.cs.ucdavis.edu/~ma/SecVis/.
[277] S. T. Teoh, S. Ranjan, A. Nucci, and C.-N. Chuah. Bgp eye: a new visualization
tool for real-time detection and analysis of bgp anomalies. In VizSEC ’06: Proceedings of the 3rd international workshop on Visualization for computer security,
[278] S. T. Teoh, K. Zhang, S.-M. Tseng, K.-L. Ma, and S. F. Wu. Combining visual
and automated data mining for near-real-time anomaly detection and analysis in
bgp. In C. E. Brodley, P. Chan, R. Lippman, and W. Yurcik, editors, VizSEC,
pages 35–44. ACM, 2004.
[279] S. T. Teoh, K. Zhang, S.-M. Tseng, K.-L. Ma, and S. F. Wu. Combining visual and
automated data mining for near-real-time anomaly detection and analysis in bgp.
In VizSEC/DMSEC ’04: Proceedings of the 2004 ACM workshop on Visualization
and data mining for computer security, pages 35–44, New York, NY, USA, 2004.
ACM.
[280] L. Terran. Machine Learning Techniques for the Domain of Anomaly Detection
for Computer Security. PhD thesis, Purdue University, 2000.
[281] J. Thomas and K. Cook, editors. Illuminating the Path: the Research and Development Agenda for Visual Analytics. IEEE, 2005.
[282] O. Thonnard. A multi-criteria clustering approach to support attack attribution in
cyberspace. PhD thesis, École Doctorale d’Informatique, Télécommunications et
Électronique de Paris, March 2010.
[283] O. Thonnard and M. Dacier. A framework for attack patterns’ discovery in honeynet data. digital investigation, 5:S128–S139, 2008.
[284] O. Thonnard and M. Dacier. Actionable knowledge discovery for threats intelligence support using a multi-dimensional data mining methodology. In Data Mining Workshops, 2008. ICDMW ’08. IEEE International Conference on, pages 154
–163, 2008.
[285] O. Thonnard, W. Mees, and M. Dacier. Addressing the attack attribution problem
using knowledge discovery and multi-criteria fuzzy decision-making. In Proceedings
115
Bibliography
of the ACM SIGKDD Workshop on CyberSecurity and Intelligence Informatics,
CSI-KDD ’09, pages 11–21, New York, NY, USA, 2009. ACM.
[286] O. Thonnard, W. Mees, and M. Dacier. Behavioral Analysis of Zombie Armies. In
C. Czossek and K. Geers, editors, The Virtual Battlefield: Perspectives on Cyber
Warfare, volume 3 of Cryptology and Information Security Series, pages 191–210,
Amsterdam, The Netherlands, 2009. IOS Press.
[287] O. Thonnard, W. Mees, and M. Dacier. On a multicriteria clustering approach for
attack attribution. SIGKDD Explor. Newsl., 12:11–20, November 2010.
[288] M. Thottan and C. Ji. Anomaly detection in ip networks. IEEE Trans. Signal
Processing, 51(8):2191–2204, 2003. Special Issue of Signal Processing in Networking.
[289] A. Toonk. Chinese ISP hijacks the Internet. http://bgpmon.net/blog/?p=282.
[290] V. Torra and Y. Narukawa. Modeling Decisions: Information Fusion and Aggregation Operators. Springer, Berlin, 2007.
[291] University of California Los Angeles. Cyclops. http://cyclops.cs.ucla.edu/.
[Online; accessed 13-Jan-2011].
[292] University of Memphis. NetViews. http://netlab.cs.memphis.edu/projects_
netviews.html. [Online; accessed 13-Jan-2011].
[293] University of Oregon. Route Views Project. http://www.routeviews.org/. [Online; accessed 13-Jan-2011].
[294] University of Washington. iPlane. http://iplane.cs.washington.edu/. [Online;
[295] A. Valdes and K. Skinner. Probabilistic alert correlation. In W. Lee, L. Me, and
A. Wespi, editors, Recent Advances in Intrusion Detection, volume 2212 of Lecture
Notes in Computer Science, pages 54–68. Springer, 2001.
[296] F. Valeur, G. Vigna, C. Kruegel, and R. A. Kemmerer. A comprehensive approach
to intrusion detection alert correlation. IEEE Trans. Dependable Sec. Comput.,
1(3):146–169, 2004.
[297] I. van Beijnum. BGP. O’Reilly Media, Inc., Sebastopol, CA, USA, September
2002.
116
Bibliography
[298] F. Viegas, M. Wattenberg, F. Van Ham, J. Kriss, and M. McKeon. Manyeyes: a
site for visualization at internet scale. IEEE Transactions on Visualization and
Computer Graphics, pages 1121–1128, 2007.
[299] G. Vigna and R. A. Kemmerer. Netstat: A network-based intrusion detection
system. Journal of Computer Security, 7(1), 1999.
[300] H. Wang, D. Zhang, and K. G. Shin. Detecting syn flooding attacks. In INFOCOM,
2002.
[301] L. Wang, X. Zhao, D. Pei, R. Bush, D. Massey, A. Mankin, S. F. Wu, and L. Zhang.
Observation and analysis of bgp behavior under stress. In Internet Measurement
Workshop, pages 183–195. ACM, 2002.
[302] Z. Wang and G. Klir. Fuzzy Measure Theory. Plenum Press, New York, 1992.
[303] M. O. Ward. Multivariate data glyphs: Principles and practice. In Handbook of
Data Visualization, Springer Handbooks Comp.Statistics, pages 179–198. Springer
Berlin Heidelberg, 2008.
[304] C. Wei, A. Sprague, G. Warner, and A. Skjellum. Characterization of Spam Advertised Website Hosting Strategy. In Sixth Conference on Email and Anti-Spam,
Mountain View, CA, 2009.
[305] C. Westphal. Data Mining for Intelligence, Fraud & Criminal Detection: Advanced
Analytics & Information Sharing Technologies. CRC Press, 1st edition (December
22, 2008), 2008.
[306] K.
J.
Wheaton.
Top
5
intelligence
http://sourcesandmethods.blogspot.com, [sep 2009].
analysis
methods,
[307] D. Wheeler and G. Larson. Techniques for cyber attack attribution. IDA Paper
P-3792, Institute for Defense Analyses, Alexandria, Virginia, 2003.
[308] R. White. Securing bgp through secure origin bgp. Internet Protocol Journal, 6(3),
September 2003.
[309] J. Wolf. Pentagon says “aware” of China Internet rerouting.
reuters.com/article/idUSTRE6AI4HJ20101119?pageNumber=1.
cessed 22-Nov-2010].
http://www.
[Online; ac-
117
Bibliography
[310] T. Wong and C. Alaettinoglu. Internet routing anomaly detection and visualization. In Dependable Systems and Networks, 2005. DSN 2005. Proceedings. International Conference on, pages 172–181. IEEE, 2005.
[311] J. Wu, Z. M. Mao, J. Rexford, and J. Wang. Finding a needle in a haystack:
Pinpointing significant bgp routing changes in an ip network. In NSDI. USENIX,
2005.
[312] L. Xiao, J. Gerth, and P. Hanrahan. Enhancing visual analysis of network traffic
using a knowledge representation. In Visual Analytics Science And Technology,
2006 IEEE Symposium On, pages 107–114. IEEE, 2006.
[313] Y. Xie, F. Yu, K. Achan, R. Panigrahy, G. Hulten, and I. Osipkov. Spamming
botnets: signatures and characteristics. In SIGCOMM ’08: Proceedings of the
ACM SIGCOMM 2008 conference on Data communication, pages 171–182, New
York, NY, USA, 2008. ACM.
[314] D. Xu and P. Ning. Alert correlation through triggering events and common
resources. In ACSAC, pages 360–369. IEEE Computer Society, 2004.
[315] D. Xu and P. Ning. Correlation Analysis of Intrusion Alerts. Springer, 2008.
[316] K. Xu, J. Chandrashekhar, and Z. Zhang. A first step towards understanding
inter-domain routing dynamics. In Proceedings of ACM SIGCOMM MINENET,
Philadelphia, PA, August 2005.
[317] R. Yager. On ordered weighted averaging aggregation operators in multicriteria
decision-making. IEEE Trans. Syst. Man Cybern., 18(1):183–190, 1988.
[318] H. Yan, R. Olivera, K. Burnett, D. Matthews, L. Zhang, and D. Massey. BGPmon: A real-time, scalable, extensible monitoring system. CATCH2009, http:
//bgpmon.netsec.colostate.edu/download/publications/catch09.pdf. [Online; accessed 13-May-2009].
[319] Y. Yang, F. Deng, and H. Yang. An unsupervised anomaly detection approach
using subtractive clustering and hidden markov model. In Proceedings of Communications and Networking in China, pages 313–316, 2007.
[320] M. Yannuzzi, X. Masip-Bruin, and O. Bonaventure. Open issues in interdomain
routing: a survey. IEEE Network, 19(6):49–56, 2005.
118
Bibliography
[321] V. Yegneswaran, P. Barford, and U. Johannes. Internet intrusions: global characteristics and prevalence. In SIGMETRICS, pages 138–147, 2003.
[322] V. Yegneswaran, P. Barford, and V. Paxson. Using honeynets for internet situational awareness. In Fourth ACM Sigcomm Workshop on Hot Topics in Networking
(Hotnets IV), 2005.
[323] A. Yelizarov and D. Gamayunov. Visualization of complex attacks and state of
attacked network. In Visualization for Cyber Security, 2009. VizSec 2009. 6th
International Workshop on, pages 1–9. IEEE, 2010.
[324] D. S. Yeung, S. Jin, and X. Wang. Covariance-matrix modeling and detecting
various flooding attacks. IEEE Transactions on Systems, Man, and Cybernetics,
Part A, 37(2):157–169, 2007.
[325] D.-Y. Yeung and Y. Ding. Host-based intrusion detection using dynamic and static
behavioral models. Pattern Recognition, 36(1):229–243, 2003.
[326] X. Yin, W. Yurcik, M. Treaster, Y. Li, and K. Lakkaraju. VisFlowConnect: netflow
visualizations of link relationships for security situational awareness. In Proceedings of the 2004 ACM workshop on Visualization and data mining for computer
security, pages 26–34. ACM, 2004.
[327] T. Zhang, R. Ramakrishnan, and M. Livny. Birch: An efficient data clustering
method for very large databases. In H. V. Jagadish and I. S. Mumick, editors,
Proceedings of the 1996 ACM SIGMOD International Conference on Management
of Data, Montreal, Quebec, Canada, June 4-6, 1996, pages 103–114. ACM Press,
1996.
[328] Z. Zhang, Y. Zhang, Y. C. Hu, Z. M. Mao, and R. Bush. iSPY: Detecting IP Prefix
Hijacking on My Own. In Proceedings of the ACM SIGCOMM 2008 conference
on Data communication, SIGCOMM ’08, pages 327–338, New York, NY, USA,
August 2008. ACM.
[329] J.-L. Zhao, J.-F. Zhao, and J.-J. Li. Intrusion detection based on clustering genetic
algorithm. In Proceedings of 2005 International Conference on Machine Learning
and Cybernetics, volume 6, pages 3911–3914, 2005.
[330] X. Zhao, D. Pei, L. Wang, D. Massey, A. Mankin, S. F. Wu, and L. Zhang. An
analysis of BGP multiple origin AS (MOAS) conflicts. In Proceedings of the 1st
ACM SIGCOMM Workshop on Internet Measurement, IMW ’01, pages 31–35,
New York, NY, USA, 2001. ACM.
119
Bibliography
[331] C. Zheng, L. Ji, D. Pei, J. Wang, and P. Francis. A Light-Weight Distributed
Scheme for Detecting IP Prefix Hijacks in Real-Time. SIGCOMM Comput. Commun. Rev., 37(4):277–288, 2007.
[332] C. Zheng, L. Ji, D. Pei, J. Wang, and P. Francis. A light-weight distributed
scheme for detecting ip prefix hijacks in realtime. In J. Murai and K. Cho, editors,
SIGCOMM, pages 277–288. ACM, 2007.
[333] L. Zhuang, J. Dunagan, D. R. Simon, H. J. Wang, and J. D. Tygar. Characterizing
botnets from email spam records. In LEET’08: Proceedings of the 1st Usenix
Workshop on Large-Scale Exploits and Emergent Threats, pages 1–9, Berkeley,
CA, USA, 2008. USENIX Association.
120

D1.1 Analysis of Current Practices - VIS

Transcription

Similar documents

Why Email?

A short interactive workshop aimed at raising awareness of the risks

ppt - NKOS

Transhield

Malware at a glance or: Facing the latest threats

AppRiver guards your inbox against unwanted

Voyagers and Voyeurs - UW-Madison Database Research Group

Understanding and Optimizing BGP Peering