this PDF file - Proceedings of the Conference on Digital
Transcription
this PDF file - Proceedings of the Conference on Digital
Proceedings of the Conference on Digital Forensics, Security and Law 2015 Daytona Beach, Florida May 19-21 Daytona Beach, Florida May 19-21, 2015 Conference Chairs Gary Kessler [email protected] Conference Chair Embry-Riddle Aeronautical University Florida, USA Diane Barrett [email protected] Program Chair Bloomsburg University Pennsylvania, USA Jigang Liu [email protected] Program Co-Chair Metropolitan State University Minnesota, USA Association of Digital Forensics, Security and Law Copyright © 2015 ADFSL, the Association of Digital Forensics, Security and Law. Permission to make digital or printed copies of all or any part of this journal is granted without fee for personal or classroom use only and provided that such copies are not made or distributed for profit or commercial use. All copies must be accompanied by this copyright notice and a full citation. Permission from the ADFSL is required to make digital or printed copies of all or any part of this journal for-profit or commercial use. Permission requests should be sent to Dr. Glenn S. Dardick, Association of Digital Forensics, Security and Law, 1642 Horsepen Hills Road, Maidens, Virginia 23102 or emailed to [email protected]. ISSN 1931-7379 Thank You to Our Sponsors Contents Committee .............................................................................................................................................. 4 Schedule .................................................................................................................................................. 5 Keynote Speaker: Jeff Salyards, Executive Director of the Defense Forensic Science Center ...... 9 Keynote Speaker: Craig Ball, Board Certified trial lawyer, certified computer forensic examiner, law professor and electronic evidence expert .................................................................... 9 Invited Speaker: Mohamed Chawki, Chief Judge of the Council of State, Egypt ......................... 11 Invited Speaker: Gareth Davies, Senior Lecturer at the University of South Wales, UK .............11 Invited Speaker: Philip Craiger, Daytona State College ....................................................................11 Invited Paper: Potential Changes to eDiscovery Rules in Federal Court: A Discussion of the Process, Substantive Changes and Their Applicability and Impact on Virginia Practice ............. 13 Joseph J. Schwerha, IV, Attorney at Law, Pennsylvania, USA Invited Paper: A Profile of Prolonged, Persistent SSH Attack on a Kippo Honeynet ........................23 Craig Valli, Director of the Security Research Institute at Edith Cowan University, Australia Two Challenges of Stealthy Hypervisors Detection: Time Cheating and Data Fluctuations .........33 Igor Korkin* An Empirical Comparison of Widely Adopted Hash Functions in Digital Forensics: Does the Programming Language and Operating System Make a Difference? ..............................................57 Satyendra Gurjar, Ibrahim Baggili*, Frank Breitinger and Alice Fischer Investigating Forensics Values of Windows Jump Lists Data ...........................................................69 Ahmad Ghafarian* A Survey of Software-based String Matching Algorithms for Forensic Analysis ...........................77 Yi-Ching Liao * A New Cyber Forensic Philosophy for Digital Watermarks in the Context of Copyright Laws ...87 Vinod Polpaya Bhattathiripad , Sneha Sudhakaran* and Roshna Khalid Thalayaniyil A Review of Recent Case Law Related to Digital Forensics: The Current Issues ..........................95 Kelly Anne Cole, Shruti Gupta, Dheeraj Gurugubelli* and Marcus K Rogers On the Network Performance of Digital Evidence Acquisition of Small Scale Devices over Public Networks ..................................................................................................................................105 Irvin Homem * and Spyridon Dosis Measuring Hacking Ability Using a Conceptual Expertise Task....................................................123 Justin Giboney*, Jeffrey Gainer Proudfoot, Sanjay Goel * and Joseph S. Valacich HTML5 Zero Configuration Covert Channels: Security Risks and Challenges ..........................135 Jason Farina*, Mark Scanlon, Stephen Kohlmann, Nhien-An Le-Khac and Tahar Kechadi Continuous Monitoring System Based on Systems' Environment .................................................151 Eli Weintraub* and Yuval Cohen Identifying Common Characteristics of Malicious Insiders ...........................................................161 Nan Liang * and David Biros* Phishing Intelligence Using The Simple Set Comparison Tool ......................................................177 Jason Britt*, Dr. Alan Sprague and Gary Warner Towards a Digital Forensics Competency-Based Program: Making Assessment Count..............193 Rose Shumba* Case Study: 2014 Digital Forensics REU Program at the University of Alabama at Birmingham ..................205 Daniel Weiss* and Gary warner * Author Presenting and/or Attending Conference Committee The 2015 ADFSL Conference on Digital Forensics, Security and Law is pleased to have the following as co-chairs of the conference, chairs of the conference committee, and administrators of the conference: Gary Kessler [email protected] Conference Chair Embry-Riddle Aeronautical University Florida, USA Diane Barrett [email protected] Program Chair Bloomsburg University Pennsylvania, USA Jigang Liu [email protected] Program Co-Chair Metropolitan State University Minnesota, USA The 2015ADFSL Conference on Digital Forensics, Security and Law is pleased to have the following as members of the program committee: Mamoun Alazab Australian National University Canberra, ACT Australia David Biros Oklahoma State University Oklahoma, USA Frank Breitinger University of New Haven Connecticut, USA Raymond Choo University of South Australia Australia Fred Cohen Management Analytics California, USA Philip Craiger Daytona State College Florida, USA David Dampier Mississippi State University Mississippi, USA Gareth Davies University of South Wales Wales, UK Doug Ferguson AT&T North Carolina, USA R. L. "Doc" Gardner University of South Florida Florida, USA Sanjay Goel University at Albany, SUNY New York, USA Gaurav Gupta Ministry of Communications & Information Technology India Ezhil S. Kalaimannan University of West Florida Florida, USA Frederick S. Lane Author, Attorney New York, USA Rick Mislan Rochester Institute of Technology New York, USA Emilio Raymond Mumba EPI-USE South Africa (Pty) Ltd Pretoria, South Africa John Riley Bloomsburg University Pennsylvania, USA John Sammons Marshall University West Virginia, USA Joseph J. Schwerha IV California University Pennsylvania, USA Remzi Seker Embry-Riddle Aeronautical University Florida, USA Iain Sutherland Noroff University College Norway Frank Thornton Blackthorn Information Security Vermont, USA Michael Tu Purdue University Calumet Indiana, USA Schedule TUESDAY – May 19 8:00 AM – 9:00 AM: Registration & Continental Breakfast Welcoming Remarks - Gary C. Kessler, Conference Chair Welcoming Remarks - Richard Heist, Embry-Riddle Aeronautical University, Conference Host Introductions - Diane Barrett, Conference Program Chair 9:30 AM: Keynote Speaker - Jeff Salyards BREAK Session Chair: Jigang Liu, Metropolitan State University 10:45 AM: Two Challenges of Stealthy Hypervisors Detection: Time Cheating and Data Fluctuations - Igor Korkin 11:45 AM: An Empirical Comparison of Widely Adopted Hash Functions in Digital Forensics: Does the Programming Language and Operating System Make a Difference? - Ibrahim Baggili LUNCH: 12:30 PM 1:30 PM: Firmware Forensics - Invited Speaker Gareth Davies Session Chair: LeGrand Gardner, USF-Florida Center for Cybersecurity 2:30 PM: Investigating Forensics Values of Windows Jump Lists Data Ahmad Ghafarian BREAK Session Chair: LeGrand Gardner, USF-Florida Center for Cybersecurity 3:30 PM: A Survey of Software-based String Matching Algorithms for Forensic Analysis - Yi-Ching Liao 4:15 PM: A New Cyber Forensic Philosophy for Digital Watermarks in the Context of Copyright Laws - Sneha Sudhakaran Schedule WEDNESDAY – May 20 8:00 AM – 9:00 AM: Registration & Continental Breakfast Session Chair: David Dampier, Mississippi State University 9:00 AM A Review of Recent Case Law Related to Digital Forensics: The Current Issues - Dheeraj Gurugubelli 9:45 AM: On the Network Performance of Digital Evidence Acquisition of Small Scale Devices over Public Networks - Irvin Homem BREAK 10:45 AM: Invited Paper: Potential Changes to eDiscovery Rules in Federal Court: A Discussion of the Process, Substantive Changes and Their Applicability and Impact on Virginia Practice - Joseph J. Schwerha IV LUNCH: 12:00 Noon 1:00 PM: Keynote Speaker - Craig Ball 2:00 PM: Invited Paper: SSH Honeypot Attack - Craig Valli BREAK Session Chair: Gareth Davies, University of South Wales 3:00 PM: Measuring Hacking Ability Using a Conceptual Expertise Task Justin Giboney and Sanjay Goel 3:40 PM: HTML5 Zero Configuration Covert Channels: Security Risks and Challenges - Jason Farina 4:20 PM: Continuous Monitoring System Based on Systems' Environment Eli Weintraub Evening Reception Schedule THURSDAY - May 21 7:30 AM – 8:30 AM: Continental Breakfast Session Chair: Ezhil S. Kalaimannan, University of West Florida 8:30 AM: Identifying Common Characteristics of Malicious Insiders - Nan Liang 9:10 AM: Phishing Intelligence Using The Simple Set Comparison Tool Jason Britt 9:50 AM: Invited Speaker: Tackling online child pornography: An Egyptian case study - Mohamed Chawki BREAK Session Chair: Ezhil S. Kalaimannan, University of West Florida 10:50 AM: Towards a Digital Forensics Competency-Based Program: Making Assessment Count - Rose Shumba 11:30 AM: Case Study: 2014 Digital Forensics REU Program at the University of Alabama at Birmingham - Daniel Weiss LUNCH: 12:30 Noon 1:30 PM: Invited Speaker: ACE Consortium – Philip Craiger 2:10 PM: Invited Speaker: To License or Not to License Reexamined: An Updated Report on State Statutes Regarding Private Investigators and Digital Examiners - Doug White 3:30 PM Conference Close Map Conference Venue Jim W. Henderson Administration and Welcome Center Embry-Riddle Aeronautical University, 600 S Clyde Morris Blvd, Daytona Beach, FL 32114 Keynote Speakers Jeff Salyards - Tuesday Dr Salyards is the Executive Director of the Defense Forensic Science Center (DFSC), Forest Park, Georgia. Prior to his current position, he served as the Chief Scientist for the US Army Criminal Investigation Laboratory (USACIL). Before coming to USACIL, he was a Principal Analyst with Analytical Services and authored a study about the best methods to train military operators in material collection during the conduct of operations. He holds a PhD in Chemistry from Montana State University, a Masters of Forensic Sciences from The George Washington University and has completed a Fellowship in Forensic Medicine from the Armed Forces Institute of Pathology. A former Director of the Defense Computer Forensic Laboratory and AFOSI Special Agent, he has 27 years of combined experience in scientific leadership, investigations, forensic consulting and teaching. He served as the Deputy for Operations and Assistant Professor at the Air Force Academy Chemistry Department and was honored with the Outstanding Academy Educator Award. Dr Salyards has served on the Board of Directors for the American Society of Crime Laboratory Directors/Laboratory Accreditation Board, the Department of Justice National Steering Committee for Regional Computer Forensic Laboratories, the Council of Federal Forensic Laboratory Directors, the ASCLD Board of Directors, and as a Commissioner for the Forensic Education Programs Accreditation Commission; he is a current member of the National Commission on Forensic Science. Dr Salyards is a Fellow of the American Academy of Forensic Sciences and has an impressive list of publications and presentations. He is also a retired commissioned officer in the United States Air Force. He has been married for 24 years and has three daughters. Craig Ball - Wednesday Craig Ball is a Board Certified trial lawyer, certified computer forensic examiner, law professor and electronic evidence expert He's dedicated his career to teaching the bench and bar about forensic technology and trial tactics. After decades trying lawsuits, Craig limits his practice to service as a court-appointed special master and consultant in computer forensics and e-discovery. A prolific contributor to educational programs worldwide--having delivered over 1,600 presentations and papers-Craig’s articles on forensic technology and electronic discovery frequently appear in the national media. For nine years, he wrote the award winning column on computer forensics and e-discovery for American Lawyer Media called "Ball in your Court." Craig Ball has served as the Special Master or testifying expert on computer forensics and electronic discovery in some of the most challenging, front page cases in the U.S. Invited Speakers Mohamed Chawki, Chief Judge of The Council of State, Egypt "Tackling online child pornography: An Egyptian case study" - Mohamed Chawki, LL.B, BA, LL.M, PhD, FRSA, Chief Judge of The Council of State, Lecturer in Law, Founder - Chairman, AILCC Mohamed Chawki holds a (Ph.D.) in law from the University of Lyon III in France for a dissertation on French; British and American cybercrime legal systems. This was followed by a 3-year post-doctoral research at the Faculty of Law, University of AixMarseille III, France. He is senior judge and former advisor to the Chairman of Capital Market Authority (CMA) and to the Chairman of the Egyptian Financial Supervisory Authority (EFSA). Dr. Chawki is the Founder Chairman of the International Association of Cybercrime Prevention (AILCC), located in Paris, France. An association of international IT experts and legal scholars specializing in cyber law, privacy and security. He is also the founder and co – director of the African Center for Cyberlaw, in Kampala (ACCP), founded in collaboration with the United-Nations (UN). Dr. Chawki has extensive knowledge of High Tec criminality, cybercrime, cyber terrorism and IT, including countermeasures and prevention. As a part of his research, he carried out an internship at Interpol’s Financial and High Tec Crime Unit. He also conducted legal analysis for the Organisation of CyberAngels in NYC and adviced cybercrime victims on various issues related to countermeasures and prevention. Doctor Chawki is the co-drafter of the African Union Convention on Cybersecurity. He is also a member of the International Scientific and Professional Advisory Council of the United Nations Crime Prevention and Criminal Justice Program (ISPAC), a member of the European Society of Criminal Law, and a board member of Computer Crime Research Center (CCRC) in Ukraine. He teaches law at private and public universities in Egypt and holds a number of visiting posts abroad. His research interest covers national security, cybercrime and data protection. Dr. Chawki holds over 25 prizes for this academic achievement and was awarded by the Medal of Excellence by the President of the Arab Republic of Egypt in 1998 , the international prize Claire l’Heureux Dubé from Canada in 2007 and the distinguished services medal from the government of Brazil in 2009. Gareth Davies, University of South Wales, UK "Hard Disk Firmware Forensics - Recovering data from protected hardware storage areas" Mr. Gareth Davies is a senior lecturer (digital forensics) and researcher at the University of South Wales. The main focus of his research is the data recovery and forensic analysis of Hard Disk and Flash Memory technologies. He has been involved in a variety of research projects in the area of information security and digital forensics. Mr. Davies acts as a consultant and investigator on digital forensic and evidential recovery technology cases. Clients include UK Police Hi-tech crime units, National law Enforcement agencies, the Ministry of Defence, Government, as well as large commercial organizations. He holds an adjunct lecturer position at the SecAU Security Research Centre at Edith Cowan University, Western Australia. Mr. Davies also serves as a committee member on the UK First Forensic Forum (F3). Doug White, Roger Williams University, Rhode Island, USA "To License or Not to License Reexamined: An Updated Report on State Statutes Regarding Private Investigators and Digital Examiners" - Doug White, Chair of Cybersecurity and Networking at Roger Williams University and the President of Secure Technology, LLC. Doug White holds a PhD in Computer Information Systems and Quantitative Analysis from the University of Arkansas. Dr. White also is Certified Computer Examiner number 30, holds a Certified Information Systems Security Professional Certificate, is a Cisco Certified Network Administrator, and a licensed Private Investigator in the State of Rhode Island. Dr. White has worked in the technology field since 1977. Dr. White has worked for the Federal Reserve System. The Department of Energy, Martin Lockheed, and hundreds of clients. Today, Dr. White is the Chair of Cybersecurity and Networking at Roger Williams University and the President of Secure Technology, LLC. INVITED PAPER Potential Changes to eDiscovery Rules in Federal Court: A Discussion of the Process, Substantive Changes and Their Applicability and Impact on Virginia Practice Joseph J. Schwerha California U of Pennsylvania United States Susan L Mitchell Altman, Spence, Mitchell & Brown, P.C. United States John W Bagby Penn State University College of Info. Sci. & Tech. United States Revision of the Federal Rules The Federal Rules of Civil Procedure (FRCP) are subject to a unique process also once used in revising the Federal Rules of Evidence (FRE). Today, this process is followed in revisions of the FRCP, the Federal Rules of Criminal Procedure and the Federal Bankruptcy Rules. This unique rulemaking process differs significantly from traditional notice and comment rulemaking required for a majority of federal regulatory agencies under the Administrative Procedure Act (APA).1 Most notably, rule-making for the federal courts’ procedural matters remain unaffected by the invalidation of legislative veto. It is still widely, but wrongly believed, that the legislative veto was completely invalidated by INS v. Chadda.2 The 2013-2014 FRCP Revisions3 Pre-trial discovery is perpetually controversial. Parties advantaged by strict privacy can often avoid justice that disadvantages their interests. Contrawise, parties advantaged by relaxed litigation privacy can achieve justice when all facts are accessible irrespective of the information’s repositories, ownership or control. Controversy stems from this justice dilemma but the advocacy largely concerns the injustice resulting from overtly threatened and implicit risk of wasteful harassment. American-style pre-trial discovery in civil and regulatory enforcement is relatively rare around the globe. Elsewhere corporate privacy remains a durable feudal legacy. U.S. discovery rules open nearly all relevant and non-privileged data for use by opposing parties; non-disclosure agreements (NDA) and non-privilege confidentiality notwithstanding. In the “old world” where most information was embodied as tangible paper data, the traditional discovery process was costly and time consuming. Despite the efficiencies claimed for computerized telecommunications, the individual and societal burdens of pre-trial discovery have actually increased, rather than diminished, as most data migrates to electronically stored information (ESI). This article provides a mid-stream assessment of the second major revision effort to accommodate U.S. discovery processes to the broad and deep problems arising during the past 20 years of document discovery experience with the increasing displacement of paper and micro-film/fiche with ESI data sources. The Historical Context Political pressures to reform the discovery process are decades old. The current environment is witnessing yet another chapter in this unfolding controversy. Two major factions are at play: transparent 1 5 U.S.C.§551, et seq. Immigration and Naturalization Service v. Chadha, 462 U.S. 919 (1983) (invalidating one-house legislative veto on bases of separation of powers and presentment, limited to non-judiciary rule-makings). 3 Portions of this discussion are adapted and updated from the Pennsylvania State University IST 453 instructor’s integrative contributions to the IST 453 class paper: author., Byron Granda, Emily Benoit, Alexander Logan, Ryan Snell & author, The Federal Rules Of Civil Procedure: Politics In The 2013-2014 Revision, CONFERENCE ON DIGITAL FORENSICS SECURITY AND LAW, Richmond VA May 29, 2014. 2 justice vs. confidential privacy. This is a classic tension in law, politics, business, personal liberty and all information sciences and technologies. Transparency Leads to Justice First, the forces of transparency seek to improve justice by requiring professionalism among litigators from the practicing bar. Transparency inculcates the values of honesty and forthrightness by requiring evidence be provided to the opposition when relevant to the dispute irrespective of its source, current residence or its ultimate implications. The natural incentive is to hide and destroy deservedly discrediting information.4 Throughout most of history in the 20th and now the 21st centuries, the U.S. has witnessed progress by these forces seeking transparency to secure a gradual opening of data sources, now predominately electronically stored information (ESI), achieving transparency that contributes to justice. Privacy and Confidentiality - Essential to Liberty Second, the forces of confidentiality and privacy are largely fighting a rear-guard action by resisting openness, touting the historic advantages of confidentiality and lauding the liberty inherent in strong privacy rights. This opposition may be one side of a critical balance in the modern era, with access to full information essential to justice in litigation and the setting of just public policies hanging in the balance. The forces of confidentiality and privacy have some potent arguments about how excessive costs of regulatory and criminal investigations and civil pre-trial discovery actually divert scarce societal resources. Furthermore, the mere threat of costly discovery is coercive, the threat of debilitating litigation compels settlements that are sometimes unjust.5 Past Reforms of Investigatory Practices Investigatory practices have varied widely throughout history. Most despotic governments have coerced the production of evidence by deploying torture and threats against person, family and property to induce confession and testimony. Fabrication, forgery, fraud and perjury persist independently. As the U.S. colonies moved towards independence, many of these practices were regularly used by the British Crown, thus inspiring the liberty reforms inherent in the U.S. Constitution, but particularly in the Bill of Rights. The U.S. system balances privacy and confidentiality against the public interest. Indeed, most lay persons understand that: (1) the 3rd Amendment protects the home as castle from quartering of occupying or domestic armed forces; (2) the 4th Amendment protects from unlawful searches and seizures by requiring probable cause before enforcing warrants and prohibiting general warrants; (3) the 5 th Amendment protects personal intimacy from self-incrimination and holds private property paramount to government interests; (4) the 6th Amendment overcomes secrecy and individual privacy in trials (requires a public record, cross-examination); (5) the 9th and 10th Amendments allow for privacy rights to be inferred from the Constitution or enacted by the states; 6 and (6) the 14th Amendment provides a strong basis to withhold or limit access to information, including the freedom of personal choice.7 Few other nations have accepted the degree of openness as exists in the U.S. arguably contributing greatly to both justice and liberty essential to the U.S.’s phenomenal success in short existence over the past two centuries. Indeed, nations of the civil law tradition provide for few civil 4 5 Posner, Richard, LAW AND ECONOMICS (2d ed. Little Brown 1977). See, e.g., In re Bridgestone/Firestone, Inc. Tires Prods. Liab. Litig., 288 F.3d 1012, 1015-16 (7th Cir. 2002): Both Ford and Firestone petitioned for interlocutory review under Fed. R. Civ. P. 23(f)… Aggregating millions of claims on account of multiple products manufactured and sold across more than 10 years makes the case so unwieldy, and the stakes so large, that settlement becomes almost inevitable–and at a price that reflects the risk of a catastrophic judgment as much as, if not more than, the actual merit of the claims. 6 U.S.CONST. amend.3, 4, 5, 6, 9, 10 7 U.S.CONST. amend. 14. lawsuits among private parties to access information held by opposing parties.8 The U.S. nearly stands alone in opening up the files and records of opposing parties in regulatory and private civil litigation to the opposition to “mine” for smoking gun correspondence, meeting records and other documents. Of course, criminal investigatory practices in many nations still take the “inquisitorial” path but justice is ultimately limited by government resources and political cronyism/favoritism. In the 20th century, pre-trial discovery has grown significantly in the U.S. creating unique default rules that all parties in regulatory and private civil litigation must search for, find, produce and disclose to the opposition most all relevant data in their possession. Only attorney-client privilege, attorney work product privilege, grand jury secrecy and various protective orders (e.g., to maintain trade secrecy) are exceptions to the broad discovery rules that the U.S. developed in the 20th century. The major 1938 amendments to the FRCP provided the first major watershed in opening up the files and personnel of opposing parties to much deeper revelation. This development was met with joy by opposing parties now able to prove their cases. It has caused consternation and dissonance by opposing parties who appear to lose cases from what appears as self-incrimination. The states eventually follow the FRCP lead, but with some major exceptions. The FRCP 2006 Revisions Directly Address ESI As corporate records have become highly valuable to prove and defend against civil and regulatory claims, their migration in the 1990s into electronic forms (ESI) has become a watershed. Both the volume and complexity of their production as well as their high probative value have been transformative. After a difficult decade of transition from predominately paper-based records in the early 1990s to predominately ESI in the early 21st century,9 the first major reassessment of pre-trial discovery practices produced the 2006 revision of the FRCP. This watershed most clearly signaled the establishment of a new field of endeavor for litigating attorneys, law office staffs, the information technology (IT) industry, and a fast growing cottage industry in litigation support that employs various experts: investigators, computer forensics, electronic discovery (ED) support. Indeed, interdisciplinary electronic discovery teams are necessary to navigate this field successfully.10 Initially, the federal courts adapted existing paper-based discovery practice to the huge new costs and potential successes of electronic discovery, first by precedent in trials and eventually with the adoption of the 2006 revision to the FRCP. Again, the states have generally followed the federal rules, with some states expanding the FRCPs intrusiveness with special treatments. Current FRCP Reform Pressures The FRCP, as interpreted by courts, enable and constrain the electronic discovery aspects of CyberForensics. The FRCP was revised in December 2006, effective in 2007, to more closely address the predominance of electronic evidence. While these 2006 FRCP revisions have worked much better than the piecemeal adaptations made by (largely) a few tech-savvy federal judges in the decade prior,11 there 8 In many civil law nations, the criminal prosecutors have inquisitorial powers and also serve as judges. While this provides a modicum of arguable political independence, it clearly violates U.S.-style separation of powers. Civil law nations lack the additional discipline of private attorneys general from private rights of action thereby providing wrongdoers with very considerable “cover.” This may explain most other nations’ reticence to expand U.S.-style private rights of action (civil damages, equitable remedies). 9 Ponemon Study (1995). 10 See generally, Ruhnka, John C. & author, Using ESI Discovery Teams to Manage Discovery of Electronic Data, Comm. of the ACM (July 2010) at 142-44. 11 Eight related Zubulake decisions were issued between 2003 and 2005: Zubulake v. UBS Warburg, 217 F.R.D. 309 (S.D.N.Y. 2003) (Zubulake I: allocating discovery costs for email production from backup tapes); Zubulake v. UBS Warburg, No. 02 Civ. 1243, 2003 WL 21087136 (S.D.N.Y. May 13, 2003) (Zubulake II: Zubulake’s reporting obligations); Zubulake v. UBS Warburg, 216 F.R.D. 280 (S.D.N.Y. 2003) (Zubulake III: allocating costs between parties for restoration of email backup tapes), Zubulake v. UBS Warburg, 220 F.R.D. 212 (S.D.N.Y. 2003) (Zubulake IV: duty to preserve emails; defendant bears plaintiff's re-deposition costs); Zubulake v. UBS Warburg, are continuing complaints that eDiscovery is still too costly, time consuming and susceptible to misuse in civil and regulatory litigation. This FRCP revision witnessed a convergence between electronic discovery and information governance. Indeed, according the influential Gartner group: Information Governance (IG) [is] the specification of decision rights and an accountability framework to encourage desirable behavior in the valuation, creation, storage, use, archival, and deletion of information. It includes the processes, roles, standards, and metrics that ensure the effective and efficient use of information in enabling an organization to achieve its goals.12 Information governance is an expanding new discipline taking pieces of enterprise integration, enterprise architecture, organizational design, cyber-forensics, digital rights management and most other IT fields as components. Security and privacy aspects of these disciplines are increasingly important. FRCP Revision – a Non-Standard Political Process The following succinctly captures how the unique Federal Rules revision process attempts to avoid politicization. The Judicial Conference’s Advisory Committee on Civil Rules recently forwarded to the Standing Committee on Rules of Practice and Procedure several proposed amendments to the Federal Rules of Civil Procedure that would, if adopted, modify the parameters and procedures governing discovery in civil litigation. … The development and approval process for court rules differs from traditional legislation by Congress and the President in some significant ways.13 According to one writer, “The proposed amendments are intended to reduce expense and delay in federal litigation by promoting cooperation among attorneys, proportionality in discovery, and judicial case management.”14 The Judicial Conference Advisory Committees on Bankruptcy and Civil Rules of the United States Courts published a preliminary draft of proposed FRCP revisions.15 Public hearings were held in Washington DC on November 7, 2013, and in Phoenix, Arizona on January 9, 2014. Transcripts of these public meetings reveal many arguments and counter-arguments as well as the special interests behind the revision effort and its opposition. Op-ed writers and expert commentary on the pressures for reform as well as assessments of the revision drafts proliferated in 2013 – 2014, created an interesting political duel over this essential public policy matter as resolved in this non-standard policy-making procedure. 2004 WL 1620866 (S.D.N.Y. July 20, 2004) (Zubulake V: sanctions granted; UBS ordered to pay costs; defense counsel ordered to monitor compliance and preserve with a litigation hold); Zubulake v. UBS Warburg, 231 F.R.D. 159 (S.D.N.Y. Feb.2, 2005) (Zubulake Va); Zubulake v. UBS Warburg, 382 F.Supp.2d 536 (S.D.N.Y. March 20, 2005) (Zubulake VI: preventing admission of various evidence); and Zubulake v. UBS Warburg, 02-CV-1243 (April 6, 2005) (Zubulake jury verdict: $29.3 million in damages of which $9.1 million compensatory, nearly $20.2 million punitive discovery sanctions). 12 See e.g., Logan, Debra, What is Information Governance? And Why is it So Hard?, GARTNER BLOG NETWORK (Jan.11, 2010) accessible at: http://blogs.gartner.com/debra_logan/2010/01/11/what-is-information-governance-and-why-is-it-so-hard/ 13 Shaffer, Craig B. & Ryan T. Shaffer, Looking Past The Debate: Proposed Revisions To The Federal Rules Of Civil Procedure, 7 FED.CT.L.REV. 178 (2013) accessible at: http://www.fclr.org/fclr/articles/html/2010/Shaffer2.pdf 14 Goldich, Marc A., Differing Opinions From Pa. On FRCP Amendments, LAW 360 (Feb.19, 2014) accessible at: http://www.law360.com/articles/511107/differing-opinions-from-pa-on-frcp-amendments 15 Proposed Amendments to the Federal Rules of Bankruptcy and Civil Procedure, Committee on Rules of Practice and Procedure of the Judicial Conference of the United States (Aug. 2013 - Preliminary Draft) accessible at: http://www.uscourts.gov/uscourts/rules/preliminary-draft-proposed-amendments.pdf The Major Issues in the 2014 Revision Effort The way the FRCP change is a slow and deliberate procedure. In June 2013, the Judicial Conference Advisory Committee on Civil Rules published their proposal. The public comment period opened August 15, 2013 and remained so until February 18, 2014. After three public hearings were held, and 2,359 comments were received, the Committee on Rules of Practice and Procedure approved the revisions at its May 29-30, 2014 meeting. These were then approved by the Judicial Conference of the United States on September 16, 2014. At this time, the revisions are pending before the Supreme Court and, if adopted, could go into effect on December 1, 2015.”16 Some of the more relevant proposals which will affect day to day litigation practices are as follows: • Reduction of the time to serve summons and complaint from 120 to 60 days; • Modification of the time within which the scheduling order must issue to within 60 days after any defendant being served (from 120), or within 45 days of any defendant appearing (from 90); • Requiring actual conference if scheduling order to be issued without Rule 26(f) report, no email, etc.; • Modifications to Rule 37 to make sanctions harder to impose for good faith preservation/destruction; • Allowing Rule 34 requests to be made before Rule 26(f) conference but without starting time to respond until after Rule 26(f) conference; • Changing proportionality from something the court limits to only allowing discovery proportional to needs of the case. (scope under Rule 26(b)(1)); • Making allocation of expenses explicit (FRCP 26(c)(1) (B)); and • Decreasing limits on depositions, interrogatories and making a limit on requests for admission.17 Several major issues are presented by the FRCP revision effort, including at least the following, some of which are explored in greater detail in the next sections. Political Pressures, the interests and influences of constituencies; Mandating Proportionality Effectiveness of Limiting Discovery Tools Changes in Sanctions for Failure to Preserve ESI Political Pressure From 2007 to 2013, experience with the 2006 FRCP revisions precipitated critique and support that evolved into political pressure for and against reform. Reform advocates advanced several arguments including that the high cost of electronic discovery produces injustice; the increasing volume of discovery is not effectively or efficiently processed leading to increasing litigation delay, also producing injustice; these costs and delays accumulate into settlement pressure that makes electronic discovery highly susceptible to misuse; and the scope of electronic discovery has grown too broad. Countervailing political pressures seek to preserve the status quo or expand the scope of electronic discovery. Predictably, the plaintiff’s bar and some academics18 are among those challenging 16 See generally, Lathrop, Tony, Proposed Amendments to Federal Rules of Civil Procedure Governing Discovery and Preservation of Electronically Stored Information One Step Closer to Approval, JD Supra Bus.Advisor (Oct.27, 2014) accessible at: http://www.jdsupra.com/legalnews/proposed-amendments-to-federal-rules-of-11494/ and Lange, Michele, Part III – FRCP Amendments: The Long and Winding Road, JDSupra Bus.Advisor (Oct.8, 2014) accessible at: http://www.jdsupra.com/legalnews/part-iii-frcp-amendments-the-long-and-77997/. Here are the two most recent Judicial Conference Reports, of course, Rule 26 is only a small part of these: http://www.uscourts.gov/uscourts/RulesAndPolicies/rules/Reports/ST09-2014.pdf http://www.uscourts.gov/uscourts/RulesAndPolicies/rules/Reports/ST09-2014-add.pdf 17 18 Id. See generally Comment by Hershkoff, Helen, Lonny Hoffman, Alexander A. Reinert, Elizabeth M. the reform effort.19 First, they argue that electronic discovery costs are overstated. Second, they argue that the misuse of discovery is a recurring but unsubstantiated allegation generally. Furthermore, adequate disincentives arise from existing rules addressing discovery sanctions and the accumulating experience in discovery sanctions. Finally, the sweeping reforms proposed would impose constraints that would actually increase the costs of achieving justice. Predictably, the defense bar, the insurance lobby and deep pocket defendants are the most active in filing pro-reform comments with the Judicial Conference. Consistent with decades-old arguments for regulatory reform, litigation reform, product liability reform and tort reform, the 2013-2014 FRCP revision pressures promise direct and opportunity cost savings.20 According to the RAND Institute for Civil Justice, document review consumes an inordinate proportion of electronic discovery costs although innovations such as predictive coding show promise to control these costs.21 Mandating Proportionality The proposals reduce the scope of discovery at several levels, including what may be obtained (incorporating proportionality explicitly) and how it may be obtained (limits on discovery tools). Proportionality was addressed through FRCP 26(b)(1). The changes make explicit that discovery shall only be “proportional to needs of the case,” thereby limiting the scope of discovery. More specifically, it further refines the definition of “proportional” by stating: “it is the amount of controversy, the importance of the discovery in resolving the issues, and whether the burden or expense of the proposed discovery outweighs its likely benefit.” The changes also specify that expenses are to be allocated more equitably. Critics, of course, dislike these changes. Plaintiffs believe it will now be harder to obtain discovery. The defense bar generally welcomes the changes as some believe that discovery, especially eDiscovery, is abused by the plaintiff’s bar, causing defendants unnecessary expense. Changes to make discovery proportional have been coming for quite some time. Experience tells us that the costs of ediscovery had become so high that courts were sua sponte limiting it to decrease litigation costs and to retain the integrity of the process. With a single hard drive capable of containing more than 400,000 elements, there is an incredible amount of data to be examined in routine cases. Changes in the proportionality concepts within the FRCP are believed to be necessary so as to avoid rendering eDiscovery so expensive that the merits of a case become secondary to eDiscovery costs. Effectiveness of Limiting Discovery Tools The changes propose decreasing the limits on depositions and interrogatories, as well as setting a limit on requests for admission. This would be accomplished by changing Rules 30, 31, 33 and 36. The proposals would reduce the number of depositions taken in an action from 10 to 5, and deposition time limits would be reduced from 7 hours each to 6. Under the proposed changes, interrogatories would decrease from 25 to 15. Requests for admission would be limited to 25. The proposals themselves were not meant to decrease discovery but, rather, make it more efficient. The Committee, in formulating the proposed changes, specifically referenced the idea that a court always may allow more discovery upon “good cause” shown. Thus, it appears the discovery tool limitations are meant to shift the start of the discussion. Again, critics and proponents split along plaintiff and defense lines. Critics argue that these limitations will make it harder for a plaintiff to prove his case. The defense bar welcomes them, believing that discovery currently is too expansive. The truth likely lies somewhere in between. Nonetheless, there Schneider, David L. Shapiro, & Adam N. Steinman, accessible at Regulation.gov 19 See generally, Fax, Charles X, Proposed Changes to Federal Rules Prompt Pushback. LITIGATION NEWS (Am. Bar Assn.), 2014. Web. 28 Apr. 2014 accessible at: http://apps.americanbar.org/litigation/litigationnews/civil_procedure/041614-pushback-federal-rules.html 20 See generally, Kyl, Jon, A Rare Chance to Lower Litigation Costs, WALL ST. J (Jan. 20 2014) accessible at: http://online.wsj.com/news/articles/SB10001424052702304049704579321003417505882 21 Pace, Nicholas & M. Laura Zakaras, WHERE THE MONEY GOES: UNDERSTANDING LITIGANT EXPENDITURES FOR PRODUCING ELECTRONIC DISCOVERY (RAND Institute for civil Justice, 2012) accessible at: http://www.rand.org/pubs/monographs/MG1208.html is a compelling argument to be made that litigation may not be more efficient in the face of these limitations if the result is an increase in discovery disputes and motions. Changes in Sanctions For Failing to Preserve ESI There is significant change coming if the proposal with regard to the safe harbor provision of FRCP 37 stands. Currently, Rule 37(e) reads , “Absent exceptional circumstances, a court may not impose sanctions under these rules on a party for failing to provide electronically stored information lost as a result of the routine, good-faith operation of an electronic information system.”22 The Committee now proposes that sanctions may be imposed for failing to preserve discoverable information “only if the court finds that the party’s actions: (i) caused substantial prejudice in the litigation and were wilfull or in bad faith; or (ii) irreparably deprived a party of any meaningful opportunity to present or defend against the claims in the litigation.”23 There are numerous critics to these particular changes. Some believe that the new language is so ambiguous and undefined that it fails to provide meaningful standards for judges considering imposition of sanctions. Noted eDiscovery scholar and United States District Court Judge Shira Scheindlin of the Southern District of New York stated her objections to these changes in a letter to the Committee, “in order to impose a sanction listed in Rule 37, the court must find that the spoliating party’s action caused ‘substantial prejudice’ and was ‘willful’ or in ‘bad faith.’ This language is fraught with problems.”24 In her testimony, Judge Scheindlin also stated that the vagueness will not encourage parties to preserve ESI. The defense bar fears that they may be sanctioned more often because of ambiguity regarding when preservation of ESI is required. That fear also stems from concern that plaintiffs more frequently will seek sanctions, even in instances where there is no clear evidence of malicious intent to destroy ES, with hopes that the court will impose same in light of the ambiguous Rule language. Reflections on the Virginia Perspective The Virginia Supreme Court is tasked with promulgating the rules of procedure in the Virginia courts. See Va. Code § 8.01-3(A) (“The Supreme Court, subject to §§ 17.1-503 and 16.1-69.32, may, from time to time, prescribe the forms of writs and make general regulations for the practice in all courts of the Commonwealth; and may prepare a system of rules of practice and a system of pleading and the forms of process and may prepare rules of evidence to be used in all such courts.”); see also Va. Const. art. VI, § 6 (“The Supreme Court shall have the authority to make rules governing the course of appeals and the practice and procedures to be used in the courts of the Commonwealth, but such rules shall not be in conflict with the general law as the same shall, from time to time, be established by the General Assembly.”). The emergence of civil discovery issues involving ESI was intended to be addressed, in part, by the 2013 amendments to the FRCP. The Virginia Supreme Court previously had addressed some of these issues in its civil rule additions and amendments that went into effect January 1, 2009.25 Specifically, the Virginia Supreme Court enacted Rules 4:1(b)(6)(ii), 4:1(b)(7), and 4:9A and amended Rule 4:9 in part to address ESI discovery issues. A discussion of these additions and amendments and a comparison with the analogous Federal Rules follows. Virginia Rule 4:1(b)(6)(ii) Subpart (ii) of this Rule provides a protocol for handling the inadvertent production of materials, including ESI, that a producing party believes to be confidential or privileged. Pursuant to this protocol, 22 Fed. R. Civ. P. 37(e). See infra at n. 15. 24 Scheindlin, Shira, Comments on Proposed Rules-Letter to Committee on Rules of Practice and Procedure, Regulations.gov (Jan. 13, 2014). 25 Notably, Bob Goodlatte, a member of the United States House of Representatives for Virginia’s 6th Congressional District, is the Chair of the 113th Congressional Committee on the Judiciary, which promulgated the subject Federal Rule amendments. The similarities in the language of the amended Virginia and Federal Rules discussed herein suggest that the 2013 Federal Rule amendments may have been influenced, at least in part, by the 2009 changes in the analogous Virginia Rules. 23 the producing party is required to notify any other party of the production of said materials and to identify the basis for the claimed privilege or protection. Thereafter, the receiving party must sequester or destroy the materials and shall not duplicate or disseminate the materials pending disposition of the claim of privilege or protection, either upon agreement of the parties or by motion. If the receiving party disseminated the subject materials prior to receiving notice from the producing party, the receiving party must take reasonable steps to retrieve the designated material. The Rule requires the producing party to preserve the materials until the claim of privilege or other protection is resolved. Although this Rule essentially mirrors FRCP 26(b)(5)(B), albeit with a few minor exceptions, the Federal Rule does not specifically reference ESI as does the Virginia Rule. FRCP 45(e)(2)(B), which was subject to the 2013 amendments, contains a similar provision; however, this Rule pertains specifically to materials produced in response to subpoena, as opposed to materials produced in the ordinary course of discovery, such as through a request for production. FRCP is substantively analogous to Virginia Rule 4:1(b)(6)(ii) with the exception of three relatively minor distinctions: (1) The Federal Rule requires that, upon notice of the inadvertent production, the receiving party must “return, sequester, or destroy” the materials, whereas the Virginia Rule mandates that the receiving party must either “sequester or destroy” the materials (there is no option to “return” the materials); (2) the Federal Rule prohibits the receiving party from using or disclosing the subject materials until the dispute is resolved, whereas the Virginia Rule prohibits the receiving party only from duplicating or disseminating the materials (there is no prohibition on otherwise “using” the materials); and (3) the Federal Rule specifies that the receiving party may present the subject materials under seal to the court for the district where compliance is required for a determination of the claim, whereas the Virginia Rule does not contain an analogous provision. Virginia Rule 4:1(b)(7) Subpart (b)(7) of this Rule governs the procedure by which a party may refuse to produce ESI that the party has identified as not reasonably accessible due to undue burden or cost. Pursuant to this procedure, a party is not required to produce any such identified ESI, although the party bears the burden of establishing that the ESI is not accessible due to undue burden or cost, either in support of the party’s motion for protective order or in response to an opposing party’s motion to compel. In addition, even if the party objecting to production of ESI meets its burden of inaccessibility, the court may still order the discovery if the requesting party shows good cause, considering the limits on the scope of discovery as provided for in Rule 4:1(b)(1).26 If the court compels production of ESI, it may specify the conditions for the production, including allocation of the reasonable costs thereof. This Rule mirrors FRCP 26(b)(2)(B), and there are no significant distinctions between the two rules. FRCP 45(e)(1)(D), which was subject to the 2013 amendments, contains a similar provision; however, this Rule pertains specifically to materials produced in response to subpoena, as opposed to materials produced in the ordinary course of discovery, such as through a request for production. Virginia Rule 4:9 This Rule governs the production of documents during discovery, including the production of ESI. Subpart (b)(iii)(B) pertains specifically to ESI production. Rule 4:9(b)(iii)(B)(1) mandates that responses to a request for production of ESI shall be subject to the provisions of Rules 4:1(b)(7) (allowing nonproduction of ESI upon a showing of inaccessibility due to undue burden or cost) and 4:1(b)(8) (requiring a pre-motion certification of good faith attempts to resolve matter without court intervention). In addition, subpart (b)(iii)(B)(2) provides that in the event a request does not specify the form for production of the ESI, or if the requested form is objected to by the producing party, the producing party must produce the information as it is ordinarily maintained if it is reasonably usable in such form, or must produce it in another form that is reasonably usable. The producing party is not required to produce requested ESI in more than one form. This Rule is similar to FRCP 34, specifically subparts (b)(2)(D) and (b)(2)(E). Subpart (b)(2)(D) of FRCP 34 provides that the producing party may object to the requesting party’s specified form of ESI 26 This Rule governs the discovery methods and general scope of discovery in civil actions. production. In the event the producing party so objects, or in the event the requesting party fails to specify a form of production, the producing party must state the form it intends to use for the production. This provision is not specifically set forth in the analogous Virginia rule, although the Virginia rule does allow a producing party to object to the form of production specified by the requesting party. See Va. Sup. Ct. R. 4:9(b)(iii)(B)(2). Subpart (b)(2)(E)(i) of FRCP 34 provides that a producing party must produce documents, including ESI, as they are kept in the usual course of business, or, alternatively, must organize and label the documents to correspond to the categories indicated in the request. This provision does not appear in the analogous Virginia rule pertaining specifically to production of ESI. Rather, this provision is set forth in Virginia Rule 4:9(b)(iii)(A), which governs production of documents generally and does not reference production of ESI. Subpart (b)(2)(E)(ii) of FRCP 34 provides that, if the requesting party fails to specify a form for ESI production, the producing party must produce it in a form in which it is reasonably maintained or in a reasonably usable form. This is similar to the Virginia rule, although the Virginia rule includes the caveat that the producing party must produce the ESI in the form in which it is ordinarily maintained if it is reasonably usable in such form. See Va. Sup. Ct. R. 4:9(b)(iii)(B)(2). There is no such caveat in the Federal Rule. Finally, subpart (b)(2)(E)(iii) of Federal Rule 34 provides that a producing party is not required to produce ESI in more than one form. This mirrors the analogous provision of the Virginia rule. See Va. Sup. Ct. R. 4:9(b)(iii)(B)(2). Virginia Rule 4:9A This Rule governs the production of documents and ESI from non-parties. Specifically, subpart (c)(2) pertains to the production of ESI by non-parties pursuant to subpoena. Rule 4:9A(c)(2)(A) mirrors the provisions of Rule 4:1(b)(7), discussed above, regarding the non-production of ESI that a producing party establishes to be inaccessible due to undue burden or cost. Rule 4:9A(c)(2)(B) mirrors the provisions of Rule 4:9(b)(iii)(B)(2), discussed above, regarding the form of ESI production by the producing party when the requesting party fails to specify any such form or when the form specified is objectionable. Rule 4:9A(c)(3) provides that, upon motion, the court may quash or modify the subpoena or the method or form of ESI production if the subpoena would otherwise be unduly burdensome or expensive. This Rule also permits the court to condition its denial of a motion to quash or modify a subpoena on the advancement of production costs by the appropriate party. This Rule further permits the court to order that the materials subpoenaed, including ESI, be submitted to the office of the court clerk and only be withdrawn upon request for a reasonable period of time to allow for inspection, photographing or copying of the materials. Rule 4:9A(c)(4) requires a party bringing a motion under this Rule to include a certification that the party attempted in good faith to resolve the matter with the other affected parties without court intervention. Finally, Rule 4:9A(f)(2) pertains to the production of ESI in response to subpoena, and it provides that when a party to a civil proceeding receives ESI in response to a subpoena, that party shall, if requested, provide complete and accurate copies of the ESI to the requesting parties in the same form it was received and upon reimbursement of the proportionate costs of obtaining the ESI. This Rule is similar to FRCP 45, specifically subparts (d)(3) and (e) governing the procedures for having a subpoena quashed or modified and the requirements for responding to a subpoena. Subpart (d)(3) of the Federal Rule is similar to subpart (c)(3) of the Virginia Rule and requires a court, upon motion, to quash or modify a subpoena that meets any of the following criteria: (1) failure to allow a reasonable compliance timeframe; (2) requires a person to respond outside the geographical limits set forth in FRCP 45(c); (3) requires disclosure of privilege or otherwise protected information assuming no applicable exception or waiver; or (4) subjects a person to undue burden. Subpart (d)(3) of the Federal Rule also sets forth the criteria under which a court may quash or modify a subpoena: (1) requires disclosure of a trade secret or other confidential research, development or commercial information; or (2) requires disclosure of an unretained expert’s opinion or information that does not describe specific occurrences in dispute and results from the expert’s study that was not requested by a party. FRCP 45 is more detailed than the Virginia Rule with respect to the circumstances under which a court must quash or modify a subpoena, as well as with respect to the circumstances under which a court has discretionary authority to quash or modify a subpoena. As with Virginia Rule 4:9A(c)(3), FRCP 45(d)(3) also provides courts with the ability to fashion alternative remedies for resolving subpoena disputes and provides that a court may, instead of quashing or modifying a subpoena, order appearance or production under specified conditions if the following criteria are met by the serving party: (1) a showing of substantial need for the testimony or material that cannot be met without undue hardship; and (2) assurance that the subpoenaed individual will be reasonably compensated. Again, this provision of the Federal Rule is more detailed and specific than its Virginia counterpart. Subparts (e)(1)(B) and (e)(1)(C) of FRCP 45 essentially mirror Virginia Rule 4:9A(c)(2)(B), discussed above, regarding the form for producing ESI when the form is not specified by the requesting party, as well as providing that the producing party is not required to produce ESI in more than one form. Subpart (e)(1)(D) of FRCP 45 mirrors Virginia Rule 4:9A(c)(2)(A), discussed above, regarding the nonproduction of ESI that a producing party establishes to be inaccessible due to undue burden or cost. Conclusion It is never easy to digest potential changes as the process of their transformation continues. However, some modifications have such implications that a wait-and-see approach will not suffice. This piece discusses significant changes to the FRCP that have great potential effect not only on the Federal litigation process, but also on corresponding Virginia Supreme Court Rules, which parallel most of the FRCP. First, cooperation is heavily incentivized. Changes to FRCP 1 were not subsequently revised making cooperation among the parties and the court to encourage “just, speedy, and inexpensive” resolution.27 Second, proportionality principles are reinforced, consistent with case law and the stronger principled advocacy leading to and expressed throughout the revision process. Under revised Rule 26(b)(1), judges are encouraged to balance several factors: Parties may obtain discovery regarding any nonprivileged matter that is relevant to any party’s claim or defense proportional to the needs of the case, considering the importance of the issues at stake in the action, the amount in controversy, the parties’ relative access to relevant information, the parties’ resources, the importance of the discovery in resolving the issues, and whether the burden or expense of the proposed discovery outweighs its likely benefit. Information within this scope of discovery need not be admissible 18 in evidence to be discoverable. Third, responding to overwhelming public pressures, the presumptive limits on depositions, interrogatories and admissions were withdrawn. This strongly illustrates the responsiveness of the Standing Committee as well as its parent, the Judicial Conference, to the comments of the public and to rulemaking witnesses. Finally, spoliation and ESI preservation case law are now better reconciled under a national standard while preserving judicial discretion to address an ever expanding volume of ESI. Sanctions are generally proper on findings the party acted willfully or in bad faith and the ESI destruction caused “substantial prejudice” to the opposing party. Sanctions for negligent ESI destruction (absent willful or bad faith) are permissible only when the loss irreparably deprived a party of any meaningful opportunity to present or defend against the claims in the litigation and only if the affected claim or defense was central to the litigation. At this stage of the proceedings, it appears that those suggesting changes are intent on implementing new and different procedures in the name of judicial efficiency. The proposed changes appear likely to impose greater consequences on plaintiffs than on defendants. However, the process itself is ongoing. Only time will tell what changes will be part of the final package, and thereby, what implications will arise from their inclusion. 27 Reikes, John, Why Even the Most Cutthroat Lawyer Should Want to Cooperate, Ediscovery Blog (Sept. 25, 2014) accessible at: http://www.theediscoveryblog.com/2014/09/25/even-cutthroat-lawyer-want-cooperate/ INVITED PAPER A Profile of Prolonged, Persistent SSH Attack on a Kippo Based Honeynet Craig Valli, Priya Rabadia and Andrew Woodard Security Research Institute Edith Cowan University [email protected] ABSTRACT This paper is an investigation focusing on activities detected by SSH honeypots that utilised kippo honeypot software. The honeypots were located across a variety of geographical locations and operational platforms. The honeynet has suffered prolonged, persistent and attack from a /24 network which appears to be of Chinese geographical origin. In addition to these attacks, other attackers have been successful in compromising real hosts in a wide range of other countries that were subsequently involved in attacking the honeypot machines in the honeynet. Keywords: Cyber Security, SSH, Secure Shell, Honeypots, Kippo INTRODUCTION This paper is an investigation focusing on activities detected by Secure Shell (SSH) honeypots that utilise the kippo honeypot software (desaster, 2015). This paper is part of an ongoing investigation, with initial work conducted in 2012 and 2013 (Valli, 2012; Valli, Rabadia, & Woodward, 2013). All SSH honeypots were configured identically using kippo source code. The focus of this particular research is primarily to identify evidence of automated attacks using password wordlists being implemented to login and gain access to three kippo SSH honeypots. All honeypots have the same username and password databases that contain multiple valid login password combinations. These valid combinations are part of the deception that is presented to the attacking entity by the kippo SSH honeypot. The passwords in these lists are drawn from well-known weak password lists. The honeypots are configured in kippo to present as different hostnames. The machines are further differentiated by manipulating some of the files in the fake filesystem used by kippo. This paper examined a specific attack that has propagated since November 2014 and continues as of the time of writing. What is unique about the attack is that all previous attempts to attack the honeypots were detected as originating from UNIX based systems utilising SSH clients. The SSH attacks are now appearing to be coming from machines that utilise the PUTTY SSH suite of tools on Windows platform operating systems. Furthermore, the volume of SSH login attempts evinced on the honeynet in the past four months has increased at a rate of growth which is approaching that of exponential. This significant increase in attempts is likely due to Windows operating system based computers comprising a significant share of the market, and reportedly in excess of 97% in China (Popa, 2015). OVERVIEW OF THE SETUP OF THE KIPPO SSH HONEYNET A honeynet can readily be described as a controlled and centrally administered collection of honeypots. The kippo SSH honeypot is a medium interaction honeypot, meaning that the honeypot imitates some functions that would exhibited by a live system (Hosting, 2013; Stevens & Pohl, 2004). The kippo honeypot is designed to effectively mimic an SSH server to an attacking entity. The SSH protocol is designed to securely transmit data using a point to point encryption tunnel (Ciampa, 2010), provides high grade encryption, and is a secure replacement for plaintext terminal programs such as telnet or rsh on UNIX or UNIX-like operating systems (Linux, OpenBSD, FreeBSD). Most network connected UNIX or UNIX-like operating systems have SSH installed as a client, and it is often included as a server (daemon) to help protect systems by providing a platform for encrypted communications. There are also many SSH clients available to run from Windows operating system based computers, with Putty being a commonly used Windows client (Tatham, 2015). kippo honeypots are designed to collect data from attacker interaction with an emulated SSH service. The emulated SSH service is provided by an open-source, Python based event-driven program called Twisted (TwistedMatrixLabs, 2013). Twisted provides the libraries that are utilised and deployed by kippo to imitate a valid encrypted SSH session to an entity. Relevant SSH artefacts are also extracted, including the SSH banner or string that the daemon or clients presents to connecting SSH entities. Each of these banners or strings are typically unique and in many cases can reliably fingerprint the connecting operating system and device. Fingerprinting is a term used in network security to describe the data which is sent by a computer when it is connect to over a network, and this data is considered to be unique to each operating system, and in some cases different versions of a given operating system. Kippo allows the honeypot user to change the SSH banner to any known valid fingerprints for SSH. The honeypot also emulates a logically correct, but manufactured file system to present to the user who successfully gains access to the honeypot. The system also presents falsified system reporting and allows interaction with artefacts such as /proc/cpuinfo or .bash_history logfile. While the level of deception in the default setting is limited, this functionality is able to be expanded and modified at will. For this experiment, key elements were modified such as /proc entries and different bash entries to create a difference in each of the kippo hosts presented in the honeypots. The kippo SSH honeypots are written in Python, and installed using the recommended process. Source code was obtained from https://github.com/ikoniaris/kippo which is modified kippo code. The setup for these particular systems used in the data collection was conducted as specified by the BruteForce Lab Guide (Labs, 2011) and further enhanced to send data to various database stores Postgresql, Mysql and also an ElasticSearch server. This setup deviates from the original kippo SSH documentation in that it uses the authbind daemon instead of twistd as the initial connecting daemon for the service. This configuration lets authbind handle the binding of the twistd as a non-root user to a low numbered TCP port and passes this to the twistd daemon. This configuration was found to be more consistent, reliable and secure during the conduct of the research project. During the installation process, a local MySQL database was configured and secured to record all the interactions with the kippo honeypots. Figure 1 is a table from (Valli, 2012) which was sourced from the kippo documentation. It shows the MySQL database structure used in the kippo honeypots that was used to record all the interaction data. TABLE auth TABLE input id int(11) PK, id int(11)NOT NULL PK session char(32) NOT NULL, session char(32) NOT NULL, success tinyint(1) NOT NULL, timestamp datetime NOT NULL, username varchar(100) NOT NULL, realm varchar(50) default NULL, password varchar(100) NOT NULL, success tinyint(1) default NULL, timestamp datetime NOT NULL, input text NOT NULL, KEY session (session,timestamp,realm) TABLE clients TABLE sensors id int(4) PK id int(11) NOT NULL (PK) version varchar(50) NOT NULL ip varchar(15) NOT NULL TABLE sessions TABLE ttylog id char(32) NOT NULL PK id int(11) NOT NULL PK starttime datetime NOT NULL, session char(32) NOT NULL endtime datetime default NULL, ttylog mediumblob NOT NULL sensor int(4) NOT NULL, ip varchar(15) NOT NULL default '', termsize varchar(7) default NULL, client int(4) default NULL, KEY starttime (starttime,sensor) Figure 1 - MySQL database structure for kippo honeypot After recording to the local MySQL database, these data are then transmitted to a centralised Postgresql SQL server (Valli et al., 2013). Communication is achieved using a Python extension that uses a Postgresql driver to connect to the SURFIDS system IDS logging server (IDS, 2013). The centralised logging server utilises the SURFIDS system for storing the data from the honeypots into an aggregated PostgreSQL database. The database has functions and tables specifically for the kippo honeypots data. In addition, the honeypots running kippo operate Dionaea and Glastopf, which in turn report to the SURFIDS instance. It should be noted that these data are not used in this analysis or reported here. The entire honeynet had its kippo code modified to support transmission of attack data to an Elasticsearch instance. In addition to storing the data in local MySQL databases, this code allows the researchers to concurrently transmit it to an ElasticSearch engine ("Elasticsearch," 2015) that has a Kibana("Kibana," 2015) frontend engine. The data in this system can subsequently be queried using customised Kibana frontend queries. The utilisation of the Kibana frontend allows the user to create many custom views of the data, allowing for detection of anomaly and threat. Demonstrative figures extracted from Kibana are included later in this paper. GAINING ACCESS To gain access to these honeypot systems, the correct username and password must be entered at the emulated login screen, as would be the case for a real system. While general user accounts on well administered systems may have lockout of the account for unsuccessful attempts, it is not a feature that is enabled on administrative and root accounts at any time. The reason being that repeated deliberate unsuccessful login attempts can result in a denial of service, thereby locking out access of administrative or root accounts. The lack of an account lockout for unsuccessful password attempts is the Achilles heel of availability for administrative accounts or system accounts, and can be routinely exploited by the use of automated attack tools. The generic tool used for this type of activity is called colloquially a password cracker. Passwords crackers can be deployed to identify the correct password by trying different passwords against the particular service or system. It should be noted that the rate of password attempts is reaching billions of passwords a second when using multi CPU- or GPU-enabled password crackers, with the limiting factor being that of the target machine speed in terms of network or processing power. There is a finite number of passwords for any given system password implementation, often referred to as a key space, and while finite, these key spaces can be computationally large. For example, the standard Windows LM password key space for all possible passwords is 2 43. While it is relatively infeasible for a single conventional computer to derive these passwords in a timely fashion, this does not hold true for advanced techniques using compute clustering or GPU technology that can factor these passwords at the rate of billions per second. Furthermore, techniques such as pre-computed rainbow tables (Oechslin, 2003) can greatly increase speed, as the key space is computed once and each possible password stored as a hash within a database table or binary file structure for easy reuse. The limiting factor then becomes the speed at which the password hash can be compared against every entry in the rainbow table database. Passwords are typically stored in file structures as a cryptographic hash or set length ciphered text, and not as plaintext. Without the use of hashing and cryptography, compromise of the password is trivial. Compromise is achieved by simply opening the file that contains the password and reading it. To increase the security of passwords, they are usually protected by applying a cryptographic process to the password, with the resulting output referred to as a hash.. In this format, the probability of an attacker obtaining or guessing the password on a first guess is very low. The MD5 hash algorithm is a common method employed to achieve password obfuscation in this manner (Marechal, 2008). There are different techniques that can be used to break or crack passwords. A brute force attack uses a systematic method of guessing the password by enumerating through a combination of characters, symbols and numbers of the allowable characters. A dictionary attack creates hashes of words that appear in a dictionary, and compares them to the stored password or feeds the hash as input to the login mechanism of a live system. The former method is commonly referred to as an offline attack, and the later as a live attack. Rainbow tables are databases comprised of various character combinations that have been pre-computed and stored typically in an efficient binary structure, allowing fast retrieval. Password techniques that utilise plaintext wordlists can also be deployed. These types of attack tend to utilise social engineering techniques and deductive reasoning to pick viable candidate passwords. In some cases, these are provided as defaults with the security software distribution or attack utilities used in, for example, Kali. Evidence from the (Rabadia & Valli, 2014) paper proves use of these password lists by attackers. Kippo facilitates the use of these default passwords to produce a list of acceptable passwords. ATTACKER BEHAVIOUR POST-COMPROMISE After achieving login on an account, an attacker will typically want to have administrative control of the device, also referred to as “owning” the system. The attacker then typically downloads malicious code and executes it, compromising the machine with infected binaries or privilege escalations that allow for remote administrative access of the machine. Achieving remote access allows provides persistent access and allows the cyber-criminal to use the compute device for their own activities at will. By design, the kippo honeypot allows all of this malicious activity to occur i.e. if the attacker logs in they are able to interact with a fake shell and download files to honeypot. The files are downloaded using wget functionality and stored in a sandbox for later retrieval and examination by the honeypot operator. Apart from logging and recording the shell interactions, as attack activity occurs kippo also extracts other relevant artefacts from the sessions with the attacker. As mentioned previously, one such artefact is the presented SSH signature from the session that can be used to identify the attacking entity by its digital fingerprint. This fingerprint information was instrumental in detecting a significant change in SSH malicious activity since this research commenced in late 2011. In addition to the kippo honeypot software, all of the honeypot systems use p0f, a passive operating system fingerprinting tool (Zalewski, 2015). This program works by looking at the TCP transmission and TCP/IP stack responses, and tries to determine the attacker’s operating system through fingerprinting and signature matching. A commonly used offensive tool nmap works on similar principles of operation. The major difference is that the p0f program does so passively, while nmap is proactive and sends packets to the target. The story so far The kippo honeynet in this research had been in existence since early 2011 and has expanded with the addition of new sensors. There are now 22 sensors in total which are spread physically around the globe. There are VPS servers located in USA, Germany, Netherlands, Singapore, Australia and the Philippines, and as previously mentioned these are all installed on a maintained Ubuntu LTS (Long Term Support) platform which is currently Ubuntu 14.04 LTS. In addition to VPS assets, there are ADSL based honeypots deployed in Australia. These utilise Raspberry Pi implementations as well as i686 based Ubuntu servers that have identical configuration to the VPS servers. The project detected a wide range of SSH fingerprint signatures as shown in Table 1 prior to 12 th November 2014, totalling approximately 1.2 million interactions, increasing to 18.6 million interactions by 5th Mar 2015 (Table 2). The attackers that had connected to the honeypots prior to the 12th November 2014 had predominantly been Unix/Unix-Like signatures as shown in Table 1, with a predominance of the Kali and BackTrack Linux distributions representing 99% of all malicious login attempts on the honeypots using the libssh2 libraries. Table 1 – Top 10 SSH Signatures detected by honeypots 1 2 3 4 5 SSH-2.0-libssh2_1.4.2 SSH-2.0-libssh2_1.4.3 SSH-2.0-libssh2_1.4.1 SSH-2.0-JSCH-0.1.51 SSH-2.0-libssh2_1.4.0 825729 342920 7101 4390 2230 6 7 8 9 10 SSH-2.0-OpenSSH_5.2 1530 SSH-2.0-paramiko_1.8.1 1157 SSH-2.0-libssh2_1.0 843 SSH-2.0-OpenSSH_6.0p1 Debian- 322 SSH-2.0-libssh2_1.4.3 PHP 134 4+deb7u2 Total 1186356 Table 2 - Top 10 SSH Signatures until 05/03/2015 1 2 3 4 5 6 7 8 9 10 SSH-2.0-PUTTY SSH-2.0-libssh2_1.4.2 SSH-2.0-libssh2_1.4.3 SSH-2.0-libssh-0.1 SSH-2.0-libssh2_1.4.1 SSH-2.0-JSCH-0.1.51 SSH-2.0-PuTTY_Release_0.63 SSH-2.0-libssh-0.4.8 SSH-2.0-libssh2_1.4.0 SSH-2.0-JSCH-0.1.44 Total 12477973 3536116 1853226 310530 225762 65791 51646 37160 9131 6472 18573807 As of March 5th (Table 2) these Linux signatures only represented 30.2% of all malicious login attempts. At that point in time, the dominate signature was that of SSH-2.0-PUTTY, which represented 67.1% of all attempts. It should be noted that the SSH-2.0-PUTTY signature had not been previously seen on the honeynets prior to 27th October, 2014, when there was an observation of 10 connections in a relatively short period of time. The next significant event was on the 13th November where 69 attempts were recorded. A significant increase in the use of the tool commenced on the 22nd November, where 13,788 attempts were made from a /24 network. This /24 was not initially able to be identified on IP based geolocation databases, but it is now identified as apparently originating from China. Initial traceroute reconnaissance by the researcher also indicated that the traffic was propagating from Chinese mainland assets. The other interesting part to note about the traffic was that prior to 12th November there was less than 20 contacts in total from that /24 IP address space over the entire period of operation of the honeynet. A histogram of all attacks with the signature SSH-2.0-PUTTY is show in Figure 1 Figure 1 – Histogram of login attempts against all honeypot sensors where the SSH-2.0-PUTTY client was used All IPs within the particular /24 still have daily contact with the honeynet, with the total number of attempts ranging from 0 ~ 50,000 on any given day from single IPs in that network address range. Figure 1 is the histogram for that time period, and represents 14130288 logins with only 42236 successful or 0.30% success. The attack “network” has grown significantly as the attackers have compromised machines globally. Initial contact with the honeypots with the SSH-2.0-PUTTY client signature was restricted to the same /24, but as they successfully compromised machines they in turn started to contact the honeynet nodes. Contact from nodes other than the /24 numbered 210,159 and the following figure (Figure 2) shows the geographic spread of these contacts. Figure 2 – Geographic spread of new attackers attacking the honeypot sensor network The top 5 countries identified as attacking the honeynet are 73341, France 25643, USA 15416, Turkey 7509 and Brazil 6472. This is reflected pictorially in decreasing shades of green i.e largest = darker shade. It should be noted that while these are attackers that are seen by the 22 nodes in the honeynet, this is not an exhaustive mechanism. However, given that modus operandi of the repeated multiple attempts from the new members of the attack network is consistent with the “original” /24, it seems likely that it is the same. The bruteforce nature of the attempts indicate automated bruteforce retries of logins. When login was achieved the packet captures also evince high repetitious reuse of the same script or code signatures to attack the systems once compromised. DISCUSSION AND CONCLUSION These attacks are ongoing and persistent now for over four months, and appear to be increasing in magnitude over time. The attack would appear to be relatively non-sophisticated, repetitive, verbose and inefficient. From analysis of the collected data it would appear that the attacking entities are not sharing attack data, and the attacks are noisy and not as efficient and optimised. One possible explanation for this is that the honeypots are not responding back to or potentially providing “alive” tokens to the attacking entity, as we are not running the malcode they download. This lack of response by the honeypots could be the causation of the retries by the attackers. This aside, the logic employed would appear to be:”if the compromise of the box was successful i.e we were able to login and successfully download the malfeasant code, then leave alone.” However, the observed behaviour was:”if code has not deployed successfully because we do not have control, then, re-attempt installation”. This finding has implications for honeypot design sophistication and deployment, and is a valuable outcome in of itself. To prevent this behaviour, a method for sending “false positives” back to the attacking entity mimicking command and control would need to be developed. This pattern of reattempted compromise in this occurrence is consistent with the intention of a honeypot, which is to exhaust or distract resources away from legitimate targets through deception. Every retried compromise and install represents resource usage by the attackers. This usage includes, but is not limited to, actual machine run time, consumption of network bandwidth and scanning activities, all of which consume finite resources on the part of the attacker. In addition to resource wastage the activity provides, with every attempt, more evidence of the actual attack and in most cases would represent repeat criminal offences. Of interest is the observation more recently of the initial use and subsequent significant increase of attacks using the Putty SSH tool. Further, it was observed that a significant quantity of these attacks apparently originated in China. There may be a number of likely reasons as to why this was observed, but one hypothesis is that the attackers leveraged compromised Windows operating system computers in China as the initial attack platform. Data suggests that the majority of computers in China are running Windows, with most of these copies are pirated and largely unpatched and thus are insecure and susceptible to compromise themselves (Popa, 2015). Use of compromised computers as a third party attack platform is not uncommon, as it makes it harder to identify the true origin of a cyberattack(Livadas, Walsh, Lapsley, & Strayer, 2006). This does call into question whether these attacks are truly originating in China, as has been suggested in previous honeypot research (Pouget & Dacier, 2004). Further research is being conducted now on the downloaded payloads from the attacking entities. One of the features of the honeynet is that it will download, check the md5 sum of the file, and if it already exists will discard the download. This is advantageous in these cases as otherwise there would be significant storage implications for this research alone. There is also data with respect to detected OS fingerprints for attacking entities which will be presented in further research papers. Finally, the honeynet is functioning as it should, and this particular persistent attack has and continues to yield significant data for analysis and interpretation. REFERENCES Ciampa, M. D. (2010). Security Awareness: applying partical security in your world (3rd ed.). Boston: Course Technology. desaster. (2015). kippo. Retrieved from https://github.com/desaster/kippo . Elasticsearch. (2015). https://www.elastic.co/products/elasticsearch: ElasticSearch BV. Hosting, G. P. (2013). Kippo. Kippo SSH Honeypot Retrieved 09.10.2013, from http://code.google.com/p/kippo/ IDS, S. (2013). SURFcert IDS Retrieved 20/10/2013, from http://ids.surfnet.nl/wiki/doku.php . Kibana (Version 3.1.2). (2015). https://www.elastic.co/products/kibana: Elasticsearch BV. Labs, B. (2011). Installing Kippo SSH Honeypot on Ubuntu Retrieved 27.09.2013, from http://bruteforce.gr/installing-kippo-ssh-honeypot-on-ubuntu.html Livadas, C., Walsh, R., Lapsley, D., & Strayer, W. T. (2006). Usilng machine learning technliques to identify botnet traffic. Paper presented at the Local Computer Networks, Proceedings 2006 31st IEEE Conference on. Marechal, S. (2008). Advances in password cracking. Journal in computer virology, 4(1), 73-81. Oechslin, P. (2003). Making a Faster Cryptanalytic Time-Memory Trade-Of. Paper presented at the The 23rd Annual International Cryptology Conference, CRYPTO '03, Santa Barbara, California, USA. Popa, B. (2015). More than 97 Percent of Computers in China Now Running Windows, Mostly Pirated Retrieved March 2015, 2015, from http://news.softpedia.com/news/97-Percent-ofComputers-in-China-Now-Running-Windows-Mostly-Pirated-472110.shtml Pouget, F., & Dacier, M. (2004). Honeypot-based forensics. Paper presented at the AusCERT Asia Pacific Information Technology Security Conference. Stevens, R., & Pohl, H. (2004). Honeypots und Honeynets. Informatik-Spektrum, 27(3), 260-264. doi: 10.1007/s00287-004-0404-y Tatham, S. (2015). PuTTY: A Free Telnet/SSH Client, from http://www.chiark.greenend.org.uk/~sgtatham/putty/ TwistedMatrixLabs. (2013). What is Twisted? Retrieved 23.09.2013, from http://twistedmatrix.com/trac/ Valli, C. (2012). SSH: Somewhat Secure Host. Paper presented at the Cycberspace Safety and Security, Melbourne Australia. Valli, C., Rabadia, P., & Woodward, A. (2013). Patterns and Patter - An Investigation into SSH Activity Using Kippo Honeypots. Paper presented at the Australian Digital Forensics Conference, Edith Cowan University. Zalewski, M. (2015). p0f v3 Retrieved March, 2015, from http://lcamtuf.coredump.cx/p0f3/ TWO CHALLENGES OF STEALTHY HYPERVISORS DETECTION: TIME CHEATING AND DATA FLUCTUATIONS Igor Korkin National Research Nuclear University Moscow Engineering & Physics Institute (NRNU MEPhI) Department of Cryptology and Discrete Mathematics Moscow, 115409, Russia [email protected] ABSTRACT Hardware virtualization technologies play a significant role in cyber security. On the one hand these technologies enhance security levels, by designing a trusted operating system. On the other hand these technologies can be taken up into modern malware which is rather hard to detect. None of the existing methods is able to efficiently detect a hypervisor in the face of countermeasures such as time cheating, temporary self-uninstalling, memory hiding etc. New hypervisor detection methods which will be described in this paper can detect a hypervisor under these countermeasures and even count several nested ones. These novel approaches rely on the new statistical analysis of time discrepancies by examination of a set of instructions, which are unconditionally intercepted by a hypervisor. Reliability was achieved through the comprehensive analysis of the collected data despite its fluctuation. These offered methods were comprehensively assessed in both Intel and AMD CPUs. Keywords: hypervisor threat, rootkit hypervisor, nested hypervisors, instruction execution time, statistics and data analysis, Blue Pill. 1. INTRODUCTION Nowadays successful malware detection is becoming increasingly important, because malware cyber-attacks can result in financial, reputational, process and other losses. We can overcome these risks only through anticipatory development of advanced cyber security solutions. Intel and AMD have released more advanced CPUs with hardware virtualization support, which runs code directly on top of the physical hardware. This privileged code is named Virtual Machine Monitor (VMM), bare-metal hypervisor or just “hypervisor”. A hypervisor with a secure system monitor functions allows us to run multiple OSes at the same time in one PC, (see Figure 1). As a result this architecture maximizes the hardware utilization and reduces the costs of operation. This is an obvious advantage of hardware virtualization based hypervisors (Derock, 2009; Barrett, & Kipper, 2010). At present more than a billion processors with this technology are installed in workstations as well as in cloud computing servers on the Internet. However, at the same time hardware virtualization technology increases vulnerability of systems, seeing that rootkit hypervisor with backdoor functionality can be planted in the PC (BenYehuda, 2013). This type of rootkits is also knows as Hardware-based Virtual Machine Rootkit (HVM rootkit). The cyber security community faces the challenge of hypervisor detection. Presently there is no built-in tool to detect a hypervisor reliably. Of course we can check basic things: CR4.VMXE (Intel, 2014) bit in Intel case or EFER.SVME bit in AMD case (AMD, 2013), but a hypervisor can hide its original value. Moreover, it is impossible to block, stop or unload a hypervisor by using existing known cyber security tools, resides on virtualized OS level. OS Install hypervisors Virtual machine 1 Virtual machine 2 OS 1 OS 2 Hypervisor with secure system monitor functions Hardware Rootkit hypervisor Hardware Figure 1 PC without Hypervisor and under Control of the Two Nested Hypervisors: a Legitimate one and Rootkit The difficulties of this challenge arise from the following causes. Firstly hypervisors can use a wide variety of different techniques to prevent detection. Secondly, it is possible to run several nested hypervisors. Thirdly, a hypervisor can be installed via a driver or boot records as well as via BIOS (Kovah, Kallenberg, Butterworth, & Cornwell, 2014) or UEFI (Bulygin, Loucaides, Furtak, Bazhaniuk, & Matrosov, 2014), which makes the deleting of a hypervisor rather difficult. Utin (2014) analyzed the possibility of BIOS-based hypervisor threat. The author’s ideas are based on the suspicious hypervisor (Russian Ghost) whose detection is simple, because it does not apply any countermeasures. Despite the fact that hardware virtualization is not new and involves a world-wide community of researchers, the development of effective hypervisor detection methods has so far been without success. The goal of this paper is to tackle this issue. This article presents new detection methods which are based on the difference between the instructions execution time (IET) both, with a hypervisor and without it. We applied a set of specific instructions which cause VM-exits unconditionally or are trapped by a hypervisor. As a result, IET takes significantly more time with a hypervisor than without any hypervisor. This time discrepancy is commonly used to detect hypervisors. However, detection by time is possible only if a hypervisor is not hiding itself via timestamp cheating (Fritsch, 2008; Garfinkel, Adams, Warfield, & Franklin, 2007) or via a temporary self-uninstalling hypervisor – the Blue Chicken technique (Rutkowska, & Tereshkin, 2007). Under these conditions the hypervisor detection methods based on time discrepancies will not work. Therefore, a totally new hypervisor detection approach, which is resilient to countermeasures, is needed. In a nutshell the proposed methods consider the IET as a random variable, whose properties depend on hypervisor presence. That is why by applying probabilistic and statistical methods to IET, it may be possible to detect a hypervisor. Our detection methods have improved on the current time-based detection method, which uses unconditionally intercepted instructions. Unlike the original method our approach is able to detect any stealthy hypervisor, which has applied countermeasures: time-cheating, temporary self-uninstalling etc. This is a distinct advantage of these new methods. The remainder of the paper is organized as follows. Section 2 is devoted to the analysis of the most popular software and hardware hypervisor detection approaches. The analysis will be given in the case of a hypervisor using countermeasures to prevent its detection, such as time cheating, temporary self-uninstalling, preventing memory dump acquisition etc. Section 3 contains the processor behavior analysis in the three cases without a hypervisor, with one and several nested hypervisors. Analysis has discovered new useful statistics for the IET, which can reveal hypervisors. In section 4 the experimental results of statistics examination are presented. The positive results of these checks make it possible to analyze IET as a random variable. As a result this allows us to use threshold values of statistics to detect each hypervisor. This approach works well under the countermeasures and fluctuations of measured time durations. The present author’s threshold generated methods and hypervisor detection approaches and their analysis are briefly presented. Section 5 contains the main conclusions and further research directions. 2. RELATED WORK Nowadays there is no hypervisor detection build-in tool for Intel. The built-in tool for AMD CPU is vulnerable to hypervisor countermeasures. Therefore researchers are working hard to solve this challenge. This paper gives a classification and analysis of all publicly available hypervisor detection methods and approaches. The history of hypervisor detection started in 2007 after the first hypervisor rootkit “Blue Pill” was presented by Rutkowska (2006). “Blue Pill” is a Windows based driver for AMD CPU. At the same time Dai Zovi (2006) released “Vitriol” – a similar hypervisor for MAC OS and Intel CPU. The comparative analysis of these two hypervisors was presented by Fannon (2014). “Blue Pill” and “Vitriol” became high-profile tools in information security sphere and motivated the creation a lot of different approaches to hypervisor detection. Their classification is given in Figure 2. We can classify these into four categories: signature-based, behavior-based, detection based on the trusted hypervisor and approaches which use time analysis. Signature based detection uses memory scanning of hypervisors’ patterns. The latter three sections are based on interaction with a hypervisor. Hypervisor detection МВМ methods Способы обнаружения Signature-based By Translation Lookaside Buffer (TLB) Behavior-based By Errors in Hypervisors By Errors in CPU Based on the trusted hypervisor By Translation Lookaside Buffer (TLB) Time-based By Return Stack Buffer (RSB) By Memory Access By Unconditionally Intercepted Instructions Figure 2 Hypervisor Detection Methods Classification 2.1. Signature-Based Detection After a hypervisor has been loaded into memory its dispatcher (VMM handler) and Virtual Machine Control Structure (VMCS in Intel case) will be located in memory. The hypervisor can be detected by signature analysis of the physical memory (Bulygin & Samyde, 2008; Desnos, Filiol, & Lefou, 2011; Medley, 2007). This approach consists of two stages: memory dump and its inspection, both of which are not resilient to the hypervisor’s countermeasures. Analysis shows that software based memory dump approaches are vulnerable whereas the hardware ones are only applicable under laboratory conditions (Korkin & Nesterov, 2014). Let us analyze how resistant hypervisor’s signatures are. the current Thus Fritsch (2008) proposed to detect “Blue Pill” hypervisor by searching “BLPB”, “BLUE” and “BLUP” strings in a memory dump. However, in common cases such strings will be unknown to analysts. The Actaeon system (Graziano, Lanzi, & Balzarotti, 2013) is based on searching for VMCS fragments. However, this method can sometimes fail. For example, hypervisor can allocate in memory 100 structures to hamper detection. These VMCSes are similar to original VMCS. After that the Actaeon system may reveal many false VMCSes so separation between the original one and the rest will require a considerable amount of manual work. As a result, signature-based detection is ineffective for resistant hypervisors. 2.2. Behavior-Based Detection Behavior-based detection relies on the system activity differences in the two cases, with and without a hypervisor. There are three behavior-based detection methods: TLB-based detection and methods based on errors in hypervisors and errors in CPUs. 2.2.1. TLB-Based Detection. It is possible to apply the Translation Lookaside Buffer (TLB) which is a memory cache used to speed address translation to detect a hypervisor (Desnos et al., 2011; Fritsch, 2008; Morabito, 2012; Wailly, 2014). TLB includes a set of recently accessed virtual and corresponding physical addresses. Every time OS accesses memory a corresponding TLB entry is searched for. If the requested virtual address is present in the TLB, the retrieved physical address will be used to access memory. In the other case the longtime search with the help of Page Directory will occur. This peculiarity will be discussed later in Section 2.4.1. It is known that VM Exit leads to flushing of TLB when a hypervisor is present. Otherwise without a hypervisor such clearance does not occur. This is why hypervisor detection reduces to checking TLB content, which can be made in several ways, for example by modifying page table entry (Myers & Youndt, 2007). However, TLB-based detection does not work on AMD CPUs and new Intel CPUs. The new supplementary TLB fields “ASID” and “PCID” do not let VM exit flush TLB. 2.2.2. Detection Based on Bugs in CPU. A hypervisor can be detected with the help of bugs in certain CPU models. In these CPUs the results of some instructions depend on whether or not a hypervisor is present. The “Erratum 140” in AMD CPU is based on using results of “RDMSR 10h”. The original value of time stamp counter (TSC) is returned by “RDMSR 10h” while “RDTSC” gets the sum of TSC value and VMCS.TSC_OFFSET value (AMD, 2011). Another bug “VMSAVE 0x67” freezes the system. The execution of the VMSAVE instruction with 0x67 prefix stops virtualization system. Without a hypervisor this error does not occur (Barbosa, 2007). These detection methods are applicable only for outdated CPUs and require non trivial adaptation to new CPUs. 2.2.3. Detection Based on Bugs in Hypervisors. There are software hypervisor bugs similar to hardware bugs in CPU. Microsoft published their paper “Hypervisor Top-Level Functional Specification”, which describes how to detect a hypervisor and get “Hypervisor Vendor ID Signature”, by using CPUID (O'Neill, 2010; Microsoft, 2013). Spoofing attack is likely to occur, when a hypervisor can replace data, trapped by CPUID execution. “Blue Pill” hypervisor has a built-in control interface, which uses “Bpknock” hypercalls (BluePillStudy, 2010; Fritsch, 2008). Calling CPUID with EAX=0xbabecafe changes EAX to 0x69696969, if “Blue Pill” is present. Otherwise such a change does not occur. Due to the hypervisor’s built-in control interface it is possible not only to detect, but also unload a hypervisor (Gabris, 2009). A hypervisor can also be detected by reading debugging messages. For example, a developer or hacker might have forgotten to remove DbgPrint calls, which can disclose a hypervisor’s activity. These approaches can reveal only well-known hypervisors, which do not take countermeasures. 2.3. Detection Based on the Trusted Hypervisor A hypervisor which is loaded first can control and block activity of hypervisors which are loaded later. This detection method was used in “McAfee DeepSAFE” (McAfee, 2012), “Hypersight Rootkit Detector” (North Security Labs, 2011), “Symantec Endpoint Protection” (Korkin, 2012), and it has also been mentioned in papers (Park, 2013; Wang & Jiang, 2010). This detection approach is vulnerable to “Man-In-The-Middle” (MITM) attack, in which an illegitimate hypervisor can gain control first and compromise a legitimate one, which was loaded later on. TPM-based attestation of hypervisor can avoid this attack, although TMP mechanism is vulnerable too (Berger et al., 2006; Brossard & Demetrescu, 2012; Wojtczuk, & Rutkowska, 2009; Wojtczuk, Rutkowska, & Tereshkin, 2009). MITM attack can be also prevented by loading hypervisor from BIOS as well as by applying Trusted Startup Hardware Module (Accord, 2010). However, due to the difficulty of porting this detection method, it is applicable only to labs. 2.4. Time-Based Detection Time-based detection uses the measuring of time duration of specific operations or profiling of its execution time. When a hypervisor is present the execution of such operations is intercepted by the hypervisor. As a result, their duration will be longer than without a hypervisor. Four time-based methods can be mentioned: TLBand RSB-based detection, detection based on memory access and detection by unconditionally intercepted instructions. Let us focus on these methods applicable in the situation where a hypervisor prevents its detection by time cheating and temporary self-uninstalling. 2.4.1. TLB-Based Detection. As it was mentioned before in Section 2.2.1, the TLB flushes every time VM exit occurs. After that, the longtime fill will happen. It is possible to use this fact to detect hypervisor as follows (Ramos, 2009; Rutkowska, 2007): 1. Read the content of a specific memory address. 2. Repeat step 1 and measure its duration. In this case the TLB entry, which was added on step 1, will be used. 3. Execute unconditionally intercepted instruction (forcing a #VMEXIT). 4. Repeatedly carry out step 2. 5. Make a conclusion about the hypervisor presence by comparing the results of steps 2 and 4. This approach does not work if the hypervisor uses time cheating, because there is no significant difference between these two steps. This approach has the same disadvantages as in Section 2.2.1. 2.4.2. RSB-Based Detection. Another detection method is based on Return Stack Buffer (RSB), which increases computer performance. RSB content as well as TLB suffers changes when VM exit occurs, but unlike TLB, RSB includes addresses of RET instructions. Applying RSB to hypervisor detection was described by Bulygin (2008) and later by Fritsch (2008) and Athreya (2010). After 16 nested functions calls, RSB will consist of 16 corresponding return addresses. The idea of the detection lies in an attempt to fill the RSB buffer, call VM-exit, for example by calling an unconditionally intercepted instruction, measure an execution time of these 16 functions. If a hypervisor is present, it intercepts VM-Exit and replaces a part of RSB entries. As a result the whole duration will be longer than without a hypervisor. This method is vulnerable to the hypervisor’s countermeasures, for example if a hypervisor dispatcher has no sub-functions it is also vulnerable to time cheating attack (Athreya, 2010). 2.4.3. Detection Based on Memory Access. A hypervisor can prevent its signature detection by controlling memory access (section 2.1.), which increases the duration of memory access and can be applied to hypervisor detection (Fisher-Ogden, 2006; Fritsch, 2008). By walking successively through memory we measure each time the duration of memory page access. The memory region with excessive access duration is the stealth memory region. This region can consist of hypervisor dispatcher and corresponding structures. However, this method works only if the hypervisor does not use time cheating for self-protection. 2.4.4. Detection by Unconditionally Intercepted Instructions. It is known that the duration of execution of unconditionally intercepted instructions increases after any hypervisor has been loaded in the system. We can detect hypervisor presence by comparing time duration with some threshold values (Athreya, 2010; Lakshminarayanan, Patel, Robinson, & Soulami, 2012). Hardware virtualization for Intel CPU includes a set of unconditionally intercepted instructions, e. g. CPUID (Intel, 2014), for AMD CPU case we can use RDMSR (Morabito, 2012), which has to be triggered by a hypervisor. The authors suggest measuring a HDD access time, RAM access time or duration of cryptographic computation (Kyte, Zavarsky, Lindskog, & Ruhl, 2012; Pek, & Buttyan, 2014). But such events can only be revealed by specialized hypervisors and does not work in ordinary cases. This detection approach is vulnerable to “Blue Chicken” technique and time cheating (Rutkowska, & Tereshkin, 2008). Nevertheless, this approach appears to be the most attractive because of its usability and portability. This approach is also universal, as a hypervisor will always spend time on VM Exits (VM Entries), and this time needs to be hidden. Because of these advantages this approach was chosen and was significantly improved. 2.5. Analysis of Counters to Measure Instruction Execution Time Instruction execution time (IET) or its duration is the main scope of this research, so let us classify and analyze the capabilities of the computer counters, which can be applied to measure, e.g. the execution time of ten CPUID instructions. Counters can be classified as software and hardware ones. Hardware counters use device capabilities and may be further classified as local and remote ones. The software counter (or SMP counter) is based on simultaneous work of two loops (Desnos et al., 2011; Jian, Huaimin, Shize, & Bo, 2010; Morabito, 2012), which are running on different CPU cores. The first thread increments the control variable, while the second one executes the unconditionally intercepted instruction in the loop, for example 1000 times. The conclusion about hypervisor presence is made by comparing the results of a control variable with the threshold value. One paper (Li, Zhu, Zhou, & Wang, 2011) describes how to prevent this approach by applying memory modification, which contains the control variable. To measure IET we can use the following hardware counters TSC, RTC, ACPI Timer, APIC Timer, HPET, PIT, local device counters, e.g. GPU timer, and NTP-based clock. Our analysis shows that all these counters apart from TSC and SMP have low-resolution and cannot be used in ordinary cases. SMP counting requires no less than two CPU cores and can be cheated. The best choice to measure the IET is TSC because of its accuracy and high-resolution. TSC also works on all CPUs. To eliminate the influence of other running programs on IET, we can use TSC on the highest IRQL and set the affinity of the measuring code with one of the CPU cores. The important advantage of TSC is the possibility to cheat on it easily, so we can simulate a stealthy hypervisor and test our detection approach in a real case. 2.6. Conclusion The above analysis shows that the existing approaches and hypervisor detection tools have the following drawbacks: 1. Signature-based approaches are vulnerable to hypervisor countermeasures. Only Actaeon project can detect nested hypervisors, but it can also be compromised. 2. Behavior-based detection methods do not reveal new hypervisors and do not work on new CPUs. 3. Trusted hypervisor-based approach is susceptible to MITM attack. 4. Time-based detection approaches are vulnerable to time cheating and Blue Chicken technique. Detection by unconditionally intercepted instructions is highly attractive, because it relies on a generally applicable technique. By improving data acquisition and processing, we can overcome the drawbacks of this method. 3. THEORETICAL PRINCIPLES FOR ENHANCEMENT OF TIME-BASED DETECTION Detection by unconditionally intercepted instructions works well only if a hypervisor does not apply countermeasures: time cheating and temporary self-uninstalling. In this section the enhancement of this method is described. Our prerequisites are based on specific features of IET. One of them is the relation between the average IET and presence of a hypervisor. Another well-known one is a random nature of IET, but it is still unclear how to use it in practice. To grapple with this gap, let us look at the switching schemes between different CPU operating modes, which occur after OS is loaded. We demonstrate and analyze what actually happens when a set of CPUID instructions are being executed in three cases: when the hypervisor is present, not present and when several nested ones are present. Further we will focus on two IET characteristics: dispersion of IET array and IET array layering. According to some papers (Duflot, Etiemble, & Grumelard, 2006; Embleton, Sparks, & Zou, 2008; Zmudzinski, 2009) without a hypervisor a CPU can operate in one of the two modes: either in the Protected Mode (P-mode) or System Management Mode (S-mode), which is depicted on Figure 3, a. System Management Interrupt (SMI) switches from the P- to S-mode, CPU leaves S-mode and returns to the previous mode by using RSM instruction. We can conclude that CPU is a stochastic system with random transitions between states, because of a random nature of SMI. Therefore IET is a random value determined by the number of SMI. After the hypervisor is loaded the CPU can switch between the three modes. As in the previous case the P- and S- modes are present but an additional VMX root mode (V-mode) is added, so the P-mode is named as VMX non root mode (Intel, 2014). The P-mode is accepted as the main one, S-mode is duplicated for better clarity, see Figure 3, b. Execution of each CPUID instruction in P-mode always leads to switching to the Vmode (VM Exit), and after execution it switches back to the P-mode. Switching to the S-mode might occur either from P-mode or from V-mode. Similarly to the previous case we may assume that CPU works as a stochastic system, but switching to the V-mode enhances its random nature. As a result switching increases the average value of IET as well as the variability of IET. Protected mode (P-mode) SMI RSM VMX non root mode (P-mode) System Management Mode (S-mode) SMI VM exit VM entry RSM VMX root mode (V-mode) SMI RSM System Management Mode (S1-mode) System Management Mode (S-mode) a b Figure 3 Switching between Modes in Two Cases: (a) without a Hypervisor, (b) with One Hypervisor CPU works in a similar way in cases when several hypervisors are present (Ben-Yehuda et al., 2010). CPU can also switch between three modes, but the situation will be different because of several hypervisors dispatchers, see Figure 4. In this case execution of each CPUID instructions in P-mode always leads to switching to the Vmode, and further, each hypervisor’s dispatcher is successively called beginning from dispatcher #1 to dispatcher #2 etc. to dispatcher #e and backwards. Finally execution will switch to Pmode. S-mode can gain control at any point. Now, CPU also works as a stochastic system, but participation of several nested dispatchers significantly lengthens the time of execution and increases IET variability. These schemes allow us to discover that the root of randomness of IET is actually the randomness of SMI. Suppose that probability or frequency of SMI is a constant. After a hypervisor is loaded, due to the increased IET the number of SMI is increased as well. That is why the dispersion of IET will increase after a hypervisor is loaded and this fact can be used for detection. During the execution of a set of CPUID the number of SMI is limited. If we repeat measuring of IET in a loop we can see that some of its values are repeated. Hence array of IET values can be grouped by sets with the same values (for details see Chapter 4). As a result, we can see that the array of IET values has a layered nature in all described cases. The number of layers will increase after a hypervisor is loaded and this fact can also be used for hypervisor detection. The revealed IET variability indexes, variance (or second moment) and number of layers (or spectral width) are resilient to time cheating. Hypervisor can only decrease the mean value of IET but not the variability characteristics. As a first approximation this analysis reveals two theoretical hypervisor indicators. This result is based on a hypothesis but now has to be comprehensively verified by experiments. VMX root mode (V-mode) VMX non root mode (P-mode) SMI RSM System Management Mode (S-mode) VM exit VM entry Hypervisor’s dispatcher #1 SMI VM exit VM entry .. .. VM exit VM entry Hypervisor’s dispatcher #e RSM System Management Mode (S1-mode) SMI RSM System Management Mode (Se-mode) Figure 4 Switching between Modes with Several Nested Hypervisors 4. INSTRUCTION EXECUTION TIME RESEARCH & NEW STEALTH HYPERVISORS DETECTION ALGORITHMS Probabilistic hypervisor detection is discussed in following papers (Desnos et al., 2011; Fritsch, 2008; Jian et al., 2010; Morabito, 2012). All these methods work only if a hypervisor is not hiding itself. What is more, these papers do not give enough attention to the random nature of IET. Detection of stealthy hypervisors faces two challenges: time cheating and data fluctuations, which will be described in this paper. 4.1. Experiments on Measurements of Instruction Execution Time To detect a hypervisor we improve the detection method, which uses unconditionally intercepted instructions. We analyze IET sets in the two cases with a hypervisor and without any. Experimental data was received by measuring a set of ten CPUID instructions by using RDTSC in a loop in Windows driver, see Figure 5. To dismiss the influence of other apps and drivers in the OS we ensured thread affinity with certain CPU core and raise IRQL to its maximum level. It is also possible to use deferred procedure call (DPC) to achieve an exclusive access to the hardware. An example of this scheme is described by Blunden (2012). We use CPUID instruction as an unconditionally intercepted one by any Intel-based hypervisor and also as a serializing instruction which prevents out-of-order execution (Fog, 2014; Intel, 1998). Our proof-of-concept hypervisor (PoC hypervisor) is based on the VMM framework by Embleton (2007) with an added TSC cheating function. There are three different ways to cheat TSC: by TSC_OFFSET field in VMCS, catching execution of RDTSC or CPUID. We chose the last one: our hypervisor decreases the TSC's value every time CPUID is executed. This hypervisor's dispatcher is the smallest. By cheating TSC we can make sure that the average values of IET are the same to within one clock tick, whether the hypervisor is present or not. Therefore, this is the most complex case for detection. To obtain data we used two nested loops. An example of an inner loop is shown on Figure 5, it was executed 1000 times without any delays. Outer loop was executed 10 times with a twosecond delay between each iteration. The results of this experiment were recorded to a 1000x10 array (see Table 1); the columns contain data from inner loops. According to ISO 5725 repeatability requirements we repeated the complete experiment five times with a two-second delay between each iteration. To control reproducibility of data we checked the results on 10 different days. All in all for this period we measured 50 arrays of 1000x10, which will be further processed. That period was sufficient to reduce variation intervals of statistics: average values, variance etc. Six PCs were involved in testing, see Table 2. In the first five PCs we used our PoC hypervisor, and in the last PC we used a specialized hypervisor loaded by BIOS – TRace EXplorer (TREX) by Tichonov and Avetisyan (2011). KeSetSystemAffinityThread(affinity) KfRaiseIrql(HIGH_LEVEL) for (...) { RDTSC MOV hst, EDX MOV lst, EAX CPUID // 1 ... CPUID // 10 RDTSC MOV hfin, EDX MOV lfin, EAX save_time(hst, lst, hfin, lfin) } Figure 5 Code Fragment for Obtaining Data Table 1 Example of Array of Measured IET without a Hypervisor Measurement no 1 2 … 1000 Average of a column Variance of a column 1 2896 2896 … 2888 2895 1738 Inner loop iteration 2 … 2888 … 2888 … … … 2888 … 2888 … 1267 … 10 2896 2880 … 2888 2888 1196 Table 2 CPU Models and OS Versions PC# 1 2 3 4 5 6 CPU models and OS versions Intel Core 2 Duo E6300 / Windows 7 Intel Core 2 Duo E8200 / Windows 7 Intel Core 2 Duo E8600 / Windows Live CD XP Intel Core i7 950 / Windows XP Intel Xeon X5600 / Windows 7 AMD Phenom X4 945 / Windows Live CD XP 4.2. Probabilistic Nature of Instruction Execution Time Desnos, Filiol and Lefou (2011) suggested that the instruction execution time is normally distributed and there are no problems with precision (repeatability and reproducibility) of the measurement data. However, all our experiments on different PCs showed that the measurement data are nonnormally distributed. There are no well-known distribution patterns which these data would match. Moreover, data fluctuation is so large that mean and variance statistics differ significantly between sets of experiments. Therefore the precision of the measurement data does not comply with ISO 5725 (2004) requirements. We have to take into consideration that outliers and jumps (discontinuity) are very common, which will alter statistical values, see Figure 6. A possible reason for outliers and jumps is the pipeline of instructions. Due to the fact that the time measurement procedure is quite simple and a PoC hypervisor with time cheating can be used, we can receive an abundance of experimental data for research and detection phase, which significantly helps. Relying on the probabilistic nature of IET we dealt with when setting up experiments, these revealed data peculiarities, processing of preliminary data, only appeared after that we applied statistical methods. 4.3. Peculiarities of Instruction Execution Time and Ways of Hypervisors Detection Our experiments confirmed the following: 1. IET measured by TSC is a random value, which depends on a CPU model, OS version and on whether or not a hypervisor is present. 2. The average and variance of IET arrays are larger if a hypervisor is present than if it is not. 3. The difference between average and variance of IET arrays becomes more significant after every new nested hypervisor has been loaded. We can easily and reliably detect a non-hidden hypervisor by just comparing the average values of IET arrays. The average values of IET arrays with a non-hidden hypervisor are almost 10 times larger than without it. x 10 4 Instruction Execution Time Outlier 2 1.5 1 0.5 Jump 0 0 50 100 150 Measurements Number Figure 6 Scatter Plot of IET Array Fragment with One Outlier and Jump But a hypervisor can apply time cheating technique and as a result the average values of IET will be the same as corresponding values without a hypervisor. There are no time-based detection methods which work well under such circumstances. Our experiments were focused on this challenging case. Using more common statistical methods in hypervisor detection proved to be inapplicable. The reasons will be given below. By using statistics we can determine if there is a statistically significant difference between two sets of data. We already know which of the set will be measured with exposure and without it. But in the current situation we have several sets. We can connect several sets to a big one, and use classical approaches, but such operation has to be proved. For this case there are no proven statistical methods. Applying current approaches to determine significant difference between the sets did not yield any positive results for a variety of reasons. We can consider the columns of arrays as a random sample, also as a result of the random process. It is impossible to use the first method, because of the fluctuation of measurements and lack of homogeneity. The second method is not applicable either, because of overlapping variation intervals and instability of characteristics. We see that homogeneity of variances (HOV) is violated in all our experimental data, and as a result we cannot use analysis of variance (ANOVA) in data processing. We conclude that methods of parametric and nonparametric statistics are not applicable in the current situation. That is why we developed the following methods, including the present author’s approaches: 1. 2. 3. 4. 5. Low-frequency filtration. Calculation of experimental probability. Two-step way to calculate statistics. Variation interval as confidence interval. Iteration of measurements if the statistical value is at the intersection of two variation intervals. Due to filtration we can decrease fluctuation and stabilize variation characteristics. Due to calculation of experimental probability we can find threshold values and so minimize type I and II errors. We choose a two-step way of calculating in order to reduce overlapping of these characteristic intervals. To calculate a confidence interval we choose the idea of the confidence interval method of Strelen (2004) and Kornfeld (1965), in which a confidence interval is calculated as a variation interval or difference between maximum (𝑆𝑚𝑎𝑥 ) and minimum (𝑆𝑚𝑖𝑛 ) values of statistic. The confidence level is the following 𝑃{𝑆𝑚𝑖𝑛 ≪ 𝜃 < 𝑆𝑚𝑎𝑥 } = 1 − 0.5𝑛−1 , where ‘𝑛’ is the length of a sample. We have to study a situation when a calculated statistical value will be at the intersection of two variation intervals. In this situation it is impossible to decide whether a hypervisor is present or not. In this case we have to repeat measurements of IET arrays and calculations of statistics. In accordance with the multiplication theorem recurrent hits in the intersection zone are very unlikely. 4.3.1. Applying Layered Structure of IET Arrays to Hypervisors Detection. Numerous experiments show that IET arrays have a layered structure. It means that each IET array is comprised of several layers, whose characteristics depend on CPU, OS and whether or not a hypervisor is present. First of all our experiments confirm, that the number of layers with a hypervisor is larger than without a hypervisor. To make it clear, the results of an experiment are given below. We measured IET arrays in two cases: without hypervisor and with it. The right part of Figure 7 is a scatter plot of the IET array; each point corresponding to the measured duration of ten CPUID instructions. Experiment numbers are on the x-axis, while IET values are displayed on the y-axis. Blue color corresponds to IET without a hypervisor, red color corresponds to IET with a hypervisor, which is applying time cheating. This technique leads to getting approximately the same mean value if hypervisor is present with the mean value without a hypervisor. The left part of Figure 7 shows the corresponding frequency polygons or relative frequency chart. We can see that with a hypervisor the number of polygon points (or number of layers) is larger than without a hypervisor. The similar nature of polygons was also noted by Morabito (2012). His observations show that the data is generally not normally distributed and skewed, long-tailed data with outliers is fairly common. Similar plots of IET array fragments are given in the paper by Fritsch (2008) in the part “A.4 Empirical results” and by Li, Zhu, Zhou, & Wang (2011). However, the fact that layered structure could be used for hypervisor detection had not been mentioned. If several hypervisors are present, the layering structure of IET arrays is still obvious. We measured IET arrays in four different cases: without hypervisor (black), with only own PoC hypervisor (green), with only Acronis hypervisor (blue) and with two nested hypervisors (red). The scatter plots of the corresponding IET arrays are shown on Figure 8. To make it clear, the scatter plots are spaced vertically. We can see that without a hypervisor the plot consists of only one line with quite rare jumps. If PoC hypervisor is present, the corresponding plot has 2-3 layers with significant jumps. The situation is similar if only Acronis hypervisor is present. If two nested hypervisors are present we can see that the plot becomes a cloud of points, there are a lot of layers with low frequency. The best way to reveal the number of layers is to use the frequency distribution of measured IET arrays. We calculate frequency distribution with each class for one value or without intervals of numbers. Number of layers equals the number of classes. It is possible to detect a stealth hypervisor, which uses the Blue Chicken technique. Temporary self-uninstalling of this hypervisor originally occurs after 50-100 measurements of IET because hypervisor needs to recognize time-based detection. As a result we will see the changed nature of the scatter plot: the first 50-100 measurements will have a layered nature and the remaining portion of measurements will have just 1-2 layers because the hypervisor has already been uninstalled. This changing of the scatter plot will be repeated in the next columns; because they were measured with a two-second delay. However, our experiments show that direct use of these indicators is problematic for two reasons. These characteristics are not always constant (they are unstable) and also variation ranges of these characteristics overlap each other whether hypervisor is present or not. Later we will discuss how to deal with it. 0 2860 2880 2900 2920 2940 2960 2980 3000 3020 IET, CPU ticks 3020 2980 2960 2940 No hy per vis or 2920 2900 Время выполнения в тактах 3000 Hyper visor pr esent 0.2 0,6 0,8 0.4 0.6 1 0.8 1 2880 0,4 Relat ive Fr equenc y 0,2 2860 0 50 100 150 200 250 300 350 M eas ur ement number s Относительная частота варьирования Figure 7 Scatter Plots of IET Arrays Fragments and Corresponding Frequency Polygons 2800 Hypervisor present (only Acronis) IET, CPU ticks 2700 2600 2500 T wo nested hypervisors are present (Acronis and PoC) 2400 Hypervisor present (only PoC) No hypervisor 2300 2200 0 50 100 150 Measurement num bers 200 Figure 8 Scatter Plots of IET Arrays Fragments in Four Different Cases 250 400 4.3.2. Applying Second and Fourth Moments to Hypervisors Detection. All our experiments also confirmed the following result of section 3. After hypervisor is loaded the numerical values that measure the spread of IET arrays will increase. We obtained good results with the second and fourth moments. Moreover, after loading each nested hypervisor these sample characteristics increase, which is clearly seen in Figure 8. Experiments show that the sixth and higher moments of IET arrays are seriously inaccurate. As mentioned before, outliers and jumps (discontinuity) significantly affected values of the second and fourth moments. That is why it is impossible to achieve a stable distinction between sample characteristics and draw a conclusion as to whether a hypervisor is present or not. Negative impacts of these factors can be eliminated by simultaneously applying three techniques: fitting, low-frequency filtering and “length-averaging”. We get sample characteristics before and after an outlier and calculate the final value by averaging of the corresponding fragments lengths for “length-averaging”. In order to reduce overlapping of these characteristics intervals we chose a two-step way of calculating. We calculate the second and fourth moments for each column in the table (IET array), see Table 1. This brings us to a set of these characteristics, which we consider as a new sample and repeatedly calculate characteristics of this set. In other words, from the primary column of IET array we get the secondary characteristics, which we are processes by statistical methods. Consequently this helps us to significantly reduce or avoid the overlapping of new characteristics intervals. All theoretical principles from Section 3 were successfully confirmed by experiment. The number of layers of IET arrays, second and fourth moments increased and remained on the same increased level after a new hypervisor was loaded, i.e. they can be used to detect a hypervisor and several nested ones. Moreover the ways of calculating threshold values of each statistic will be given with fluctuations. due consideration of data 4.4. How to Calculate Threshold Values of Statistics to Detect Hypervisors Hypervisor detection includes comparison of calculated statistics values with threshold values. If statistical values are greater than threshold values, we conclude that a hypervisor is present, otherwise there is no hypervisor. The main goal is to find a suitable filtration level and the statistic, which has an appropriate threshold value or minimal sum of type I and type II errors. To calculate threshold values we have to measure 50 arrays 1000x10 for two cases when a hypervisor is present or not, 100 arrays in total. We use own PoC hypervisor, because it contains the minimal set of instructions in CPUID dispatcher and its only role is TSC cheating. This is the most difficult case. PoC hypervisor’s threshold values will help to detect any other hypervisor with more functions, as it will cause more changes to IET variation. Calculating threshold values includes calculating statistics in two ways after low frequency filtering with the following levels {0, 0.02, 0.05, 0.1, 0.15, 0.2} or {0%, 2%, 5%, 10%, 15%, 20%}. One way is to calculate statistics for each column 1000x1 of a 1000x10 array. After this calculation we analyzed the received set of 10 values as a new sample and then averaged them (𝑙 ̅ – “averaged columns value”). Another way is to calculate statistics for one big column 10,000x1 obtained from an array 1000x10 by vectorizing (Vectorization (mathematics), 2014) (𝑙𝑣 – “vectorized array value”). It should not be forgotten that outliers and jumps (discontinuity) significantly change statistics values and therefore we have to delete them. We find a jump as the maximum value in the first order difference (The Scipy Community, 2009). The threshold value of a jump is 300 CPU ticks, which can then be corrected. The calculation algorithm of threshold values is the same for all statistics and includes three steps: 1. Receive and process IET array every day. Receive preliminary results. 2. Process the preliminary results which are obtained for 10 days. Receive threshold values and probabilities of type I and II error. 3. Create the final table with all appropriate statistics. We are going to describe a way to calculate threshold values of a new statistic – a number of layers. The first step is to filter each column from Table 1 with different filtration levels. For each received column we calculate the number of layers. Calculated values are given in corresponding columns in Table 3. The last but one column in Table 3 includes the mean values of the number of layers for each filtration level. For example the first value 12 is (28+29+...+10)/10. The last column includes the values of the number of layers, which were calculated from the column 10,000x1 for each filtration level. E.g. the first value 53 means the number of layers in the array 10,000x1 after its filtration with level 0%. Table 3 Example of Calculating the Number of Layers If no Hypervisor is Present Filtering level 0 0.02 0.05 0.1 0.15 0.2 Values of the number of layers for each column in array 1000x10 1 2 … 10 28 29 … 10 4 3 … 3 3 3 … 3 2 2 … 3 1 2 … 2 1 2 … 2 Averaged columns value, 𝑙̅ Vectorized array value, 𝑙𝑣 12 4 3 3 2 2 53 6 6 3 3 2 Table 4 Number of Layers of IET Arrays for 2 Cases when a Hypervisor is Present and not Code of experiments day #1 (Ig10) day #10 (Ig19) Variation intervals Threshold values Type I error Type II error No hypervisor Averaged Vectorized columns value, 𝑙 ̅ array value, 𝑙𝑣 5 23 4 18 4 15 5 21 4 15 Hypervisor is present Averaged Vectorized columns value, 𝑙 ̅ array value, 𝑙𝑣 11 47 11 52 10 34 13 53 14 68 ... … … … 4 6 6 6 10 [4, 14] ≤7 0.04 0 20 32 32 32 50 [10, 110] ≤ 32 0.12 0.16 19 15 16 20 21 [8, 21] ≥8 – – 102 77 79 88 105 [29,105] ≥ 33 – – We can see that with filtration level “0.1” the values of 𝑙 ̅ and 𝑙𝑣 are stabilized, therefore we will use this filtration level for this PC in the future. The similar table is also created if PoC hypervisor is present. Four numbers, values 𝑙 ̅ and 𝑙𝑣 in two cases when a hypervisor is present or not present, are evaluated from a single 1000x10 array in each case. This procedure was repeated for each of five arrays 1000x10 every day, for 10 days. After that we create a preliminary table with threshold values and type I and II errors, see Table 4. Stabilization of statistics is obvious in both cases when a hypervisor is present and not. We managed to achieve this stabilization only due to filtration of jumps and length-averaging, as previously mentioned. Variation intervals were determined according to minimum and maximum values of the statistics in the columns. Variation intervals overlap, therefore if statistical values get into this overlapping, it is impossible to reliably detect a hypervisor. In these cases we have to repeat IET array measurements. We chose threshold values so that the sum of probability of type I and II errors was minimal. Type I error means that we conclude that the hypervisor is there according to calculations, while actually it is not there. The probability of a type I error is experimentally calculated as a number of values, which are greater than the threshold value. A type II error means that we conclude that the hypervisor is not there, while actually it is there. The probability of this error is also experimentally calculated as a number of values, which is smaller than the threshold value. In other words, we calculate the probability of type I and II errors with this formula 𝑟⁄𝑔 , where ‘𝑟’ is the number of values in the column, which are outside the threshold, 𝑔 = 50 is the total number of values in the column. For detection we used only those statistics, whose sum of type I and II errors are less than 0.2 (or 20%). Below is a fragment of the final table (Table 5) with all appropriate statistics for all tested PCs from Table 2. 𝑇̅ is the average value of IET from all arrays without a hypervisor and all other statistical notations are below in Table 6. As mentioned above we can calculate the statistics in two ways: for each column and after vectorization. Our research findings suggest that threshold values depend on Windows version. For the same hardware threshold values for Windows XP and Windows 7 are different, variation intervals of statistics on Windows XP are smaller than on Windows 7. This occurs because Windows 7 enables more SMI handlers than Windows XP. We performed similar experimental checks for nested hypervisors. We used the following iteration algorithm: 1. First, we obtained threshold values for the case without a hypervisor. To do this we measured IET arrays without a hypervisor and with our PoC hypervisor. We received that 𝐿 ≤ 31 (number of layers) means there is no hypervisor. The probability of a false positive is 0.14. 𝐿 ≥ 32 means a hypervisor is present. The probability of false negative is 0.06. 2. Secondly, we installed Acronis Disk Director, which loaded its own hypervisor. In the same way we obtained threshold values for this case. To do this we measured IET arrays with only the Acronis hypervisor and with two nested hypervisors: PoC and Acronis. We found out that 𝐿 ≤ 67 or more precisely 32 ≤ 𝐿 ≤ 67 means that only the Acronis hypervisor is present. 𝐿 ≥ 86 means that two nested hypervisors simultaneously work. Probabilities of type I and II errors in the latter case is 0. Table 7 includes the threshold values for all mentioned cases. Table 5 Final Table with all Appropriate Statistics PC Statistics Filtration level 𝑇̅ 𝐿̅ ̅ 𝐷 ̅ 𝑀 𝜇 𝑇̅ 𝐿̅ ̅ 𝐷 ̅ 𝑀 𝑇̅ 𝐿̅ ̅ 𝐷 𝜇 𝑇̅ 𝐿̅ ̅ 𝐷 𝑇̅ 𝐿̅ ̅ 𝐷 𝑇̅ 𝐿̅ 𝑙̅ ̅ 𝐷 𝑑 ̅ 𝑀 0 0 0 0.1 0.1 0 0 0.2 0.2 0 0 0.1 0.1 0 0 0.1 0 0 0 0 0 0 0 0 0,02 1 2 3 4 5 6 Threshold values No Hypervisor hypervisor is present ≤ 2911 – ≤7 ≥8 ≤ 14 ≥ 18 ≤ 679 ≥ 947 ≤ 104161 ≥ 111041 ≤ 2492 – ≤ 11 ≥ 12 ≤ 100 ≥ 101 ≤ 168 ≥ 13030 ≤ 2431 – ≤6 ≥8 ≤ 15 ≥ 41 ≤ 609 ≥ 3410 ≤ 5018 – ≤ 22 ≥ 26 ≤ 177 ≥ 181 ≤ 2852 – ≤ 67 ≥ 71 ≤ 16416 ≥ 48920 ≤ 2126 – ≤ 34 ≥ 241 ≤ 134 ≥ 593 ≤ 216 ≥ 5478 ≤ 345 ≥ 5422 ≤ 54 ≥ 956 Probability Type I Type II error error – – 0.04 0 0.02 0 0.02 0 0.02 0.10 – – 0.1 0.06 0.08 0.1 0.14 0.02 – – 0 0 0 0 0 0 – – 0.02 0.02 0.1 0.1 – – 0.04 0 0 0 – – 0 0 0 0 0 0 0 0 0 0 Table 6 Statistical Notations Averaged columns value Vectorized array value Number of layers 𝐿̅ 𝑙 nd 2 central moment ̅ 𝐷 𝑑 4th central moment ̅ 𝑀 𝜇 4.5. Detection of Stealthy Hypervisors According to experiments the detection of hypervisors goes in two stages: through preliminary and operational stages, see Table 8. First of all we have to make sure that there is no hypervisor in BIOS. To achieve this we update/flash BIOS with a known and trusted image. Malware in BIOS can prevent its updating by software utility. That is why the best way to overwrite BIOS is to desolder a microchip from the motherboard, flash it by hardware programmer and solder it back (Muchychka, 2013). Table 7 Threshold Values for Two Nested Hypervisors Threshold values 𝐿 ≤ 31 32 ≤ 𝐿 ≤ 67 𝐿 ≥ 86 Conclusion about hypervisors and their numbers No hypervisor Only Acronis hypervisor is present Two nested hypervisors are present Type I error Type II error 0.14 0 0 0 0.06 0 Table 8 Detection of Stealthy Hypervisors Stages Preliminary Operational (detection) Stage description 1. Flash BIOS with a trusted image or firmware. 2. Install OS. 3. Get threshold values for no hypervisor case. 4. Check in a loop if a hypervisor is present. 5. Install supplementary software (optional). 6. Monitor messages about a hypervisor presence. 7. To adapt the tool to new legitimate hypervisor go to 3. In the second step we install OS. We have to use official images to be certain that OS images do not include any malware or illegitimate hypervisors. Additionally OS components may be checked, e.g. by reverse-engineering. In the third step we get threshold values by using PoC hypervisor. This step was described above. In the fourth step we run the hypervisor presence check in an infinite loop. We measure IET arrays in a loop and compare calculated statistics with threshold values, which were calculated in step 3. We successively check if a hypervisor is present on each CPU physical core. program. However in more complicated cases we have to check all the system components including the BIOS image. All source codes of getting threshold values, PoC hypervisor and detection tool are here (Korkin, 2014). The tool for getting threshold values consists of two parts: subsystem for IET arrays acquisition (C++) and subsystem for threshold values calculation (MATLAB). PoC hypervisor was developed using С++ and ASM, and it is compiled with Visual Studio. The detection tool consists of two parts: subsystem for IET arrays acquisition and subsystem for threshold values checks by MATLAB. On the fifth and sixth steps we install supplementary software and monitor messages about new hypervisors. 5. CONCLUSIONS, DISCUSSIONS AND FUTURE WORK If we get a message about new hypervisors after a program installation, we check if this hypervisor is legitimate. The approaches how to do this are beyond the scope of this paper. It may be noted that we can do it by calling corresponding support service etc. Once we conclude that the hypervisor is legitimate, we have to adapt the detection tool by obtaining new threshold values (step 3). If we conclude that the hypervisor is illegitimate, it must be removed from the system. In some cases this is solved by just uninstalling the previously installed 1. Hypervisor detection is a well-known challenge. Malware with hypervisor facilities are serious information security threats. Many authors and companies are trying to tackle this challenge. 2. In this paper we focused on and improved time based detection by unconditionally intercepted instructions. We studied the case when a hypervisor uses time cheating and temporary self-uninstalling to prevent its detection. For this situation appropriate time based approaches are not available on the Internet. Only the described methods are able to detect stealthy hypervisors in all known countermeasures: time-cheating etc. 3. We explored the probability characteristics of instruction execution time and proposed a new technique for the detection of a hypervisor and several nested ones. 4. We developed techniques for calculating threshold values of various statistics and a step-by-step guide on how to detect a hypervisor. 5. These methods work well on different PCs with both Intel and AMD CPUs and detected PoC hypervisor and special BIOS hypervisor. 5.1. Hypervisor Detection without Flashing BIOS and Reinstalling OS The proposed hypervisor detection method (or its preliminary starting procedure) needs to stop system activity to flash BIOS, reinstall OS etc. But for some systems this interruption of work is prohibited or impossible. However, on the basic of our experimental results, we can guarantee no hypervisor presence without performing 1-2 steps and unwanted system shutdown. To achieve this we acquire IET arrays on PC, which is already in operation. If after IET arrays filtering step we get 1-2 stable layers, this will mean that there is no hypervisor. This peculiarity occurs on PCs with Windows XP and should be investigated further. 5.2. Applying Numerical Values of Layers for Hypervisor Detection We have discovered another pattern which can be used to detect a hypervisor. Thus most of our experiments numerical values of layers are unique. For example, in Figure 7 we see that numerical values of different layers after filtering indicate hypervisor presence. We achieve the following numerical values of layers without hypervisor {2160, 2168, 2184, 2192, 2200, 2478, 2480, 2880, 2888, 2904, 2920, 2936} and these values {2876, 2884, 2892, 2900, 2908, 2916, 2924} with PoC hypervisor. We see that these two sets do not contain equal values. Moreover, if a hypervisor cheats TSC so that the first members from each set are equal, the second and the next members from the above sets will differ. This happens because of the differences of deltas in each set {8, 24, 32, 40, 318, 320, 720, 728, 744, 760, 776} and {8, 16, 24, 32, 40, 48}. The reasons for this difference and its resilience to hypervisor countermeasures requires further research. 5.3. Ways to Develop Computer Forensics and Statistics for Universities The proposed statistical methods and tools for hypervisor detection can be used in two different disciplines. Firstly, it may become a part of Computer Forensics Discipline, when students can acquire practical skills working with hardware virtualization technology. PoC hypervisor can be used as a basic platform for further improvements; for example to create an event tracing tool which will monitor low-level events and will be resilient to modern malware. Hypervisor detection tools can be used to invent new detection approaches, based for example on all unconditionally intercepted instructions (CPUID, INVD, MOV from/to CR3, all VMX instructions, RDMSR and WRMSR). These may have different parameters, including wrong or invalid parameters, as well as profiling execution time for different sets and sequences of instructions, not just ten CPUIDs as is described in this paper. Analysis of time of physical memory access can be applied to find time anomalies due to possible hidden objects. Such a detection approach may need checking all the memory pages, including valid and invalid addresses. We compare IET characteristics before and after disabling the CPU’s cache control mechanism. A stealth hypervisor has to cheat TSC with different deltas for each case, which does not always occur. Secondly it may become a part of a course in “Statistics and Data Analysis”. Because of its opportunity to acquire a lot of real experimental data sets students can acquire practical experience of data processing and its analysis. They can learn how to solve repeatability and reproducibility problems. They can apply different statistical criteria to test correlations between arrays for different cases: with a hypervisor and without it. As a result students will not only better understand the theoretical materials of the course, but will also acquire new practical skills and apply them in their own research. 5.4. Applying Hidden BIOS Hypervisor to Track Stolen Laptops It is well known that an organization has to pay heavily every time an employee’s laptop is lost or stolen. The idea is to create a software agent which will track a laptop, block it if it is stolen, control it remotely etc. This tool will work like Computrace LoJack by Absolute Software (2014). The key moment is to create a software agent, which will be really hard to detect, delete and block. By using hardware virtualization technology we can create a hypervisor, which works permanently. To guarantee that autorun works well, it will be loaded from BIOS. This hypervisor can hide memory areas and prevent its own rewriting by software tools with the help of Shadow Page Tables for AMD CPUs or Extended Page Tables for Intel CPUs. This hypervisor can be easily planted in any PC which supports hardware virtualization. To facilitate development of this hypervisor we can use open source software components, for example Coreboot (2014) for BIOS firmware, TianoCore (2014) for UEFI and XEN (The Xen Project, 2014) as a basis for this hypervisor. 5.5. Applying Hypervisor as USB Firewall to prevent BadUSB attack Nohl and Lell (2014) presented an idea and prototype of malware USB stick. The idea lies in reprogramming a USB device in order to add new unauthorized functions. As a result, for example, a reprogrammed USB stick will work as a USB keyboard and by running malware commands can take over a computer. This vulnerability is really serious because this USB device works transparently for user and AVs and formatting USB flash does not erase malware firmware. We can solve this challenge by using a hypervisor’s facilities, which will control all the devices access to the PC. By applying manual configuration mode the hypervisor can block malware activities of such devices. It will look as if a hypervisor is playing the role of a USB firewall. For example, after a USB device plugs into the computer port the hypervisor will display the list of all registered devices and allow the user to choose the appropriate position. After that the hypervisor will control the work of all USB devices according to the access policies of these devices. As a result this hypervisor working as USB firewall can guarantee protection of PCs from BadUSB attack or other malware USB devices. 6. ACKNOWLEDGEMENTS I would like to thank Andrey Chechulin, Ph.D research fellow of Laboratory of Computer Security Problems of the St. Petersburg Institute for Informatics and Automation of the Russian Academy of Science (Scientific advisor – Prof. Igor Kotenko) for his insightful comments and feedback which helped us to uplift the quality of the paper substantially. I would like to thank Iwan Nesterov, head of the department of mathematical methods of cyber security, LLC “CSS Security”, Moscow, Russia for his invaluable contribution and support. I would also like to thank Ben Stein, teacher of English, Kings College, London, UK for his invaluable corrections of the paper. I am grateful to my grandfather Peter Prokoptsev, Ph.D, Kirovograd, Ukraine for his help with statistics and data analysis. 7. AUTHOR BIOGRAPHY Igor Korkin, Ph.D has been in cyber security since 2009. He works at the Moscow Engineering & Physics Institute, training post-graduate students and supervising students. Research interests include: rootkits and anti-rootkits technologies. Took part at the CDFSL in 2014. 8. REFERENCES [1] Absolute Software (2014). Theft Recovery Software for Laptops, Mac, Smartphones and Tablets – Absolute Lojack. Retrieved on October 12, 2014, from http://lojack.absolute.com/en/products/absolut e-lojack [2] Accord. (2010). Trusted Startup Hardware Module. Retrieved on October 12, 2014, from http://www.okbsapr.ru/akk_amdz_en.html [3] AMD. (2011). Revision Guide for AMD NPT Family 0Fh Processors. Technical Report, Publication Number 33610, Revision 3.48. [4] AMD. (2013). AMD64 Architecture Programmer’s Manual Volume 2: System Programming. Technical Report, Publication Number 24593, Revision 3.23. [5] Athreya, B. (2010, August). Subverting Linux On-The-Fly Using Hardware Virtualization Technology (Master’s thesis). Georgia Institute of Technology. Atlanta, GA, USA. Retrieved on October 12, 2014, from https://smartech.gatech.edu/handle/1853/3484 4 [6] Barbosa, E. (2007). Detecting of Hardware Virtualization Rootkits. Paper presented at the Symposium on Security for Asia Network (SyScan), Singapore. [7] Barrett, D., & Kipper, G. (2010). Virtualization and Forensics: a Digital Forensic Investigator’s Guide to Virtual Environments. Amsterdam, Netherlands: Syngress/Elsevier. [8] Berger, S., Caceres, R., Goldman, K., Perez, R., Sailer, R., & Doorn, L. (2006, July 31 - August 4). vTPM: Virtualizing the Trusted Platform Module. Proceedings of the 15th conference on USENIX Security Symposium (USENIX-SS), 305–320, Vancouver, BC, Canada. [9] Ben-Yehuda, M., Day, M., Dubitzky, Z., Factor, M., HarEl, N., Gordon, A., Liguori, A., Wasserman, O., & Yassour, B. (2010, October 4-6). The Turtles Project: Design and Implementation of Nested Virtualization. Proceedings of the 9th USENIX conference on Operating Systems Design and Implementation (OSDI), 423-436, Vancouver, BC, Canada. [10] Ben-Yehuda, M. (2013). Machine Virtualization: Efficient Hypervisors, Stealthy Malware. Muli Ben-Yehuda Homepage. Retrieved on October 12, 2014, from http://www.mulix.org/lectures/vmsecurity/vm sec-cyberday13.pdf [11] Blunden, B. (2012). The Rootkit arsenal: Escape and evasion in the dark corners of the system. (2nd ed.). Burlington, MA: Jones & Bartlett Publishers. [12] BluePillStudy. (2010). Learn the Open Source of Blue Pill Project. Project Hosting on Google Code. Retrieved on October 12, 2014, from https://code.google.com/p/bluepillstudy [13] Brossard, J., & Demetrescu, F. (2012, April). Hardware Backdooring is Practical, Proceedings of the Hackito Ergo Sum (HES), Paris, France. [14] Bulygin, Y. (2008, April). CPU Side-channels vs. Virtualization Malware: The Good, the Bad, or the Ugly. Proceedings of the ToorCon. Seattle, WA, USA. [15] Bulygin, Y., & Samyde, D. (2008, August). Chipset Based Approach to Detect Virtualization Malware a.k.a. DeepWatch. Proceedings of the Black Hat Security Conference, Las Vegas, NV, USA. [16] Bulygin, Y., Loucaides, J., Furtak, A., Bazhaniuk, O., & Matrosov, A. (2014, August). Summary of Attacks against BIOS and Secure Boot. Proceedings of the DefCon, Las Vegas, NV, USA. [17] Coreboot. (2014). Retrieved on October 12, 2014, from www.coreboot.org [18] Dai Zovi, D. (2006). Hardware VirtualizationBased Rootkits. Proceedings of the Black Hat Security Conference, Las Vegas, NV, USA. [19] Derock, A. (2009, October 28-30). HVM Virtual Machine Monitor, a Powerful Concept for Forensic and Anti-Forensic. Paper presented at the Hack.lu conference, Luxembourg. [20] Desnos, A., Filiol, E., & Lefou, I. (2011). Detecting (and creating !) a HVM rootkit (aka BluePill-like). Journal in Computer Virology, 7(1), 23–50. http://dx.doi.org/10.1007/s11416-009-0130-8 [21] Duflot, L., Etiemble, D., & Grumelard, O. (2006, April). Using CPU System Management Mode to Circumvent Operating System Security Functions. Proceedings of the CanSecWest Applied Security Conference, Paris, France. [22] Embleton, S. (2007). The VMM framework. Hacker Tools. Retrieved on October 12, 2014, from http://www.mobiledownload.net/tools/software/vmxcpu.rar [23] Embleton, S., Sparks, S., & Zou, C. (2008, September 22-25). SMM Rootkits: A New Breed of OS Independent Malware. Proceedings of the 4th International Conference on Security and Privacy in Communication Networks (SecureComm). Istanbul, Turkey. [24] Fannon, C. (2014, June). An Analysis of Hardware-assisted Virtual Machine based Rootkits. (Master’s thesis) Naval Postgraduate School. Monterey, CA, USA. Retrieved on October 12, 2014, from http://hdl.handle.net/10945/42621 [25] Fisher-Ogden, J. (2006). Hardware Support for Efficient Virtualization. Technical report, University of California. San Diego, CA, USA. Retrieved on October 12, 2014, from http://cseweb.ucsd.edu/~jfisherogden/hardwar eVirt.pdf [26] Fog, A. (2014). Lists of Instruction Latencies, Throughputs and Micro Operation Breakdowns for Intel, AMD and VIA CPUs. Agner Fog Homepage. Retrieved on October 12, 2014, from http://www.agner.org/optimize/instruction_ta bles.pdf [27] Fritsch, H. (2008, August). Analysis and Detection of Virtualization-based Rootkits. Master’s thesis, Technical University of Munich, Germany. Retrieved on October 12, 2014, from http://www.nm.ifi.lmu.de/pub/Fopras/frit08/P DF-Version/frit08.pdf [28] Gabris, F. (2009, August 22). Turning off Hypervisor and Resuming OS in 100 Instructions, Paper presented at the Fasm Conference (Fasm Con). Myjava, Slovak Republic. [29] Garfinkel, T., Adams, K., Warfield, A., & Franklin, J. (2007). Compatibility is Not Transparency: VMM Detection Myths and Realities. Paper presented at the 11th USENIX Workshop on Hot Topics in Operating Systems (HotOS). Berkeley, CA, USA. [30] Graziano, M., Lanzi, A., & Balzarotti, D. (2013, October 23-25). Hypervisor Memory Forensics, Proceedings of the 16th International Symposium Research in Attacks, Intrusions, and Defenses (RAID), 21-40, Rodney Bay, Saint Lucia. http://dx.doi.org/10.1007/978-3-642-412844_2 [31] Intel. (1998). Using the RDTSC Instruction for Performance Monitoring. Carleton Computer Security Lab (CCSL). Retrieved on October 12, 2014, from http://www.ccsl.carleton.ca/~jamuir/rdtscpm1 .pdf [32] Intel. (2014, September). Intel 64 and IA-32 Architectures Software Developer’s Manual Volume 3C: System Programming Guide, Part 3, Order Number: 326019-052US. [33] ISO 5725 (2004). Accuracy (Trueness and Precision) of Measurement Methods and Results, Parts 1-6. [34] Jian, N., Huaimin, W., Shize, G., & Bo, L. (2010). CBD: A Counter-Based Detection Method for VMM in Hardware Virtualization Technology. Proceedings of the First International Conference on Pervasive Computing, Signal Processing and Applications (PCSPA). Harbin, China. http://dx.doi.org/10.1109/pcspa.2010.92 [35] Kovah, X., Kallenberg, C., Butterworth, J., & Cornwell, S. (2014, September 24). Into the Unknown: How to Detect BIOS-level Attackers. Paper presented at the 24th Virus Bulletin International Conference (VB2014), Seattle, WA, USA. [36] Korkin, I., & Nesterov I., (2014, May 28-29). Applying Memory Forensics to Rootkit Detection. Paper presented at the Proceedings of the 9th annual Conference on Digital Forensics, Security and Law (CDFSL), 115-141, Richmond, VA, USA. [37] Korkin, I. (2012, July 12). Hypervisor Detection in Symantec Endpoint Protection. Retrieved on October 12, 2014, from http://igorkorkin.blogspot.ru/2012/07/hypervi sor-detection-in-symantec.html [38] Korkin I. (2014). Hypervisor Detection Platform. Retrieved on October 12, 2014, from http://sites.google.com/site/iykorkin/hdp.zip [39] Kornfeld, M. (1965, March). Accuracy and Reliability of a Simple Experiment. (in Russian) 85 (3), Number UFN 85 533–542, 533-542, Retrieved on October 12, 2014, from http://ufn.ru/ufn65/ufn65_3/Russian/r653e.pd f [40] Kyte, I., Zavarsky, P., Lindskog, D., & Ruhl, R. (2012, June 10-12). Detection of Hardware Virtualization Based Rootkits by Performance Benchmarking. Concordia University College of Alberta. Retrieved on October 12, 2014, from http://infosec.concordia.ab.ca/files/2012/04/2 011Kyte.pdf [41] Kyte, I., Zavarsky, P., Lindskog, D., & Ruhl, R. (2012, January). Enhanced side-channel analysis method to detect hardware virtualization based rootkits. Proceedings of the World Congress on Internet Security (WorldCIS), 192-201, Guelph, ON, Canada. [42] Lakshminarayanan, K., Patel, K., Robinson, D., & Soulami, T. (2012). U.S. Patent No. 8,205,241 B2. Washington, DC: U.S. Patent and Trademark Office. [43] Li, H., Zhu, J., Zhou T., & Wang, Q. (2011, May 27-29). A new Mechanism for Preventing HVM-Aware Malware. Proceedings of the 3rd International Conference on Communication Software and Networks (ICCSN), 163-167, Shaanxi, China, http://dx.doi.org/10.1109/ICCSN.2011.60146 96 [44] McAfee. (2012). Root out Rootkits: An Inside Look at McAfee Deep Defender. Intel. Retrieved on October 12, 2014, from http://www.intel.com/content/dam/www/publi c/us/en/documents/white-papers/mcafeedeep-defender-deepsafe-rootkit-protectionpaper.pdf [45] Microsoft. (2013, August 8) Hypervisor TopLevel Functional Specification: Windows Server 2012 R2. Released Version 4.0a. [46] Medley, D. (2007, March). Virtualization Technology Applied to Rootkit Defense. Master’s thesis, AFIT/GCE/ENG/07-08, Wright-Patterson Air Force Base, OH, USA, Accession Number ADA469494. [47] Morabito, D. (2012, June). Detecting Hardware-Assisted Hypervisor Rootkits within Nested Virtualized Environments. Master’s thesis, AFIT/GCO/ENG/12-20, Wright-Patterson Air Force Base, OH, USA, Accession Number ADA563168. [48] Muchychka, A. (2013). Lenovo IdeaPad B560: How to Flash KBC BIOS. Hardware Today. Retrieved on October 12, 2014, from http://hardwaretoday.com/articles/notebooks/lenovo_ideapad _b560_how_to_flash_kbc_bios_reasons_no_p ower_led_indicators_don_t_work [49] Myers, M., & Youndt, S. (2007 Aug). An Introduction to Hardware-Assisted Virtual Machine (HVM) Rootkits. Mega Security. Retrieved on October 12, 2014, from http://megasecurity.org/papers/hvmrootkits.pd f [50] North Security Labs (2011). Blog. Retrieved on October 12, 2014, from http://northsecuritylabs.blogspot.ru/2011/11/g reetings-to-all-we-have-great-news.html [51] Nohl, K., & Lell, J. (2014, August 2-7). BadUSB - On Accessories That Turn Evil, Proceedings of the Black Hat Security Conference, Las Vegas, NV, USA. [52] O'Neill, D. (2010) How to Detect Virtualization. Dave O'Neill Homepage. Retrieved on October 12, 2014, from http://www.dmo.ca/blog/detectingvirtualization-on-linux/ [53] Park, J. (2013, October 30). A Study on Detection of Hacking and Malware Codes in Bare Metal Hypervisor for Virtualized Internal Environment of Cloud Service. International Journal of Computer Science and Network Security (IJCSNS), 13(10), 7882. [54] Pek, G, & Buttyan, L. (2014, June 10-14). Towards the Automated Detection of Unknown Malware on Live Systems. Proceedings of the IEEE International Conference on Communications (ICC), 847852, Sydney, NSW, Australia, http://dx.doi.org/10.1109/ICC.2014.6883425 [55] Ramos, J. (2009, December). Security Challenges with Virtualization. Master’s thesis, University of Lisbon, Portugal. Retrieved on October 12, 2014, from http://core.kmi.open.ac.uk/download/pdf/124 24378.pdf [56] Rutkowska, J. (2006). Subverting Vista Kernel for Fun and Profit. Proceedings of the Symposium on Security for Asia Network (SyScan) & Black Hat Briefings, Singapore & Las Vegas, NV, USA. [57] Rutkowska, J., & Tereshkin, A. (2007). IsGameOver(), Anyone? Proceedings of the Black Hat Security Conference, Las Vegas, NV, USA. [58] Rutkowska, J., & Tereshkin, A. (2008, August). Bluepilling the Xen Hypervisor. Proceedings of the Black Hat Security Conference, Las Vegas, NV, USA. [59] Strelen, J. (2004, December 5-8). The Accuracy of a New Confidence Interval Method. Proceedings of the Winter Simulation Conference. Washington, DC, USA, 654-662, http://dx.doi.org/10.1109/WSC.2004.1371373 [60] TianoCore. (2014). Retrieved on October 12, 2014, from http://tianocore.github.io/ [61] Tichonov, A., & Avetisyan, A. (2011). Development of Taint-analysis Methods to Solve the Problem of Searching of Undeclared Features. Proceedings of the Institute for System Programming of Russian Academy of Sciences (ISPRAS), 20, 9-24, Moscow, Russia. [62] The SciPy Community. (2009). Calculate the First Order Difference. SciPy developers. Retrieved on October 12, 2014, from http://docs.scipy.org/doc/numpy/reference/ge nerated/numpy.diff.html [63] The Xen Project. (2014). Retrieved on October 12, 2014, from http://www.xenproject.org/ [64] Utin, M. (2014, November 20-21). A Myth or Reality – BIOS-based Hypervisor Threat. Proceedings of the In-Depth Security Conference (DeepSec), Vienna, Austria. [65] Vectorization (mathematics). (2014). In Wikipedia. Retrieved on October 12, 2014, from http://en.wikipedia.org/wiki/Vectorization_(m athematics) [66] Wailly, A. (2013, June 20). Malware vs Virtualization: The Endless Cat and Mouse Play. Proceedings of the Hack in Paris (HIP) Conference. Paris, France. Retrieved on October 12, 2014, from http://aurelien.wail.ly/publications/hip-2013slides.html [67] Wang, Z., & Jiang, X. (2010, May 16-19). HyperSafe: A Lightweight Approach to Provide Lifetime Hypervisor Control-Flow Integrity. Proceedings of the 31st IEEE Symposium on Security and Privacy (SP). Oakland, CA, USA, 380–395. http://dx.doi.org/10.1109/SP.2010.30 [68] Wojtczuk, R., & Rutkowska, J. (2009, February 18-19). Attacking Intel Trusted Execution Technology. Proceedings of the Black Hat Security Conference, Washington DC, USA. [69] Wojtczuk, R., Rutkowska, J., & Tereshkin, A. (2009, December). Another Way to Circumvent Intel Trusted Execution Technology. Retrieved on October 12, 2014, from http://invisiblethingslab.com/resources/misc0 9/Another TXT Attack.pdf [70] Zmudzinski, K. (2009). U.S. Patent No. US20090172229 A1. Washington, DC: U.S. Patent and Trademark Office. An Empirical Comparison of Widely Adopted Hash Functions ... 2015 CDFSL Proceedings AN EMPIRICAL COMPARISON OF WIDELY ADOPTED HASH FUNCTIONS IN DIGITAL FORENSICS: DOES THE PROGRAMMING LANGUAGE AND OPERATING SYSTEM MAKE A DIFFERENCE? Satyendra Gurjar, Ibrahim Baggili, Frank Breitinger and Alice Fischer Cyber Forensics Research and Education Group (UNHcFREG) Tagliatela College of Engineering, ECECS Department University of New Haven, West Haven CT, 06511 {agurj1, ibaggili, fbreitinger, afischer}@newhaven.edu ABSTRACT Hash functions are widespread in computer sciences and have a wide range of applications such as ensuring integrity in cryptographic protocols, structuring database entries (hash tables) or identifying known files in forensic investigations. Besides their cryptographic requirements, a fundamental property of hash functions is efficient and easy computation which is especially important in digital forensics due to the large amount of data that needs to be processed when working on cases. In this paper, we correlate the runtime efficiency of common hashing algorithms (MD5, SHA-family) and their implementation. Our empirical comparison focuses on C-OpenSSL, Python, Ruby, Java on Windows and Linux and C♯ and WinCrypto API on Windows. The purpose of this paper is to recommend appropriate programming languages and libraries for coding tools that include intensive hashing processes. In each programming language, we compute the MD5, SHA-1, SHA-256 and SHA-512 digest on datasets from 2 MB to 1 GB. For each language, algorithm and data, we perform multiple runs and compute the average elapsed time. In our experiment, we observed that OpenSSL and languages utilizing OpenSSL (Python and Ruby) perform better across all the hashing algorithms and data sizes on Windows and Linux. However, on Windows, performance of Java (Oracle JDK) and C WinCrypto is comparable to OpenSSL and better for SHA-512. Keywords: Digital forensics, hashing, micro benchmarking, security, tool building. 1. INTRODUCTION Cryptographic hash functions are critical to digital forensic science (DFS). Almost all tools written for forensic acquisition and analysis compute hash values throughout the digital forensic process to ensure the integrity of seized devices and data. For instance, to ensure the integrity of digital evidence in court, a forensic examiner traditionally computes the hash digest of the entire disk image that is then securely © 2015 ADFSL stored. When it becomes necessary to verify that a disk has remained intact without alteration after being acquired, a new hash digest is computed on the entire disk and compared against the stored hash digest. If both hashes coincide, we conclude that no alteration to the original drive took place during the acquisition process. On the other hand, the availability and use of electronic devices has dramatically increased. Traditional books, photos, letters and records Page 57 2015 CDFSL Proceedings An Empirical Comparison of Widely Adopted Hash Functions ... have become e-books, digital photos, e-mail and music files. This transformation has also influenced the capacity of storage media, increasing from a few megabytes to terabytes. According to the Federal Bureau of Investigation (FBI)’s Regional Computer Forensics Laboratory annual report in 2012 (Regional Computer Forensics Laboratory. Annual report, 2012), there was a 40% increase in amount of data analyzed in investigations. Due to the amount of data to be processed, runtime efficiency becomes an important and timely issue.To that end, automatic data filtration has become critical for speeding up investigations. A common procedure known as file filtering is in use by today’s digital forensic scientists and examiners, which requires hash functions. The procedure is quite simple: 1. compute the hashes for all files on a target device and A common field of the application for hash functions is in digital forensics. Since this area has to deal with large amounts of data, the ease of computation (runtime efficiency) of hashing algorithms is very important. In this paper we compare runtime efficiency of hashing algorithm implementations in multiple programming languages across different operating systems. Namely, we compare MD5, SHA-1, SHA-256 and SHA-512 in C, C♯, Java, Python and Ruby on both Windows and Linux. While the DFS community has performed extensive research on hash function applications, little to no experimental work has been published with regards to the variance in the runtime efficiency of hash functions across different programming languages and libraries. This is of critical importance and may help scientists and practitioners alike when choosing a particular programming language if their forensic applications are hash function intensive. 2. compare them to a reference database. Depending on the underlying database, files are either filtered out (e.g., files of the operating system) or filtered in (e.g., known illicit content). A commonly used database for ‘filtering out’ data is the National Software Reference Library Reference Data Set (RDS) (RDS Hashsets, 2014) maintained by National Institute for Standards and Technologies (NIST). Traditional hash functions can only match files exactly for every single bit. Forensic examiners frequently face the situation when they need to know if files are similar. For example, if two files are a different version of the same software package or system files or if the files partially match an image or video. Research has found utility for hash functions for finding similar files. Kornblum’s Context Triggered Piecewise Hashing (CTPH) (Kornblum, 2006) and Roussev’s similarity digest hashing (sdhash, (Roussev, 2010a)) have presented these ideas. These algorithms provide a probabilistic answer for similarities of two or more files. Although these algorithms are designed to detect similarity, they make use of traditional / cryptographic hash functions. Page 58 2. RELATED WORK Hash functions (e.g., SHA-1 (Gallagher & Director, 1995)) have a long tradition and are applied in various fields of computer science like cryptography (Menezes, van Oorschot, & Vanstone, 2001), databases (Sumathi & Esakkirajan, 2007, Sec. 9.6) or digital forensics (Altheide & Carvey, 2011, p.56ff). Garfinkel (Garfinkel, Nelson, White, & Roussev, 2010) also discussed small block forensics using cryptographic hash functions by calculating hashes on individual blocks of data rather than on entire files. Techniques described in his paper can be applied to data acquired from memory images as well. A hash based carving tool frag_find (frag_find, 2013) is used to find a MASTER file or fragments in a disk image using small block hashing. It computes hashes of small blocks (512 bytes) of MASTER files then compares it with disk image blocks in each sector. In contrast to cryptographic hash functions, bytewise approximate matching do not have a long history and probably had its breakthrough in 2006 with an algorithm called context triggered piecewise hashing (CTPH). Korn© 2015 ADFSL An Empirical Comparison of Widely Adopted Hash Functions ... blum (Kornblum, 2006) used this algorithm to identify similar files. The idea of CTPH is based on spamsum (spamsum, 2002-2009), a spam detection algorithm by Tridgell (Tridgell, 1999). The basic idea is behind it is simple: split an input into chunks, hash each chunk independently and concatenate the chunk hashes to a final similarity digest (a.k.a. fingerprint). The sdhash tool1 was introduced four years later (Roussev, 2010b) in an effort to address some of the shortcomings of ssdeep. Instead of dividing an input into chunks, the sdhash algorithm picks statistically improbable features to represent each object. A feature in this context is a byte sequence of 64 bytes, which is hashed using SHA-1 and inserted into a Bloom filter (Bloom, 1970). The similarity digest of the data object is a sequence of 256-byte Bloom filters, each of which represents approximately 10 KB of the original data. Besides these two very prominent approaches, more tools published over the last decade mrsh-v2 (Breitinger & Baier, 2012) seem to be promising since they use concepts from sdhash and ssdeep. In addition to the tools, Breitinger (Breitinger, Stivaktakis, & Baier, 2013) presented a testing framework entitled FRamework to test Algorithms of Similarity Hashing (FRASH) which is used to compare these algorithms – efficiency was one of the important metrics. Saleem, Popov and Dahman (Saleem, Popov, & Dahman, 2011) presented a comparison of multiple security mechanisms including a hashing algorithm in accordance with Information Technology Security Evaluation Criteria (ITSEC). One of the criteria chosen in their analysis most relevant to this research is computational efficiency. In their experiments, they concluded that SHA-256 had the slowest average time. They also referred to a collision attack on MD5 (Wang & Yu, 2005) and SHA-1 (Wang, Yin, & Yu, 2005) and concluded SHA256 and SHA-512 show more Strength of Mechanism compared to MD5 and SHA-1. 1 http://sdhash.org last visited 2014-09-29. © 2015 ADFSL 3. 2015 CDFSL Proceedings METHODOLOGY In this section, we first explain our experimental environment in Sec. 3.1 followed by an explanation of how we present our results in Sec. 3.2. 3.1 Experimental environment In order to compute runtime, we generated files of multiple sizes from 2 MB to 1 GB using Python’s os.urandom function. On UNIX-like systems, this Python function uses /dev/urandom and on Windows it uses CryptGenRandom to generate random binary data. In our experiments, the programs written in the respective languages take four command line arguments: Warmup-count is the number of times to run a hash function before we start collecting elapsed time. This is to help programming languages like Java that have a Just-InTime compiler to start, compile, and optimize code before we start collecting measurements. repeat-count is the number of times we are going to run the hash function on the same data to collect the elapsed time. Elapsed time is collected for computing the digest only. The time to load the file from disk into the buffer is not included. For this experiment, we set repeat − count = 10 for each hashing algorithm and data file. algorithm is the name of the hashing algorithm to be used for a run. We use MD5, SHA-1, SHA-256 and SHA-512 in our experiments. data-file is the name of the data-file whose hash digest is to be computed. As we have data files of multiple sizes, each run computes the hash digest on every single data file. Each program prints the elapsed time, repeat index and computed digest (for the verification of correctness of the program). Table 1 shows the used hardware for our experiments where Table 2 describes the hashing Page 59 2015 CDFSL Proceedings An Empirical Comparison of Widely Adopted Hash Functions ... algorithms. On Linux we tested the hashing algorithms using Java, Ruby, Python and C (with OpenSSL). For Windows we tested using Java, Ruby, Python, C♯ and C (with two libraries OpenSSL and WinCrypto). The source code of the experiment is available on github: https:// github.com/sgurjar/hashfun-benchmark. 3.2 Data analysis and results For each language, hashing algorithm and data size, we recorded the elapsed time of the n = 10 runs. Next, we computed the mean values for all runs. In addition, we wanted to identify the best curve/graph that represents this set of data points. More precisely, we wanted to identify the best coefficients of the linear equation y = a+bx where we decided to use the Least Squares Mean2 (LSM). According to LSM, we identified the coefficients as follows: n n i=1 xi ∗ yi ) − (( i=1 xi ) ∗ ( i=1 yi )) Pn Pn b= (n ∗ i=1 xi ∗ xi ) − ( i=1 xi )2 (1) P P ( ni=1 yi ) − (b ∗ ni=1 xi ) a= (2) n where x is an independent variable representing the size of the data we are computing the hash digest for, and y is the dependent variable representing the average elapsed time for a given language, algorithm and data size. b is called slope of the line and is a measurement of how well the implementation of an algorithm will scale in a programming language, i.e. the higher the slope the slower the implementation for large amounts of data. (n ∗ Pn P P 4. ASSESSMENT AND EXPERIMENTAL RESULTS We divided this section into five subsections. The first four subsections are named according to the tested algorithms MD5, SHA-1, SHA-256 and SHA-512 and present the detailed results of our experiments. The last section visually summarizes our results and discusses critical findings. 2 http://en.wikipedia.org/wiki/Least_squares last visited 2014-09-29 Page 60 Data in MB 2 4 8 16 32 64 128 256 512 640 768 896 1024 Avg. elapsed time in milli-Sec. C Java Python Ruby 5 11 4 5 12 22 11 10 23 44 21 21 44 89 42 44 87 178 87 88 177 356 175 176 353 712 353 352 705 1420 704 706 1409 2848 1413 1407 1761 3523 1763 1765 2114 4252 2116 2119 2465 4978 2470 2470 2820 5716 2823 2824 Table 3: Average elapsed time for MD5 on a Linux system. To present our results, we decided to have three tables per algorithm: 1. The first table shows the average elapsed time in milliseconds for Linux dependent on the file size. 2. The second table shows the coefficients a and b using equations 1 and 2. 3. The third table shows the average elapsed time in milliseconds for Windows dependent on the file size. The column header C indicates that COpenSSL is used while C (win) stands for the WinCrypto library. 4.1 MD5 The detailed results for the MD5 algorithms are shown in Table 3, 4 and 5. As indicated by Table 3, languages using OpenSSL (C, Python and Ruby) showed similar performance on Linux, where Java was approximately half as fast. On Windows, Table 5, C-OpenSSL and Ruby were the fastest and have perform similar than on the Linux system. Python is faster than C♯, C WinCrypto, and Java, but slower than when ran on Linux. C♯ and WinCrypto showed similar performance. © 2015 ADFSL An Empirical Comparison of Widely Adopted Hash Functions ... Part RAM Processor Unix Win 2015 CDFSL Proceedings Specification 4 GB Intel® Core2™ Duo Processor E6400 (2M Cache, 2.13 GHz, 1066 MHz FSB) Ubuntu 12.04.1 LTS, 32 bit Windows Server 2012 R2, 64-bit Table 1: Test Environment Language Linux: Java Ruby Python C Windows: Java Ruby Python C♯ C Version Module 1.7.0_51 OpenJDK 2.1.0 2.7.3 GCC 4.6.3 -Ofast java.security.MessageDigest openssl hashlib OpenSSL 1.0.1 1.7.0_51 SunJDK 2.0.0 2.7.6 MS C♯ Compiler ver 12 MSVC C/C++ Compiler ver 18 java.security.MessageDigest openssl hashlib System.Security.Cryptography WinCrypto and openssl-1.0.1f Table 2: Runtime Environment Windows Language Linux: C Python Ruby Java Windows: C C (win) C♯ Python Ruby Java a b 0.209 −0.575 −0.488 −1.642 2.752 2.758 2.758 5.558 −0.276 0.523 −0.367 −0.422 −0.206 3.445 2.751 3.736 3.669 3.374 2.647 5.497 Table 4: MD5 a and b coefficients Again, Java showed the slowest results; around 2 times slower as its counterparts (C-OpenSSL, Python and Ruby) and 1.5 times slower than C♯ and WinCrypto. These findings also coincide with Table 4. Comparing b shows that C and Ruby have similar efficiency regardless of the operating system © 2015 ADFSL while Python is faster on the Linux system. As expected, Java is almost two times slower, evident by the value of b = 5.558 (Linux). 4.2 SHA-1 The detailed results for SHA-1 are shown in Table 6, 7 and 8 which shows that overall, SHA-1 is slower than MD5 with respect to all tested scenarios. Again, OpenSSL (C, Python and Ruby) on Linux perform very well while we identified a slight drawback for Python. The Windows system shows a similar behavior – C has the fastest implementation followed by Ruby. Next are C (win), C♯ and Python with a small disadvantage for the latter one. Regardless the operating system, Java was almost three times slower than OpenSSL with a slope value of 9.303 and 8.345. 4.3 SHA-256 The detailed results for the SHA-256 are shown in Table 9, 10 and 11. Page 61 2015 CDFSL Proceedings Data in MB 2 4 8 16 32 64 128 256 512 640 768 896 1024 An Empirical Comparison of Widely Adopted Hash Functions ... C♯ 7 16 29 58 117 234 470 938 1877 2346 2817 3285 3761 Avg. elapsed time in milli-Sec. C C (win) Java Python 6 8 11 6 9 16 24 14 22 31 45 27 44 61 86 54 88 120 177 108 177 239 356 215 352 478 709 431 705 956 1414 863 1408 1922 2849 1727 1759 2381 3506 2159 2113 2863 4230 2590 2464 3358 4920 3023 2817 3825 5628 3454 Ruby 6 9 21 42 86 169 339 677 1354 1693 2033 2372 2711 Table 5: Average elapsed time for MD5 on a Windows system. Data in MB 2 4 8 16 32 64 128 256 512 640 768 896 1024 Avg. elapsed time in milli-Sec. C Java Python Ruby 6 19 6 6 12 38 12 12 24 75 24 24 49 149 48 48 97 297 98 96 193 595 195 192 387 1187 393 386 771 2380 785 771 1542 4748 1570 1548 1934 5974 1969 1933 2320 7134 2362 2320 2705 8308 2757 2700 3093 9550 3149 3094 Table 6: Average elapsed time for SHA-1 on a Linux system. Language Linux: C Python Ruby Java Windows: C C (win) C♯ Python Ruby Java Page 62 b −0.572 −0.357 −1.217 −1.353 3.019 3.023 3.076 9.303 0.342 0.468 0.167 −0.346 −0.781 −0.668 3.017 3.568 3.576 3.774 3.346 8.345 Table 7: SHA-1 a and b coefficients and 4 GB RAM was not sufficient when handling large data. 4.4 While our experiments showed constant results for the Linux system with OpenSSL outperforming Java, the tests on Windows vary. For SHA-256 Ruby was fastest on Windows followed by C and C (win). Again, Java remained the slowest. Compared to the previous tests, we uncovered an odd behavior of C♯ which performed well expect for 1 GB file. We hypothesize that C♯ had a larger memory footprint a SHA-512 The detailed results for SHA-512 are shown in Table 12, 13 and 14. The results for SHA-512 are similar to SHA256. On the Linux system, Java is the slowest while all other results are almost identical. On Windows, Ruby was the fastest followed by Python. C, C♯, C (win) and Java showed similar efficiency. However, on the 1 GB data file, C♯ again was slow, mostly due to what we hy© 2015 ADFSL An Empirical Comparison of Widely Adopted Hash Functions ... Data in MB 2 4 8 16 32 64 128 256 512 640 768 896 1024 C♯ 7 14 32 57 114 229 457 916 1831 2288 2745 3204 3663 2015 CDFSL Proceedings Avg. elapsed time in milli-Sec. C C (win) Java Python 6 6 16 7 13 14 33 15 25 28 66 31 48 58 136 60 97 114 269 120 194 228 547 241 386 459 1062 482 773 913 2148 966 1544 1836 4231 1931 1931 2280 5305 2417 2317 2738 6467 2897 2703 3205 7452 3382 3091 3649 8567 3864 Ruby 6 12 26 52 107 214 427 856 1712 2139 2568 3000 3424 Table 8: Average elapsed time for SHA-1 on a Windows system. Data in MB 2 4 8 16 32 64 128 256 512 640 768 896 1024 Avg. elapsed time in milli-Sec. C Java Python Ruby 18 29 17 17 36 59 36 35 73 117 72 71 144 236 142 143 291 470 287 287 577 942 573 575 1150 1897 1144 1146 2301 3769 2286 2336 4601 7545 4579 4568 5733 9395 5767 5730 6884 11417 6913 6875 8031 13226 7990 8034 9163 15161 9135 9183 Table 9: Average elapsed time for SHA-256 on a Linux system. pothesize is a memory footprint. Overall, we note that on Windows, SHA-512 was faster than SHA-256 for all of the languages, especially for larger data sizes. On Linux, the speed for SHA-512 and SHA-256 were similar for all of the languages except for Java where SHA-512 was much slower than SHA-256. © 2015 ADFSL Language a Linux: C 1.724 Python 3.486 Ruby 2.237 Java −4.972 Windows: C −5.176 C (win) 5.264 C♯ −1044.546 Python −0.503 Ruby −0.180 Java 2.561 b 8.946 8.956 8.959 14.789 9.037 9.688 17.53 10.189 8.038 13.085 Table 10: SHA-256 a and b coefficients 4.5 Result summary This section discusses and summarizes the main findings. A visual summary of all the experimental results is presented in Figures 1 and 2. While most graphs show an expected behavior, there are two striking results. On Linux, the Java implementation of SHA-512 shows an unexpected behavior while on Windows C♯ is particularly eye-catching. More precisely, on the Linux system, programming languages using the OpenSSL showed similar high performance. Regarding Page 63 2015 CDFSL Proceedings Data in MB 2 4 8 16 32 64 128 256 512 640 768 896 1024 C♯ 19 38 78 157 313 631 1263 2541 5077 6352 8342 9008 28859 An Empirical Comparison of Widely Adopted Hash Functions ... Avg. elapsed time C C (win) 17 19 36 37 74 78 142 156 286 309 572 622 1145 1255 2291 2486 4650 4988 5720 6242 7005 7431 8009 8683 9295 9903 in milli-Sec. Java Python 27 20 53 40 103 81 209 163 422 326 847 651 1683 1303 3359 2607 6695 5219 8361 6520 10028 7822 11793 9128 13370 10433 Ruby 17 33 64 129 256 514 1029 2057 4115 5143 6174 7201 8231 Table 11: Average elapsed time for SHA-256 on a Windows system. Figure 1. Ubuntu results Figure 1: Overview of the measurements results for Linux. Java, which was significantly slower than the OpenSSL library, we expected that for large data files that the efficiency will go up as JustPage 64 in-Time (JIT) compiler should have compiled and optimized byte code into native code. However, the slow performance of Java was related © 2015 ADFSL An Empirical Comparison of Widely Adopted Hash Functions ... Data in MB 2 4 8 16 32 64 128 256 512 640 768 896 1024 C♯ 13 26 53 107 209 421 842 1686 3425 4428 5475 5981 22381 2015 CDFSL Proceedings Avg. elapsed time in milli-Sec. C C (win) Java Python 20 19 19 13 42 36 39 27 77 73 75 55 153 148 147 110 306 297 300 220 613 597 595 441 1231 1192 1180 883 2452 2378 2356 1766 4894 4747 4706 3534 6114 5933 5895 4416 7342 7138 7073 5300 8567 8319 8238 6183 9808 9492 9411 7067 Figure 1. Ubuntu results Ruby 12 20 40 81 162 326 651 1302 2606 3256 3908 4557 5214 Table 14: Average elapsed time for SHA-512 on a Windows system. Figure 2. Windows results Figure 2: Overview of the measurements for Windows. to the underlying cryptographic primitives, as noted by Garfinkel (Garfinkel et al., 2010). MD5 and SHA-1 were three times faster than © 2015 ADFSL SHA-256 and SHA-512. On the Windows system, Java surprisingly was faster and outperformed C, Python and C♯ Page 65 2015 CDFSL Proceedings Data in MB 2 4 8 16 32 64 128 256 512 640 768 896 1024 An Empirical Comparison of Widely Adopted Hash Functions ... Avg. elapsed time in milli-Sec. C Java Python Ruby 19 71 19 19 38 144 37 38 77 286 75 75 153 572 152 152 303 1144 304 303 611 2292 612 606 1221 4569 1224 1223 2431 9155 2448 2447 4870 18283 4864 4891 6079 22770 6123 6092 7335 27347 7335 7341 8565 31964 8572 8630 9812 36527 9785 9714 Table 12: Average elapsed time for SHA-512 on a Linux system. for SHA-512. We could not find any explanation why SHA-512 on Java has such high efficiency. Again, programming languages using OpenSSL, such as Ruby and Python, steadily showed good and constant results on Windows. WinCrypto API showed good performance, and was better than OpenSSL for SHA-512. Overall Ruby showed the best times for SHA-256 and SHA-512 Main remarks: • OpenSSL showed good performance across both platforms. This also applies to programming languages using OpenSSL as a library, such as Ruby and Python. • SHA-256 and SHA-512 have a similar runtime. However, on Windows SHA-512 was faster while on Linux it was the other way round. • On Windows Ruby was discovered to be faster than Python. On Linux the two languages were very similar. • OracleJDK showed a higher performance on Windows than OpenJDK did on Linux. OracleJDK was specially good for SHA-512 on large data sizes. Page 66 Language a Linux: C −0.009 Python −1.803 Ruby −5.256 Java 4.422 Windows: C 0.952 C (win) 0.906 C♯ −874.141 Python −0.548 Ruby −0.102 Java 3.091 b 9.547 9.557 9.559 35.648 9.565 9.277 13.348 6.902 5.089 9.194 Table 13: SHA-512 a and b coefficients • C♯ started showing sudden spikes on elapsed time for SHA-256 and SHA-512 when the data size reached 1 GB. This may be attributed to a lack of available RAM on the system used. 4.6 Impact on the real world In this section we discuss the impact of our findings on a real world scenario. We assume that an investigator receives one hard drive of 512 GB, a smart phone having 32 GB memory, an SD-card of 8 GB and an external backup device of 160 GB. Furthermore, the user has 10 GB of cloud storage. We argue that this is a realistic scenario and that (512 + 32 + 8 + 160 + 10 = ) 722 GB can be easily found in a household nowadays especially when storing multimedia files such as videos and images. Table 15 shows the upscaled results. For upscaling we used the times of processing 1024 MB = 1 GB, multiplied it by 722 and divided it by 1000 – except for star-marked numbers. Since, C♯ had problems with the 1024 MB file, we upscaled using the 512 MB file. Thus, the table shows the estimated time in seconds. To conclude, there might be time differences of over 83 minutes for SHA-256 or even 322 minutes on Linux systems when using SHA-512. 5. CONCLUSION Although most results were as expected, our experiments uncovered some strange behav© 2015 ADFSL An Empirical Comparison of Widely Adopted Hash Functions ... Language Linux: C Python Ruby Java Windows: C C (win) C♯ Python Ruby Java Table 15: 722 GB. MD5 SHA-1 SHA256 SHA512 2036 2038 2039 4126 2233 2273 2233 6895 6615 6595 6630 10946 7084 7064 7013 26372 2033 2761 2715 2493 1957 4063 2231 2634 2644 2789 2472 6185 6711 7149 7331* 7532 5942 9653 7081 6853 4945* 5102 3764 6794 Estimated time for processing ior. The results on Linux are pretty solid and predictable – MD5 is the fastest while SHA-512 is the slowest both others are in between. Since most programming languages access the OpenSSL library, the times are quite constant. The slowest implementation was OpenJDK Java and therefore it is not recommended for hashing large amounts of data. We did not test OracleJDK on Linux. Regarding Windows, the results are different and show unexpected behavior. The results for C (independent from the library) are reasonable and mainly coincide with the Linux results. C♯ showed strange behavior for SHA-256 and SHA512 for larger files. We hypothesize that this is due to a larger memory footprint. Results for Python and Ruby are similar to the Linux results except for SHA-512 where algorithms are way faster on Windows. We cannot explain this behavior as of right now, and further experimentation is needed to explain these results. In conclusion, for writing a tool that needs to be portable across Unix-like and Windows platforms, C-OpenSSL is a good choice, however scripting languages such as Ruby and Python showed strong promise for quick prototyping. © 2015 ADFSL 2015 CDFSL Proceedings REFERENCES Altheide, C., & Carvey, H. (2011). Digital forensics with open source tools: Using open source platform tools for performing computer forensics on target systems: Windows, mac, linux, unix, etc (Vol. 1). Syngress Media. Bloom, B. H. (1970). Space/time trade-offs in hash coding with allowable errors. Communications of the ACM , 13 (7), 422–426. Breitinger, F., & Baier, H. (2012, October). Similarity Preserving Hashing: Eligible Properties and a new Algorithm MRSHv2. 4th ICST Conference on Digital Forensics & Cyber Crime (ICDF2C). Breitinger, F., Stivaktakis, G., & Baier, H. (2013). Frash: A framework to test algorithms of similarity hashing. Digital Investigation, 10 , S50–S58. frag_find. (2013). https://github.com/ simsong/frag_find. ([Online; accessed Sep-2014]) Gallagher, P., & Director, A. (1995). Secure Hash Standard (SHS) (Tech. Rep.). National Institute of Standards and Technologies, Federal Information Processing Standards Publication 180-1. Garfinkel, S., Nelson, A., White, D., & Roussev, V. (2010). Using purpose-built functions and block hashes to enable small block and sub-file forensics. digital investigation, 7 , S13–S23. Kornblum, J. D. (2006, August). Identifying almost identical files using context triggered piecewise hashing. In Proceedings of the digital forensic workshop (p. 9197). Retrieved from http://dfrws.org/ 2006/proceedings/12-Kornblum.pdf Menezes, A. J., van Oorschot, P. C., & Vanstone, S. A. (2001). Handbook of applied cryptography (Vol. 5). CRC Press. RDS Hashsets. (2014). http://www.nsrl.nist .gov/. ([Online; accessed Sep-2014]) Regional Computer Forensics Laboratory. Annual report. (2012). http://www.rcfl.gov/downloads/ documents/RCFL_Nat_Annual12.pdf. Page 67 2015 CDFSL Proceedings An Empirical Comparison of Widely Adopted Hash Functions ... ([Online; accessed Sep-2014]) Roussev, V. (2010a). Data fingerprinting with similarity digests. In Advances in digital forensics vi (pp. 207–226). Springer. Roussev, V. (2010b). Data fingerprinting with similarity digests. In K.-P. Chow & S. Shenoi (Eds.), Advances in digital forensics vi (Vol. 337, pp. 207–226). Springer Berlin Heidelberg. Retrieved from http://dx.doi.org/10.1007/978 -3-642-15506-2_15 doi: 10.1007/978-3 -642-15506-2_15 Saleem, S., Popov, O., & Dahman, R. (2011). Evaluation of security methods for ensuring the integrity of digital evidence. In Innovations in information technology (iit), 2011 international conference on (pp. 220–225). spamsum. (2002-2009). http:// www.samba.org/ftp/unpacked/ junkcode/spamsum/. ([Online; accessed Sep-2014]) Sumathi, S., & Esakkirajan, S. (2007). Fundamentals of relational database management systems (Vol. 1). Springer Berlin Heidelberg. Tridgell, A. (1999). Efficient algorithms for sorting and synchronization. Australian National University Canberra. Wang, X., Yin, Y. L., & Yu, H. (2005). Finding collisions in the full sha-1. In Advances in cryptology–crypto 2005 (pp. 17–36). Wang, X., & Yu, H. (2005). How to break md5 and other hash functions. In Advances in cryptology–eurocrypt 2005 (pp. 19–35). Springer. Page 68 © 2015 ADFSL INVESTIGATING FORENSICS VALUES OF WINDOWS JUMP LISTS DATA Ahmad Ghafarian University of North Georgia Department of Computer Science and Information Systems Dahlonega, Ga 30597 [email protected] ABSTRACT Starting with Windows 7, Microsoft introduced a new feature to the Windows Operating Systems called Jump Lists. Jump Lists stores information about user activities on the host machine. These activities may include links to the recently visited web pages, applications executed, or files processed. Computer forensics investigators may find traces of misuse in Jump Lists auto saved files. In this research, we investigate the forensics values of Jump Lists data. Specifically, we use several tools to view Jump Lists data on a virtual machine. We show that each tool reveal certain types of information about user’s activity on the host machine. This paper also presents a comparative analysis of the tools’ performances. In addition, we suggest different method of viewing contents of hidden folders, present another approach for deleting files from hidden folders, and propose an innovative way of gaining access to application identification numbers (AppIDs.) Keywords: Windows 7, Jump Lists, operating systems, computer forensics tools, virtual machine, VM 1. INTRODUCTION Jump Lists is a new feature of Windows 7 Operating Systems that shows the files and tasks that most recently or most frequently used by a user. They are similar to shortcuts in that they take user directly to the files or directories that are regularly used. They are different than the normal shortcut in that they are more extensible in what information they display. For example, Internet Explorer will use Jump Lists to display websites frequently visited; Microsoft Office products like Excel, PowerPoint and Word, on the other hand, will show most recently opened documents. From a user's standpoint, Jump Lists increase one's productivity by providing quick access to the files and tasks associated with the applications. From a forensics investigator’s standpoint, Jump Lists is a good indicator of which files were recently opened or which websites were visited frequently. Limited research results have been reported in the area of forensic value of Jump Lists data. Barnett (2011) has reported on the forensic value of Windows Jump Lists data. However, in his experiment he did not use any computer forensic tool. The author used a PC running Windows 7 with various web browsers to download pictures from a website. Then the amount and type of information that was stored by Jump Lists were compared manually for different web browsers. Roblyness (2012) has evaluated the data being stored by Jump Lists for different applications such as Notepad, MS Word, etc. He concluded that the programs that use default applications to open a related file, store less information than when the application is chosen by a user to open the same file. This researcher also did not use any tool and all the information was retrieved manually. In Windows 7, details of accessed files, such as opening a file by right-clicking the application taskbar square, are held within structured storage files which themselves are stored within the user’s profile. The files are named with 16 hexadecimal digits, known as the AppID, followed by two hidden file extensions called automaticDestinations and customDestinations. The first set store information about data file usage. Items are sorted either by Most Recently Used (MRU) or by Most Frequently Used (MFU), depending on the application. The latter set is the type file. The content contained within, and the tasks specified by this category of file, are maintained by the specific application responsible for that specific Destination file. These two sets of files can be parsed to obtain forensics data. Cowen (2011) has tested several applications on Windows 7 Professional SP1 and noted that the application identification numbers (AppID) of those applications are different for different versions of the same applications. The purpose of this research is to further investigate various aspects of Jump Lists auto saved data. To do this, we needed to decide on the applications we intend to use. There are many applications that a user or suspect can execute on a machine. Jump Lists keep different type of information for each type of application and for each action (e.g. open, update, delete, etc.) on the file. For example, the type and amount of the Jump Lists hidden files for a Microsoft Word file would be different than the same data for graphic file. In this work, we will limit our experiment to Microsoft Office 2010, standard web browsers, and portable web browsers. In contrast to most of the previous work, we perform our experiment on a virtual machine (VM), i.e. vmWare. This is because we wanted to make sure that the applications we use in this experiment are the only ones that are installed on the VM. The impact of this restriction is that we will only have limited AppIDs to evaluate. time, a physical machine may have different status as far as the application running on the machine and resource usages of the system. However, with the virtual machine, we use a bare machine with only the activities that are related to this research. Throughout this research, we use some tools to retrieve information from Windows Jump Lists. In general, there are two types of tools, namely tools like Jumplist-Launcher, which allow user to create customized jump lists, and tools like JumpLister, which parses the jump lists and deliver details about the activities of the user on the Windows machine. 2.1 Virtual Machine (VM) A virtual machine is a software implementation of a computing environment. The virtual machine typically emulates a physical computer, but requests for physical resources are managed by a hypervisor which translates these requests to the underlying physical hardware (vmWare, 2014). 2.2 Jumplist-Launcher Jumplist-Launcher is a free portable tool for Windows 7 and 8 that allows computer forensics investigators to add their favorite programs in a single Jump Lists for easy accessibility. We can add up to 60 jump list-items and they can be categorized into self-defined groups for easy accessibility (Madalina, 2014). 2.3 JumpListsView Since the tools behave differently for different applications, we present a comparative analysis of the performances of the tools we used to view Jump Lists data. Additional contribution of this research include proposing different methods of viewing contents of hidden folders, presenting another approach for deleting files from hidden folders and suggesting an innovative way of gaining access to AppIDs. JumpListsView is an open source tool that is used to display the information stored by Jump Lists. For every record found in the Jump Lists, JumpListsView displays the following information: The filename that the user accessed, the date/time of the file opening event, the ID of the application that was used to open the file, the size/time/attributes of the file on the time the file was opened (NirSoft, 2013.) 2. VIRTUAL MACHINE AND TOOLS 2.4 JumpLister In order to make our experiment consistent, we use virtual machine to examine forensics values of Windows Jump Lists. This is because the experiment is done at different date and time during the course of this research. At any given JumpLister is designed to open one or more Jump Lists files, parse the compound file structure, and then parse the link file streams that are contained within. It uses the LNK parser (Woanware, 2012).The latest version also parses out the Destination Lists (DestList) and performs a lookup on the AppIDs (Cowen, 2011.) For example, when a user opens a file and saves it as a new name, in JumpLister the Count will increase by one in Root and the DestList will be updated. Besides, the path of file, type of file, and name of file will be shown to the examiner. 2.5 Jump Lister Parser (JMP) JMP is a command line version of a Windows parser that parses Jump Lists. This tool is geared for outputting data in a parseable comma delimited (CSV) format. For example, the statement, Jmp <Destinations filename> > results.txt, parses an individual destination file and saves the results on results.txt. 2.6 Jump List File Extract Jump Lists File Extract is a program that extracts file information from Jump Lists data. This information contains link to the files accessed by Jump Lists that are called destination files and are introduced in section 1 of this paper. 3. OUR EXPERIMENT In this section we describe the environment and the setup in which we performed our experiment. We installed vmWare 8.0 on a Windows 7 machine. We then set the logical environment for JumpLister, JumpListsView and Jmp on VM. In order to view Jump Lists data we need to run an application on our Windows machine. To make data more meaningful, we limited ourselves to three applications namely, Microsoft Office 2010, Mozilla Firefox, and Google Chrome Portable. We installed all these three applications on vmWware (VM). 3.1 Results of Actions on Various Application Files First we worked with MS Word. We created a sample MS Word document on VM. We then performed some actions such as open, rename, delete on the file. After each action, we used several tools to open Jump Lists auto saved data. Three of the most significant Jump Lists data that we monitored their changes include AppID, Count, and DestList. The results of this experiment are shown in Table 1 below. AppID was briefly described in section 1 above. Count and DestList are briefly described below. Count indicates the number of times a file has been referred to and DestList represents the action on the file. The DestList stream acts as a most recently/frequently used list. This stream consists of a 32-byte header, followed by the various structures that correspond to each of the individual numbered streams. Each of these structures is 114 bytes in size, followed by a variable length Unicode string (NirSoft, 2013.) For installed web browser experiment, after installing Firefox web browser and connecting to the Internet for the first time, we noticed that the AppIDs was not created even after viewing the Welcome Firefox HTML. Further examination showed that the AppID was created when Count increased for the first time. Each increase of the Count indicates some actions such as visiting a web site, downloading a picture or a video clip. Table 1 shows the changes on Count, AppID and DestList for opening a web page. For portable web browser, we used Google Chrome portable web browser. Generally, we cannot pin a portable app to the taskbar since the AppID of the launcher is different from the actual app executable. Therefore, the windows taskbar cannot group them into one place. However, we followed the solution that is offered by (Roblyness, 2012) and were able to create shortcuts and pin it on the Taskbar. After we opened a web page, we checked Jump Lists data and noticed that the AppIDs was not created. The reason is because there were both installed and portable web browsers and the operating systems probably did not know which one to use. After we uninstalled Firefox browser, tried to open a page with the portable browser, the related web page was opened. We tried this action several times to make sure that this observation is accurate. In Windows XP, we can set a portable web browser as a default browser. However, in Windows 7 and 8 this cannot be done easily. See Table 1 for the results. Table 1-Results of Actions on MS Word, Installed Browser, and Portable Browsers Row Action 1 Open fixed disk Word file Open Word file from Removable media After opening Visible after Changed Updated the file Count changed After opening Not visible Not changed Not updated the file 3 Right mouse click & delete Word file After deleting Visible after Changed the file Count changed Updated 4 Rename, Word file Updated 5 Regular browser, After action finished After opening a page 6 Portable browser 2 Result AppID Visible after Changed Count changed Visible after Changed Count changed After opening Not visible a page Comparison of Table 1 entries with the results reported in (Larson, 2011) shows that; Jump Lists data revealed on our VM and on a Physical machine for the most parts are the same. The exception is when we use removable media and portable browser. In case of removable media such as flash drive, Count and DestList were not changed. In the case of portable web browser, we should not have an installed version of any browser together with the portable web browser on the same machine. Otherwise, portable web browser would not work. In addition, with portable web browser, when the saved web page was opened and changed followed by saving these changes, Date/Time was not updated in Jump Lists. Also, when we used portable web browsers to open a page, the AppID was not visible, Count Count DestList Updated. Not changed Not updated was not changed, and DestList was not updated either. For regular web browsers, after opening a web page and saving it as a new web page, the Count and DestList were updated. Overall, examining traces of using removable media and web browsing activity using portable web browser is a challenging task for computer forensics investigators. However, for non-removable media, the details of a user activity can be viewed and possible misuse can be identified. 3.2 Comparisons of the Tools In section2, we introduced several Windows Jump Lists tools. Table 1 shows comparative performances of some of those tools on Windows Jump Lists. However, we plan to report more results in future paper covering all the tools. Table 2- Comparison of the Tools Tool Name User friendliness Displays information Search Option Recognizes DestList Recognizes AppIDs Jmp CMD Yes Has Does Doesn’t have JumpLister GUI Yes Doesn’t have Doesn’t have To certain extent GUI & CMD Yes Has Doesn’t have Has JumpListView With JMP we can extract more data than the other two. JMP and JumpListView have search option but Jumplister does not have. Overall, there is no one tool that does everything. Rather each tool has unique feature. We recommend that one should use a combination of the tools. 4. ADDITIONAL RESEARCH During the course of this research, we propose different ways of handling issues such as deleting files, accessing applications, etc. In this section, we present the details of our approaches to handling those issues. 4.1 Detecting Files from Hidden Folders As we discussed earlier, Jump Lists creates hidden files and folders on the host machine. There are specific methodologies and tools to detect these files. Two methods of detecting files of hidden folders have been discussed by Madalina (2014). In this work, we propose a third method of copying hidden files to a new destination. We can do that by typing the following command in MSDOS, the hidden files will be copied to a new media called d. C:\copy c:\ %appdata%\Microsoft\Windows\Recent\Automati cDestinations\*.* d:\new folder A suspect may decide to delete file entries from the Jump Lists so that a trace of it cannot be found. Various methods of deleting the entries from a Jump Lists have been tested by Harvey (2011). In here we propose another way of deleting file entries. In this approach, we suggest to use Track Eraser Pro software for free (AceSoft, 2014) to delete the AutomaticDestanation folder and its content. Therefore, an investigator should consider that a suspect may have used this utility for erasing his/her foot print. 4.3 Finding AppIDs Jump Lists file names are created using hash-like values that in turn are based on AppID. A forensics investigator may be interested in determining AppIDs which in turn identifies associated applications that have been used by a suspect. Two methods of finding AppIDs are listed in (Forensics Focus, 2012). We propose the third way of finding AppIDs. In this approach, we delete AutomaticDestination files (for example with Track Eraser). Recall that when we delete AutomaticDestination, the hidden files will still be there. We then use specific tools to retrieve AppIDs of the deleted files. From AppIDs we can determine the applications that have been used by a suspect. The AppIDs contain 16 characters. Table-3 shows AppIDs of several applications (List of Jump Lists IDs). 4.2 Deleting Jump List Data Table 3 – Selected AppIDs and Corresponding Applications AppID Application Description 271e609288e1210a Microsoft Office Access 2010 x86 6e855c85de07bc6a Microsoft Office Excel 2010 x86 3094cdb43bf5e9c2 Microsoft Office OneNote 2010 x86 9c7cc110ff56d1bd Microsoft Office PowerPoint 2010 x86 a7bd71699cd38d1c Microsoft Office Word 2010 x86 3094cdb43bf5e9c2 Microsoft Office One note 2010 x86 28c8b86deab549a1 Internet Explorer 8 / 9 6824f4a902c78fbd Mozilla Firefox 29 5. CONCLUSSIONS Our results show that Jump Lists data in both cases of physical and virtual machine are the same in most cases. However, when we use removable drive, traces of Jump Lists data are inconsistent. Similarly, when we have a standard web browser installed on the VM, we could not launch the portable web browser on the VM. Over all we conclude that forensics analysis of Jump Lists data for removable media and portable web browsers is more challenging for computer forensics investigators. Comparisons of the performances of the tools show that each tool has its own unique feature. We found that the type and the amount of data varied based on the tool we use. This is because the tools are designed with different features. Our suggestion is for analysis of Jump Lists data; a combination of the tools will yield better results. Finally, we made recommendations on how to detect Jump Lists hidden files, how to find AppIDs and how to delete Jump Lists data. 6. FUTURE WORK We plan to experiment with more tools on a physical machine for parsing Windows Jump Lists data to extract forensically valuable data. This may include the type and the amount of data they can retrieve from Jump Lists information. We also plan on evaluating Jump Lists data on different applications such PDF, images, and multimedia files. Writing new parsing tools with different features will also be a good line of future research. Additional work can be done on the verification of the consistency of tools. This can be done by performing the same action more than once on the same application file to see if the same results can yield. REFRENCES Acesoft (2014). Track Eraser software. Retrieved from http://www.acesoft.net/download.htm Barnett, A. (2011). The Forensics Value of the Windows 7 Jumplist, Purdue University. Retrieved from http://www.alexbarnett.com/jumplistforensics.pdf Cowen, D. (2011), Jump Lists Forensics: AppIDs Part 1 & 2, retrieved from http://www.4n6k.com/2011/09/jump-listforensics-appids-part-1.html Forensics Focus (2011). Windows Jumplist parser (Jmp). Retrieved from http://www.forensicfocus.com/Forums/ viewtopic/t=9316/ FtpArmy (2011). Jumplist File Extract. Retrieved from http://ftparmy.com/143017-jumplistfileextract.html Harvey, H. (2011), Windows Incident Response, retrieved from http://windowsir.blogspot.com/2011/08/ jump-list-analysis.html Larson, T. (2011). Forensics Analysis of Windows 7 Jump Lists, retrieved from http://www.slideshare.net/ctin/windows-7forensics-jump-listsrv3public# Madalina, M. (2014). “How to create your own Windows 7 and 8.1 Jump Lists using Jumplists Launcher. Rretrieved 4/from http://en.www.ali.dj/jumplist-launcher/ NirSoft (2013). JumpListsView. Retrieved from http://www.nirsoft.net/utils/ jump_lists_view.html Roblyness, T. (2012), Forensics Analysis of Windows 7 Jump Lists. retrieved from http://articles.forensicfocus.com/2012/10/30/ forensic-analysis-of-windows-7-jump-lists/ vmWare Virtualization for desktop. Retrieved from http://www.vmware.com/ Wiki. List of Jumplist IDs, retrieved from http://www.forensicswiki.org/wiki/ List_of_Jump_List_IDs WoanWare (2012). Jumplister info, retrieved from http://www.woanware.co.uk/forensics /jumplister.htm l A SURVEY OF SOFTWARE-BASED STRING MATCHING ALGORITHMS FOR FORENSIC ANALYSIS Yi-Ching Liao Norwegian Information Security Laboratory Gjøvik University College, Norway [email protected] ABSTRACT Employing a fast string matching algorithm is essential for minimizing the overhead of extracting structured files from a raw disk image. In this paper, we summarize the concept, implementation, and main features of ten software-based string matching algorithms, and evaluate their applicability for forensic analysis. We provide comparisons between the selected software-based string matching algorithms from the perspective of forensic analysis by conducting their performance evaluation for file carving. According to the experimental results, the Shift-Or algorithm (R. Baeza-Yates & Gonnet, 1992) and the Karp-Rabin algorithm (Karp & Rabin, 1987) have the minimized search time for identifying the locations of specified headers and footers in the target disk. Keywords: string matching algorithm, forensic analysis, file carving, Scalpel, data recovery 1. INTRODUCTION File carving is the process of extracting structured files from a raw disk image without the knowledge of file-system metadata, which is an essential technique for digital forensics investigations and data recovery. There is no guarantee that metadata exists to provide the location of each file within a file system, and file headers can be anywhere in a raw disk image. Therefore, it is inevitable for file carving applications to search every byte of a raw disk image, at the physical level, to locate specific file headers and footers of interest to the investigation. To minimize the overhead of searching for file headers and footers, it is important to employ a fast string matching algorithm for reducing the search time (Richard III & Roussev, 2005). In this paper, we summarize the concept, implementation, and main features of several software-based string matching algorithms, and provide comparisons between them from the perspective of forensic analysis. The rest of this paper is organized as follows. Section 2 describes the state of the art, and summarizes other research on the survey of string matching algorithms. Section 3 illustrates the importance of file carving, and introduces the implementation of one of the most popular open source file carving application, Scalpel (Richard III & Roussev, 2005). Section 4 summarizes the concept, implementation, and main features of ten software-based string matching algorithms. Section 5 presents the experimental results of comparisons between different string matching algorithms from the perspective of forensic analysis. Finally, section 6 concludes this paper and provides recommendations for future work. 2. RELATED WORK Baeza-Yates (R. A. Baeza-Yates, 1989) surveys several important string matching algorithms, and presents empirical results of the execution time for searching 1,000 random patterns in random texts and an English text. The evaluated algorithms include the brute force algorithm, the KnuthMorris-Pratt algorithm (Knuth, Morris, & Pratt, 1977), the Boyer-Moore algorithm (Boyer & Moore, 1977) and its variants, the Shift-Or algorithm (R. Baeza-Yates & Gonnet, 1992), and the Karp-Rabin algorithm (Karp & Rabin, 1987). The empirical results show that the Horspool algorithm (Horspool, 1980), a simplification of the Boyer-Moore algorithm (Boyer & Moore, 1977), is the best known algorithm for almost all pattern lengths and alphabet sizes. Navarro (Navarro, 2001) presents an overview of the state of the art in approximate string matching, which tolerates a limited number of errors during string matching. The most important application areas of approximate string matching include computational biology (e.g. DNA and protein sequences), signal processing (e.g. speech recognition), and text retrieval (e.g. correction of misspellings and information retrieval). Navarro states that information retrieval is among the most demanding areas of approximate string matching, because it is about extracting relevant information from a large text collection. Navarro also demonstrates empirical comparisons among the most efficient algorithms by running them on three kinds of texts: DNA, natural language, and speech. Tuck et al. (Tuck, Sherwood, Calder, & Varghese, 2004) regard the string matching algorithm as the essential component of modem intrusion detection systems, since intrusion detection systems depend heavily on the content identified in the packets by string matching algorithms. In addition to modifying the Aho-Corasick algorithm (Aho & Corasick, 1975) to reduce the resource overhead, Tuck et al. also explain some core string matching algorithms, such as the SFKSearch algorithm utilized for low memory situations in Snort and the Wu-Manber algorithm (Wu & Manber, 1994). Even though the average case performance of the modified Wu-Manber algorithm (Wu & Manber, 1994) is among the best of all multi-pattern string matching algorithms, its worst case performance is not better than the brute force algorithm. AbuHmed et al. (AbuHmed, Mohaisen, & Nyang, 2007) introduce a survey on the deep packet inspection algorithms and their usage for intrusion detection systems. They regard the string matching algorithm complexity as one of the challenges for deep packet inspection, since the resourceconsuming pattern matching will significantly decrease the throughput of intrusion detection systems. In their opinions, the string matching algorithms suffer from two factors: the computation operations during comparisons and the number of patterns to be compared. AbuHmed et al. list the Knuth-Morris-Pratt algorithm (Knuth et al., 1977), the Boyer-Moore algorithm (Boyer & Moore, 1977), the Aho-Corasick algorithm (Aho & Corasick, 1975), the AC BM algorithm (Coit, Staniford, & McAlerney, 2001), the Wu-Manber algorithm (Wu & Manber, 1994), and the Commentz Walter algorithm (Commentz-Walter, 1979) as the most famous software-based string matching algorithms, and present a throughput comparison between existing intrusion detection systems with their algorithms and hardware implementations. 3. FILE CARVING File carving is the process of recovering files without the file-system metadata describing the actual file system, which is vitally important for digital forensics investigations and data recovery. File carving is essential for digital forensics investigations, because it is able to provide human-readable information, instead of low level details, for forensic investigators (Richard III & Roussev, 2005). File carving is also a topic of great interest to an enterprise, because raw file recovery can minimize the impact of data loss when the file system of a disk is damaged (Pungila, 2012). Scalpel (Richard III & Roussev, 2005) is one of the most popular open source file carving application that runs on Linux and Windows. To reassemble files from fragments, Scalpel first reads the entire disk image with a buffer of size 10 MB, and searches for the locations of file headers and footers. Since the configuration file “scalpel.conf” includes the known header and footer patterns of different file formats, forensic investigators can customize the configuration file to specify their target file formats. After the initial pass over the disk image, Scalpel matches each file header with an appropriate footer. The newest public release of Scalpel utilizes a modified Boyer-Moore algorithm (Boyer & Moore, 1977) as the default string matching algorithm. Since this paper is to investigate the applicability of the software-based string matching algorithms for forensic analysis, we concentrate on the first phase of Scalpel, in which the locations of specified headers and footers are identified in the target disk. 4. STRING MATCHING ALGORITHMS Since there is no guarantee that file-system metadata exists to provide the location of each file within a file system, searching every byte of a raw disk image is unavoidable for file carving applications to identify the locations of structured files. Therefore, employing a fast string matching algorithm is indispensable for minimizing the overhead of file carving applications. The objective of string matching algorithms is to find one or more occurrences of pattern in a text through the sliding window mechanism. In this paper, we denote the pattern length as m, the text length as n, and the alphabet size of pattern and text as σ. We summarize the concept, implementation, and main features of ten software-based string matching algorithms as follows: 4.1 The Brute Force Algorithm The brute force algorithm checks for the pattern by shifting the window by exactly one position with the time complexity O(m×n). The algorithm can perform the string matching in any order without a preprocessing phase. During the searching phase, it performs 2n text character comparisons (Aoe, 1994). The worst case scenario of the brute force algorithm is searching for repetitive text and pattern. Moreover, the brute force algorithm requires constant extra space to back up the text stream. 4.2 The Boyer-Moore Algorithm The Boyer-Moore algorithm (Boyer & Moore, 1977) and the Knuth-Morris-Pratt algorithm (Knuth et al., 1977) are among the most widely used single pattern matching algorithms, in which each pattern is searched within a given text separately. The Boyer-Moore algorithm is considered as the most efficient string searching algorithm in both theory and practice, and it has become the standard for practical string searching. To improve the performance of searching, it performs the string matching from right to left, and it requires a preprocessing phase to determine the possibility of large shifts in the window with the time complexity O(m+σ). The pre-computed functions for shifts in the window are “good-suffix shift” and “bad-character shift”. During the searching phase, it performs with the time complexity O(m×n) and at most 3n character comparisons (Aoe, 1994). The best performance of the Boyer-Moore algorithm is O(n/m), which improves as the length of pattern m increases. 4.3 The Knuth-Morris-Pratt Algorithm Knuth et al. (Knuth et al., 1977) present the Knuth-Morris-Pratt algorithm with the time complexity proportional to the sum of the lengths of pattern and text, O(m+n), which is independent of the alphabet size. The algorithm performs the string matching from left to right, and it needs a preprocessing phase to construct a partial-match table with the time complexity O(m). The table determines how many characters to slide the pattern when a mismatch occurs. During the searching phase, it performs at most 2n-1 character comparisons (Aoe, 1994). The Knuth-Morris-Pratt algorithm is a practical algorithm for on-line search, and it can be modified for searching multiple patterns in one single search. 4.4 The Karp-Rabin Algorithm Since hashing is able to provide a simple method to avoid a quadratic number of character comparisons, Karp and Rabin (Karp & Rabin, 1987) propose an efficient randomized pattern matching algorithm that only checks if the window of text similar to the pattern through the hashing function. Therefore, the algorithm can examine the resemblance without checking whether the pattern occurs at each position of the text. The algorithm demands a preprocessing phase to compute hash values with the time complexity O(m), and it performs with the time complexity O(m×n) during the searching phase (Charras & Lecroq, 2004). The Karp-Rabin algorithm can be easily extended to find multiple patterns; however, the arithmetic operations can be slower than character comparisons. 4.5 The Horspool Algorithm The Horspool algorithm (Horspool, 1980) is a simplified version of the Boyer-Moore algorithm (Boyer & Moore, 1977), which only utilizes the precomputed “bad-character shift” function for shifts in the window. Even though utilizing the “bad-character shift” is inefficient for small alphabets, it can be effective when the alphabet size is large enough compared to the pattern length. The Horspool algorithm requires a preprocessing phase with the time complexity O(m+σ), and it performs in any order with the time complexity O(m×n) during the searching phase (Charras & Lecroq, 2004). Baeza-Yates (R. A. Baeza-Yates, 1989) conducts a survey on several important string matching algorithms, and the empirical results show that the Horspool algorithm is the best known algorithm for almost all pattern lengths and alphabet sizes. 4.6 The Quick Search Algorithm Similar to the Horspool algorithm (Horspool, 1980), the Quick Search algorithm (Sunday, 1990) is also a simplified version of the Boyer-Moore algorithm (Boyer & Moore, 1977), which only utilizes the precomputed “bad-character shift” function for shifts in the window. Likewise, the Quick Search algorithm needs a preprocessing phase with the time complexity O(m+σ), and it performs in any order with the time complexity O(m×n) during the searching phase. However, the Quick Search algorithm has a quadratic worst case time complexity in the searching phase. 4.7 The Shift-Or Algorithm The main idea of the Shift-Or algorithm (R. Baeza-Yates & Gonnet, 1992) is to represent the search state as a number, and each search attempt performs a small number of arithmetic and logical operations. By utilizing the bitwise techniques, the Shift-Or algorithm can be efficient if the pattern length is smaller than the memory-word size of the machine. The Shift-Or algorithm demands a preprocessing phase with the time complexity O(m+σ), and the time complexity of its searching phase is O(n), which is independent of the alphabet size and the pattern length (Charras & Lecroq, 2004). 4.8 The Smith Algorithm Different from the Quick Search algorithm (Sunday, 1990) depending on the statistics of the language to determine the order of comparisons, the Smith algorithm (Smith, 1991) is able to perform the string matching language independently. It utilizes the precomputed “badcharacter shift” function for shifts in the window from the Horspool algorithm (Horspool, 1980) and the Quick Search algorithm (Sunday, 1990). The Smith algorithm requires a preprocessing phase with the time complexity O(m+σ), and it performs with the time complexity O(m×n) during the searching phase (Charras & Lecroq, 2004). Since the Smith algorithm is a language-independent algorithm with competitive performance, it can perform the string matching efficiently without the knowledge of the text type. 4.9 The Raita Algorithm Since neither the pattern nor the text is random in practice, Raita (Raita, 1992) proposes a new implementation that makes use of the dependencies between successive symbols. The Raita algorithm can perform 21 to 27 percent faster than the Horspool algorithm (Horspool, 1980) with all pattern lengths. After comparing the last character of the pattern with the rightmost character of the text, it compares the first and then the middle character before comparing the rest of characters. The Raita algorithm needs a preprocessing phase with the time complexity O(m+σ), and it performs with the time complexity O(m×n) during the searching phase (Charras & Lecroq, 2004). 4.10 The Berry-Ravindran Algorithm Berry and Ravindran (Berry & Ravindran, 1999) introduce a new string matching algorithm that is more efficient than the existing algorithms through over 1,500,000 separate experiments. The BerryRavindran algorithm is a composite of the Quick Search algorithm (Sunday, 1990) and another variant of the Boyer-Moore algorithm (Boyer & Moore, 1977), the Zhu-Takaoka algorithms. It performs the window shifts by considering the “bad-character shift” for the two consecutive text Table 1 Time Complexity of String Matching Algorithms characters to the right of the window. The BerryRavindran algorithm demands a preprocessing phase with the time complexity O(m+σ²), and it performs with the time complexity O(m×n) during the searching phase (Charras & Lecroq, 2004). Table 1 summarizes the time complexity, including the preprocessing and searching phases, of the string matching algorithms described in this section. However, the theoretical analysis can only show how the algorithm is likely to perform, instead of the actual performance. Therefore, it is necessary to conduct true experiments in order to evaluate the performance of algorithms in practice. 5. EVALUATION RESULTS To provide comparisons between multiple string matching algorithms described in section 4 from the perspective of forensic analysis, we deploy an experimental testbed implemented with VMware Workstation and Ubuntu 12.04.3 based on the AMD64 architecture. The virtual machine utilizes a single CPU core with 1GB of memory. To evaluate the performance of each string matching algorithm, we utilize two test images for Scalpel 2.0 to extract various file formats. The first image ”11-carve-fat.dd” (Nick Mikus, 2005a) is a raw partition image of a 65 MB FAT32 file system, and the second image ”12-carve-ext2.dd” (Nick Mikus, 2005b) is a raw partition image of a 129.4 MB EXT2 file system. Since the file formats within the two images include doc, gif, jpg, mov, pdf, wav, and wmv, to specify the target file formats, we include 12 known header and footer patterns in the configuration file ”scalpel.conf”, which is shown in Table 2. Since this paper aims to evaluate the applicability of the software-based string matching algorithms for forensic analysis, we concentrate on the performance of each algorithm during the first phase of Scalpel, in which the locations of specified headers and footers are identified in the target disk. In order to get more accurate results, we revert to the same snapshot when we evaluate each algorithm, and all evaluation results reported in this paper are the average from repeating the experiments for 30 times. Moreover, to find out the algorithm performance for different file formats, we separate each file format in the configuration file ”scalpel.conf”, which is shown in Table 2. Table 3 presents the experimental results of the search time and the number of files carved for different file formats between ten string matching algorithms for the image ”11carvefat.dd”. Table 2: Header and Footer Patterns in the “scalpel.conf” Configuration File *We distinguish the file extension with different headers and footers by adding numbers to the file extension. Table 3: Search Time (in secs) and Number of Files Carved for Image “11-carve-fat.dd” ¹The modified Boyer-Moore algorithm that Scalpel utilizes According to the experimental results from Table 3, some carved files are missed when utilizing the Karp-Rabin algorithm (Karp & Rabin, 1987), the Horspool algorithm (Horspool, 1980), the Quick Search algorithm (Sunday, 1990), the Shift-Or algorithm (R. Baeza-Yates & Gonnet, 1992), the Smith algorithm (Smith, 1991), the Raita algorithm (Raita, 1992), and the Berry-Ravindran algorithm (Berry & Ravindran, 1999). The KarpRabin algorithm (Karp & Rabin, 1987), the ShiftOr algorithm (R. Baeza-Yates & Gonnet, 1992), and the Raita algorithm (Raita, 1992) are unable to discover mov and wav file formats. The Horspool algorithm (Horspool, 1980), the Quick Search algorithm (Sunday, 1990), and the Smith algorithm (Smith, 1991) cannot locate the mov2 file format. In addition to mov2 file format, the Horspool algorithm (Horspool, 1980) also has problems finding wav file format. The Berry- Ravindran algorithm (Berry & Ravindran, 1999) is unable to discover wav file format either. However, it is able to locate one mov2 file. It appears that the types missed are those with the “?” character in the header pattern and with no footer pattern, which we regard as an open problem for future work. Since there is no difference between the number of files carved by string matching algorithms for the image ”12-carve-ext2.dd” (3 doc1, 3 doc2, 1 gif, 3 jpg1, 1 pdf1, and 2 pdf2 files), Table 4 only shows the experimental results of the search time for different file formats between ten string matching algorithms for the image ”12-carve-ext2.dd”. Figure 1 and Figure 2 present the clear comparisons of search time for different file formats between different string matching algorithms for the images ”11-carve-fat.dd” and ”12-carve-ext2.dd” accordingly. Table 4 Search Time (in secs) for Image “12-carve-fat.dd” ¹The modified Boyer-Moore algorithm that Scalpel utilizes Figure 1 Search Time Comparison for Image ”11-carve-fat.dd” Figure 2 Search Time Comparison for Image ”12-carve-fat.dd” According to Figure 1 and Figure 2, the experimental results show the Shift-Or algorithm (R. Baeza-Yates & Gonnet, 1992) and the KarpRabin algorithm (Karp & Rabin, 1987) have the minimized execution time during the first phase of Scalpel, in which the locations of specified headers and footers are identified in the target disk. However, they both suffer from identifying the mov and wav file formats, which can be improved in the future. 6. CONCLUSIONS AND FUTURE WORK In this paper, we summarize the concept, implementation, and main features of ten software-based string matching algorithms, and provide comparisons between them from the perspective of forensic analysis. Since the theoretical analysis can only show how the algorithm is likely to perform, not the actual performance, we conduct true experiments to survey the performance of ten software-based string matching algorithms through utilizing them for file carving, which is an essential technique for digital forensics investigations and data recovery. Our experimental results show the Shift-Or algorithm (R. Baeza-Yates & Gonnet, 1992) and the Karp-Rabin algorithm (Karp & Rabin, 1987) have the minimized search time for identifying the locations of specified headers and footers in the target disk. Even though file carving is an essential technique for digital forensics investigations and data recovery, there are other application areas in forensic analysis eager for better string matching algorithms, such as information retrieval and digital forensic text string searches. Moreover, there are several more string matching algorithms for future evaluation, including the AC BM algorithm (Coit et al., 2001), the Wu-Manber algorithm (Wu & Manber, 1994), the Commentz Walter algorithm (Commentz-Walter, 1979), and the Aho-Corasick algorithm (Aho & Corasick, 1975). Even though the evaluation method is valid, the evaluation results can be more unbiased if more test images are utilized. In addition to the execution time, other evaluation criteria, such as the storage overhead, CPU usage, and accuracy, can also be considered as future work. [11] ACKNOWLEDGMENT Yi-Ching Liao is supported by the COINS Research School of Computer and Information Security. [12] REFERENCES [13] [1] AbuHmed, T., Mohaisen, A., & Nyang, D. (2007, November). A survey on deep packet inspection for intrusion detection systems. Magazine of Korea Telecommunication Society, 24, 25–36. arXiv:0803.0037 [2] Aho, A. V., & Corasick, M. J. (1975, June). Efficient string matching: an aid to bibliographic search. Commun. ACM, 18(6), 333–340. doi: 10.1145/360825360855 [3] Aoe, J.-i. (1994). Computer algorithms: string pattern matching strategies (Vol. 55). Wiley. com. [4] Baeza-Yates, R., & Gonnet, G. H. (1992, October). A new approach to text searching. Commun. ACM, 35(10), 74–82. doi: 101145/135239.135243 [5] Baeza-Yates, R. A. (1989, April). Algorithms for string searching. SIGIR Forum, 23(3-4), 34–58. doi: 10.1145/74697.74700 [6] Berry, T., & Ravindran, S. (1999). A fast string matching algorithm and experimental results. In Stringology (pp. 16–28). [7] Boyer, R. S., & Moore, J. S. (1977, October). A fast string searching algorithm. Commun. ACM, 20(10), 762–772. doi: 10.1145/359842.359859 [8] Charras, C., & Lecroq, T. (2004). Handbook of exact string matching algorithms. King’s College Publications. [9] Coit, C., Staniford, S., & McAlerney, J. (2001). Towards faster string matching for intrusion detection or exceeding the speed of snort. In DARPA information survivability conference amp; exposition II, 2001. DISCEX ’01. proceedings (Vol. 1, pp. 367– 373 vol.1). doi: 10.1109/DISCEX.2001.932231 [10] Commentz-Walter, B. (1979). A string matching algorithm fast on the average. [14] [15] [16] [17] [18] [19] [20] [21] [22] Automata, Languages and Programming, 6th Colloquium, 71, 118–132. Horspool, R. N. (1980). Practical fast searching in strings. Software: Practice and Experience, 10(6), 501– 506. doi: 10.1002/spe.4380100608 Karp, R. M., & Rabin, M. (1987). Efficient randomized pattern-matching algorithms. IBM Journal of Research and Development, 31(2), 249–260. doi: 10.1147/ rd.312.0249 Knuth, D. E., Morris, J., & Pratt, V. R. (1977). Fast pattern matching in strings. SIAM journal on computing, 6(2), 323–350. doi:10.1137/0206024 Navarro, G. (2001, March). A guided tour to approximate string matching. ACM Comput. Surv., 33(1), 31–88. doi: 10.1145/ 375360.375365 Nick Mikus. (2005a, March). Digital forensics tool testing image, available at http://dftt.sourceforge.net/test11/index.html. Nick Mikus. (2005b, March). Digital forensics tool testing image, available at http://dftt.sourceforge.net/test12/index.html. Pungila, C. (2012). Improved file-carving through data-parallel pattern matching for data forensics. In 2012 7th IEEE international symposium on applied computational intelligence and informatics (SACI) (pp. 197–202). doi: 10.1109/SACI.2012.6250001 Raita, T. (1992). Tuning the boyer-moorehorspool string searching algorithm. Software: Practice and Experience, 22(10), 879–884. doi: 10.1002/spe.4380221006 Richard III, G. G., & Roussev, V. (2005). Scalpel: A frugal, high performance file carver. Refereed Proceedings of the 5th Annual Digital Forensic Research Workshop. Retrieved 2014-12-12, from http://www.dfrws.org/2005/proceedings/rich ard_scalpel.pdf Smith, P. D. (1991). Experiments with a very fast substring search algorithm. Software: Practice and Experience, 21(10), 1065– 1074. doi: 10.1002/spe.4380211006 Sunday, D. M. (1990, August). A very fast substring search algorithm. Commun. ACM, 33(8), 132–142. doi: 10.1145/79173.79184 Tuck, N., Sherwood, T., Calder, B., & Varghese, G. (2004). Deterministic memory efficient string matching algorithms for intrusion detection. In INFOCOM 2004. twenty-third Annual Joint conference of the IEEE computer and communications societies (Vol. 4, pp. 2628–2639 vol.4). doi: 10.1109/INFCOM.2004.1354682 [23] Wu, S., & Manber, U. (1994). A fast algorithm for multi-pattern searching (Tech. Rep.). Technical Report TR-94-17, University of Arizona. Retrieved 2014-1212, from http://webglimpse.net/pubs/TR9417.pdf A NEW CYBER FORENSIC PHILOSOPHY FOR DIGITAL WATERMARKS IN THE CONTEXT OF COPYRIGHT LAWS Vinod Polpaya Bhattathiripad Ph D Sneha Sudhakaran Roshna Khalid Thalayaniyil Cyber Forensic Consultant Cyber Forensic Consultant College of Engineering GJ Software Forensics GJ Software Forensics Kallooppara,Kerala Kozhikode - 673004, Kerala, India Kozhikode - 673004, Kerala, India India [email protected] [email protected] [email protected] ABSTRACT The objective of this paper is to propose a new cyber forensic philosophy for watermark in the context of copyright laws for the benefit of the forensic community and the judiciary worldwide. The paper first briefly introduces various types of watermarks, and then situates watermarks in the context of the ideaexpression dichotomy and the copyright laws. It then explains the forensic importance of watermarks and proposes a forensic philosophy for them in the context of copyright laws. Finally, the paper stresses the vital need to incorporate watermarks in the forensic tests to establish software copyright infringement and also urges the judiciary systems worldwide to study and legalize the evidential aspects of digital watermarks in the context of copyright laws. Keywords: Digital Watermarks, Software Copyright, Idea-Expression Dichotomy, Programming Blunders, Copyright Infringement, AFC, POSAR 1. INTRODUCTION Software can be copyright protected. When an infringement of the copyright is suspected, the copyright owner has every moral and legal right to ensure the exclusivity of their property rights to the software. It is only natural that when such rights have been flagrantly violated, particularly for commercial profits (and uses), the injured parties will invariably resort to legal measures both for the protection of their property and for the restitution of damages involved therein. Such an issue can trigger a legal battle. In the process of legally establishing copyright infringement, the watermark (contained in the software) can play an important role. In order to use watermark as an evidence to establish the criminal activity behind the infringement allegation, both the forensic procedure (used as part of the investigation) and the judge’s decisionmaking process need to be sensitive to the forensic role of watermarks. Although much has been done on the design, programming and implementation aspects of watermarks (Cox et al, 2008), there has not been any effort from cyber forensic researchers to explain the forensic locus standi and philosophical rationalle of watermarks for the benefit of the entire forensic community and also for the benefit of the judiciary across the world. As a result of this deficiency, a cognitive (or an expertise) gap can exist between the forensic community and the judiciary and the goal of this work is to fill this gap. As several different forms of digital watermarks exist, it is the duty of forensic professionals to explain the forensic roles of various different watermarks separately and then generalize these different roles to form a single forensic philosophy which can be ultimately used by the judiciary for effective decision making in any software copyright infringement litigation. Before getting into the forensic philosophy of watermarks, a quick overview of digital watermarks will help readers to situate this work properly. 2. OVERVIEW OF WATERMARKS File watermarking is not uncommon in the digital world. It is a widely used mechanism worldwide in order to protect the ownership of a digital file, including software. A digital watermark (or, simply a watermark) in a digital file (whether it is a text or image or an audio or a video file) is some kind of electronic thumb impression introduced by the owner into the file for easy establishment of his / her creativity (Nagra et al, 2002). Since any digital file has a source code (or a hex dump) as part of it (see fig 1), file watermarking virtually becomes a process of embedding some kind of information into the source code (of the file) for the purpose of introducing some degree of personalization (or identity) into the source code (Cox et al, 2008). When a watermark is embedded into any digital file, the source code of the watermark also gets embedded into the source code of the digital file (see fig 2 & 3). Watermarks can exist in different forms like text, image, audio and video (and also combinations of these forms). The best way to further explain a watermark is to quickly demonstrate the technicalities of an image file, first using its nonwatermarked form and then, its watermarked form. There is a general feeling that a watermark is always a single, identifiable and easily separable entity in a watermarked file and that a watermarked file always differs from its nonwatermarked form by only a few hexdumps. This is not true. Most watermarks do not remain as single, identifiable and easily separable entities in the watermarked file. Also, the hexdump of any non-watermarked image (for example, see fig 2) differs in a big way from that of the watermarked form of the same image (see fig 3) and this difference can be easily verified by comparing the corresponding hex values in fig 2 and fig 3. This big difference is because the watermarking algorithm not only inserts the hex values of watermarks into the original (non-watermarked) image but also modifies most hex values of the original image. In the same manner, the hexdumps of any particular non-watermarked audio, video or a text file also differ largely in the same fashion from those of the watermarked form of the file. Just as there are different forms of digital files (image, audio, video and text form and also their different combinations), watermarks can also exist in many forms. Further, watermarks can be classified in many different ways based on several factors. An overview of two sample classifications will help readers to situate the forensic aspects of watermarks properly. Based on their techniques of generation, watermarks are classified into two types and they are static watermarks (which are embedded as code segments within the source code of a digital file) and dynamic watermarks (which are watermarks generated during the runtime with the help of code segments embedded within the source code of a digital file) (collberg and Thomborson, 1999). Again, based on the roles played by different persons involved in the development of the software, watermarks can be classified as authorship mark, fingerprinting mark, validation mark and licensing mark (which are unique identities of the author, distributor, publisher and consumer, respectively, of the software that contains the watermark) (Nagra et al, 2002). Every watermark has certain desirable features like effectiveness (or the correctness and aptness of the intended purpose of the watermark), integrity (or the watermark’s ability to not to interfere with the performance of the source code), fidelity (or how closely the watermark accurately or truthfully helps to identify the ‘owner’ of the software), robustness (or the watermark’s ability to withstand any kind of alteration of the content of the file in which the watermark is embedded) etc. (Nagra et al, 2002; Cox et al, 2008; Marcella and Menendez, 2008). 3.THE IDEA-EXPRESSION DICHOTOMY AND WATERMARKS The idea-expression dichotomy (Walker, 1996) provides an excellent theoretical perspective to look at and further delineate watermarks embedded as part of the source code of any software. Any software is (or consists of) a collection of code segments and each code segment is an expression of one or more ideas. This being so, software, as a whole, can be considered a collection of expressions of one or more ideas. Figure 1. A JPEG file and its source code in C, generated using the HxD tool. (Only the beginning and the end of the C code are shown here and the hidden portion is indicated by a thick white space) (Picture courtesy: Kadalundi Mangrove Reserve preserved by Kerala Forests, Kozhikode district, India) Figure 2. The hexdump (generated using the HxD tool) of the non-watermarked JPEG image shown in Figure.1 Figure 3. The watermarked form of the image shown in Figure.1 and its hexdump (First, the picture was watermarked using the tool TSR Image Watermark and then the corresponding hexdump was generated using the HxD tool) From the idea-expression perspective, any watermark (embedded as part of any software) is a genuine idea which is properly expressed in a manner that does not adversely affect the syntax (or sometimes even the semantics) of the software. It is a part of the source code (of the software) which is not a functional requirement of the software. In other words, watermark in any software is part of the requirements marking and identifying the original ownership of the software and not part of the requirements of the potential users of the software. The above explanation of watermarks in terms of the idea-expression dichotomy clearly opens the door to linking watermarks directly to copyright infringements of any software because the ideaexpression perspective is the basis of formulation of software copyright laws of several countries (Newman, 1999; Hollar, 2002). The ideaexpression basis of copyright laws of several countries (especially the US copyright laws) says that if there is only one way or a limited number of (exclusive) ways of effectively expressing an idea, this idea and its expression tend to “merge” (Walker, 1996) and in such instances an idea and its expression are not protectable through copyright (Hollaar, 2002). In other words, if the same idea can be realized through more than a limited number of expressions, all such different realizations are protected by copyright laws. Thus, if the idea behind the expressions in a watermark (which is embedded in any particular copyrighted software) can be expressed in more than a limited number of ways, then the copyright obtained for the software can extend to the watermark contained in it. Thus, watermarks are directly linked to copyright. This link requires further explanation. Even if the copyright of the main software can be extendable to the watermark contained in it as well, the copyright may not be extendable to all the elements of the watermark. This non-extendability is because a watermark can contain several legally unprotectable elements such as globally common mnemonics, names and expressions, globally shared notations, codes or expressions due to shared nature of technology, and globally common functional area elements. If all the elements in the watermark are unprotectable, then the copyright obtained for the software will not be extendible to the watermark contained in the software. Finally, if there is at least one protectable element in the watermark contained in software, the copyright of any software will extend to the watermark contained therein as well. To summarize, watermarks can be perceived in terms of idea-expression dichotomy, and thus, can be directly linked to copyright and can also be an indicator of software copyright infringement. 4. FORENSIC IMPORTANCE OF COPYRIGHTED WATERMARKS Despite their apparent functionally irrelevant and thus innocuous status in any software, watermarks, when copyrighted (that means, when there is at least one protectable element in a watermark), can be of great value / assistance to the cyber forensic expert and a discussion of this evidence is the prime objective of this article. The approach to the forensic importance of a watermark can be best done in the context of the concept of programming blunders (Bhattathiripad, 2012). A programming blunder has been defined as a “variable in a program or a code segment in a program … which is …. unnecessary for the user’s functionality”. Looking from this definitional point of view, a watermark is technically (or can be explained in terms of) a programming blunder because a watermark in any software is not part of the functional requirements of the software or (in other words) is unnecessary for user’s functionality. The locus standi and functionality of watermarks can thus be best situated through their inclusion in the category of blunders. Even so, unlike a typical programming blunder, watermark is neither unintentional nor accidental. Rather, it is an intentional ‘programming blunder’, introduced into the software by its developer for a specific purpose. In general, every watermark is an intentionally introduced software element and is technically an intentional programming blunder. Because watermarks are intentionally introduced code segments in any software, the three etiological factors of programming blunders (see Bhattathiripad, 2012) are not sufficient enough to explain the etiology of watermarks. All the existing etiological factors of programming blunders assume that programming blunders can happen only due to inability or inattention of the programmer (or the quality engineer) to completely remove those statements that are not required for user’s functionality. This also means that the existing etiological aspects of programming blunders do not consider the possibility of programming blunders happening due to software developer’s intentional effort to introduce (into a software), a code segment (like a watermark) which is not required for user’s functionality. While doing a juxtaposed comparison of two sets of software to establish possible copyright infringement, the existence of a particular watermark in identical contexts in both the complainant’s and the defendant’s versions can be a more positive indication of illegal copying (than other kinds of blunders), as the watermark was deliberately inserted into but not carelessly leftover in the complainant’s version. It is highly unlikely that two programmers will design and insert exactly same watermarks exactly in the same position and exactly in the same way, and this elevates the similarity into potential evidence of copyright infringement. Thus, most watermarks can provide direct evidence (or at least probable, corroborative or supporting evidence) to establish copyright infringement more decisively than other programming blunders. In the absence of other direct evidence of copyright infringement, watermarks can form the only basis of the expert opinion to the judiciary about the possibility of copyright infringement. 5. WATERMARKS AS EVIDENCE IN COPYRIGHT INFRINGEMENT FORENSIC TEST The importance of watermarks has not been given any role or status in the forensic procedure of the Abstraction-Filtration-Comparison (AFC) test (which is the only judiciary-accepted procedure for establishing software copyright infringement in the US) (Bhattathiripad, 2014). Watermarks are not even considered during this test because during the abstraction of the software, only the functionally active or relevant parts (of the two sets of software) will be considered for abstraction and used for further investigation (Hollaar, 2002). As a result, the functionally irrelevant parts (or those items that are irrelevant for user’s functionality, like watermarks) may not be considered for abstraction. In such case of unfortunate non-consideration, the watermarks will not be available for final comparison and this unavailability certainly adversely affects the rigour of the AFC test and thus, can affect its reliability. Hence, this paper proposes that, along with the AFC test results, the evidence concerning watermarks, if any, should also be identified and gathered separately by the forensic expert, before the final findings and inferences are presented to the court. The software forensic research community is encouraged to take on this proposal and find ways to incorporate watermarks in the AFC test. The judiciary systems worldwide also need to be encouraged to study and legalize the evidential aspects of digital watermarks in the context of copyright laws. Some preliminary suggestions are presented below. During the forensic analysis as part of any software copyright infringement litigation, any watermark (embedded into a software package by the developer and identified and detected by the forensic expert) needs to be considered as a separate program segment. In other words, during the forensic test in any copyright infringement litigation, the embedded watermark needs to be first separated1 from the main software and then 1 The task of separation of the source code of a watermark from the source code of the main software (or any digitally watermarked file) can be easy if and only if the source code of the watermark can be perfectly identified in the original source code as a single unit of code segments. To put it clearer, the task of separation of watermarks from an image / audio / video file can be complicated and strenuous, for many reasons. Two such potential reasons are (a) the hex values of the watermark get fragmented (as against an identifiable single unit) in the ocean of hex values of any watermarked image / audio / video file and (b) the watermarking algorithm not only inserts the hex values of watermarks into the original (non-watermarked) image / audio / video file but also modifies a few, if not all, hex values of the original. Even so, this task of subjected to the forensic test separately. This is in order to ensure that the watermark has (or does not have) protectable elements. The ultimate goal here is to establish whether the copyright of the main software is (or is not) extendable to the watermark as well. For instance, if the test used is AFC, then the watermarks in both the software packages need to be separated first, and then separately abstracted. Subsequently, the unprotectable elements in both watermarks need to be filtered out and removed. Finally, the comparable elements in the remaining “golden nuggets” (Walker, 1996) need to be compared and the resulting evidence (or evidence of infringement of protectable elements) needs to be reported to the court. If the test used is POSAR (Bhattathiripad, 2014), watermarks need to be separately subjected to this 5-stage forensic test process and the resulting evidence2 need to be reported to the court. Although outside the purview of AFC and POSAR, the evidence of copyright infringement of watermark will form part of the evidence of copyright infringement of the main software as well (because watermark is a part of the main software) and sometimes, can turn out to be valuable evidence to establish copyright infringement of the main software. Before concluding, a note on what a judge expects from a forensic expert would add value to the special attention and consideration given to watermarks. In any software comparison report, what the judge would expect from the forensic expert is a set of details that helps the court in arriving at a decision on the copyrightable aspects of the elements in both software packages (Newman, 1999). So, what is expected in the case of watermarks is not a mere statement on the extendability of the copyright to the watermarks. Rather, the statement should be substantiated and supported by a set of details on the merger aspects separation is not impossible if the algorithm for separation is sensitive to both the insertions and the modifications done by the watermarking algorithm. 2 The evidence set here contains the evidence of infringement of protectable elements along with the evidence of post-piracy modifications and the evidence of infringement of programming blunders as part of the watermark of the ideas and expressions contained in the watermarks. It needs to be stated here that the future research on forensics of watermarks should not ignore all these complex aspects that determine the status and role of watermarks in copyright cases. 6. CONCLUSION In the process of legally establishing copyright infringement, the watermark (contained in the software) can play an important role. As any watermark can be considered to be technically a programming blunder, the forensic importance and philosophy of programming blunders (explained in the context of the idea-expression dichotomy) can be extendible to every watermark as well. While doing a juxtaposed comparison of two sets of software to establish possible copyright infringement, the existence of watermarks in identical contexts in both versions can be a more positive indication of illegal copying (than other kinds of blunders), as they were deliberately inserted into and not carelessly leftover in the complainant’s version. In order to use watermark as evidence to establish the criminal activity behind the infringement allegation, both the forensic procedure (used as part of the investigation) and the judge’s decision making process need to be sensitive to the forensic role of watermarks. Hence the forensic tests (to establish software copyright infringement) need to be re-designed so as to ensure that possible evidence like watermarks are procedurally collected, forensically analyzed and then properly reported to the court. The forensic report should be substantiated and supported by a set of details on the merger aspects of the ideas and expressions contained in the watermarks. Future research in this area should ensure not to leave out the importance of watermarks as well as their role in establishing software copyright infringement. REFERENCES [1] Bhattathiripad, P. V. (2012), Software Piracy Forensics: A Proposal for Incorporating Dead Codes and other Programming Blunders as Important Evidence in AFC Test, Proceedings of the IEEE 36th International Conference on [2] [3] [4] [5] Computer Software and Applications Workshops, Turkey Bhattathiripad, P. V. (2014). Judiciaryfriendly forensics of software copyright infringement, IGI-Global Collberg, C., & Thomborson, C. (1999). Software watermarking: Models and dynamic embeddings, Proceedings of the 26th ACM SIGPLAN-SIGACT symposium on Principles of programming languages (pp. 311-324). ACM. Cox, I. J., Miller, M. L., & Bloom, J. A. (2008). J. Fridrich, T. Kalker, Digital Watermarking and Steganography, Morgan Kaufmann Publishers Hollaar, L. A. (2002) “Legal Protection Of Digital Information”, BNA Books, 2002 [6] Nagra, J., Thomborson, C., & Collberg, C. (2002). A functional taxonomy for software watermarking. In Australian Computer Science Communications (Vol. 24, No. 1, pp. 177-186). Australian Computer Society, Inc.. [7] Marcella, A. J. Jr., and Menendez, D. (2008), Cyber Forensics, Auerbach Publications, page 3 [8] Newman, J. O. (1999), “New Lyrics For An Old Melody, The Idea/Expression Dichotomy In The Computer Age”, 17, Cardozo Arts & Ent. Law Journal, p.691 [9] Walker, J., (1996), Protectable 'Nuggets': Drawing the Line Between Idea and Expression in computer Program Copyright Protection, 44, Journal of the Copyright Society of USA, Vol 44, Issue 79 A REVIEW OF RECENT CASE LAW RELATED TO DIGITAL FORENSICS: THE CURRENT ISSUES Kelly Anne Cole, Shruti Gupta, Dheeraj Gurugubelli and Marcus K Rogers Department of Computer and Information Technology Purdue University West Lafayette, IN 47907 [email protected], [email protected], [email protected] and [email protected] ABSTRACT Digital forensics is a new field without established models of investigation. This study uses thematic analysis to explore the different issues seen in the prosecution of digital forensic investigations. The study looks at 100 cases from different federal appellate courts to analyze the cause of the appeal. The issues are categorized into one of four categories, ‘search and seizure’, ‘data analysis’, ‘presentation’ and ‘legal issues’. The majority of the cases reviewed related to the search and seizure activity. Keywords: Computer Investigation, Case Law, Digital Forensics, Legal Issues, and Courts 1. INTRODUCTION Digital forensics (DF) is still in its infancy, resulting in rapid growth and formation. Legal concerns surrounding this field must soon be addressed in order for it to function fittingly as a scientific field. Several dominating legal issues relevant to DF have come to light including lack of standards and certifications, analysis and preservation concerns and admissibility of evidence issues (Meyers & Rogers, 2004). For this paper, the issues in appellate court proceedings surrounding the digital forensics field are examined and more fully addressed. But first what is digital evidence? The DoJ (2008) describes digital evidence as, “information and data of value to an investigation that is stored on, received, or transmitted by an electronic device. This evidence is acquired when data or electronic devices are seized and secured for examination” (pg.1). For example, illegal photos, chats, log files and emails are examples of potential digital evidence used in the courts. Who relies on digital forensic evidence and research related to cyber crimes? Academia, law enforcement, the military, the private sector and the legal system all rely on digital forensic evidence and research related to cyber crimes as they are all using and or interpreting the same technologies (Palmer, 2002). Differences exist among how each of these disciplines put digital forensics into practice. Investigators in law enforcement (LE) conducting investigations in search of electronic evidence useful for a prosecution must follow the exact guidelines set by the court. The primary objective for the private sector is to maintain business continuity in the face of an incident. Thus, the goal of the digital investigation is recovery from the incident, in real time, and prosecution goals (if any) are secondary. The military acquires digital evidence in the same way that businesses do except that their objectives are more focused on the protection of highly confidential digital data (Palmer, 2002). They all look to digital forensic research in order to formulate best practices when using digital technology and they also look to the courts for protection and retribution against malicious attacks. Currently the courts are facing rather tough questions from the fairly new digital world. Smith and Kenneally (2008) ask the question of how do we prevent previous case law decisions from overlooking new issues or disregarding more complex ones. For instance they proposed the question, “should an e-mail or log be denied admissibility because it was retrieved from a database that was unsecured and subject to tampering” ? Information technology experts are frequently called upon to objectively answer such data integrity questions for the court. Currently the bar for proving reliability and authenticity of digital evidence is not very high (Smith & Kenneally, 2008). Typically, evidence will be admitted if the testifying witness had firsthand knowledge of the evidence, if the evidence is a product of an automated process or system, or the digital record(s) meet the business records exception to the Hearsay Rule. Thus, data tampering is considered unlikely by the courts (Smith & Kenneally, 2008). As courts become more familiarized with digital evidence vulnerabilities, they will start scrutinizing the trustworthiness of evidence from computer systems and investigative methods (Chaikin, 2007). Over time the courts will also better apply constitutional amendments to the digital world. There is still ambiguity about the interpretation of the 4th Amendment protections to the digital world (Nance & Ryan, 2011). In regards to the 4th Amendment and digital evidence searches, the plain view exception and the closed container rule has brought up significant attention. When an investigator is conducting a search within the scope of a warrant and comes across contraband in plain view, the officer is allowed to seize it. The issue with digital evidence is that the scope is sometimes overbroad. With a valid warrant the investigator can search the whole hard drive as if it were a container, thus all of its contents are in plain view. Depending on the judge and evidence submitted, courts may limit the scope of such searches (Trepel, 2007). Stahl et al., (2012) claim that lawyers, computer experts, legislators and judges do not share the same knowledge and understanding of computer technologies that is needed in order to address the conflicts between forensic technology and law. The following section provides the related work surrounding legal issues in the computer forensics field, followed by the methods, results, limitations and our conclusion. 2. RELEVANT LITERATURE Meyers and Rogers (2004) discussed that search and seizure methods are disputed most often in regards to digital forensic investigations and that improper search and seizure methodology (missing steps) used during the digital investigation could potentially impact in the inadmissibility of the evidence. The current research investigates if this is true over the past 10 years and which steps are missed most often. Shinder (2003) addresses the legal issues in a similar manner as the present paper. She identifies the various issues and discusses the case law that highlights those issues. However, she restricts her discussion mostly to an in-depth analysis of the issues related to search and seizure. This paper looks at all the issues that arise within the dataset of cases. Also, Shinder (2003) looks at milestone cases instead of examining “random” cases like the present study. Meyers and Rogers (2004) presented an overview of the issues faced in the field on computer forensics. They highlighted the lack of standardization as the biggest issues but also explain the legal hurdles related to search and seizure and expert qualifications. Brungs and Jameison (2005) conducted research to identify and classify the main legal issues associated with digital forensics. Conducting the research in Australia, they recruited eleven experts to discuss and identify the legal issues related to computer forensics. They then ranked the issues and provided a classification scheme for the various legal issues. Wegman (2006) discusses the various issues related to the admissibility of evidence in the court of law. He outlines the main laws related to computer forensic investigation and highlights the difficulties in interpreting the usual criminal laws to digital investigations. He provides more of an overview of the legal aspects. Liles et al. (2009) furthered the research by Brungs and Jameison (2005) but conducting a similar survey but in the United States. They increased the survey size to sixty-nine respondents and performed a comparative analysis of the results with those of Brungs and Jameison. Greiman and Chitkushev (2010) deal with the legal aspects of computer forensics from an academic perspective. They delve into the ramifications of understanding the legal framework for digital investigations. They attempt to design an academic curriculum to effectively address legal concepts like cyber-law, jurisdiction issues etc. 3. METHODS The appellate cases were randomly selected using the FindLaw database and through using the keywords ‘computers’, ‘computer’, ‘online’, ‘digital’, ‘computer crime’, ‘digital evidence’, and ‘computer investigations’. The researchers examined 100 appellate court cases from all districts related to digital forensic investigations within the past 10 years, in search of the most profound issues during digital investigations (see Appendix A for a list of reviewed cases). The thematic analysis method was used (Braun & Clarke, 2006). Thematic analysis involves the searching across a dataset to find repeated patterns of meaning (Braun & Clarke, 2006). The researchers took an inductive data analysis approach. An inductive approach means the themes identified are strongly linked to the data themselves and are not fit to a pre-existing coding frame (Patton, 1990). The researchers read and re-read the cases many times, and used open coding of the data until major themes related emerged. 87 cases fell into 4 themes. The 4 themes presented next, offer valuable insights into the issues taking place in courts surrounding digital technologies (see figure 1). 4. RESULTS Overall, 24 of the cases were reversed and the rest of the cases were upheld in favor of the prosecutor. Four major themes emerged from the data: 4.1 Search and Seizure Among the 100 cases that the researchers examined, 41 of the appeals deal with issues during the collection phase of the digital forensic process. The issue most dealt with by the court was exceeding the scope of the warrant (15), followed by the defendants claim to an expectation of privacy which includes warrantless searches (9), followed by the claim that standards for probable cause were not met (7), followed by the claim that consent to search was not given (5) and lastly, staleness or invalid warrant (5). Our findings are consistent with the research of Meyers and Rogers (2004) who suggested that search and seizure methods would be disputed most often in regards to digital forensic investigations. Improper search and seizure methodology (missing steps) used during the digital investigations results in the inadmissibility of the evidence (Meyers & Rogers, 2004). 4.2 Data Analysis Among the 100 cases that the researchers examined 10 fell into the data analysis theme. The issues dealt with the most were errors in a programs’ output or a program not working correctly (4), unreliability of time stamps and mac times (3), computer was wiped or contaminated during examination (3). 4.3 Presentation Issues Among the 100 cases that the researchers examined, 5 of the appeals fell into the presentation and expert witness theme. The issue most dealt with by the courts was the failure to preserve text messages or images for presentation (3), followed by whether or not an expert witness must fully understand the source code of a tool or how it works (2). 4.4 Legal Issues Among the 100 cases that the researchers examined, 31 fell into the legal theme. A popular issue dealt with by the court was whether or not an image of an abused child was real, virtual, or computer generated (6). Followed by the defendants refusal to decrypt passwords or files (1), unauthorized access or whether one had access or not to specific files (6), sentencing issues which includes double counting and sentence enhancement issues (13) and lastly, knowing possession (4). The four major themes that have emerged revealed the major issues being brought up by the courts. Figure 1 Theme Frequencies Search and Seizure Affirmed Reversed Exceeding the scope of the warrant 15 4 Expectations of privacy/warrantless search 9 2 Standards for probable cause were not met 7 2 Consent to search was not given 5 1 Staleness or invalid warrant 5 1 Total 41 10 Affirmed Reversed Data Analysis Errors in a programs’ output or a program not working correctly 4 Unreliability of time stamps and mac times 3 1 Computer was wiped or contaminated during examination 3 1 Total 10 2 Affirmed Reversed Failure to preserve text messages or images for presentation 3 1 Must an expert witness fully understand the source code of a tool or how it works 2 Total 5 1 Affirmed Reversed 6 1 Presentation and Expert Witness Legal Issues Whether or not an image of an abused child was real, virtual, or computer generated The defendants refusal to decrypt passwords or files or pleading the 5th 2 1 Unauthorized access or whether one had access or not to specific files 6 2 Sentencing issues which includes double counting 13 6 Knowing possession 4 1 Total 31 11 Overall Total 87 24 4. CONCLUSION This study consisted of a small sample size. While it would be difficult to make generalizations about the nature of prosecution issues in digital forensics investigations, the study gives us a good glimpse into a subset of problems that are experienced. One major opportunity for knowing the issues that are being brought up in the courts surrounding digital evidence is awareness for law enforcement. Now that we are aware of the specific search and seizure issues we can better educate police officers in that area of computer investigation. The study showed that 24 of the cases had their decisions reversed in the appellate court. This is a concern for the digital forensics community. The study also reaffirms that search and seizure procedures need to be carefully adapted to work within the digital realm. The largest issue seen was ambiguity in the scope of the warrant. There were also issues where lawenforcement officers did not stop the search when encountered with new information and apply for another warrant. Another warrant related issue seen was that the warrant was not specific enough. For most of these cases, the court ruled in good faith but this could change as courts become more strict regarding the scope of the warrant. In general, law enforcement officers need specific training in search and seizure procedures for digital evidence. Another issue observed was related to defendant claims that the tool was not functioning properly. The reliability of tools is often discussed as an area of concern, with most of the tools used not subject to scientific testing. The real authenticity of digital images was also questioned in court. With child pornography being a major cyber crime to contend with, ways to prove the “realness” of an image will be important. This study is limited to 100 cases within the last 10 years. The cases were randomly selected using the FindLaw database and through using the keywords ‘computers’, ‘computer’, ‘online’, ‘digital’, ‘computer crime’, ‘digital evidence’, and ‘computer investigations’. The researchers could not get access to police reports therefore some of the issues may have not been brought up in the appellate court briefs. As mentioned earlier, the study employed a small sample size, which makes it difficult to generalize the results. However, the trend seen among the 100 cases is consistent with the discussion in the digital forensic community about the nature of the issues seen. With attention drawn to these issues, it might be possible to speedup the prosecution of cases and lower the rate at which cases are appealed. Future work in this area should target a much bigger sample size and perform a more detailed analysis of the issues seen. [9] Jerry Wegman. (2005). Computer forensics: Admissibility of evidence in criminal cases. Journal of Legal, Ethical and Regulatory Issues, 8(1). REFERENCES [12] Meyers, M., & Rogers, M. (2004). Computer forensics: The need for standardization and certification. International Journal of Digital Evidence, 3(2). [1] Braun, V. and Clarke, V. (2006). Using Thematic Analysis in Psychology. Qualitative Research in Psychology, 3 (77-101). [2] Beebe, N. (2009). Digital forensic research: The good, the bad and the unaddressed. In G. Peterson and S. Shenoi (Eds.), Advances in digital forensics (pp. 17-36). IFIP International Federation for Information Processing. Boston. [3] Brungs, A., & Jamieson, R. (2005). Identification of legal issues for computer forensics. Information Systems Management; 22(2), 57-67. [4] Chaikin, D. (2007) Network investigations of cyber attacks: the limits of digital evidence Crime Law Social Change, 46, 239–256 doi: 10.1007/s10611-007-9058-4. [5] Daubert v. Merrel Dow Pharmaceuticals, Inc., (1993) 509 U.S. 579. [6] Kuhmo Tire v. Carmichael, (1998) 119 S.Ct. 37, 142 L.Ed.2d 29, 69 USLW 3228 [7] DoJ (2001). Electronic crime scene investigation: a guide for first responders. U.S. Department of Justice, 1-74. Retrieved from https://www.ncjrs.gov/txtfiles1/nij/187736.txt. [8[ Greiman, V., & Chitkushev, L. (2010). Legal frameworks to confront cybercrime: a global academic perspective. 5th Internation Conference on Information Warfare and Security. APPENDIX A REVIEWED CASES [10] Liles, S., Rogers, M., & Hoebich, M. (2009). A survey of the legal issues facing digital forensic experts. Advances in Digital Forensics V, (267–276). [11] Maxwell, J. (2005). Qualitative Research Design: an interactive approach. 2nd edition. Sage publications, 41. [13] Nance, K., & Ryan, D. J. (2011, January). Legal aspects of digital forensics: a research agenda. In System Sciences (HICSS), 2011 44th Hawaii International Conference on (pp. 1-6). IEEE. [14] Palmer, G. (2002). Forensic analysis in the digital world. International Journal of Digital Evidence, 1, 1-6. [15] Shinder, D. L. (2003, December). Understanding legal issues. Law & Order, 51(12), 38-42. [16] Smith, C.F. & Kenneally, E.E. (2008). Electronic Evidence and Digital Forensics Testimony in Court. Handbook of Digital and Multimedia Forensic Evidence. Ed.Barbara, J. © Humana Press Inc., Totowa: NJ. (8), 103-132. [17] Stahl, B., Carroll-Mayer, M., Elizondo, D., Wakunma, K., & Zheng, Y. (2012). Intelligence Techniques in Computer Security and Forensics: at the boundaries of ethics and law. Computational Intelligence for Privacy and Security, 394, 237-258. [18] Trepel, Samantha (2007).Digital Searches, General Warrants, and the case for the Courts. Yale J.L. & Tech. 120, 1-45. 1. U.S. v Habershaw Criminal No. 01-10195PBS, 2002 U.S. Dist. Lexis 8977. 2. Williford v. Texas. (2004).127 S.W.3d 309; Tex. App. 3. Taylor v. Texas, 02-11-00092-CR 4. Ohio v. Brian Cook. (2002). 149 Ohio App. 3d 422; 429 5. U.S. v. Marinko, No. 09-30430 (9th Cir. 10-222010) 6. Ohio v. Anderson. (2004). Case No. 03CA3 302-0415 7. Four Seasons Hotels and Resorts v. Consorcio Barr, 320 F.3d 1205 (11th Cir. 2003). 8. Melendez-Diaz v. Massachusetts, 07-591 (Supreme Court June 25, 2009). 9. U.S. v. Rosa, 09-0636-cr (2nd Cir. 10-27-2010) 10. U.S. v. Dye, No. 09-3410 (3rd Cir. 10-222010) 11. U.S. v. Merz, No. 09-3692 (3rd Cir. 10-122010) 12. U.S. v. Dennington, No. 10-1357 (3rd Cir. 107-2010) 13. U.S. v. Jean-Claude, 09-5138 (10th Cir. 10-292010) 14. U.S. v. Suarez, Criminal Action No.: 09-932 (JLL) (N.J. 10-21-2010) 15. U.S. v. Christie, No. 09-2908 (3rd Cir. 9-152010) 16. In Matter of the Application of U.S., No. 084227 (3rd Cir. 9-7-2010) 17. U.S. v. Comprehensive Drug Testing, Nos. 05-10067, 05-15006, 05-55354 (9th Cir. 9-132010) 18. U.S. v. Williams, No. 10-10426 NonArgument Calendar (11th Cir. 9-22-2010) 19. U.S. v. Norman, CASE NO. 2:09-CR-118WKW [WO] (M.D.Ala. 9-24-2010) expectation of privacy for p2p 20. Maggette v. BL Development Corp., NO. 2:07CV181-M-A LEAD CASE, NO. 2:07CV182M-A (N.D.Miss. 9-2-2010 21. United States v. Highbarger, No. 09-1483, United States Court of Appeals for the Third Circuit, 380 Fed. Appx. 127; 2010 U.S. App. LEXIS 9963, May 6, 2010 22. United States v. Giberson, 527 F.3d 882, 88990 (9th Cir. 2008) 23. United States v. Hill, 459 F.3d 966, 977-78 (9th Cir. 2006). 24. United States v. Carey, 172 F.3d 1268 (10th Cir. 1999) 25. United States v. Grant, 490 F.3d 627, 633-34 (8th Cir. 2007) 26. US v. Stanley,United States Ninth Circuit 08/02/11 10-50206 27. People v. Stipo, B218512 (05/16/11) 28. ED* US v. Nosal, 10-10038, ninth circuit 04/28/2011 29. US v. Rodriguez, 09-15265 US eleveth circuit 12/27/2010 30. US v. Koch, 10-1789 US Eighth Circuit 11/16/2010 31. US v. Allen, 09-50283 US fifth circuit 11/05/2010 32. US v. Payton, 07-10567 US 9th circuit Decided 07/21/2009 33. US v. Christie, No. 09-2908 (3d Cir. Sept. 15, 2010). 34. US v. Ellyson, 326 F.3d 522 (4th Cir. 2003). 35. State v. Grenning, 174 P.3d 706, 142 Wash. App. 518 (Ct. App. 2008). 36. United States v. Doe, 556 F.2d 391 (6th Cir. 1977). 37. US v. Evers, 669 F.3d 645 (6th Cir. 2012). 38. US v. FREERKSEN, No. 11-6044 (10th Cir. Jan. 24, 2012). 39. US v. Vadnais, 667 F.3d 1206 (11th Cir. 2012). 40. US v. Moreland, 665 F.3d 137 (5th Cir. 2011). 41. United States v. Lynn, No. 09-10242 42. United States v. Hardy, No. 10-4104, UNITED STATES COURT OF APPEALS FOR THE THIRD CIRCUIT, 2011 U.S 43. UNITED STATES OF AMERICA, Plaintiff Appellee v. DANIEL JAMES BROUSSARD, Defendant – Appellant No. 11-30274 UNITED STATES COURT OF APPEALS FOR THE FIFTH CIRCUIT 2012 U.S. App. LEXIS 1876 February 1, 2012, Filed 54. UNITED STATES of America, PlaintiffAppellee, v. Luis Miguel DIAZ-LOPEZ, Defendant-Appellant. 44. UNITED STATES OF AMERICA, PlaintiffAppellee, v. OVELL EVERS, SR., DefendantAppellant. No. 08-5774 UNITED STATES COURT OF APPEALS FOR THE SIXTH CIRCUIT 12a0042p.06; 2012 U.S. App. LEXIS 2641; 2012 FED App. 0042P (6th Cir.) 55. UNITED STATES v. HILTON UNITED STATES of America, Respondent, Appellant, v. David HILTON, Petitioner, Appellee. No. 031741. April 02, 2004 56. UNITED STATES v. ROBINSON February 10, 2012, Decided UNITED STATES of America, Appellee, v. Robert ROBINSON, Defendant, Appellant. No. 03-1403. March 02, 2004 February 10, 2012, Filed 57. US v. Richardson 09-4072 June 11, 2010 45. UNITED STATES v. BERK 58. UNITED STATES v. KING UNITED STATES of America, Appellee, v. Michael A. BERK, Defendant, Appellant. No. 09–2472. July 27, 2011 UNITED STATES of America, v. Richard D. KING, Jr., Appellant. No. 09-1861. July 28, 2011, Argued 46. UNITED STATES v. PERAZZA MERCADO UNITED STATES of America, Appellee, v. José Angel PERAZZA–MERCADO, Defendant, Appellant. No. 07–1511. January 21, 2009 47. UNITED STATES v. LaFORTUNE UNITED STATES of America, Appellee, v. Girard LaFORTUNE, Defendant, Appellant. No. 06-1699. March 18, 2008 48. UNITED STATES of America, Appellee, v. David RODRIGUEZ-PACHECO, Defendant, Appellant. No. 05-1815. February 05, 2007 49. US v. Goldsmith, 432 F. Supp. 2d 161 (D. Mass. 2006). 50. UNITED STATES v. COUNCILMAN UNITED STATES of America, Appellant, v. Bradford C. COUNCILMAN, Defendant, Appellee. No. 03-1383. August 11, 2005 51. People v. Nazary (2011)191 Cal.App.4th 727 , -- Cal.Rptr.3d – 52. People v. Hawkins (2002) 98 Cal.App.4th 1428 , 121 Cal.Rptr.2d 627 53. UNITED STATES of America, Plaintiff– Appellant, v. David NOSAL, Defendant– Appellee. (2011) Argued Oct. 27, 2009. -- April 30, 2010 59. UNITED STATES v. TENUTO UNITED STATES of America, Plaintiff-Appellee, v. Vincent J. TENUTO, Defendant-Appellant. No. 09-2075. Argued Nov. 12, 2009. -- February 03, 2010 60. UNITED STATES v. KAIN UNITED STATES of America, Plaintiff-Appellee, v. Andrew Charles KAIN, Defendant-Appellant. No. 08-3396. 61. UNITED STATES v. LAY UNITED STATES of America, Plaintiff-Appellee, v. Dennis LAY, Defendant-Appellant. No. 07-4062. Argued: Jan. 16, 2009. -- October 13, 2009 62. LVRC HOLDINGS LLC v. BREKKA LVRC HOLDINGS LLC, Plaintiff-Appellant, v. Christopher BREKKA; Employee Business Solutions Inc.; Carolyn Quain; Stuart Smith; Brad Greenstein; Frank Szabo, Defendants-Appellees. No. 07-17116. Argued and Submitted March 13, 2009. -September 15, 2009 63. UNITED STATES v. STULTS UNITED STATES of America, Appellee, v. Harold STULTS, Appellant. No. 08-3183. August 14, 2009 64. UNITED STATES v. ROMM UNITED STATES of America, Plaintiff-Appellee, v. Stuart ROMM, Defendant-Appellant. No. 04-10648. Argued and Submitted Dec. 5, 2005. -- July 24, 2006 65. United States v. Otero, CRIMINAL NO. 1: CR-96-005-03 (M.D. Pa. Oct. 31, 2005). 66. UNITED STATES v. ALBERTSON UNITED STATES of America v. Randy A. ALBERTSON, Appellant. No. 09–1049. Argued Sept. 23, 2010. -- May 04, 2011 67. SNYDER v. BLUE MOUNTAIN SCHOOL DISTRICT J.S., a minor, through her parents; Terry SNYDER; Steven Snyder, Appellants v. BLUE MOUNTAIN SCHOOL DISTRICT; Joyce Romberger; James McGonigle. No. 08–4138. Argued June 2, 2009. -- June 13, 2011 68. US v. Cioni, 649 F.3d 276 (4th Cir. 2011). 69. US v. Mann, 592 F.3d 779 (7th Cir. 2010). 70. US v. Voelker, 489 F.3d 139 (3d Cir. 2007). 71. US v. Trotter, 478 F.3d 918 (8th Cir. 2007). 72. US v. Schaffer, 586 F.3d 414 (6th Cir. 2009). 73. US v. Mutschelknaus, 592 F.3d 826 (8th Cir. 2010). 74. US v. Tenuto, 593 F.3d 695 (7th Cir. 2010). 75. US v. John, 597 F.3d 263 (5th Cir. 2010). 76. US v. Lewis, 594 F.3d 1270 (10th Cir. 2010). 77. US v. Batti, 631 F.3d 371 (6th Cir. 2011). 78. US v. Quinzon, 643 F.3d 1266 (9th Cir. 2011). 79. US v. Felix, 561 F.3d 1036 (9th Cir. 2009). 80. US v. Luken, 560 F.3d 741 (8th Cir. 2009). 81. US v. Mitchell, 365 F.3d 215 (3d Cir. 2004). 82. US v. Lewis, 594 F.3d 1270 (10th Cir. 2010). 83. US v. Patterson, 576 F.3d 431 (7th Cir. 2009). 84. US v. Nichols, 512 F.3d 789 (6th Cir. 2008). 85. US v. Payton, 573 F.3d 859 (9th Cir. 2009). 86. US v. Griesbach, 540 F.3d 654 (7th Cir. 2008). 87. US v. GIBERSON, 527 F.3d 882 (9th Cir. 2008). 88. US v. Hansel, 524 F.3d 841 (8th Cir. 2008). 89. US v. Griffin, 150 F.3d 778 (7th Cir. 1998). 90. People v. Hertzig, C053674 91. US v. McCoy, 323 F.3d 1114 (9th Cir. 2003). 92. UNITED STATES OF AMERICA, PlaintiffAppellee, United States Court of Appeals Tenth Circuit August 24, 2007 RAY ANDRUS, v. No. 06-3094 93. Coburn v. PN II, Inc., 2:07-cv-00662-KJDLRL (Nev. 9-30-2010) 94. USA v SLANINA 5th circuit No. 00-20926 Feb 21 2002 95. Paroline v. US, 134 S. Ct. 1710, 572 U.S., 188 L. Ed. 2d 714 (2014). 96. United States of America, Plaintiff-Appellee, v. David Daniel Anderson, 7th circuit. Argued November 27, 2001--Decided February 12, 2002.No. 01-1368 97. United States Court of Appeals,Tenth Circuit.UNITED STATES of America, Plaintiff– Appellee, v. Jory Michael NANCE, Defendant– Appellant.No. 13–6188.Decided: September 23, 2014 98. United States Court of Appeals,Tenth Circuit.UNITED STATES of America, Plaintiff– Appellee, v. Gustave Wilhelm BRUNE, Defendant–Appellant. No. 12–3322. Decided: September 19, 2014 99. United States Court of Appeals,Tenth Circuit. UNITED STATES of America, Plaintiff– Appellee, v. John Edward MULLIKIN, Defendant–Appellant. No. 13–1290. Decided: July 15, 2014 100. United States Court of Appeals,Tenth Circuit. UNITED STATES of America, Plaintiff– Appellee, v. Lawrence L. LUCERO, Defendant– Appellant. No. 13–2084. Decided: May 2, 2014 ON THE NETWORK PERFORMANCE OF DIGITAL EVIDENCE ACQUISITION OF SMALL SCALE DEVICES OVER PUBLIC NETWORKS Irvin Homem [email protected] Spyridon Dosis [email protected] Department of Computer and Systems Sciences Stockholm University Postbox 7003, 164 07 Kista Sweden ABSTRACT While cybercrime proliferates – becoming more complex and surreptitious on the Internet – the tools and techniques used in performing digital investigations are still largely lagging behind, effectively slowing down law enforcement agencies at large. Real-time remote acquisition of digital evidence over the Internet is still an elusive ideal in the combat against cybercrime. In this paper we briefly describe the architecture of a comprehensive proactive digital investigation system that is termed as the Live Evidence Information Aggregator (LEIA). This system aims at collecting digital evidence from potentially any device in real time over the Internet. Particular focus is made on the importance of the efficiency of the network communication in the evidence acquisition phase, in order to retrieve potentially evidentiary information remotely and with immediacy. Through a proof of concept implementation, we demonstrate the live, remote evidence capturing capabilities of such a system on small scale devices, highlighting the necessity for better throughput envisioned through the use of Peer-to-Peer overlays. Keywords: Digital Forensics, Digital Evidence, Remote acquisition, Proactive forensics, Mobile devices, P2P, Network performance 1. INTRODUCTION Malevolent activities, quickly adapt and evolve to align themselves with the particularities of their given environment. This is seen in that they are no longer only a confined to the physical world. They have readily adapted to the digital realm, taking up their niche markedly on the Internet. Examples of such are the Zeus banking Trojan (Stone-Gross, 2012) and the Flame malware (sKyWIper Analysis Team, 2012) stealing banking credentials and performing espionage activities respectively. They are no longer rare occurrences with mild consequences. They have permanently set up camp in intricate and surreptitious forms, taking unjust advantage over unsuspecting users going about commonplace activities on the Internet. The Regin malware (Kaspersky Lab, 2014), formally analyzed and documented in 2014 as a cyberespionage tool, is an example of this, having said to have been possibly in the wild since 2003. Today, all activities in digital realm are at the risk of being compromised by malicious actors aiming at perpetrating theft, impersonation, sabotage or to paralyze others’ activities for personal benefit. The consequences of such malicious activities for the unsuspecting user have also become more detrimental, persistent and having far reaching effects in that they are largely untraceable and easily invisible to the untrained eye. Developing novel and innovative methods that enable malicious activities to remain effectively undetected and untraceable, is the hallmark of these evildoers. They are almost always one step ahead of the pursuers. Furthermore, it is relatively easy to hide among the deluge of data that is created among communication devices that support the basic network communication on the internet. Malevolent activity in the “Digital Realm” can thus, easily become rampant and uncontrollable if there are no equally innovative methods to counter the offending actors and their activities. The rate of innovation and uptake of novel techniques by law enforcement agencies, digital forensics practitioners and incident responders must at the very least be equivalent to that of their criminal counterparts, if they are to keep up with the proliferation of crime on the Internet. One of the foremost areas in digital crime investigations where innovative means of combatting crime are highly necessary, but largely lacking, is the evidence capture process. This is the initial stage of an investigation where artifacts from the scene of the crime need to be retrieved in their original form, or, in the case of digital investigations, in some form of a complete copy of the original artifact that can be proven to be devoid of any tampering (National Institute of Standards and Technology, 2004) (Scientific Working Group on Digital Evidence (SWGDE), 2006). This process needs to be performed meticulously, carefully and in many cases slowly in order to ensure that there is no potentially crucial piece of evidence left behind. This is the state of affairs in the real physical world. However, today’s crime scene is rapidly edging away from a physical reality into a more virtual one. The forms of evidence found in these “Digital Crime Scenes” have also moved from the traditional fingerprints, footprints, hair samples, blood samples or other DNA related evidence, into more digital artifacts.. Such digital forms of evidence commonly include hard-disk drives, live (RAM) memory, network traffic captures, mobile devices, RAID sets (M. Cohen, Garfinkel, & Schatz, 2009), and virtually any other form of technology that records past events of its actions; that can be captured and can be analyzed during or after the criminal event and whose integrity can be verified. This opens the floor to almost any form of computer appliance (physical or virtual) that can be thought of. Thus arises the heterogeneity problem among devices – or simply put the seeming lack of standardization among vendors of devices that perform related tasks. Different devices may have different physical connectors, operating systems, software applications, storage formats, encoding schemes and communication protocols (CDESF Working Group, 2006). This heterogeneity makes the job of a Digital Investigator a lot more difficult because of the wide variety in which evidence could manifest itself in the wild. This greatly hampers any manual efforts of collecting evidence, even with the assistance of semi-automated tools of today such as disk imagers. In addition to this, Electronic Crime cases today often involve more than just a single device. Several computer-like appliances including tablets, mobile phones, digital cameras, GPS devices, smart-TV’s and even embedded devices such as onboard vehicle computer systems (from trucks, cars and even ships) could be seized for a single case, in order to be subjected to further investigative analysis. If we also bring in the vast realm of the Internet also into play, such evidence sources could include web application accounts, online email accounts, cloud storage facilities, network traffic captures and logs (Raghavan, Clark, & Mohay, 2009). It is not difficult to imagine that all these evidence forms could easily be part of a single case in today’s world and even more so in the imminent realm of the Internet of Things. The sheer volume of data that one would have to sift through in order to investigate a single case could be in the order of Terabytes and can be a more than daunting task to perform. (Case, Cristina, Marziale, Richard, & Roussev, 2008) Furthermore, in the realm of the Internet, composed of massively interconnected devices sharing vast amounts of highly varying data, crossing paths at high velocities, the speed of the capture of potentially evidentiary information is of essence. The same levels of meticulousness and carefulness of physical evidence acquisition may as well be sacrificed to some extent for the agility that is needed in reacting to crime in the digital world. This is because potentially evidentiary information that is not captured almost instantaneously, is likely to be lost forever in just a matter of seconds. However, this does not mean that all accuracy and care in collection of digital evidence artifacts is ignored, rather it is traded-off and reduced in favour of speed. Nevertheless, the maintenance of the chain of custody is always very important in any digital investigation. New methods of achieving similar standards of the preservation of digital evidence to those of physical evidence also need to be sought after and integrated into legal standards. Finally, at present, investigators grapple with the problem of the relatively immature forensic tools that they are presented with. Current industry standard forensic tools such as EnCase, FTK, XRY, Volatility and Wireshark, at the moment of writing, do not cater for the highly divergent nature of digital evidence sources. Most, if not all tools, focus on a single niche area such as Filesystem Data, Live Memory, Network Traffic, Mobile Devices or Log data. None of these tools provide a comprehensive method to interface with all the variety present to provide a uniform investigation platform. In addition to this, current tools have rather limited capabilities for capturing potentially evidentiary data on demand over networks as well as dealing with extremely large datasets. Most of the tools would struggle and would quickly become problematic when presented with Internet-Scale crime scenes. In this paper, we present the architecture of a scalable, distributed, multi-component incident response and digital investigation platform aimed at dealing with large scale distributed cybercrime investigations. We name this system the Live Evidence Information Aggregator, or LEIA, in short. The LEIA architecture aims at curbing cybercrime through assisting digital forensics practitioners and law enforcement agencies in improving their digital crime response capabilities. This is to be done through addressing several of the aforementioned problems such as the innate and growing complexity of the fast growing “Internet-of-Things” types of cases as well as dealing with the constantly growing amounts of heterogeneous data vis-a-vis the present shortage of physical resources and technical capacity within Law Enforcement. We also address the need for proactive collection of evidence from potential evidence sources on-demand over public networks, and further show the need for faster throughput network transfers such as those seen in Peer to Peer technologies. The rest of this paper is organized as follows: In Section 2, we review related work outlining shortcomings of previous similar solutions. Section 3 describes the requirements for a comprehensive distributed digital investigation platform. The functionality of the LEIA system with particular focus on the networking component is described in Section 4. The network-focused proof of concept implementation and results are outlined in Section 5. In Section 6 and 7, we summarize the work done in this study and propose further work that may be done in this area, respectively. 2. BACKGROUND AND RELATED WORK Several progressive efforts have been made towards improving the efficiency of the Digital Investigation process. The motivations behind these have spawned from the changing requirements of national and international legal systems, the evolution in the digital crime scene, the visible backlogs of cases overburdening law enforcement agencies and advances in technological capabilities. Some of these efforts include: Delegation and collaboration among teams; Reduction of evidence sizes through filtering out known files; and simple automation of important but mundane, repetitive tasks (such as indexing data for subsequent searches, file carving, parsing running process in memory or TCP flows in network captures). Most of these capabilities have been implemented in current industry standard forensic tools, however, investigators and analysts still remain overburdened (van Baar, van Beek, & van Eijk, 2014). This is because of the presently abundant and steadily growing amounts of heterogeneous and disjointed datasets from multiple sources that they are tasked to collect and analyze. Methods to alleviate this problem through fully automating the remote collection and pre-processing of such data are so far either lacking in efficiency or in scalability. Several unidirectional solutions, such as, (Almulhem & Traore, 2005) have been proposed in a bid to solve this multi-faceted problem, however, they have not been unequivocally successful. In recent times there have been initiatives to centralize evidence storage (Ren & Jin, 2005), but distribute processing among several machines (Roussev & Richard III, 2004). There has also been a push towards having the different parties, involved in solving a case to work together, even from geographically separate locations (Davis, Manes, & Shenoi, 2005), particularly with respect to technical staff in niche areas (such as filesystem forensics, network forensics, live memory forensics or mobile forensics) and the legal experts. Collaboration has been the mainstay of the attempt to get cases solved faster. Reducing the amount of data that is needed to be collected is also a means of reducing the amount of time needed to analyze the data. This has previously been done through “Known File Filtering” as well as through scripts crafted to use heuristics (Koopmans & James, 2013). Network Security Monitoring has also been an avenue for gathering data with the help of Intrusion Detection Systems (IDS’s) assisted through data mining (Leu & Yang, 2003). However, this has been the specific mandate of the IDS, centralized or distributed, as the case may be, with terminating (end) devices or intermediary devices generally playing very minor roles in this task. As far as is known to the author, there has not been much done, through any single initiative, in terms of expanding the scope of data captured to be the mandate of all possible devices of reasonable capability. Enabling individual devices to natively act as part of the Incidence Response System, towards the aim of collecting potential evidentiary data, has not been widely studied. Additionally, collaboration on the human processing level has been emphasized, but it has not been introduced among unrelated networked devices. These devices could possibly be harnessed to work together towards aiding in intelligent real-time capturing, filtering and processing in order to attain and retain that which could be considered as possible evidentiary data, antecedent to the event of a crime being detected. It is for these reasons that we delve into this area to explore it further. Notable related studies include (Zonouz, Joshi, & Sanders, 2011), that describes a live network forensics system that provisions varying Intrusion Detection Systems on host machines based on their respective resource costs. It works in a virtualized environment where snapshots are taken periodically and used to revert the system back to the point before an attack began. Each system rollback results in a different IDS’s being deployed to collect new and possibly better information. This presupposes that the attacker reenacts their malicious behavior in a similar way to their previous attempts, each time their efforts are thwarted by the system. Storage of the potential evidentiary information in a forensically sound manner is not particularly dealt with in this study. The aim was to understand attacks better in order to make better decisions on what kind of preventive measures to deploy. (Shields, Frieder, & Maloof, 2011), (Yu et al., 2005), (M. I. Cohen, Bilby, & Caronni, 2011), and (Moser & Cohen, 2013) describe distributed system architectures for proactive collection and summarization of evidence, with centralized data storage and processing. They are, however, particularly directed at closed domain enterprise systems, where there is some form of control and order instigated by system administrators. Participation of computer systems outside the control of the enterprise is not considered. The system being proposed in this study is aimed at being universal – applying to the entire Internet. The work done by Redding in (Redding, 2005) is the most closely related study done in the area of pro-active and collaborative computer forensic analysis among heterogeneous systems. Redding proposes a peer-to-peer framework for network monitoring and forensics through which network security events can be collected and shared among the peers. “Analysis, forensic preservation and reporting of related information can be performed using spare CPU cycles,” (Redding, 2005) together with other spare, under-utilized, or unused resources. This system however seems to be designed to collect only network security events and not any other forms of evidence from individual host devices Furthermore it seems to be aimed towards an “administratively closed environment” under the control of some systems administrator within an enterprise. An open system that has the Internet as its domain of operation assisting in the collection of any form of computer based evidence is what is not dealt with in Redding’s work. Thus, it is this that is sought after in the current study as will be described later in this paper. In order to facilitate uniform, seamless exchange of forensic artifacts between heterogeneous entities, some form of standardization of the transmitted evidence formats is necessary. One of the bodies that has made proposals related to this is the Common Digital Evidence Storage Format Working Group (CDESF Working Group, 2006). Other notable efforts include (Schatz & Clark, 2006) which makes use of the Resource Description Framework (RDF) from Semantic Web technologies as a common data representation layer for digital evidence related metadata, using ontologies for describing the vocabulary related to this data, and (Kahvedžić & Kechadi, 2009) where a detailed ontology of Windows Registry artifacts of interests is introduced. The Open Forensic Integration Architecture (FIA) in (Raghavan et al., 2009) and FACE (Case et al., 2008) describe methods for the integration of digital evidence from multiple evidence sources in a bid to facilitate more efficient analysis. The Advanced Forensic Format (Garfinkel, 2006), AFF4 (M. Cohen et al., 2009) and XIRAF (Alink, Bhoedjang, Boncz, & de Vries, 2006) describe annotated evidence storage formats that allow for addition of arbitrary metadata as well as interoperability among different tools. In AFF4 (M. Cohen et al., 2009), notably, remote evidence capture, some form of availability through manually driven redundancy, and some parallelism in the evidence capture process of RAID data sets is also present. However it seems that the initiation of these processes is instigated through human intervention. They are not fully automated through machine triggers, and thus could be slow to react in acquiring evidence. The availability (fail-over) provided through redundancy is based on whether the evidence captured is required in other locations. If it is not required elsewhere, then the fail-over mechanism would not work because there would be only one copy of the evidence. The parallelism (described particularly for acquiring individual disks in a RAID set) is unclear whether it could also apply in parallelizing other potential evidence data sources such as RAM memory or NAND storage on mobile devices. The proposed idea that this study covers is composed of several areas of specialization, namely: The Internet of Things (IoT), Intrusion Detection Systems, Peer to Peer Networks, Virtualization infrastructures, Large Scale Cloud storage and Semantic Web technologies. Most of these technologies have been previously harnessed in different capacities, singularly or in small clusters, towards the benefit of digital forensics for today’s complex internetworked and intertwined cyber realm. However, to the author’s knowledge, there has so far not been any work done that aims to merge all these technologies together in order to provide a singular scalable solution that solves the recurring problems of large amounts of data, several sources of evidence, inability of collecting evidence efficiently over networks, heterogeneity among systems, insufficient processing power, security and privacy – that are constantly troubling digital forensic analysts and law enforcement agencies worldwide. 3. CHARACTERISTICS OF THE DESIRED SOLUTION Inspired by the challenges documented by Palmer at the first DFRWS conference (Palmer, 2001), we describe below a wish-list of characteristics that one would like to have in a comprehensive Digital Forensics and Incident Response system for a public open domain networked environment such as the Internet. They are aimed at complementing and updating Palmer’s list in light of the current state of electronic crime and the present state of forensic tools, as described earlier. i. Distribution: The ability to deal with massive amounts of distribution in terms of participants, data storage, processing and dissemination. The system needs to be able to handle the heterogeneity that may come with distributed systems as well. ii. Scalability: Large scale interconnectivity, as well as the possibility of new entities joining, iii. iv. v. vi. vii. viii. ix. as well as others leaving the system dynamically and gracefully without drastic negative effects on the system. The ability to easily improve or extend the capabilities of the system through new modules is also desired. Availability: Providing suitable levels of functionality as and when required. Universality: Among the heterogeneity and lack of standardization among vendors of different systems, there needs to be some standardization and common understanding between the systems on the level of communication and storage of potential evidentiary information. Responsiveness: The system should be able to aptly detect when a security policy has been irrecoverably violated, thus collecting information in order to pursue the perpetrators of the criminal actions. This also improves on efficiency and privacy in that the system does not have to perpetually be collecting all possible information from all possible systems. Resource Sharing: Today, large complex problems that are being solved through collaboration and sharing of resources as seen in Crowdsourcing, P2P networks, and cloud infrastructures. They provide on demand rapid availability of large amounts of resources from collective resource pools providing speed, efficiency and the benefits from “the wisdom of the crowd”. Integrity (Trust, Reliability & Accuracy): As a system facilitating law enforcement in digital crimes, the levels of trust, reliability, accuracy and integrity of the information needs to be high enough to be accepted as a veritable source of evidentiary information for a court of law. The Daubert standards and the chain of custody need to be adhered to. Privacy & Confidentiality: Personally identifiable and secret information must be maintained as anonymous and confidential as is reasonably acceptable, unless incriminated. Unauthorized access to such information is not to be allowed. Security: In addition to ensuring the security of the potential evidentiary information that it aims to collect and process, it must also take its own security into consideration – especially in terms of authentication, authorization, accountability and non-repudiation of activities undertaken. 4. LEIA: THE LIVE EVIDENCE INFORMATION AGGREGATOR The LEIA is a 4-tiered system architecture that may be described as a combination of hypervisors, intrusion detection systems, peer to peer systems and cloud storage. It is made up of the following components: a) The Host-based Hypervisor (HbH) b) The Peer-to-Peer Distribution Architecture (P2P-da) c) The Cloud-based Backend (CBB) d) The Law Enforcement Controller (LEC) The functionality of each of the layers of the LEIA system is briefly described in the following sections. 4.1. The Host-based Hypervisor (HBH) The Host-based Hypervisor (HbH) system is composed of a virtualization layer managed by a hypervisor – a privileged secure platform managing the guest operating system (OS). The hypervisor contains an inbuilt host-based intrusion detection system also termed as the embedded intrusion detection system (em-IDS). Security utilities within the guest OS such as anti-malware tools and intrusion detection systems maintain their own data and logs that are accessible to the HbH. The HbH collects and assimilates the information that it gets from its own inbuilt intrusion detection system together with other information collected from the other security utilities that may exist within the guest OS. This helps in getting a better perspective of current malicious activity that may be underway. Further to this sharing of information within a single HbH system, individual HbH systems also share their information about malicious activity they may have discovered with each other. This communication is facilitated through the Peer-toPeer Distribution Architecture (P2P-da). This collaborative effort among the HbH systems further helps improve the accuracy of IDSs and eventually forensic data acquisition. In order to reduce the amount of data that may need to be collected for analysis, each HbH maintains a hash list of the local files on its guest operating system (Local - Known Data Hash-List, L-KDHL). This L-KDHL is periodically crosschecked and updated against a Master – Known Data Hash-List (M-KDHL) stored at the Cloudbased Backend (CBB). This is managed by the Cloud-based Backend Differencing Engine (CBBDE) component of the CBB. The aim of this is to quickly filter out known system data or files through matching the files on a HbH against hashes of system files that are known to be benign and have not been modified in an way. A user data profile with its corresponding hashlists is also created. The user-data hash-list is also maintained in a dual format – with a local copy residing on the HbH and a remote master copy being maintained at the CBB. Further to this the actual user data is backed up at the CBB. Thus, the user data hash lists are used to check which files have changed and may need to be backed up to the CBB. Figure 1: The components of the HbH component With respect to “Known Malicious Files” which are files that have been previously identified as having malicious content within them, a “Known Malicious File” hash list is to be maintained only on the CBB. It is not held on individual HbH systems as it can easily become large and unmanageable for a single HbH to maintain. The hypervisor is the critical element when it comes to the collection of potential evidentiary data. Having privileged access, the hypervisor is able to directly interact with the file system, network interfaces, memory caches and other low- level resources, which are all primary sources of common evidentiary data in digital investigations. The embedded IDS’s (em-IDS) also collects information mostly in the form of logs which are parsed to result in synthesized alerts. When evidentiary data from the local HbH is collected, it is transmitted towards the CBB via neighbouring HbH systems through the action of the P2P-da system (described in the next section). While such data is in transit through a neighbouring HbH system, and travelling onward to the CBB, it is always held in an encrypted form and only within temporary storage. 4.2. The Peer-to-Peer Distribution Architecture (P2P-da) The essence of the P2P-da is to provide reliability, scalability and rapid throughput of transmitted data even in the face of high rates of “churn”, that is, large numbers of nodes joining and leaving the network. In order to achieve this, a cocktail of P2P protocols are put together in order to extract the best qualities of these and also allow for each to compensate for the other’s shortcomings. The particular P2P protocols that are put together in order to build the P2P-da are: Gradient overlay protocols (Sacha, Dowling, Cunningham, & Meier, 2006) Epidemic protocols (Jelasity, Voulgaris, Guerraoui, Kermarrec, & Steen, 2007), and the Bit-Torrent protocol (B. Cohen, 2003). There are 3 main functionalities of the P2P-da: I. Maintenance of the P2P Overlay II. Dissemination and aggregation of Malicious Behaviour Information and alerts. III. Incident response data collection These functionalities generally correspond to the P2P protocols that make up the essence of the P2P-da. The function of the maintenance of the P2P overlay is facilitated mainly through gradient (hierarchical) overlays assisted through epidemic (gossip-based) overlays. The dissemination and aggregation of malicious behavior information is mainly facilitated by epidemic (gossip-based) overlays. Incident response data collection is mainly facilitated through an adaptation of the BitTorrent protocol. The details behind these individual functionalities are dealt with in the following sections. 4.2.1. Maintenance of the P2P Overlay The essence of this is for the overall P2P network to maintain connectivity among neighbouring nodes as well as the larger HbH node population. Further to this, the aim here is to link HbH nodes in such a way that they are most beneficial to each other as well as to the overall communication of security events and evidence transmission aims. In order to do this, a hierarchy is to be created among the peer nodes such that those less endowed with resources are lower in the hierarchy and those that are better endowed are higher in the hierarchy. The aim of this is to ensure that nodes that lack resources generally communicate security event information, or transmit potentially large evidence files towards more reliable and stable peers. It is assumed that nodes with more resources are more likely to be better equipped to deal with larger amounts of information and are also more likely to be online and available to be communicated with. A gradient overlay network is suited to ensure this form of a network structure. It is built in such a way that a utility metric is used to determine which nodes are most suitable to connect to, and which nodes to avoid. This utility metric is determined from a combination of factors including the amount of resources available on a node, the current state of use of the node and the amount of time that it has been online. These utility metrics are shared through random node interactions, typical of “gossip-based” (epidemic) P2P protocols in order for nodes to get to know of other nodes that might be better to link to. As gossip-based P2P protocols are known to eventually converge to a generally stable state, a hierarchy of the HbH systems is thus formed with the less endowed elements on the outer edges and the more capable elements closer towards the centre of the LEIA system (that is, the CBB). 4.2.2. Dissemination and Aggregation of Malicious Behaviour Information & Alerts This capability is necessary in order to facilitate the collaborative mechanisms needed to ensure that security event information is shared, and that potentially useful evidence information is captured efficiently and transmitted securely. Security event information known by individual HbH peers is duly shared out to others in order for the overall system to have a more informed security posture as well as to be forewarned of imminent malicious events. This includes the distribution of malicious activity signatures as well as the discovery of malicious activity originating from certain hosts. When such messages are received, only a set of the most common and recently active malicious activity signatures are maintained at the HbH. These kind of messages are termed as “Management messages” and can be shared out to any peers that a particular HbH has address information about and that has connectivity. The other major type of messages that are involved in this functionality are termed as “Security Incident Control messages”. These messages facilitate the reaction to the detection of a malicious event. This mainly includes the communication of procedures to initiate the evidence capture process on certain components of certain nodes as well as initiating initial preprocessing such as determining IP addresses of recently connected devices in order to extend the evidence capture process to other suspected devices. There may be other forms of messages that might need to traverse the P2P-da, however, the 2 categories mentioned thus far are the major types. 4.2.3. Incident response data collection This functionality is triggered by the detection of malicious events via the collective knowledge gained through collaborating HbH systems, the em-IDS and guest OS security mechanisms. For more volatile data such as network traffic and live memory, a fixed time period is chosen for which to perform the capture process (or a fixed number of snapshots of the data over a short period of time particularly for live memory) after which a decision is to be made whether subsequent captures need to be made, or whether what has been collected so far suffices. Correspondence with the Cloud-Based Backend-Differencing Engine (CBB-DE) filters out known system files through facilitating the hash comparisons. Primary analysis for IP addresses and hostnames on the data collected may result in triggering of other HbH systems to capture data also. The actual data collection procedure involves 3 stages as described in the following sections. The diagram below (Fig. 2) depicts the data collection and transfer process of the P2P-da. Figure 2: The P2P-da Data Transfer process a) Data Partitioning Different data formats (memory dumps, logs, files, packet captures, disk images) are compressed and stored temporarily on the HbH system in a modified AFF4 data structure that also contains simple RDF metadata describing the evidence. This data structure is termed as the Incident Data Archive (IDA). Each IDA data structure is partitioned in equal size pieces that will be referred to as shards. The shard is a signed and encrypted partition of the IDA analogous to the idea of a “piece” in the BitTorrent Protocol. A metadata file termed as the “reflection” (which corresponds to the BitTorrent Metadata file) is also created and sent directly to the CBB. In this way the CBB acts as the “tracker” and “leeches” IDAs from participating HbH systems in the P2Pda, thus benefiting from the high throughput of the BitTorrent protocol b) Shard Distribution Multiple copies of each individual shard are distributed to more capable neighbours (supporters), facilitated by the gradient overlay. Each time a shard is passed on it increases its “heat level”. After a certain “heat” threshold (that we refer to as the “melting point”) a HbH system is obliged to directly upload to the CBB (more specifically the HbH Master Peers of the CBB), else an election procedure is initiated to determine which previously supporting HbH should be delegated the uploading task. In order to avoid an individual node being the only “proxy” and thus a potential single point of failure, individual HbH systems are only allowed to partake in uploading a certain number of IDA shards governed by the “dependency value”. This improves the overall reliability of the larger system through reducing the possibility of having a single point of failure in the transmission process. c) Rapid fragment reconstruction For a particular shard, downloads are initiated from all their respective supporter locations. This is done for redundancy and bandwidth maximization purposes. Similar to the BitTorrent Protocol download, priority is given to the shards that are the least commonly available, that is, those that have the fewest recorded supporters. In order to reconstitute the IDA, individual hashes of shards are verified as they are received, against that in the reflection. Several supporters upload at the same time, thus if a shard is in error, that from another supporter is taken. Once successfully transferred, shards are deleted from supporting HbH systems. 4.3. The Cloud-based Backend (CBB) The CBB system is a highly available, scalable, responsive, centralized back end storage service capable of storing large amounts of data in a homogeneous form. It is subdivided into 3 major components: The Storage System (SS), the Differencing Engine (DE) and the HbH Master Peers. The Storage System (SS) is built upon the Hadoop HDFS architecture (Shvachko, Kuang, Radia, & Chansler, 2010) that provides not only the raw storage capabilities but also scalability, availability, reliability and responsiveness. The Differencing Engine (DE) filters out known files before having them stored on the CBB. This is provisioned through the MapReduce (Dean & Ghemawat, 2008) capabilities supported by Hadoop. The DE also provides a query-response mechanism to the HBH systems with information on known benign data as part of the Master Known Data Hash-List (M-KDHL). The MKDHL contains data about known files, memory processes, protocol flows, and log entries and thus enables their removal from IDAs being prepared. This reduces the size of IDAs before being stored on the Storage System (SS) of the CBB. The HbH Master Peers are a particular set of wellendowed peers that are directly connected to the core CBB system (that is, the SS and DE) providing an interface to the rest of the LEIA system through the P2P-da. They do not have other core functionalities unrelated to their LEIA responsibilities and are essentially the backbone of the P2P-da and ultimately the provider of connectivity of the LEIA system outwards to the other HBH systems. The HBH Master Peers also serve as the central point through which system software updates and malicious event detection heuristics are originated from and disseminated outwards to the HBH systems in the wild. Figure 3: The Cloud-based Backend components 4.4. The Law Enforcement Controller System The Law Enforcement Controller is the main interface that law enforcement personnel interact with in order to perform their directed analysis for a particular digital investigation case. Through it, a Law Enforcement Agent can initiate specific queries to the data sets stored on the CBB, thus retrieving detailed, structured information as well as new knowledge inferred through correlation of data originating from different sources that may help in solving a case. The aim of this is to automate otherwise manual tasks of correlating data from different heterogeneous sources in order to pose valid assertions based on the data that could assist a forensic analyst in performing their duties of making sense of digital artifacts. This functionality is described in more detail by Dosis in (Dosis, Homem, & Popov, 2013). Additionally, from the new found knowledge obtained through correlation, patterns of malicious activities are to be learnt and stored. These Malicious Activity Patterns are to be used as feedback to the HbH systems in order to improve the detection capabilities of the inbuilt IDS’s and thereby also improve the accuracy of collection of data of potential forensic evidentiary use. 5. PROOF OF CONCEPT EVALUATION AND RESULTS As the first part of testing the motivations behind the designed architecture, we decided to focus on the network transmission component as it is critical in enhancing speedier evidence collection. In order to demonstrate the need for better throughput networks such as those exhibited in P2P overlays, an experiment was set up to simulate the conditions of the LEIA, however without the P2P-da component. This means that, the experiment was performed with the transmission of potentially evidentiary information from a HbH system to the CBB over a traditional client-server paradigm. The experiment itself focused on the time taken to perform remote extraction, compression and transmission of increasingly larger disk images over an encrypted channel from small scale devices over the Internet and the subsequent reconstruction and storage of this data on a Hadoop HDFS cluster. It should be mentioned that for the sake of simplicity of the experiment, the actual hypervisor of the HbH system was not built, however closely similar conditions – particularly in terms of the LEIA prototype application having privileged access – were met. In order to test and measure the performance of the proof of concept application working over the client-server paradigm, four different small scale devices were used. The table below outlines the specifications of the devices being captured. Table 1: Small scale device specifications Device Platform Processor Chipset RAM Disk Chumby Classic Busybox v1.6.1 350MHz ARM926EJ-S Freescale i.MX233 64MB 64MB HTC Incredible S Android OS v2.3.3 (Gingerbread) 1 GHz Scorpion Qualcomm MSM8255 Snapdragon 768MB 1.1GB HTC MyTouch 4G Slide CyanogenMod 10.2 Alpha Dual‐core 1.2GHz Scorpion Qualcomm Snapdragon S3 MSM8260 768MB 4GB Android OS, v4.0.3 Dual‐core 1GHz (Ice Cream Sandwich) TI OMAP 4430 1GB 8GB Samsung Galaxy Tab 2 (WiFi Only) Figure 4: The experimental set up In order to perform the testing and the performance evaluation, partitions of the various devices were filled to specific size limits with random files, including images, PDFs, music files and compressed archive files (RARs) in order to simulate normal disk usage. These devices were subsequently captured over the network. The capture process was repeated 10 times for each individual partition size in order to get the average file transfer times that each size took. The sizes measured were taken at 9 intervals with gradually increasing sizes. The maximum size of 4GB was taken as the largest size because the average capture (file transfer) times were beginning to take rather long periods (50-80 minutes) per test acquisition round. Furthermore, the maximum disk size on any of the devices available for testing was 8GB (with the rest being 4GB, 1.1GB and 64MB). A 4GB mini-SD card was also available and was used to supplement the HTC Incredible S in order to simulate a larger disk size. The Chumby Classic only had 64MB available of flash (NAND) memory, and no expansion capabilities, thus it was not included in the testing for remote data transfer performance as there was no way to increase the size of the storage capacity. It was, however, used in testing to show that the remote device capture of such a small scale device running on a Linux based platform was possible. It was also used as the main prototyping device because it had a rather small storage capacity that enabled rather quick disk acquisitions when testing the software developed. The repetition process and the use of the averaging was done in order to compensate for the effects of random processes that could have affected network transmission times. Such random processes could include network traffic from other users of the networks being used, phone calls coming in and interfering with the I/O processes of the devices, or applications being updated on the devices, among others. The tables below show the partition sizes used and the average times (in milliseconds) taken to perform the transfer: Table 2: Results from Test Cases on "HTC Incredible S" Partition Amount used # of Test Runs Avg. File Transfer time (ms) 16MB 133MB 250MB 507MB 1000MB 1500MB 2000MB 3000MB 4000MB 10 10 10 10 10 10 10 10 10 13664 84600.8 392323.9 553933.1 978571.8 1360375 2932376.8 3877676.8 4814006.6 Table 3: Results from Test Cases on "HTC MyTouch 4G Slide" Partition Amount Used # of Test Runs Avg. File Transfer time (ms) 21.4MB 87.0MB 255MB 500MB 1000MB 1550MB 2000MB 3000MB 4000MB 10 10 10 10 10 10 10 10 10 8583 31467 230709 338180 1174482 1323845.90 1673928 2052952.40 3015056.60 Table 4: Results from Test Cases on "Samsung Galaxy Tab 2" Partition Amount Used # of Test Runs Avg. File Transfer time (ms) 4MB 11MB 250MB 500MB 1000MB 1500MB 2000MB 3000MB 4000MB 10 10 10 10 10 10 10 10 10 1235 67608 286947 426783 960952 1488236 2829355 2951551 3707556 The data above from three of the four different specimen devices is plotted on a graph in order to visualize the general trend of the file transfer time against the partition size for the client server network paradigm of remote evidence acquisition. The diagram that follows depicts the graph that was attained: 5000 HTC MyTouch 4G Slide 4500 HTC Incredible S File Transfer Time (Secs) 4000 Samsung Galaxy Tab 2 3500 3000 2500 2000 1500 1000 500 0 0 500 1000 1500 2000 2500 3000 3500 4000 4500 Partition Size Used (MB) Figure 5: Performance of the LEIA Proof of Concept with the Client-Server paradigm From the figure above, the curves seem to start off with a linear relationship which soon seems to turn into more of an exponential relationship. The “HTC MyTouch 4G Slide” clearly portrays this characteristic, with the rest of the devices also exhibiting this however not as vividly. Overall there seems to be a more exponential relationship between the Partition Size and the File Transfer Time with respect to the larger sizes of partitions. One could posit that as the partition sizes increase, even to sizes substantially larger than those in the graph, the relationship will become ever more exponential. This means that the times taken to acquire such partition sizes would be increase in exponential magnitude and thus shows that the client-server paradigm is likely not suitable enough for the task of performing remote evidence acquisition, especially in the type of environment that the LEIA system is aimed at. This suggests the need for a more efficient network transfer paradigm for this type of activity. From this need, we postulate that the use of P2P networks, between the evidence capture location and the eventual storage location, could be a suitable replacement, as certain P2P overlays are known to provide better network throughput, and thus shorter latency times between evidence capture and storage. 6. CONCLUSION In this study we outlined the numerous problems that blight the digital investigation process, and law enforcement agencies at large, rendering them slow and ultimately ineffective. We proposed a comprehensive architecture of a proactive, system – that makes used of hypervisors, P2P networks, the RDF framework and cloud storage – that could essentially revolutionize the digital investigation process through automation. Finally, through a small proof of concept, we demonstrate a limited part of this system, and motivate the need for a network paradigm with better throughput. Some P2P overlays demonstrate this and could possibly provide the solution to improving the speed of remote evidence capture. Digital Investigation, 3, doi:10.1016/j.diin.2006.06.016 50–58. Almulhem, A., & Traore, I. (2005). Experience with Engineering a Network Forensics System. Proceedings of the 2005 international conference on Information Networking. Convergence in Broadband and Mobile Networking (pp. 62–71). Korea: Springer Berlin Heidelberg. Case, A., Cristina, A., Marziale, L., Richard, G. G., & Roussev, V. (2008). FACE: Automated digital evidence discovery and correlation. Digital Investigation, 5, S65– S75. doi:10.1016/j.diin.2008.05.008 CDESF Working Group. (2006). Standardizing digital evidence storage. Communications of the ACM. doi:10.1145/1113034.1113071 7. FUTURE WORK Though this architecture is promising, larger disk acquisitions need to be performed with more modern small scale devices that are equipped with larger storage capacities in order to further confirm the need for a more efficient form of network data transfer in the form of P2P communication. From the proposed architecture, several parameters within the P2P communication protocols need further optimization and testing. Additionally, a PKI infrastructure can be infused in the system in order to improve the security of the communication and storage facilities. Also, the storage capabilities of the Cloud-based Backend could be supplemented by that of participating HbH nodes in order to realize a more distributed and independent storage solution. The concept of privacy also needs to be addressed within the scope of this solution. Finally, an experiment with a wider scope, in terms of multiple devices being tested simultaneously, would be greatly desired in order to better drive this architecture towards becoming a reality. 8. BIBLIOGRAPHY Alink, W., Bhoedjang, R. a. F., Boncz, P. a., & de Vries, A. P. (2006). XIRAF – XML-based indexing and querying for digital forensics. Cohen, B. (2003). Incentives build robustness in BitTorrent. Workshop on Economics of Peerto-Peer Systems. Retrieved from http://www.ittc.ku.edu/~niehaus/classes/750s06/documents/BT-description.pdf Cohen, M., Garfinkel, S., & Schatz, B. (2009). Extending the advanced forensic format to accommodate multiple data sources, logical evidence, arbitrary information and forensic workflow. Digital Investigation, 6, S57–S68. doi:10.1016/j.diin.2009.06.010 Cohen, M. I., Bilby, D., & Caronni, G. (2011). Distributed forensics and incident response in the enterprise. In Digital Investigation (Vol. 8, pp. S101–S110). Elsevier Ltd. doi:10.1016/j.diin.2011.05.012 Davis, M., Manes, G., & Shenoi, S. (2005). A network-based architecture for storing digital evidence. Advances in Digital Forensics: IFIP International Conference on Digital Forensics, 194, 33–42. doi:10.1007/0-38731163-7_3 Dean, J., & Ghemawat, S. (2008). MapReduce : Simplified Data Processing on Large Clusters. Communications of the ACM, 51(1), 1–13. doi:10.1145/1327452.1327492 Dosis, S., Homem, I., & Popov, O. (2013). Semantic Representation and Integration of Digital Evidence. Procedia Computer Science, 22, 1266–1275. doi:10.1016/j.procs.2013.09.214 Garfinkel, S. L. (2006). AFF : A New Format for Storing Hard Drive Images. Association for Computing Machinery. Communications of the ACM, 49(2), 85–87. Jelasity, M., Voulgaris, S., Guerraoui, R., Kermarrec, A.-M., & Steen, M. van. (2007). Gossip-based peer sampling. ACM Transactions on Computer Systems (TOCS), 25(3), 1–36. Retrieved from http://dl.acm.org/citation.cfm?id=1275520 Kahvedžić, D., & Kechadi, T. (2009). DIALOG: A framework for modeling, analysis and reuse of digital forensic knowledge. Digital Investigation, 6, S23–S33. doi:10.1016/j.diin.2009.06.014 Kaspersky Lab. (2014). The Regin Platform: Nation-State Ownage of GSM Networks (pp. 1–28). Koopmans, M. B., & James, J. I. (2013). Automated network triage. Digital Investigation, 10(2), 129–137. doi:10.1016/j.diin.2013.03.002 Leu, F.-Y. L. F.-Y., & Yang, T.-Y. Y. T.-Y. (2003). A host-based real-time intrusion detection system with data mining and forensic techniques. IEEE 37th Annual 2003 International Carnahan Conference onSecurity Technology, 2003. Proceedings., (Mid). doi:10.1109/CCST.2003.1297623 Moser, A., & Cohen, M. I. (2013). Hunting in the enterprise: Forensic triage and incident response. Digital Investigation, 10(2), 89–98. doi:10.1016/j.diin.2013.03.003 National Institute of Standards and Technology. (2004). Digital data acquisition tool specification. Draft for Comments. Retrieved from http://www.cftt.nist.gov/Pub-Draft-1DDA-Require.pdf Palmer, G. (2001). A Road Map for Digital Forensic Research. In Proceedings of the Digital Forensic Research Workshop, 2001. Uttica, New York. Raghavan, S., Clark, A., & Mohay, G. (2009). FIA: an open forensic integration architecture for composing digital evidence. Forensics in Telecommunications, Information and Multimedia: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, 8, 83–94. Retrieved from http://link.springer.com/chapter/10.1007/978 -3-642-02312-5_10 Redding, S. (2005). Using Peer-to-Peer Technology for Network Forensics. Advances in Digital Forensics: IFIP International Federation for Information Processing, 194, 141–152. doi:10.1007/0387-31163-7_12 Ren, W., & Jin, H. (2005). Distributed agent-based real time network intrusion forensics system architecture design. In Proceedings International Conference on Advanced Information Networking and Applications, AINA (Vol. 1, pp. 177–182). Ieee. doi:10.1109/AINA.2005.164 Roussev, V., & Richard III, G. G. (2004). Breaking the Performance Wall: The Case for Distributed Digital Forensics. Digital Forensics Research Workshop, 1–16. Sacha, J., Dowling, J., Cunningham, R., & Meier, R. (2006). Discovery of stable peers in a selforganising peer-to-peer gradient topology. In International Conference on Distributed Applications and Interoperable Systems (DAIS) (pp. 70–83). Retrieved from http://link.springer.com/chapter/10.1007/117 73887_6 Schatz, B., & Clark, A. (2006). An open architecture for digital evidence integration. In AusCERT Asia Pacific Information Technology Security Conference (pp. 15–29). Gold Coast, Queensland. Retrieved from http://eprints.qut.edu.au/21119/ Scientific Working Group on Digital Evidence (SWGDE). (2006). Data integrity within computer forensics. Retrieved from https://www.swgde.org/documents/Current Documents/2006-04-12 SWGDE Data Integrity Within Computer Forensics v1.0 Shields, C., Frieder, O., & Maloof, M. (2011). A system for the proactive, continuous, and efficient collection of digital forensic evidence. Digital Investigation, 8, S3–S13. doi:10.1016/j.diin.2011.05.002 Shvachko, K., Kuang, H., Radia, S., & Chansler, R. (2010). The Hadoop Distributed File System. 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), 1–10. doi:10.1109/MSST.2010.5496972 sKyWIper Analysis Team. (2012). Skywiper (a.K.a Flame a.K.a Flamer): a Complex Malware for Targeted Attacks (Vol. 05, pp. 1–64). Budapest. Retrieved from http://www.crysys.hu/skywiper/skywiper.pdf \npapers2://publication/uuid/1A396077EBAB-47F8-A363-162BDAF34247 Stone-Gross, B. (2012). The Lifecycle of Peer-toPeer ( Gameover ) ZeuS. Retrieved from http://www.secureworks.com/cyber-threatintelligence/threats/The_Lifecycle_of_Peer_t o_Peer_Gameover_ZeuS/ Van Baar, R. B., van Beek, H. M. a., & van Eijk, E. J. (2014). Digital Forensics as a Service: A game changer. Digital Investigation, 11, S54–S62. doi:10.1016/j.diin.2014.03.007 Yu, J., Ramana Reddy, Y. V., Selliah, S., Reddy, S., Bharadwaj, V., & Kankanahalli, S. (2005). TRINETR: An architecture for collaborative intrusion detection and knowledge-based alert evaluation. Advanced Engineering Informatics, 19(2), 93–101. doi:10.1016/j.aei.2005.05.004 Zonouz, S., Joshi, K., & Sanders, W. (2011). Floguard: cost-aware systemwide intrusion defense via online forensics and on-demand IDS deployment. In Computer Safety, Reliability, and … (pp. 338–354). Naples, Italy: Springer-Verlag, Berlin, Heidelberg. doi:10.1007/978-3-642-24270-0_25 MEASURING HACKING ABILITY USING A CONCEPTUAL EXPERTISE TASK Justin Scott Giboney School of Business University at Albany Albany, NY 12222 [email protected] Jeffrey Gainer Proudfoot Information and Process Management Bentley University Waltham, MA 02452 [email protected] Sanjay Goel School of Business University at Albany Albany, NY 12222 [email protected] Joseph S. Valacich Eller College of Management The University of Arizona Tucson, AZ 85721 [email protected] ABSTRACT Hackers pose a continuous and unrelenting threat to organizations. Industry and academic researchers alike can benefit from a greater understanding of how hackers engage in criminal behavior. A limiting factor of hacker research is the inability to verify that self-proclaimed hackers participating in research actually possess their purported knowledge and skills. This paper presents current work in developing and validating a conceptual-expertise based tool that can be used to discriminate between novice and expert hackers. The implications of this work are promising since behavioral information systems researchers operating in the information security space will directly benefit from the validation of this tool. Keywords: hacker ability, conceptual expertise, skill measurement 1. INTRODUCTION Governments, businesses, universities and other organizations are prime targets for hackers. A common motive for hackers to target these organizations is data theft [1] which results in billions of dollars of losses annually [2]. Due to the threat that hackers pose to organizations, researchers have been encouraged to investigate hacker motives and behavior [3]. There have been recent attempts to further understand hacker behavior e.g., [2], [4], [5]. However, these studies rely on data collected from self-reported hackers. Respondents can pose as hackers due to gain the incentives provided during data collection. It is unverifiable whether the samples utilized in prior research are based on data collected from actual hackers or whether these samples are based on data collected from persons misrepresenting their hacking abilities and experience. A consequence of this uncertainty is the questionable validity and generalizability of the findings reported in prior hacking research. A second issue is the tendency for researchers to lump all hackers into a single category during data analysis. This is typically done as a means of comparing hackers to other groups, but previous research indicates that there is more than one type of hacker [6]. Categories of hackers include: script kiddies, petty thieves, virus writers, professional criminals, and government agents [6], [7]. Furthermore, the motivations and skill levels of different types of hackers are varied [6]. In light of these differences, hacking researchers would benefit tremendously from the ability to more accurately measure each hacker’s level of skill. The ability to measure hacking skill would allow researchers to verify that a self-proclaimed hacker indeed possesses requisite technical skills. It would also allow analyses to be conducted on subsets of data for different groups of hackers based on their level of skill and areas of expertise. In short, there is currently no scientific measure that can be used to assess hacking skill level without employing qualitative research methods (e.g., interviews) [8]. While effective, qualitative methods are much less scalable than survey-based methods as surveys can be administered widely with few temporal or geographic limitations. Furthermore, hacking activities often require behavior that is criminal in nature; a survey-based methodology for data collection may elicit a more candid response from a participant since the identity of the respondent can remain anonymous. The goal of this research is to develop a surveybased methodology for determining a hacker’s skill level using an 18-scenario scale. If a scale can be developed to measure a hacker’s skill level, researchers can (1) more accurately discriminate between categories of hackers, (2) more accurately quantify who is a hacker and who is not, and (3) provide evidence that their findings are indeed generalizable to the population of interest. The scale development process used for this research is in accordance with recognized scale development protocols [9] and is based on measuring conceptual expertise, an approach previously utilized by researchers in a variety of disciplines [10], [11]. Upon completing scale development, this research proposes to collect and analyze data to validate the accuracy of the measurement tool. This paper presents an overview of the scale development process concerning the validity of this novel approach to measuring hacker ability. 2. SCALE DEVELOPMENT There are six recommended phases used for scale development: 1) conceptualization, 2) development of measures, 3) model specification, 4) scale evaluation and refinement, 5) validation, and 6) norm development [9]. Each of these steps is discussed in the following sections. 2.1 Conceptualization The goal of the conceptualization phase is to provide a precise definition of the construct of interest and establish conceptual arguments for how the construct can be discriminated from previously-specified and evaluated constructs found in literature [9]. This paper introduces hacking conceptual expertise as a new construct based on the conceptual expertise construct found in cognitive science literature [10], [12]. Hacking conceptual expertise is comparable to, and should distinguish from, two similar constructs: computer self-efficacy [13] and computer ability [14]. This section will first discuss expertise before addressing computer self-efficacy and computer ability. Expertise is a “manifestation of skills and understanding resulting from the accumulation of a large body of knowledge” [12, p. 167]. A hacker’s expertise is manifested in their ability to write code or scripts that can circumvent security protocols, disrupt the intended functions of a system, collect valuable information, and not get caught [6]. Many hackers are novices, sometimes referred to as “script kiddies”, who have only a surface understanding of hacking but still employ software and scripts written by experts to perform their attacks [6], [15]. Expert hackers understand hacking at a deeper level as they have a command of the common weaknesses and vulnerabilities of information systems. Therefore, we formally define the construct hacking conceptual expertise as the manifestation of skills and understanding about circumventing security protocols, disrupting the intended functions of systems, collecting valuable information, and not getting caught. Computer self-efficacy is a similar construct to hacking conceptual expertise and is based on Social Cognitive Theory (SCT). SCT explains that social pressures, context, cognitive and personal characteristics, and behavior are reciprocally determined. [13], [16]. Self-efficacy is part of the cognitive and personal characteristics that drive behavior [13], [17]. Self-efficacy is a belief people have about their capacity to perform an action and their skill at performing the action [13], [17]. According to SCT, people are more likely to act if they believe that there will be positive outcomes as a result of their action [13], [17]. If people believe that they are good at an action, they are more likely to believe that they will receive a positive outcome by performing the action, and therefore, will be more likely to perform the action [13], [17]. Computer selfefficacy is the belief people have about their capacity to perform actions that accomplish a computer-based task [13]. Computer self-efficacy has three main components: magnitude, strength, and generalizability [13]. Magnitude refers to the perception that people can accomplish more difficult tasks [13]. Strength refers to the confidence people have in being able to perform the tasks [13]. Generalizability refers to the extent to which a person’s judgments include multiple activities [13]. Experts perform tasks better than novices as they have superior mental representations of problems; this is attributable to larger quantities of stored information and solution nodes (i.e., steps to achieve the solution) [10]. In other words, experts have more strategies at their disposal to find solutions to problems as compared with novices. Due to larger quantities of stored information, experts typically organize information into abstract categories [10]. Abstract categories are used to filter tasks and potential solutions; they allow experts to more quickly and efficiently solve problems [10]. These abstract categories form the basis for testing expertise. Another similar measure to hacking conceptual expertise is computer ability [14]. Computer ability is based on two concepts: how important a skill is to a task and the perceived skill level of the individual in performing a task [14]. This research will demonstrate that hacking conceptual expertise is both distinct from and an antecedent to computer self-efficacy and computer ability. The abstract categorization of experts can be leveraged to distinguish between experts and novices. When given a set of related problems (e.g., end of chapter questions in a textbook) and asked to group the problems, experts will rely on their abstract categories to organize the problems based on principles of the domain [12]. When novices are given the same task, however, they will organize the problems based on physical evidence, explicit words, or formulas [12]. For example, in a study distinguishing between expert and novice programmers, experts sorted programming problems by solution algorithms and novices sorted the same problems by application areas [22]. 2.2 Development of Measures Measurement development is a process traditionally completed in two steps. First, a set of items that represent the construct is generated, and second, the content validity of the set of items is assessed [9]. Before discussing the generation of items, this paper will first discuss how these items will be used in a specialized task to measure hacker conceptual expertise rather than using a traditional survey. 2.2.1 The Conceptual Expertise Task When people solve problems they approach tasks based on the mental representation they have of the problem [10], [18]. Their mental representations are based on stored information (memories) that assist in knowledge-based decisions [19]. People use stored information as nodes (or waypoints) that allow them to follow a solution path [20], [21]. Therefore, problem solvers use the mental representations they have of the task to find a solution. For example, chess masters can identify more ways to achieve checkmate (where the nodes are necessary moves) than novice chess players. Researchers can capitalize on this difference between experts and novices (abstract categories versus physical evidence) to create a scoring system to measure expertise [10], [11]. To create this scoring system, underlying principles from the domain of expertise are derived from literature, textbooks, and other related sources. These underlying principles are termed deep features as they show understanding of a given domain (e.g., social engineering). Deep features are contrasted with surface features that are the objects or contexts (e.g., stealing financial data) represented in a problem [10]. When participants group deep features together more often than they group surface features, the participants are considered experts. When participants group surface features together more often than they group deep features, the participants are considered novices. The conceptual knowledge task is typically performed using a card sort of relevant scenarios on 3x5 cards with each card having one deep feature and one surface feature [11]. An example of a scenario with a deep feature of system resource consumption and a surface feature of financial data is as follows: Create a name for your groups Table 2 provides an example of how a participant might group the scenarios. Once grouped, researchers can score the pairings of every combination in each group to classify it as a Eve sends out requests to millions surface feature pair (S), a deep feature pair (D), or of machines using an IP address an unexpected pair (U). For example, the pair P-G assigned to a server at a stock in the participant’s first group is a surface feature brokerage. pair as both scenarios are in the “usernames and passwords” column. The pair L-I in the While this is an example of a scenario that could participant’s second group is a deep feature pair as be displayed on one card, assume that a researcher both scenarios are in the “input validation” row. creates 18 cards lettered A-R, each possessing a The pair F-E in the participant’s third group is an unique hacking-related scenario containing a deep unexpected pair as the two scenarios are neither in feature and a surface feature. The hypothesized the same row nor the same column. In total, this groupings could look something like Table 1. participant identified 6 deep pairings, 12 surface Participants, without seeing Table 1 or knowing pairings, and 18 unexpected pairings. This the hypothesized features, are asked to sort the participant is likely more novice than expert as he cards into groups with the following restrictions: or she identified more surface features than deep You must create more than one group features. However, the participant could have Each group must have at least 2 cards and created an even higher number of surface feature pairs, thus he or she is likely not a complete fewer than 15 cards novice. Each card can only be a part of 1 group Table 1 Example problem matrix Hypothesized deep features Hypothesized surface features Fake website Usernames and passwords Financial data Authentication/ Authorization H D O Hiding tracks F N A Input validation/ Memory override Q J E Resource consumption M P R Social engineering K G C Vulnerability detection B L I 2.2.2 Generating items for the hacking conceptual expertise task The next step in scale development is to “generate a set of items that fully represent the conceptual domain of the construct” [9, p. 304]. For the conceptual expertise task, the generation of items begins with the identification of deep features. Previous researchers using the conceptual expertise task have used textbook problems to identify deep features see [10], [11]. As features for the hacking conceptual exercise. Table information security textbooks do not typically 3 contains a thorough, but not exhaustive, list of provide information on how to engage in criminal vulnerabilities and security measures identified in hacking behavior, we relied on both information the set of textbooks and relevant literature used for security textbooks and academic literature this study [1], [4], [6], [8], [23]–[40]. addressing hacking scenarios to generate deep Table 2 Example participant grouping result Group 1 – Hacks that involve numerous targets P, O, M, J, Q, G, D Group 2 – Hacks that involve hacker input N, L, A, I Group 3 – Hacks that involve pretending to be someone else or pretending to do something good F, E, B, R Group 4 – Hacks that involve programming K, H, C Table 3 Hacks, vulnerabilities, and security measures referenced in relevant literature Authentication/Authorization Encryption, Security tokens, Permissions, Password cracking, Two-step commit, Certificate authorities, Password salting, Keystroke logging, Rainbow tables, Brute force attacks Hiding tracks Malware signatures, Removing log files, Audit-disabling software, Disabling security controls, Using proxies, IP spoofing, Steganography Input validation/Memory override Buffer overflow, Cross-site scripting, Maladvertising, SQL injection, Heap spraying, Format string attacks, Dangling pointers Resource consumption Denial of service attacks, Syn flood, ACK storm, Email bombs, HTTP POST DDOS, Smurf attacks, Spamming Social engineering Spear Phishing, Pharming, Nigerian scam, Phishing Vulnerability detection Man in the middle attacks, Port scanning, Ping sweeps, Packet sniffing, Network mapping, War driving, Bluesnarfing Actions/Outcomes Electronic espionage, Zombie networks, Spyware, Website defacement, Computer worms, Trojan horses, Root kits, Ransomware, Leak of information, Bot net, Trap doors, Logic bombs A careful review of the hacks, vulnerabilities, and security measures identified in relevant literature allowed us to organize seven principles of hacking that form the basis for our deep features. In the next section we will empirically test and validate the categorization of these hacking principles. It is worth noting that the last category titled “Actions/Outcomes” in Table 3 does not contain hacking techniques, but rather contains outcomes of hacking activities. This category was not considered ideal for evaluating a person’s ability to carry out a hacking attempt, but rather how well someone knows about hacking activities in general, therefore, it was excluded from our final set of deep features. For the conceptual expertise task, deep features are coupled with their corresponding surface features to create a matrix. We created three areas of surface features, namely financial data, fake websites, and usernames/passwords, to correspond with six deep features. Table 4 contains the 18 scenarios resulting from the use of the features contained in the matrix. Recall that Table 1 contains the matrix depicting how the deep features are crossed with the surface features. 2.2.3 Assessing Content Validity Before using this task to discriminate between novice and expert hackers, the items must first be scientifically validated. The validation process will be completed using two approaches. First, expert information security practitioners and academics will review our proposed methodology and provide feedback. Second, the scenarios presented in Table 4 will be empirically validated using an item-ranking task. We have already compiled feedback on the hacking conceptual expertise task proposed in this paper; feedback was solicited from four security experts with either an industry or academic background. The general consensus of the polled experts is that this is a feasible approach for discriminating between expert and novice hackers. A common concern is that our approach may only measure how well a hacker conceptually understands hacking methods without directly assessing a hacker’s actual ability. Table 4 Hacking conceptual expertise scenarios Hack # Scenario Removing log files A Eve deletes log files as she combs through a compromised machine looking for tax returns. Port scanning B Eve uses a malicious website to scan for open ports of visitors. Phishing C Eve, pretending to be a bank website, emails Kelly asking for her bank information. Rainbow tables D Eve uses a rainbow table to decrypt secret military intranet links. SQL injection E Eve uses a semicolon in a web form to access user account balances in the database. Using proxies F Eve uses a proxy while creating a website to create a zombie network Nigerian scam G Certificate authority H Man-in-the-middle attack Improper file validation I J Pharming K Ping sweep L HTTP POST DOS attack Malware signature avoidance M N Password salting O Email DOS P Eve sends Twitter messages en masse asking people to click on an Internet link in return for some secret information. Eve becomes her own certificate authority as she creates a fictitious ecommerce business. Eve captures Wi-Fi network traffic from a conference to watch for financial transactions. Eve uploads an executable to a server expecting an image, the executable sends out instant messages with Internet links to random email addresses. Eve creates a website similar to a well-known company using a similar domain name. Eve sends a ping to networked machines and then sends an Internet link as a message to live machines. Eve creates fake websites that post to a targeted website normally, but that are extremely slow (e.g. 1 byte/110 seconds). Eve has created a virus to look for Internet links to sensitive data stored on a computer that changes itself after every install. Eve is attempting to figure out the salt that was used for some financial transactions. Eve created a script to send hundreds of emails with an Internet link using fake email addresses to a particular company leader. Cross-site scripting Q Smurf attack R Eve posts a response on a forum that allows Eve to redirect users to a malicious website. Eve sends out requests to millions of machines using the IP address of the server of a stock trading institution. More specifically, one of the security experts stated that deep features “…are more clear cut than the surface features.” Another security expert suggested that the deep pairings may be too intuitive. However, we do not consider these responses to be troubling as experts should consider deep features to be both clear and intuitive. We take these comments as a sign that the measurement method is well specified. Upon incorporating this feedback into a new set of items, we will empirically validate the items by selecting 20 new security experts from a state information security team. To empirically validate the items, each item should be adequately representative of the deep feature to which it is assigned [9], [41], [42]. Mackenzie et al. [9] recommended a technique suggested by Hinkin and Tracey [43] in which a matrix is created with the items in the first column and the deep features listed as column headers. The matrix is then distributed to raters who are asked to rate how well each item fits with each column header on a five-point Likert-type scale (1 = not at all; 5 = completely). Table 5 contains a hypothetical example of the matrix that we will use. Data collection for both of these approaches is currently underway. Preliminary findings from both approaches will be reported at the conference. There were a number of other requests from the security experts that we will incorporate in the next iteration of the measurement tool. For example, one security expert suggested that we include script kiddie scenarios that reference the use of existing prepackaged tools. Another security expert suggested that we include more hardware exploits. Table 5 Hypothetical example of item rating task Rater # = 001 Authentication /Authorization Hiding tracks Input validation /Memory override Eve deletes log files as she combs through a compromised machine looking for tax returns. 1 5 1 2 1 2 Eve uses a malicious website to scan for open ports of visitors. 1 1 1 2 2 4 … … … … … … … Eve sends out requests to millions of machines using the IP address of 2 1 1 4 1 2 Resource consumption Social engineering Vulnerability detection the server of a stock trading institution. 2.3 Model Specification The next phase in scale development is to specify how the indicators capture the expected relationships with the construct [9]. While this stage typically involves specifying formative or reflective indicators for a construct, the conceptual expertise task does not treat the indicators as formative measures in a scale, rather they are used to calculate a single expertise score. Therefore, our model specification will be the percentage of deep pairs identified by a participant compared to the percentage of surface pairs identified by the participant. 2.4 Scale Evaluation and Refinement Scale evaluation and refinement is a two-step process based on (1) conducting a pilot study and (2) modifying items in the survey. After revising the scenarios from the feedback we receive from experts participating in our item-validation tasks, we will conduct a pilot study comprised of 20 security experts, 20 novices, and 20 claimed hackers on Amazon’s Mechanical Turk. Security experts will be selected from government and corporate information security teams with connections to the university. Novices will be selected from introductory Computer Science, Informatics, or Information Systems courses. The pilot study will allow us to further refine the scale by adjusting the scenarios based on our results. We will look to refine scenarios that are paired by experts using surface features as well as scenarios paired by novices based on deep features. We will also look for scenarios commonly paired in unexpected ways. The item-refinement process is iterative and will be carried out until the scale possesses sufficient discriminatory power. evolve throughout the pilot-testing process, we will conduct the main data collection with a fullsized sample. We will sample 50 security experts, 50 novices, and 50 self-identified hackers from Amazon’s Mechanical Turk. Once the full-sized sample is collected we can assess the scale’s validity in two ways. First, we will be able to use a known-group comparison method, and second, we will assess the nomological validity of the scale. The knowngroup comparison is the use of groups (novices and experts) that should demonstrate differences on the scale [9]. We expect that novices will create more surface pairings than experts, and that novices will create less deep pairings than experts. To assess nomological validity we will measure how well hacking conceptual expertise relates to similar measures. Specifically, we expect hacking conceptual expertise to increase perceptions of computing ability measured through computer self-efficacy and computer ability (see Figure 1). 2.6 Norm Development The last step in scale development is to develop norms for the scale. This involves discovering the distribution of the scores from different populations. While we currently do not have plans to create norms for this scale, we are optimistic that this paper will serve as a foundation for developing norms in future work. 2.5 Validation Validation is a three-step process comprised of the following tasks: 1) gathering data from a complete sample, 2) assessing scale validity, and 3) crossvalidating the scale [9]. As the scenarios will Figure 1 Measurement model 3. CONCLUSION Hackers continue to pose a serious threat to organizations. Security researchers can benefit from a greater understanding of how and why hackers engage in criminal behavior. A limiting factor of such studies is the inability to verify that self-proclaimed hackers participating in research actually possess their purported knowledge and skills. This paper presents a cogent plan to develop and validate a conceptual-expertise based tool that can be used to discriminate between novice and expert hackers. The proposed tool operates on the premise that given a set of scenarios, experts will rely on their understanding of abstract categories to organize problems based on principles of the domain whereas novices organize problems based on physical evidence, explicit words, or formulas. In other words, experts will group items based on deeper features while novices will group items based on surface features. To create a conceptualexpertise based tool for measuring hacker ability that possesses sufficient discriminatory power, items must first be developed and validated. We have developed 18 scenarios and are in the process of refining both the task and the scenarios by soliciting feedback from information security experts. These 18 scenarios will be a scale that can be used in survey-based research to measure hacker skill level. Once feedback from solicited experts is analyzed, our model will be refined followed iterative pilot testing and data collection. The implications of this work are promising as behavioral information systems researchers operating in the information security space will directly benefit from the validation of this tool. Furthermore, adaptations of this tool have the potential to be utilized in a variety of contexts and applications in information systems research. REFERENCES [1] Verizon, “2012 data breach investigations report,” 2012. [2] D. Dey, A. Lahiri, and G. Zhang, “Hacker behavior, network effects, and the security software market,” J. Manag. Inf. Syst., vol. 29, no. 2, pp. 77–108, Oct. 2012. [3] M. A. Mahmood, M. Siponen, D. Straub, H. R. Rao, and T. S. Raghu, “Moving toward black hat research in Information Systems security: An editorial introduction to the special issue,” MIS Q., vol. 34, no. 3, pp. 431–433, 2010. [4] Z. Xu, Q. Hu, and C. Zhang, “Why computer talents become computer hackers,” Commun. ACM, vol. 56, no. 4, p. 64, Apr. 2013. [5] J. S. Giboney, A. Durcikova, and R. W. Zmud, “What motivates hackers? Insights from the Awareness-Motivation-Capability Framework and the General Theory of Crime,” in Dewald Roode Information Security Research Workshop, 2013, pp. 1–40. [6] M. K. Rogers, “A two-dimensional circumplex approach to the development of a hacker taxonomy,” Digit. Investig., vol. 3, no. 2, pp. 97–102, Jun. 2006. [7] R. Chiesa and S. Ducci, Profiling Hackers: The Science of Criminal Profiling as Applied to the World of Hacking. Boca Raton, FL: Auerbach Publications, 2009. [8] A. E. Voiskounsky and O. V Smyslova, “Flow-based model of computer hackers’ motivation.,” CyberPsychology Behav., vol. 6, no. 2, pp. 171–180, Apr. 2003. [9] S. B. Mackenzie, P. M. Podsakoff, and N. P. Podsakoff, “Construct measurement and validation procedures in MIS and behavioral research: Integrating new and existing techniques,” MIS Q., vol. 35, no. 2, pp. 293– 334, 2011. [10] M. T. H. Chi and P. J. Feltovich, “Categorization and representation of physics problems by experts and novices,” Cogn. Sci., vol. 5, no. 2, pp. 121–152, 1981. [11] J. I. Smith, E. D. Combs, P. H. Nagami, V. M. Alto, H. G. Goh, M. A. A. Gourdet, C. M. Hough, A. E. Nickell, A. G. Peer, J. D. Coley, and K. D. Tanner, “Development of the Biology Card Sorting Task to Measure Conceptual Expertise in Biology,” CBE-Life Sci. Educ., vol. 12, no. 4, pp. 628–644, Dec. 2013. [12] M. T. H. Chi, “Laboratory methods for assessing experts’ and novices' knowledge,” in The Cambridge Handbook of Expertise and Expert Performance, K. A. Ericsson, N. Charness, P. J. Feltovich, and R. R. Hoffman, Eds. Cambridge University Press, 2006, pp. 167–184. [13] D. R. Compeau and C. A. Higgins, “Computer self-efficacy: Development of a measure and initial test,” MIS Q., vol. 19, no. 2, pp. 189–211, 1995. [14] P. H. Cheney and R. R. Nelson, “A tool for measuring and analyzing end user computing abilities,” Inf. Process. Manag., vol. 24, no. 2, pp. 199–203, 1988. [15] T. J. Holt, “Subcultural evolution? Examining the influence of on- and off-line experiences on deviant subcultures,” Deviant Behav., vol. 28, no. 2, pp. 171–198, Feb. 2007. [16] A. Bandura, “Social cognitive theory: An agentic perspective,” Annu. Rev. Psychol., vol. 52, no. 1, pp. 1–26, 2001. [17] A. Bandura, “Self-efficacy mechanism in human agency,” Am. Psychol., vol. 37, no. 2, pp. 122–147, 1982. [18] A. Newell and H. A. Simon, Human Problem Solving. Englewood Cliffs, NJ: Prentice Hall, 1972. [19] A. Chandra and R. Krovi, “Representational congruence and information retrieval: Towards an extended model of cognitive fit,” Decis. Support Syst., vol. 25, pp. 271–288, 1999. [20] H. A. Simon and J. R. Hayes, “Understanding written problem instructions.,” in Knowledge and Cognition, L. W. Gregg, Ed. Potomac, MD: Lawrence Erlbaum Associates, 1974, pp. 165–200. [21] J. R. Hayes and H. A. Simon, “Psychological differences among problem isomorphs,” in Cognitive Theory, 2nd ed., J. N. Castellan Jr., D. B. Pisoni, and G. R. Potts, Eds. Hillsdale, NJ: Lawrence Erlbaum Associates, 1977, pp. 21–41. [22] M. Weiser and J. Shertz, “Programming problem representation in novice and expert programmers,” Int. J. Man. Mach. Stud., vol. 19, no. 4, pp. 391–398, 1983. [23] C. P. Pfleeger and S. L. Pfleeger, Security in Computing, 4th ed. Upper Saddle River, NJ, USA: Prentice Hall, 2006. [24] M. T. Goodrich and R. Tamassia, Introduction to Computer Security. Boston, MA: Pearson Education, Inc., 2011. [25] T. Jordan, “Mapping Hacktivism: Mass Virtual Direct Action (MVDA), Individual Virtual Direct Action (IVDA) And Cyber- wars,” Comput. Fraud Secur., vol. 4, no. 1, pp. 8–11, 2001. [26] T. Jordan and P. Taylor, “A sociology of hackers,” Sociol. Rev., vol. 46, no. 4, pp. 757–780, Nov. 1998. [27] S. M. Furnell and M. J. Warren, “Computer hacking and cyber terrorism: The real threats in the new millennium?,” Comput. Secur., vol. 18, no. 1, pp. 28–34, 1999. [28] O. Turgeman-Goldschmidt, “Hackers’ Accounts: Hacking as a Social Entertainment,” Soc. Sci. Comput. Rev., vol. 23, no. 1, pp. 8–23, Feb. 2005. [29] V. Mookerjee, R. Mookerjee, A. Bensoussan, and W. T. Yue, “When hackers talk: Managing information security under variable attack rates and knowledge dissemination,” Inf. Syst. Res., vol. 22, no. 3, pp. 606–623, 2011. [30] D. P. Twitchell, “Augmenting detection of social engineering attacks using deception detection technology,” in International Conference on i-Warfare and Security, 2006. [31] R. E. Bell, “The prosecution of computer crime,” J. Financ. Crime, vol. 9, no. 4, pp. 308–325, 2002. [32] H. Liang and Y. Xue, “Avoidance of information technology threats: A theoretical perspective,” MIS Q., vol. 33, no. 1, pp. 71– 90, 2009. [33] G. B. Magklaras and S. M. Furnell, “Insider threat prediction tool: Evaluating the probability of IT misuse,” Comput. Secur., vol. 21, no. 1, pp. 62–73, 2002. [34] Symantec Corporation, “Internet Security Threat Report,” Mountain View, California, 2013. [35] L. Holmlund, D. Mucisko, K. Kimberland, and J. Freyre, “2010 cybersecurity watch survey: Cybercrime increasing faster than some company defenses,” 2010. [36] CyberEdge Group, “2014 Cyberthreat Defense Report,” 2014. [37] R. T. Wright and K. Marett, “The influence of experiential and dispositional factors in phishing: An empirical investigation of the deceived,” J. Manag. Inf. Syst., vol. 27, no. 1, pp. 273–303, 2010. [38] B. Parmar, “Protecting against spearphishing,” Comput. Fraud Secur., vol. 2012, no. 1, pp. 8–11, Jan. 2012. [39] S. Goel and H. A. Shawky, “Estimating the market impact of security breach announcements on firm values,” Inf. Manag., vol. 46, no. 7, pp. 404–410, Oct. 2009. [40] R. Boyle and J. G. Proudfoot, Applied Information Security: A Hands-On Guide to Information Security Software, 2nd ed. New Jersey: Pearson, 2014. [41] F. N. Kerlinger and H. B. Lee, Foundations of Behavioral Research, 4th ed. New York: Cengage Learning, 1999. [42] D. W. Straub, M.-C. Boudreau, and D. Gefen, “Validation guidelines for IS positivist research,” Commun. Assoc. Inf. Syst., vol. 13, no. 1, pp. 380–427, 2004. [43] T. R. Hinkin and J. B. Tracey, “An analysis of variance approach to content validation,” Organ. Res. Methods, vol. 2, no. 2, pp. 175– 186, 1999. HTML 5 Zero Conguration Covert Channels: Security ... 2015 CDFSL Proceedings HTML5 ZERO CONFIGURATION COVERT CHANNELS: SECURITY RISKS AND CHALLENGES Jason Farina Mark Scanlon Nhien-An Le-Khac Stephen Kohlmann Tahar Kechadi School of Computer Science & Informatics, University College Dublin, Ireland. {jason.farina, stephen.kohlmann}@ucdconnect.ie, {mark.scanlon, an.lekhac, tahar.kechadi}@ucd.ie ABSTRACT In recent months there has been an increase in the popularity and public awareness of secure, cloudless le transfer systems. The aim of these services is to facilitate the secure transfer of les in a peer-to-peer (P2P) fashion over the Internet without the need for centralized authentication or storage. These services can take the form of client installed applications or entirely web browser based interfaces. Due to their P2P nature, there is generally no limit to the le sizes involved or to the volume of data transmitted and where these limitations do exist they will be purely reliant on the capacities of the systems at either end of the transfer. By default, many of these services provide seamless, end-to-end encryption to their users. The cybersecurity and cyberforensic consequences of the potential criminal use of such services are signicant. The ability to easily transfer encrypted data over the Internet opens up a range of opportunities for illegal use to cybercriminals requiring minimal technical know-how. This paper explores a number of these services and provides an analysis of the risks they pose to corporate and governmental security. A number of methods for the forensic investigation of such transfers are discussed. Keywords 1. : Covert Transfers, Encrypted Data Transmission, Counter-forensics INTRODUCTION tion being intercepted or duplicated and their data being downloaded by others. Regarding For the typical home user, sending anything the security of their data stored on this third- larger than a single image or document elec- party provider, users must blindly trust this tronically is still a cumbersome task when re- third-party to not access or share their data with liant on popular online communication meth- any unauthorized party. ods. Most email providers will limit the le While the requirement to send larger volumes size of attachments to something in the order of of information over the Internet is ever increas- megabytes and many will additionally restrict ing, the potential for third-party interception, or le types such as executables or password pro- illicit access to, this data has become a common tected archives based on internal security poli- story in the general media. cies. Sending larger les usually requires users whistle-blowers regarding the degree of surveil- to upload the content to third party storage lance conducted by large government funded providers, e.g., Dropbox, OneDrive, Box.net, spying agencies on everyday citizens has pushed etc., and provide a link to the content to their the topic of cybersecurity into the public realm. intended recipients. From a security standpoint, Increasingly, Internet users are becoming con- this leaves user vulnerable to their communica- scious of their personal responsibility in the pro- © 2015 ADFSL Recent leaks from Page 135 2015 CDFSL Proceedings HTML 5 Zero Conguration Covert Channels: Security ... 1.1 Contribution of this work tection of their digital information. This results in many users being discontent with their personal data stored on these third party servers likely stored in another jurisdiction with privacy requirements much lower than those of their own locale. For many, the topic of covert channels immediately brings to mind some form of steganography likely in combination with an Internet anonymizing service, such as Tor and I2P. While some work has been conducted on the re- verse engineering/evidence gathering of these To respond to this demand a number of le anonymizing P2P proxy services, little work has exchange/transfer services have grown in pop- been done in the area of online services provid- ularity in recent months facilitating the secure ing end users with the ability to securely and transfer of les in a peer-to-peer (P2P) fashion covertly transfer information from peer to peer from point A to point B. Most of these services in an largely undetectable manner. aord the user encrypted end-to-end le trans- presented as part of this paper examines a num- fer and add an additional level of anonymity ber of popular client application and web based compared to regular le transfer services, e.g., services, outlines their functionality, discusses email attachments, FTP or instant message le- the forensic consequences and proposes a num- sharing. The more security concerned users will ber methods for potentially retrieving evidence opt for the cloudless versions of these services. from these services. This work This independent control over personal informa- 2. tion has advantages and disadvantages for the end user. The advantage is that the user has BACKGROUND READING precise knowledge over who has initial access to his/her information and what country the data The technology used to facilitate covert le is stored in. The downside comes in terms of transfers is not new, however, until recently such The data stored or transferred us- measures were deemed too complex to be con- ing these services is only available if at least one sidered of benet to the average user. Mainte- host storing the le is online. nance of FTP servers, dynamic DNS for pub- reliability. lic IP based system shares and management of public/private keypairs for SSH are not tasks As with most security or privacy enhanc- a non-technical individual can accomplish with- ing Internet services, these services are open out technical assistance or a very thorough how- to abuse by cybercriminals. to guide. In eect, the ad- The current transfer options, while ditional level of anonymity and security pro- easy to implement and in some cases, completely vided by these services provides cybercriminals transparent to the user, can all be found to have with o-the-shelf counter-forensic capabilities a more complex root in one of the following for information exchange. methods or techniques. Cybercriminal ac- tivities such as data exltration, the distribution of illicit images of children, piracy, indus- 2.1 Anonymizing Services trial espionage, malicious software distribution, Today there are many anonymizing services and can all potentially be facilitated by the available for communication and data trans- use of these services. fer. The ad hoc nature and The popular anonymous browser Tor al- mainstream protocol features of some of these lows users to explore the Internet without the services make secure detection and prevention risk of their location or identity becoming known problematic without causing disruption to nor- [Loesing et al., 2010]. The Tails operating sys- mal Internet usage. This also causes forensic tem which works in conjunction with Tor oers event reconstruction diculties as any traces an extra layer of anonymity over traditional op- have a high degree of volatility when compared erating systems. When a user is operating Tails to more traditional le transfer options. all connections are forced to go through the Tor Page 136 © 2015 ADFSL HTML 5 Zero Conguration Covert Channels: Security ... network and cryptographic tools are used to encrypt the users data. The operating system will leave no trace of any activity unless explicitly dened by the user to do so. 2015 CDFSL Proceedings 2.2 Decentralized Osite Data Storage The MAIDSafe network is a P2P storage facility that allows members to engage in a data stor- The Invisible Internet Project, also known as age exchange. Each member of the network en- I2P is another anonymous service similar to Tor. ables the use of a portion of their local hard As I2P is designed as an anonymous network drive by other members of the network. In re- layer which will allow users to utilize their own turn the member is given the use of an equiv- applications across the network. Unlike Tor cir- alent amount of storage distributed across the cuits the I2P network trac utilises multiple network and replicated to multiple locations re- layers of encryption and an addressing system ferred to as Vaults. This allows the authorized not based on IP or ISP to provide anonymity. member access from any location and resilience This process decouples a user's online identity should a portion of the network not be active and physical location [Timpanaro et al., 2014]. at any time. I2P also groups network messages together in is deduplicated and replicated in real time with irregular groupings for encryption to discourage le signatures to ensure integrity. network trac analysis. the data is encrypted allowing secure storage on Like Tor, I2P allows All data stored on the network In addition for passage through its network to a server or untrusted remote systems. service not hosted within its area of inuence. is managed through a two factor authentication This is managed through the use of Outprox- process involving a password and pin combina- ies which perform the same function as Tor exit tion. nodes. centivized through SafeCoin, a cryptocurrency Both Tor and I2P provide anonymity to the user with an open network of onion routers in the case of Tor and Garlic routing in the case of I2P. These networks of routers are run by Authorized access The use of the MAIDSafe network is in- that members can earn by renting out space or providing resources such as bandwidth for le transfers. Other users can earn SafeCoins by participating in development of the protocol. The result of this network growth 2.3 Data Exltration through Standard File Transfer Channels is an increase in anonymity and privacy for Data exltration refers to the unauthorized ac- each individual user [Herrmann and Grotho, cess to otherwise condential, proprietary or 2011]. sensitive information. participating volunteers and it is continually growing. However these services are not without Giani et al. [2006] out- drawbacks such as a severe reduction in net- lines a number of data exltration methods in- work throughput resulting in much slower access cluding most regular le transfer methods for speeds (though I2P has greater reported perfor- inside man attacks, e.g, HTTP, FTP, SSH and mance than Tor, in particular for P2P down- email, and external attacks including social en- loading protocols but it has fewer Outproxies gineering, botnets, privilege escalation and root- than Tor has exit nodes resulting in a lesser de- kit facilitated access. Detection of most of these gree of anonymization). Many software pack- methods is possible using a combination of re- ages (those not SOCKS aware in the case of walls and network intrusion detection systems Tor) are not designed to correctly route through or deep packet inspection [Liu et al., 2009, Sohn these services and will instead provide informa- et al., 2003, Cabuk et al., 2009]. tion that will potentially reveal the identity of response trac to be delivered to. For the end 2.4 File Delivery Services Built on Anonymizing Networks user, there is also the issue of adding yet an- OnionShare is a le sharing application that other step to the already technical task they nd leverages the anonymity of Tor to provide se- themselves performing. cure le transfers for its users. the user such as the local, true, IP address for © 2015 ADFSL File transfers Page 137 2015 CDFSL Proceedings HTML 5 Zero Conguration Covert Channels: Security ... are direct from uploader to recipient though link is no longer considered valid and all in- both users utilize the Tor browser to participate. coming URLs with the same le signature OnionShare itself is a python based application are refused. that sets up a le share on the local system as a limited web server. This web server is then ad- This combination of time and availability in vertised as a Tor Hidden Service using the built conjunction with the anonymity of Tor makes in functionality of the Tor browser. The appli- OnionShare trac extremely dicult to analyze cation uses random data to generate a 16 char- eectively. acter onion address and more random data to link is already invalidated. Similarly, if the le generate a unique name for the le being shared is discovered on a local lesystem by an investi- to use as a reference for. gator any trace of the once o connection to the The process used by OnionShare is as follows: If the trac is observed then the download point, if not already lost from local connection logs or memory, will only lead to a 1. Uploader starts Tor Browser to provide an entry point for OnionShare to the Tor net- Tor entry point and not to the actual source of the le. work. OnionShare is started and a temporary di- 3. INVESTIGATIVE rectory is created in the users' default temp folder. TECHNIQUES All randomly generated names in OnionShare follow the same procedure: While no work targeted specically at forensic investigation of these zero conguration ser- (a) A number of random bytes are gener- vices has been published at the time of writ- ated using os.random. 8 are generated ing, there are a number of digital evidence ac- for a directory/host name and 16 for quisition methods published for related services. the lename "slug" used to generate There has , however, been security focused work the le portion of the share URL published on HTML5 additions including an (b) These random bytes are SHA-256 and analysis of the webstore localstore and in- the rightmost 16 characters of the re- troduced in this version of the protocol such as sulting hash are carved that produced by Bogaard and Parody [2012]. (c) h is then Base32 encoded, all characters are converted to lower case and any trailing `=' signs are removed 2. The result is then used as a URL using the format <host>.onion/<fileID> and this is the url the Tor browser advertises to the introduction nodes and registers on DHT. This section outlines a number of related investigation techniques and analyses their relevancy to the forensic recovery of evidence from covert le transfer services. 3.1 Cloud Storage Forensics Forensics of cloud storage utilities can prove challenging, [2012a]. 3. The uploader then sends the URL to the as presented by Chung et al. The diculty arises because, unless complete local synchronization has been per- downloader who must use the URL within formed, a timeframe (24 hours by default) or the ous distributed locations. For example, it may signature of the le HS timestamp will not only reside in temporary local les, match, a process controlled by the ItsDan- storage (such as the system's RAM) or dis- gerous library for python. In addition to persed across multiple datacenters of the ser- this time limit, OnionShare also utilizes a vice provider's cloud storage facility. Any digi- download counter which has a default value tal forensic examination of these systems must of 1. Once the number of downloads suc- pay particular attention to the method of ac- cessfully initiated matches this counter, the cess, usually the Internet browser connecting Page 138 the data can be stored across vari- © volatile 2015 ADFSL HTML 5 Zero Conguration Covert Channels: Security ... 2015 CDFSL Proceedings to the service provider's storage access page umes of trac involved. One approach to over- (https://www.dropbox.com/login for Dropbox come the massive volume of network data to pro- for example). This temporary access serves to cess is to simply record every packet sent and re- highlight the importance of live forensic tech- ceived from the Internet, in the similar manner niques when investigating a suspect machine to the tactic employed in the case outlined by as a pull out the plug anti-forensic technique Garnkel [2002]. This would facilitate an after- would not only lose access to any currently the-fact reconstruction of any data breaches to opened documents but may also lose any cur- aid in determining precisely what was compro- rently stored sessions or other authentication to- mised. kens that are stored in RAM. Martini and Choo [2013] published the results 3.3 Deep-Web Forensics of a cloud storage forensics investigation on the Although Tor and I2P are designed for users to ownCloud service from both the perspective of communicate and transfer data on the Internet the client and the server elements of the service. anonymously it is still viable and possible for They found that artifacts were found on both investigators to gather user specic information. the client machine and on the server facilitat- The process of an investigation into the Tor net- ing the identication of les stored by dierent work requires advanced digital forensic knowl- users. The module client application was found edge as traditional investigation methods used to store authentication and le metadata relat- for standard networks fail to heed the desired ing to les stored on the device itself and on results. Loesing et al. [2010] published a study les only stored on the server. Using the client measuring statistical data in the Tor network. artifacts, the authors were able to decrypt the The study is weighted towards protecting the associated les stored on the server instance. users anonymity while using Tor but nonetheless shows that it is technically possible to gather 3.2 Network Forensics data on Tor users by setting up a Tor relay and Network Forensic Analysis Tools (NFATs) are logging all relayed user trac. designed to work alongside traditional network When users install Tor the software rst con- security practices, i.e., intrusion detection sys- nects to one of the directory authorities. The di- tems (IDSs) and rewalls. They preserve a long rectory authorities are operated by trusted indi- term record of network trac and facilitates viduals of Tor and from these authorities the Tor quick analysis of any identied issues [Corey software downloads the list of currently available et al., 2002]. Tor nodes. Most rewalls allow HTTP and These nodes are relay servers that HTTPS trac through to allow users behind are run by volunteers of Tor. the rewall to have access to regular web ser- then selects three nodes from those available The Tor client vices which operate over these protocols. With and builds an encrypted channel to the entry regards to the web based covert le transfer node. An encrypted channel is then built from services (outlined in detail in Section4 below), the entry node to the middle node, and lastly blocking network trac to these systems would this channel connects to the exit node. require the maintenance of a comprehensive re- Blond et al. [2011] demonstrated the results wall list of such servers to ensure no unautho- of an attack on the Tor anonymity network that rized data exltration. NFATs collecting this in- revealed 10,000 IP addresses over 23 days. The formation will only have the ability to capture authors used the attack to obtain the IP ad- the encrypted packets, their destination and as- dresses of BitTorrent users on Tor. The study sociated metadata. Identifying precisely what found that 72% of users were specically us- has been transferred will likely prove impossible ing Tor to connect to the tracker. The authors for network administrators. launched their attacks through six instrumented network Tor exit nodes resulting in 9 percent of all Tor forensics is dealing real-time with the large vol- streams being traced. Moreover, the paper anal- The © issue with 2015 ADFSL always-on active Page 139 2015 CDFSL Proceedings HTML 5 Zero Conguration Covert Channels: Security ... yses the type of content discovered in the attack culminating in the result that the existence of an underground BitTorrent ecosystem existing on Tor is plausible. Alongside these attacks Tor users were also proled. Using BitTorrent as the insecure application they hijacked the statistical properties of the DHT and the tracker responses. 4. EVOLUTION OF HTML5 POWERED COVERT FILE TRANSFER SERVICES While services like Onionshare provide a method of le transfer that is dicult to investigate due to its limited lifespan and shifting end points, it still requires software installation and additional support services in the form of the Tor Browser. These requirements allow for several methods of access control such as a security policy blocking local services attempting to communicate on port 9159 and 9051, the default Tor Con- Figure 1: The Basic HTML5 Data Transfer Pro- trol Ports. On an application level local, group cess or layer 7 rewall policies can block tor.exe or onionshare.py based on path or le hash with- out undue interruption to normal user network 4.1 Basic HTML5 File Transfer link: A link may be established to the source at the new location. Basic HTML5 File transfer as depicted in Fig- ure 1 is accomplished using native browser APIs data transfer ob- copyLink: A copy or link operation is permitted. ject. This object consists of a customizable array of An item may be moved to a new location. usage. that allow a user to utilize a move: key:value pairs that represent a group of copyMove: A copy or move operation is permitted. le objects. This associative array is then acces- sible by client side scripts run from a web page or web application. These scripts must rst be linkMove: A link or move operation is permitted. downloaded and allowed to run by the local user (this depends on the trust setting for the website being visited). Any element can be added to the array through a Drag and Drop (DnD) functionality or les can be added though a le all: All operations are permitted. none: The item may not be dropped if the element added to the array is a le FileReader browser interface. The actions available by de- then the element is passed to a fault are: ject that copies the data contained in the le to copy: A copy of the source item may be made at the new location. Page 140 localstorage or session storage ob- depend- ing on the settings of the web application. Local Storage is shared across all browser sessions © 2015 ADFSL HTML 5 Zero Conguration Covert Channels: Security ... 2015 CDFSL Proceedings currently active, session storage is only available to the owning application or window (for browsers with multiple tabs or windows). This local/session storage behaves very similarly to the standard cookie storage but with hugely increased capacity. ( 5MB for Chrome, Firefox and Opera, 10MB for Internet Explorer - DnD native is only available in version 9+ of IE web storage in version 8+, and 25MB for Blackberry). For basic data transfer, the Filereader reads the entire le indicated into RAM for processing. Once stored in web storage a le can only be Figure 2: accessed by local actions that have permission SRTP Traditional VOIP Data vs DTLS- to access that web storage area such as client side scripts downloaded from the controlling web page or session. These scripts can use any scripting language but usually JQuery, JavaScript or lays that form the path between the source and destination. AJAX. The local client can also call scripts to In Figure 2, the standard VOIP method of run on the remote server in order to pass vari- communication is displayed alongside the newer ables or prepare for the establishment of addi- WebRTC method. Both systems start with es- tional sessions are required. tablishing a signaling and control path to handle 4.2 Cryptographically enhanced HTML5 data channels non-sensitive data such as connection auditing Following on from their acquisition of ON2 in A and Client B involving relay through Relays February 2010, Google continued to develop a 1 and 2. Unless the user fully trusts both relays browser to browser data transfer protocol, which and the security of the network path between was made open source in 2011 when it was each node on the network, there is a risk of adopted by W3C as a standard for HTML5. an adversary eavesdropping on the data stream The protocol, which supported Real Time Com- and either manipulating the content in transit munication between browsers was released as or capturing it for oine inspection. packets. In VOIP, the data stream would follow this established path and trac between Client WebRTC 1.0, was developed to provide P2P voice, video and data transfers between browsers without an additional software requirements. 5. ANALYSIS OF EXISTING SERVICES WebRTC, provides a collection of protocols and methods as well as a group of codec libraries Services such as those presented in Table 1 are that can be accessed via a JavaScript API. a sample set of HTML5 and WebRTC based le WebRTC improved data transfer over the transfer utilities. While at rst glance many standard HTML5 script based method by in- of these applications appear to be homogenous troducing data integrity, source authentication closer examination shows important dierences and end to end encryption. This is accomplished in both capabilities and requirements. through the use of Datagram Transport Layer services listed, only Sharefest and JustBeamIt Security (DTLS) Modadugu and Rescorla [2004] allow usage without local installation or some extension to handle key exchange for the Secure form of authentication. Real-time Transport Protocol (SRTP). The use is based on basic HTML5 le transfer while of DTLS-SRTP diers from standard VOIP en- Sharefest cryption by removing the need to trust SIP re- purports to oer persistent storage, it does so © 2015 ADFSL utilizes Of the Of these, JustBeamIt WebRTC. While Sharefest Page 141 Sharefest JustBeamIt Transfer Big Files Innit Any Send Rejetto QikShare 3 7 3 3 3 7 7 3 3 3 3 3 7 3 7 7 3 3 3 3 3 Anonym ity Mobile Compa tibility Persist ent Sto rage Relay S erver tion Op tion Applica sed HTML Ba Service Name Registr ation R equired HTML 5 Zero Conguration Covert Channels: Security ... Encryp ted Tra nsfer 2015 CDFSL Proceedings 7 7 7 3 3 3 7 7 7 7 7 7 7 7 7 3 3 7 3 7 3 3 7 3 3 3 3 3 3 7 3 3 7 3 3 Table 1: Comparison of Browser Based Transfer Services by virtue of its distributed organization. Storage is only available as long as at least one share member is online. While many of the services do not provide encrypted le transfer, the threat posed lies in the fact that any illegal or illicit data transfer can be dicult, if not impossible, to dierentiate from standard web trac. 5.1 HTML5 Enabled Peer-to-Peer Transfer This section examines a number of HTML5 enabled P2P transfer sites and describes their operation. 5.2 Sharefest Sharefest.me is a le-sharing Figure 3: Sharefest P2P Mesh over WebRTC one-to-many based website that aims to dynamically generate and maintain le-sharing swarms by connecting peers that are interested in sharing the beyond a WebRTC capable browser. As depicted in Figure 3, the Sharefest pro- same data. Like the BitTorrent protocol, multi- cess is quite straightforward in design. ple peers are utilized simultaneously to transfer sharefest.me server acts as a transfer control portions of the data thus increasing download server that records all les being oered for shar- speeds by avoiding the bottleneck that is the ing and matches the resource to the client sys- lower upload speed of a standard ADSL Internet tem request. connection. In order to achieve this, Sharefest A has a complete le that it wants to share. is built on Peer5's (https://peer5.com/) plat- Client A connects to the Sharefest server at The In the scenario depicted, Client form for a distributed Internet, a P2P data https://www.sharefest.me/ over port 443 and transfer mesh network that utilizes the capabil- negotiate TLS1.2 where possible using SPDY if ities of the browser without additional plugins available for web content transfer. Given a full Page 142 © 2015 ADFSL HTML 5 Zero Conguration Covert Channels: Security ... 2015 CDFSL Proceedings range of options the Sharefest server negotiates Sharefest server and the peers throughout the the use of the ECDHE-ECDSA with AES 128 exchange. and GCM 256. As required by the IETF RFC Sharefest is an example of P2P privacy in a 4492 (http://tools.ietf.org/html/rfc449), distributed network ensuring that data can be the Sharefest server passes the curve details as transferred without risk of interception. part of its serverkeyexchange packet. This level of privacy comes at a cost though as the Once a secure path is established the server ability of IT security to inspect the trac is delivers a small set of helper scripts to the client: greatly diminished with the level of encryption in use at all stages of the transfer. Packet anal- les.js : a script to gather le details from ysis can detect the IP addresses in use but with- the client out access to the key to decrypt the trac the content being transferred is extremely dicult ui.js : a script to control the update and to determine. One option available to network display of the le management interface on admins is to block the use of the sharefest.me the page. service by blacklisting the URL. This would have the eect of preventing casual usage of Once a le has been selected for sharing the Sharefest server assigns a code value in the form https://www.sharefest.me/ 67509cb244257b6643540dda512f8171 where of a URL such as the number after the swarmID for this le. domain name is the service but the source for Sharefest is publicly available on Github Peer5/Sharefest https://github.com/ along with instruction and support for installation of a personal server. the Peer5 also provide the API key for free to any- The SwarmID has 32 one interested in the code. This means that any characters but is not based on the MD5 of the IP or URL could become a Sharefest server. le, instead it appears to be derived from the One method of detecting the use of this ap- Sharefest crypto.js script which incorporates plication is the STUN trac generated once SHA3 in a lengthy calculation. The swarmID is a peer is identied and connection is initi- deterministic meaning that any client sharing ated. In testing an average of 5 STUN negoti- the ation/conrmation exchanges were recorded ev- same le will identied with the same swarmID. ery second depending on the level of le trans- Once clients oering and requesting the same fer data passing between the peers. This level le or le set are identied the Sharefest server of noise would make the user of Sharefest rel- acts as a central trac control and initiates a atively easy to discover and no eort is made to STUN connection directly between the partic- obfuscate the communicating peers. ipating clients. In Figure 3, Clients A, C and In an attempt to determine if this lack D are all participating in the swarm for File 1, of anonymization could be overcome, we at- but Client B is not involved. The STUN con- tempted to run Sharefest through a Tor cir- nection consists of each pair of clients issuing cuit but both transfer utilities (JustBeamIt and BIND requests from uploader to downloader us- Sharefest) failed to complete the initial negotia- ing STUN over UDP which allows each host to tion with the relevant server. discover its own public facing IP address in or- using Tor Browser installed on a Windows 7 der to create end to end connections through a VMWare image. A possible alternative may be NAT gateway or rewall. Each BIND / Conrm to attempt the use of a SOCKS aware proxy to pair is re-issued on a regular, short, interval to direct the trac to and from the application. ensure the connection remains intact. Once the Alternatively a server running Sharefest could STUN session is active the two peers negotiate be adapted to run as a Tor Service but this a WebRTC session and switch over to the proto- would not alleviate the lack of privacy experi- col's encryption. ACK and STUN conrmation enced once data transfer was initiated between messages continue to be sent to and from the peers. © 2015 ADFSL This was tested Page 143 2015 CDFSL Proceedings HTML 5 Zero Conguration Covert Channels: Security ... 5.3 Transfer Big Files Transfer Big les and drop transfer of https:// (TBF) www.transferbigfiles.com/ les It is worth noting that while this transfer ser- oers with drag size dependant on the user's account type. limits At the most basic free account level, the limit is 100mb per le totalling up to 20gb of les awaiting transfer at any one time. Files are held by the server for a default of 5 days before they are vice is just standard HTTPS and so will be discovered by standard forensic practices, if the username and password can be recovered, the site's account activity summary contains a full activity log of all les uploaded including when the transfer was started and when the recipient collected the le as well as the recipient's account name. 5.4 JustBeamIt removed. While this service does oer HTML5 drag and drop upload capability, it is not a true P2P application in that it does not transfer directly from one local peer to one or more remote peers. Some features of TBF include: tion is the le transfer service oered at www.justbeamit.com. HTTP port) and performs a standard TCP handshake followed by a series of email address and a password though of quests for client side JavaScripts. BrowserDetectUtility.js - determines if the user browser can properly support HTML5 initial DNS lookup resolves to for an Transfer- IP address 69.174.247.183 - a server hosted in the US reset and webpage notications if the array This account server negotiates TLS 1.0 as File upload prompts DNS lookup of 0storageuk4.transferbigles.com and 1storageuk4.transferbigles.com which both resolve to 83.222.233.155 (a server is emptied. UploadManager.js - denes the drag and drop actions and denes the landing zone UploadHandler.js - Determines if the Client needs to use XMLHttpRequest (XHR) or group hosted in the UK) FORM based uploading and generates the QRCode. once uploaded the les remain on the storage servers until picked up by the receipient who must be notied separately group UploadHandler.XHR.js and UploadHan- dler.FORM.js - the actual uploading scripts File download is performed from one of the tbfuk4.transferbigles.com FileHandler.js - manages the transfer to and from the le array. Handles the array part of the initial handshake servers over standard HTTPS. JustBeamIt.js - the base script that sets up data transfers The bigles.com re- tion functions code for the le itself HTTP GET the variables and denes the communicaRecipients can either have their own aclink with the domain tbf.me and a short (alternate the service. This account requires a name, count or can be sent a shortened URL alias 8080 senders must have an account to avail of elds http:// The sending user con- nects to the server over port note, there is no verication of any of these An example of a basic HTML5 transfer applica- There is an option to drag and drop a le into the browser but in this instance the le browser is used to select a le from the local user pictures TBF also oers an application and a com- folder. Once selected the button Create Link mand line client with enhanced capabilities is clicked and the link over the browser based interface but this is .com/di33x is created along with a QRCode for beyond the scope of this paper mobile use. Page 144 http://www.justbeamit This link can copied and sent to © 2015 ADFSL HTML 5 Zero Conguration Covert Channels: Security ... the receiving system, in the meantime the lo- 2015 CDFSL Proceedings load. cal client is redirected to a relay server URL 5.5 Any Send for the upload itself (http://b1.justbeamit .com/. On the remote system, the URL is pasted into a browser and the system and the remote client loads http://www.justbeamit.com and immediately requests the download token for the le ID di33x and downloads and runs the set of JavaScripts. The server passes along the token along with the current download status (upload waiting) and the le descriptor (name, size, extension). Once the remote user clicks on the link to download the browser is redirected http://b1.justbeamit.com to transfer is performed. where the le Once complete the lo- cal user is notied that the transfer has been successful. The token used to download is now invalidated and a new link must be generated if the le is to be shared again. Similarly, there is a 1000 second timeout period during which the shared le must be downloaded before the opportunity expires and a new share token must be generated. While this method of le transfer provides ease of use to the end user, all transfers are performed via unencrypted trac. The data being transferred is susceptible to any form of eavesdropping capable of detecting trac on any network segment the trac passes through. The open nature of the transfer and the client side execution of scripts (as well as the open exchange of tokens) allow for trivial man-in-the- Any Send http://www.anysend.com/ is a web based le transfer utility that oers a pure web based alternative to its own downloadable application. The web page anysend.com resolves to a set of four IP addresses divided between Utah and to clickmein.com. Texas and all registered The webpage consists of a large background image and a single dropzone for les to be sent. On dropping a le, the background javascript helpers manage the le upload to their servers. Once uploaded the user is presented with a URL to send to the recipient. The URL is comprised of the any- send.com domain and a le identier generated by the server (a 32 character string). Once the recipient enters the URL into a browser they are taken to an anysend.com page with details of the le corresponding to the leID part of the URL used. From here it is a standard HTTPS download from the anysend servers and the download URL will reect the anysend server url as well as the full lename of the le with the term %20via%20AnySend.exe ap- http://www.anysend.com/dl.php ?data=7f234e0920d8424833f97d3ab9380883\ ...\&fn=orientdb-community-2.0.4\ %20via\%20AnySend.exe\&packageID= ED59EE58A0380BBDB197A88F8290BDDE pended. eg: middle (MITM) attacks where an adversary capable of eavesdropping can use a proxy or other interception utility to alter the packets in transit. One possible scenario would be the sub- 6. FORENSIC CONSEQUENCES OF stitution of a harmless UploadManager.js script UNTRACEABLE FILE for something less benign as identied by Jang- TRANSFER Jaccard Jang-Jaccard and Nepal [2014] as a rising risk or, even exchanging the generated down- The facilitation of untraceable or anonymous load token for one that leads to a virus or other le exchange can lead to a number of potential form of malware. malicious use cases. For each of the scenarios From a security standpoint, defense against outlined below, an added dimension can be cre- the use of this service is quite straightforward. ated by the originator of the content: time. Due Because of the application's use of a centralized to the ability to create one-time or temporary set of servers, a standard rewall rule to block access to any piece of content, the timeframe http://*.justbeamit.com would pre- where evidence may be recovered from remote access to vent any upload but also any attempt to down- © 2015 ADFSL sharing peers might be very short. Page 145 2015 CDFSL Proceedings HTML 5 Zero Conguration Covert Channels: Security ... 6.1 Cybercriminal Community Backup Unmonitored covert transfer could be used to create a share and share alike model for the remote encrypted backup of illegal content. The sharing of these backups onto multiple remote machines eectively could provide the user with a cloudless backup solution requiring minimal trust with any remote users. The encryption of the data before distribution to the community culty in blocking HTTP based le transfers is that the technology is likely used during regular employee Internet usage. fers can be used when emailing le attachments or adding items to content management system. play should the remote nodes delete the information or it otherwise becoming unrecoverable. Having a secure, encrypted connection to a remote backup might be desirable to cybercriminals enabling the use of a kill-switch to their local storage devices should the need arise. 6.2 Secure Covert Messaging For example, the proof of concept based on the BitTorrent Sync Protocol found at missiv.es/. http:// The application currently oper- ates by saving messages to an outbox folder in a synchronized share between peers that has a read only key shared to the person you want to receive the message. They in turn send you a read only key to their outbox. One to many can be achieved by sharing the read only key with more than one person but no testing has been done with synchronization timing issues yet and key management may become an issue as a new outbox would be needed for each private conversation required. 6.3 Industrial Espionage Many companies are aware of the dangers of One additional scenario where these ser- vices could be used would be to transfer les within a LAN and subsequent external exltration from a weaker/less monitored part of the network, e.g., guest wireless access. 6.4 Piracy can ensure that only the owner will ever have access to decrypt the data. Trust only comes into HTTP trans- Like any other P2P technology, the ability to transfer les in a direct manner from peer to peer lends itself well to the unauthorized distribution of copyrighted material. The sharing of copyrighted multimedia, software, etc., between peers using these covert services is less likely to lead to prosecution compared with public piracy on open le-sharing networks such as BitTorrent. 6.5 Alternative to Server Based Website Hosting This scenario involves the creation of static websites served through a shared archive. These websites could be directly viewed on each user's local machine facilitating the easy distribution of any illegal material. The local copies of the website could receive updates from the webmaster through the extraction of archived updated distributed in a similar manner as the original. 7. POTENTIAL FORENSIC INVESTIGATION TECHNIQUES allowing unmonitored trac on their networks. Assuming access (physical or remote) can be ac- However, quite often corporate IT departments quired to either end of the le transfer, then a enforce a blocking of P2P technologies through live acquisition of the evidence should be attain- protocol blocking rules on their perimeter re- able. Performing evidence acquisition after the walls. This has the eect of cutting o any fact would rely on traditional hard drive and le-sharing clients installed on the LAN from memory forensic techniques to see if any rem- the outside world. In addition to Deep Packet nants of the network communication remain. Inspection (DPI) to investigate the data por- The investigation of the unauthorized trans- tion of a network packet passing the inspection fer for information through one of these ser- point, basic blocking of known IP address black- vices without access to either end of the trans- lists in rewall rulesets can be used. The di- fer can prove extremely dicult. Page 146 © Assuming 2015 ADFSL HTML 5 Zero Conguration Covert Channels: Security ... through some external means, the precise date and time of the transfer were discovered. ployed. The only method available to law enforcement is to eectively wiretap the transfer by running a software or hardware based deep packet inspection tool on the network at either end of the transfer. To date there has been keen interest in research performed on the forensic examination of le sharing utilities and the type of security risks they pose. Chung et al. [2012b] outlined a best practice approach to the investigation of le sharing using cloud based Storage-as-a-Service (StaaS) utilities such as Dropbox, iCloud and OneDrive. 2015 CDFSL Proceedings In 2014, Federici [2014] presented Cloud Data Imager (CDI) a utility developed to automate the retrieval of cloud based storage artifacts from a suspect system and use these credentials to access their secure storage online. Scanlon et al. [2014] described a methodology that leveraged the processes used by persistent le synchronization services to ensure data integrity to retrieve data that would otherwise have been inaccessible. This could be as a result of deliberate obfuscation such as encryption or anti-forensic activities or it could be caused by an error in the imaging process. The methodology presented utilized the need for synchronization group ongoing communication to enumer- 8. The evolution of online le transfer systems is becoming more and more covert through employing encryption-based, server less, P2P protocols. of the suspect data. Emerging le transfer utilities, such as the purely browser based le transfer utilities based on WebRTC, do not advertise persistence of availability nor integrity checking beyond the initial transfer and in many cases are only associated for the length of time that both parties are online and in communication, directly or otherwise. After this time, such as with Onionshare for example, the address of the le source will change completely and no longer be available to any peer authorized or otherwise. This ephemeral nature of data transfer can The development of HTML5 and JavaScript based services proves particularly interesting from a digital forensic perspective. As these technologies mature, a future can be easily envisioned whereby the investigation and evidence retrieval from these systems will prove extremely dicult, if not entirely impossible. Daisy-chaining a number of the technologies outlined in this paper has the potential to enable malicious users to securely transfer any desired information to another user/machine without arousing suspicions of system administrators. Identifying the use of a HTTPS, browser-based P2P le transfer with relatively small transfer sizes might prove prohibitively dicult. The investigation of these transfers may prove cost prohibitive in terms of both time and money for law enforcement to comprehensively investigate and mitigation of the risk by way of security policies or hardware and software based rulesets may come at a price in terms system usability that will be deemed too high. 8.1 Future Work ate remote peers and to identify any authorized peers that could provide a forensically true copy CONCLUSION As privacy becomes easier for the end user to accomplish, the role of forensics will become all that much harder as not even the low hanging fruit of browser history can be expected as a starting point. Additionally, system security will no longer be able to react as this may already be too late. As a future developement in forensics there is clear potential for utilities and techniques to be developed to help bridge the gap between proactive security abnd reactive forensics. Some areas of interest are Automated Detection of HTML5 and WebRTC based Data Exltration. make any attempt to verify or re-create the circumstances of the le transfer dicult if not Approximate Hashing Signatures Ap- impossible and it very much depends on the proximate hashing facilitates the analysis features of the individual application being em- of network trac as it could be applied to © 2015 ADFSL Page 147 2015 CDFSL Proceedings HTML 5 Zero Conguration Covert Channels: Security ... recognise variations on patterns specic to http://dx.doi.org/10.1016/j.diin.2014.02.002. protocols and their timings used in HTML5 URL and WebRTC Forensic analysis of P2P over anonymizing http://www.sciencedirect.com/ science/article/pii/S174228761400005X. Simson Garnkel. Network forensics: Tapping networks. Perhaps a lack of a footprint can the internet. IEEE Internet Computing, 6: be proven to be a footprint in and of itself 6066, 2002. in a networking environment. Annarita Giani, Vincent H Berk, and George V REFERENCES Stevens Le Blond, Pere Manils, Chaabane Abdelberi, Mohamed Ali Dali Kaafar, Claude Castelluccia, Arnaud Legout, and Cybenko. Data exltration and covert channels. In Defense and Security Symposium, pages 620103620103. International Society for Optics and Photonics, 2006. Walid Dabbous. One bad apple spoils the bunch: exploiting p2p applications to trace Michael Herrmann and Christian Grotho. and prole tor users. arXiv preprint Privacy-implications of performance-based arXiv:1103.1518, 2011. peer selection by onion-routers: a real-world Daryl; Bogaard, Daniel; Johnson and Robert Parody. Browser web storage vulnerability investigation: Html5 localstorage object. In case study using i2p. In Privacy Enhancing Technologies, pages 155174. Springer, 2011. Julian Jang-Jaccard and Surya Nepal. A Proceedings of The 2012 International survey of emerging threats in cybersecurity. Conference on Security and Management, Journal of Computer and System Sciences, 2012. 80(5):973 993, 2014. ISSN 0022-0000. Serdar Cabuk, Carla E Brodley, and Clay Shields. Ip covert channel detection. ACM Transactions on Information and System Security (TISSEC), 12(4):22, 2009. Hyunji Chung, Jungheum Park, Sangjin Lee, and Cheulhoon Kang. Digital forensic investigation of cloud storage services. Digital Investigation, 9(2):81 95, 2012a. ISSN 1742-2876. Hyunji Chung, Jungheum Park, Sangjin Lee, http://dx.doi.org/10.1016/j.jcss.2014.02.005. http://www.sciencedirect.com/ science/article/pii/S0022000014000178. URL Special Issue on Dependable and Secure Computing The 9th {IEEE} International Conference on Dependable, Autonomic and Secure Computing. Yali Liu, Cherita Corbett, Ken Chiang, Rennie Archibald, Biswanath Mukherjee, and Dipak Ghosal. Sidd: A framework for detecting sensitive data exltration by an insider and Cheulhoon Kang. Digital forensic attack. In System Sciences, 2009. HICSS'09. investigation of cloud storage services. 42nd Hawaii International Conference on, Digital investigation, 9(2):8195, 2012b. pages 110. IEEE, 2009. Vicka Corey, Charles Peterman, Sybil Shearin, Karsten Loesing, Steven J Murdoch, and Roger Michael S Greenberg, and James Dingledine. A case study on measuring Van Bokkelen. Network forensics analysis. statistical data in the tor anonymity Internet Computing, IEEE, 6(6):6066, 2002. network. In Financial Cryptography and Data Security, pages 203215. Springer, Corrado Federici. Cloud data imager: A 2010. unied answer to remote acquisition of cloud storage areas. Digital Investigation, 11(1):30 42, 2014. ISSN 1742-2876. Page 148 Ben Martini and Kim-Kwang Raymond Choo. Cloud storage forensics: owncloud as a case © 2015 ADFSL HTML 5 Zero Conguration Covert Channels: Security ... 2015 CDFSL Proceedings study. Digital Investigation, 10(4):287 299, 2013. ISSN 1742-2876. Nagendra Modadugu and Eric Rescorla. The design and implementation of datagram tls. In NDSS, 2004. Mark Scanlon, Jason Farina, Nhien-An Le Khac, and M-Tahar Kechadi. Leveraging Decentralisation to Extend the Digital Evidence Acquisition Window: Case Study on BitTorrent Sync. Journal of Digital Forensics, Security and Law, pages 8599, September 2014. Taeshik Sohn, JungTaek Seo, and Jongsub Moon. A study on the covert channel detection of tcp/ip header using support vector machine. In Information and Communications Security, pages 313324. Springer, 2003. Juan Pablo Timpanaro, Isabelle Chrisment, and Olivier Festor. Group-based characterization for the i2p anonymous le-sharing environment. In New Technologies, Mobility and Security (NTMS), 2014 6th International Conference on, pages 15. IEEE, 2014. © 2015 ADFSL Page 149 2015 CDFSL Proceedings Page 150 HTML 5 Zero Conguration Covert Channels: Security ... © 2015 ADFSL CONTINUOUS MONITORING SYSTEM BASED ON SYSTEMS' ENVIRONMENT Eli Weintraub Tel Aviv Afeka College of Engineering Israel Head of Information systems specialization Yuval Cohen Tel Aviv Afeka College of Engineering Israel Head of Production Management specialization ABSTRACT We present a new framework (and its mechanisms) of a Continuous Monitoring System (CMS) having new improved capabilities, and discuss its requirements and implications. The CMS is based on the realtime actual configuration of the system and the environment rather than a theoretic or assumed configuration. Moreover, the CMS predicts organizational damages taking into account chains of impacts among systems' components generated by messaging among software components. In addition, the CMS takes into account all organizational effects of an attack. Its risk measurement takes into account the consequences of a threat, as defines in risk analysis standards. Loss prediction is based on a neural network algorithm with learning and improving capabilities, rather than a fixed algorithm which typically lacks the necessary environmental dynamic updates. Framework presentation includes systems design, neural network architecture design, and an example of the detailed network architecture. Keywords: Continuous Monitoring, Computer security, Attack graph, Software vulnerability, Risk management, Impact propagation, Cyber attack, Configuration management 1. INTRODUCTION Personal and organizational computing systems are sometimes subject to cyber-attacks which may cause damage to organizational data, software and computers (Mell et al. 2007). This paper focuses on threats generated by hostile attackers. Vulnerabilities are weaknesses or exposures stemming from bugs that are potential causes to security failures: loss of confidentiality, integrity or availability. An attack is performed by exploiting software vulnerabilities in the target computing system. Exploits are planned to attack certain components having specific vulnerabilities. Langer (2011) states that Stuxnet warm included a process of checking hardware models and configuration details, and also downloads program code from the controller to check if it was the “right” program before launching an attack. This leads to planning defense systems that are sensitive to changes in their environment. Users' computers might be damaged by exploited vulnerabilities. Organizations make decisions on actions they have to take, in order to limit their risks according to the amount of potential damage and vulnerability characteristics (Tom, 2008). Several software products are usually used for defending computers from cyber attacks. Antivirus software, antispyware and firewalls are examples to some of these tools. Several tools are based on periodic assessment of the target computer by comparing computers' software to the known published vulnerabilities. Antivirus engines store features of known malware and hash signatures, using classification algorithms to identify hostile software. Signature scanning technique is the most widely used technology in anti-virus programs (Symantec, 1997). Those tools are naturally effective only against known threats and not against new unpublished threats. Heuristic Antivirus scanners detect viruses by analyzing the program’s structure or its behavior instead of looking for signatures. Heuristic scanners are able to identify new unpublished malware. Intrusion Detection System (IDS) monitor the events occurring in a computer or network, searching for violations or threats to computer security policies and security practices. Static and dynamic code analysis techniques are aimed to identify malicious activities by analyzing attempts to execute code or identifying unusual behavior (Scarfone and Mell, 2007). Contrary to the popular techniques such as antivirus, antispyware and firewall, our model analyzes vulnerabilities at the time before fixes are publicly distributed. Moreover, our model uses a prediction algorithm which uses historical data of exploits, together with computer's configuration, to predict losses of the new vulnerabilities. Information Security Continuous Monitoring system (ISCM) is defined by NIST as: Maintaining ongoing awareness of information security, vulnerabilities, and threats to support organizational risk management decisions (Dempsey et al., 2011). We use the acronym CMS since we do not limit our model to software. CMS's monitor computer systems in a near real time process aimed at detecting vulnerabilities and cyberspace security risks, and alarming the appropriate functions within the organization. Contemporary systems use vulnerabilities databases (which are continually updated as new vulnerabilities are detected) and a scoring algorithm which predicts potential business losses. Computers are at risk to known threats until the moment a patch is programmed for the vulnerable specific software, an activity that may last weeks or months. Even after a patch is prepared by the software vendor a computer might still be at risk until the moment the new patch is loaded to the vulnerable system. Loading patches to computer systems is usually performed as a periodic process, not continuously. The reason for this is avoiding too many interrupts required for uploading and activating the patch on the production environment. In today's environment of zero-day exploits, conventional systems updating for security vulnerabilities has become a cumbersome process. There is an urgent need for a solution that can rapidly assess system vulnerabilities and immediately fix them (Nũez, 2008). Although zero-day vulnerabilities are kept secret by hackers for exploits programming, after a 90-days period vendors like Google use to automatically disclose the vulnerability to the public even if no fix was written. Our system deals with risks at the time the vulnerability is published but not yet fixed in the operational organizational environment. Operating techniques for monitoring, detecting and alerting of security threats on a regular basis are defined as Security Continuous Monitoring (SCM) systems. After identifying these risks, tools evaluate the potential impacts on the organization, sometimes suggesting risk mitigation activities to the organization to support organizational risk management decisions (Dempsey, 2011). SCM's are aimed at closing the gap between the zero-day of identifying the vulnerability, until the moment the computer is loaded by the corresponding patch fixing the vulnerability. The time gap may be considerably long. In this paper we describe a mechanism of a new SCM system framework that will produce better detection and prevention than existing SCM systems. Our framework is based on four main elements: (1) Knowledge concerning the specific computers' configuration of the target system and interrelationships among systems' components. (2) A prediction algorithm which runs continuously and predicts the potential losses. (3) Risk assessment is based on vulnerability consequences. (4) A learning algorithm which continuously improves the predicted losses. The rest of the paper is organized as follows: In section 2 we describe current known solutions. In section 3 we present the proposed framework including systems' architecture. In section 4 we describe the scoring algorithm which predicts vulnerability losses. We present a neural network model for loss prediction and learning. In section 5 we conclude and describe future research directions. 2. EXISTING SOLUTIONS SCM systems are using external vulnerabilities databases for evaluation of the target computers' risk. There are several owners of vulnerability databases (Dempsey et al., 2011): The Sans Internet Storm Center services and The National Vulnerability Database (NVD). Vulnerability Identification Systems (VIS) aimed to identify vulnerabilities according to three categories: code, design, or architecture. Examples for VIS are: the Common Vulnerabilities and Exposures (CVE), and The Common Weakness Enumeration (CWE). In this work we shall use NVD vulnerabilities database as an example. Risk evaluation uses scoring systems which enable parameters estimation for assessing the impacts of vulnerabilities on the organization. The Common Vulnerability Scoring System (CVSS) is a framework that enables user organizations receive IT vulnerabilities characteristics (Mell et al., 2007). CVSS uses three groups of parameters to score potential risks: Basic parameters, Temporal parameters and Environmental parameters. Each group is represented by a score compound parameters ordered as a vector, used to compute the score. Basic parameters represent the intrinsic specifications of the vulnerability. Temporal parameters represent the specifications of a vulnerability that might change over time due to technical changes. Environmental parameters represent the specifications of vulnerabilities derived from the local IT specific environment used by users' organization. CVSS enables omitting the environmental metrics from score calculations, those are cases that users' environment has no effect on the score. CVSS is a common framework for characterizing vulnerabilities and predicting risks, used by IT managers, risk managers, researchers and IT vendors, for several aspects of risk management. CVSS is an open framework which enables managers to deal with organizations' risks and make decisions based on facts rather than evaluations. User organizations adopting CVSS framework may gain the following benefits: A standard scale for scoring vulnerabilities and risks. The scale enables organizations normalize vulnerabilities according to specific IT platforms. The computed scores enable users to get rational decisions in correlation to vulnerability risks. Open framework: user organization can see the characteristics of vulnerability and the logical process of scores evaluation. Prioritized risks: organizations using the environmental parameters may benefit by considering changes in its IT environment according to predicted risk scores. There are few other vulnerability scoring systems besides CVSS differing by what they measure. CERT/CC puts an emphasis on Internet infrastructure risks. SANS vulnerability system considers users' IT configuration and usage of default parameter definitions. Microsoft’s scoring system emphasizes attack vectors and impacts of the vulnerability. Generally, Basic and Temporal parameters are specified and published by products' vendors who have the best knowledge of their product. Environmental parameters are specified by the users who have the best knowledge of their environments and vulnerabilities' business impacts. This paper focuses mainly on environmental metrics. The organizational damage caused by vulnerability is influenced by the specific IT environment which is exploited. CVSS environmental parameters specify the characteristics of a vulnerability that is associated with user’s IT components compounding the environment. Environmental parameters are of three groups: I. Collateral Damage Potential (CDP): A group of parameters which measure the economic potential loss caused by a vulnerability. II. Target Distribution (TD): Parameters indicating the percentage of vulnerable components in user environment. A large proportion indicates more impacts on organizational potential damages. III. Security Requirements (CR, IR, AR): Security requirements are parameters which indicate user's sensitivity to security risks. This group of parameters is subdivided to certain parameters indicating the Confidentiality (CR), Integrity (IR), and Availability (AR) of the vulnerable component. High security requirements might cause higher security damages, thus more economic losses. Categorization of IT components according to security requirement measures should be performed by users encompassing all assets. Doing so raises the possibility to predict the organizational losses. Federal Information Processing Standards (FIPS) requirements demands implementation of a categorization system (Dempsey et al., 2011), but does not require using any particular scale, thus risk comparisons among users systems is difficult. 3. THE PROPOSED FRAMEWORK Federal organizations are moving from periodic to continuous monitoring implementing SCM's which will improve national cyber security posture (Hardy, 2012). The proposed framework includes four capabilities which are not found in current models: Real time environmental metrics. Metric evaluations are based on the components of the system as updated in the systems' CMDB (Keller and Subramanianm, 2009). There are several commercial products for asset inventory management such as IBM Tivoli or Microsoft System center. This capability enables basing predictions on real IT environment rather than on user's evaluations. According to Grimalia et al. (2009) it is impossible for organizations to make precise estimates of the economic losses caused by an attack without having full knowledge of users' IT environment. Kotenko and Chechulin (2012) state that network configuration should be monitored continually and available vulnerabilities must be analyzed in order to provide the necessary security level. The proposed CMS examines a database of published asset vulnerabilities, compares in real time computers' assets for existing exposures, and calculates computers' potential losses. Loss evaluation is performed by considering vulnerabilities even before patches are prepared and loaded on the computers' system. Components interdependencies. Current systems focus on the IT infrastructure but not on the interdependencies among components. Several researchers stress the need to deal with interdependencies (Albanese et al., 2013; Jakobson, 2011). Jajodia S, et al. (2011) presents a model that maps possible multi-step environmental vulnerabilities, enabling organizational damage estimations. Kotenko and Chechulin (2012) present a system based on attack modeling using attack graphs, evaluating security risk based on attack model. Wang et al. (2006) propose an automated process aimed at hardening a network against multi-step intrusions. Our framework deals with loss prediction by looking for past attacks on systems' components by learning from their past organizational impacts. The proposed algorithm takes into account component dependencies, predicting all potential direct and indirect impacts on the organization stemming from the specific vulnerable component. Loss prediction is implemented by a neural network which represents IT components and interdependencies between components such as reading and writing from neighboring components. The process of predicting loss is based on propagation of signals among components, starting from the vulnerable component, ending at the organizational losses as stated by the user. Signals between components represent the varying kinds of dependencies. Risk assessment based on consequences. Risk analysis theory defines risk as a triple that specifies the scenario of an event, the likelihood that the occurring event and event consequences appearing regularly as threat x vulnerability x consequences. According to Collier et al. (2014) CVSS fails to connect risk assessment to risk management. According to CVSS, risk damage potential values are estimated by organizations (Mell et al., 2007). According to the proposed framework, potential loss prediction is based on the actual losses of similar past attacks on the specific vulnerable component, performed through the similar attack vector. In cases when there has not been in the past a similar attack, prediction will be based on past losses stemming from past attacks on the specific component concerning all attack vectors. A Learning algorithm. Hardy (2012) states that predictive analysis should be used for threat modeling. Threat projection algorithms are also presented by Holsopple and Yang (2008) to estimate plausible futures. We use predictive analysis for loss prediction, based on historical data of losses caused by past attacks on vulnerable components. The predictive analysis uses a learning algorithm since the organization learns how to deal with the vulnerable component, improves its software, thus limiting or preventing damages. According to the proposed framework loss prediction is based on environmental parameters, and actual losses of past events. Both environmental parameters and losses are related to changes: environmental characteristics are subject to changes which occur all the time in operational systems and actual losses of past events which are continuously updated according to users' findings about incidents' impacts. Losses caused by past attacks may be noticed long after the time of the attack. Such late losses should update the predicted loss calculated by the algorithm. We describe the proposed framework architecture (Figure 1) and its main components. Vulnerabilities Database (NVD) CMS Potential Losses Database Scoring Module (CVSS) CMDB Historical Events Database: Potential Vulnerabilities, Losses, IT Components Learning Module Figure 1 Continuous Monitoring System architecture Vulnerabilities Database includes all known vulnerabilities and their specification as published by Database owners or government agencies. As an example for vulnerability specifications NVD defines: category, vendor name, product name, published start and end dates, vulnerability update dates, severity, access vector, access complexity and security requirements of a patch (Hardy, 2012). Scoring module (CVSS) is an algorithm which computes potential losses according to the parameters of three groups. As stated above there are also other known algorithms, some of them for public use other commercial. CMDB is a database which includes all hardware and software components of the target system and all components' characteristics. Components are dealt in a resolution of a hardware machine. Software is dealt in the resolution of programs or physical exe files or DLL's. Data is handled in the resolution of database or table, not data items. Input/output are dealt by screen-name or output message. The target system might be one computer or a group of organizations' computers. CMDB includes all components in the computers' environment, components which interface with the target system directly or indirectly up to external and end-users' interfaces. CMDB includes also the security requirements (CR, IR, AR) of each component. Security requirements are specified by systems' owners according to business potential losses. CMDB includes also all interfaces among components. For each interface are indicated the direction of data transfer between the components and the probability of occurrence of that connection according to systems' operational history. Historical events database includes all cyber attacks on the system and their details. For each event indicated the vulnerability which was used to exploit and all computer components involved in the incident. Also are indicated the economic loss caused to the organization by the attack as evaluated by the organizational users or risk management. Potential Losses Database includes the predicted losses computed by the system. Systems' owner is informed about the potential predicted loss of all components at risks and makes his decisions concerning each component. The owner might disable a component or a computer when loss potential is high. In cases a patch is not yet developed, the owner might continue using the risky component or monitoring the component closely with higher awareness to possible exploits. In cases a patch was developed but not yet loaded on operational systems the owner might decide either remediate and deploy the patch, defer deployment to appropriate times considering organizational constraints, or reject deployment in cases the potential loss is limited. The system runs continuously based on the neural network predicting losses of new vulnerabilities. Updates to neural networks' parameters due to the learning process are performed periodically according to operational constraints. The system start the continuous process computing loss prediction in two cases: first is whenever a new vulnerability is publishes and indicated in the NVD. Second is whenever a change is made or intended to be made in a system component or in systems' environment. In the case of testing a new component, the system computes losses as a simulation, before decision is made to move the component to the operational environment. Loss evaluation is based on NVD, CVSS, CMDB and the Historical Events Database. Whenever a component is found to be vulnerable according to NVD, the system performs a propagation process which computes all impacts on components which read or write data from the vulnerable component. Propagation algorithm runs until the final output is been transferred to the users or written to the output files. Propagation process uses CMDB to lead the process of interactions among components. The Learning algorithm writes the potential computed losses in the Historical Events Database. The Learning module forecasts the future potential losses caused by a specific vulnerability which was exploited on a component. Prediction will performed by running the neural network. Actual damage will be updated by organizations' owner on a regular basis to capture also delayed outcomes of a past vulnerability. The learning algorithm will improve economic prediction accuracy losses which will be based on the updated environment and the updated actual losses. 4. LOSS SCORING AND LEARNING Scoring algorithm is implemented through the neural network. The architecture of the network is described in table 1 and a detailed design example of a network illustrated in table 2. Network design is based on Han and Kamber (2006). Implementation may be done using data mining software tool such as SAS business analytics software, or Weka. The network represents all parameters impacting on the vulnerable component comprising the input layer. Parameters include vulnerability characteristics as updated in NVD. Characteristic example parameters are vulnerability category, vulnerability severity, and parameters describing components' specification such as vendor name and product ID (such as operating system version). The input layer includes all CVSS parameter groups: Basic metrics, Temporal and Environmental metrics. As illustrated in table 2 parameters are categorized as they appear in CVSS. For example vulnerability access complexity includes three categories: high, medium and low. The hidden layers include a number of layers which represent messages from the exploited component to all other systems' components such as the operating systems in use, database, communication protocols used, UI programming language, and all other application components called directly or indirectly by the vulnerable component. The neural network represents the logical workflow of messaging and data transfers between the component and all other systems' components. The output layer of the neural net represents losses occurring due to cyber attacks on components. Losses are categorized to low, medium, high and fatal. Losses represent actual business damages by past attacks on a component, as reported by the organization. Losses are reported on a regular basis until all late-effects are known, sometime in the future. This requires the nomination of a security person that would be responsible for regular reports of the vulnerabilities and damages. Neural net input signals are represented by zeros and ones according to the existence of the specific parameter. Messages between neural network nodes are binary. Arcs between nodes represent kinds of dependencies between components. Output layer categories are also binary. At the end of the process the system presents the predicted business loss category for attacks on one component. Each activation process of the network uses all computed weights between network nodes. The network may be programmed to predict attacks using a specific vulnerability or otherwise attacks using all vulnerabilities on that component. After prediction of the business losses, the organization decides ways of mediating the vulnerability, whether to accept the risk, try to attenuate the risk, wait for a patch or live without the risky component. The learning process is activated on a periodic basis generating updated weights to network arcs and components. The learning process is activated by three event types: (1) accepting indicators to a new vulnerability (2) Loss updates concerning past organizational losses (3) Changes performed to the computing configuration or environment. The training and learning process runs on the historical database of attacks by several forward and backward propagation processes until networks' termination conditions exist. The proposed approach differs from existing scoring models such as CVSS by dynamic generation of the calculations involved in the scoring process. CVSS uses fixed coefficients which were calculated at a specific point in the past. Our framework predicts losses on a continuous basis, and updates network coefficients through learning on a periodic (or nearly continuous) basis subject to operational constraints. Table 1 Neural Network architecture Input group layer Vulnerability Details Basic Metrics Input layer parameter name Input layer parameters values Vendor 1 Cross site scripting vendor 2 Untrusted search path Vul. Access Complexity Low Intermediate layer Component UI protocol Intermediate layer Component Operating system HTML Output layer Business Losses Low ( 1-10k) High Windows 7 Medium (10k-100k) Temporal Metrics High Vul. Exploitability java Functional Unix Environmental Metrics Collateral damage potential High High (100k-1000k) Medium Table 2 Example of a Neural Network layer design Input layer Input layer group Input layer parameter name Input layer param values Vulnerability Details Vulnerability Category Cross scripting site Component UI protocol Intermediate layers Component Operating system Database Application components Application components Output layer Business Losses Vul. Vendor Vul. Product ID Vul. Severity Vul. duration Basic Metrics start Vul. Access Complexity Vul. Access Vector Buffer overflow SQL Injection Red Hat Algosec Quantum 1234 3456 High Medium Low Less 1 week Vul. Confidentiality Impact Vul. Integrity Impact Windows 7 Com. x Unix Oracle Com. 1 JAVA Windows 8 Less 1 month Less 1 year More 1 year Medium Comp y High Low Local Adjacent Network Vul. Access Authentication HTML Medium (10k100k) Javascript Multiple Single None None Windows XP Com. 2 DB2 Partial Complete None Windows NT Partial Complete Vul. Availability None Partial Complete Vul. Exploitability Unproven AJAX OS/360 Temporal Mertics SQL Proof of concept Functional High Not Defined Vul. Remediation Level Official Fix High (100k1000k) Temporary Workaround Unavailable Not Defined Report Confidence Unconfirmed Os X .NET Uncorroborated Confirmed Not Defined Environmental Metrics Collateral damage potential C Input layer Input layer group Input layer parameter name Input layer param values Target component Distribution Low Component UI protocol Com. 3 Intermediate layers Component Operating system Medium High Target CR Database Application components Application components Output layer Business Losses SQL C++ Fatal (1000k - ) Target IR Com. 4 android Comp w Target AR 5. CONCLUSIONS In this work we described a new framework of Security Continuous Monitoring (SCM) system and its mechanisms, including neural network architecture aimed at increasing security of information systems by improving and accelerating loss prediction. The system introduces four new capabilities: (1) Continuous real-time loss prediction software agent using real time environmental parameters for an improved loss prediction algorithm. (2) Components' interdependencies are used by a propagation algorithm for loss prediction, (3) Risk prediction is based on actual losses reported by the organization and (4) a learning algorithm which is based on a process of updating the facts concerning vulnerabilities' actual losses and real-time IT configuration. The framework enables getting improved recommendations to computer owners concerning new relevant vulnerabilities. The framework also enables improved security management of the operating systems. For example, in cases where a vulnerability to the new asset is publicly known but still un-patched, loading a new version of a software component will be prevented by performing a preliminary simulation test which analyzes vulnerabilities of the new component, incorporated in the operational environment. Several future research directions exist: performing a proof of concept of the framework for evaluation of the model, investigating defense methods against attack vectors involving several different vulnerabilities, searching new hidden vulnerabilities in a production environment. Further research could extend the resolution of the entities used in our model so that entities will include data items with the appropriate specifications such as security requirements, and interdependencies between components indicating data transfer between data items. Albanese M., Jajodia S., Jhawar R., and Piuri V., (2013). Reliable Mission Deployment in Vulnerable Distributed Systems, proceedings of the 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks, Budapest, Hungary, June 24-27, 2013. Collier Z. A., DiMase D., Walters S., Tehranipoor M., Lambert J. H., Linkov I., (2014). Cybersecurity Standards: Managing Risk and Creating Resilienc, Computer, Vol. 47 Issue No. 09 September 2014, IEEE. Dempsey K., Chawia N. S., Johnson A., Johnston R., Jones A., C., Orebaugh A., Scholl M., and Stine K., (2011). Information Security Continuous Monitoring (ISCM) for Federal Information Systems and Organizations, NIST. Grimalia M. R., Fortson L. W., and Sutton J. L., (2009). Design considerations for a cyber Incident Mission Impact Assessment (CIMIA) Process, Proceedings of the 2009 International Conference on Security and Management (SAM09), Las Vegas. Han J., and Kamber M., (2006) Data Mining: Concepts and Techniques, 2nd ed. San Francisco, CA, Morgan Kaufmann Publishers. Hardy M. G., (2012). Beyond Continuous Monitoring: Threat Modeling for Real-time Response, SANS Institute. Holsopple J., and Yang S. J., (2008). FuSIA: Future Situation and Impact Awareness, in Proceedings of the 11th International Conference on Information Fusion, Cologne, Germany, July 1-3 2008, ISIS.IEEE. Jajodia S., Noel S., Kalapa P., Albanese M., and Williams J., (2011). Cauldron: Mission-Centric Cyber Situational Awareness with Defense in Depth, in Proceedings of the Military Communications Conference, (pp. 1339-1344), USA. Jakobson G., (2011). Mission Cyber Security Situation Assessment Using Impact Dependency Graphs, The 14th International Conference on Information Fusion, Chicago, USA, July 5-8, 2011. REFERENCES Keller A. and Subramanianm S., (2009). Best practices for deploying a CMDB in large-scale environments, Proceedings of the IFIP/IEEE international conference on Symposium on Integrated Network Management, pages 732745, NJ, IEEE Press Piscataway. Kotenko I. and Chechulin A., (2014). Fast Network Attack Modeling and Security Evaluation based on Attack Graphs, Journal of Cyber Security and Mobility Vol. 3 No. 1 pp 2746. Langer L., (2011). Stuxsnet: Dissecting a Cyber Warfare Weapon, Security and Privacy IEEE, Volume: 9 Issue: 3, pages 49-51, NJ, USA. Mell P., Scarfone K., and Romanosky S., (2007). CVSS - A Complete Guide to the Common Vulnerability Scoring System, Version 2.0, Retrieved on October 13, 2014 from http://www.first.org/cvss/cvss-guide. Nũez Y. F., (2008). Maximizing an organization's information security posture by distributedly assessing and remeding system vulnerabilities, 2008 IEEE, International Conference on Networking, Sensing and Control, China, April 6-8, 2008. Scarfone K., and Mell P., (2007). Guide to Intrusion Detection and Prevention Systems (IDPS), NIST, 2007. Symantec, (1997). Understanding Heuristics: Symantec's Bloodhound Technology, White paper XXXIV. Tom S., Christiansen D., Berrett D., (2008). Recommended Practice for Patch Management of Control Systems, DHS National Cyber Security Division Control Systems Security Program. Wang L., Noel S., and Jajodia S., (2006). Minimum-cost network hardening using attack graphs, Computer Communications 29, Issue 18, pp. 3812–3824. IDENTIFYING COMMON CHARACTERISTICS OF MALICIOUS INSIDERS Nan Liang Oklahoma State University Spears School of Business Stillwater, Ok 74075 [email protected] David Biros Oklahoma State University Spears School of Business Stillwater, Ok 74075 [email protected] ABSTRACT Malicious insiders account for large proportion of security breaches or other kinds of loss for organizations and have drawn attention of both academics and practitioners. Although methods and mechanism have been developed to monitor potential insider via electronic data monitoring, few studies focus on predicting potential malicious insiders. Based on the theory of planned behavior, certain cues should be observed or expressed when an individual performs as a malicious insider. Using text mining to analyze various media content of existing insider cases, we strive to develop a method to identify crucial and common indicators that an individual might be a malicious insider. Keywords: malicious insider, insider threat, the theory of planned behavior, text mining 1. INTRODUCTION In the field of information security, the subject of “insider threat” garners a lot of attention, yet has been deprived of sound empirical investigation. However, there is considerable anecdotal mention of the insider threat issue. In a recent report, the FBI noted 10 examples of insider attacks reported in recent years including the theft of trade secrets, corporate espionage, and the unauthorized disclosure of information (FBI report, 2010). These insider incidents resulted in financial losses in the millions of dollars. Some researchers believe that insider threat, as opposed to outsider attacks, is easier to achieve as insiders are more familiar with the security structure of the in organizations in which they work (Anderson, 1999; Chinchani, Iyer, Ngo, & Upadhyaya, 2005). Insiders of an organizations either have legitimate access to organizational resources (Bishop, Engle, Peisert, Whalen, & Gates, 2009) or have knowledge about the operations of the organization (Probst, Hunker, Gollmann, & Bishop, 2010). With their knowledge and legitimate access, they can bypass security protocols and exploit the trust the organization has placed on them (Bellovin, 2008). Information Age The information age has brought on new outcomes from insider threat. Consequences of insider attacks have multi-dimensional loss, including financial loss, disruption to the organization, loss of reputation, and long-term impacts on organizational culture (Hunker & Probst, 2011). When compared to consequences of outsider attack, insider attack yields incidents with higher impacts (Chinchani et al., 2005) since insiders are familiar with countermeasures of organizations and know how to find their targets. The topics of insider and insider threat have received significant attention from both practitioners and academia in the information age. On one hand, insider threat is considered as one of the most serious security concerns (Anderson, 1999) as noted in the results of the 2008 CSI Computer Crime and Security Survey that listed “insider threat” second only to computer viruses a significant security concern. However, insider threat has received a relatively low level of scientific investigation (Chinchani et al., 2005). One important reason for this lack of attention is due to the difficulties in dealing with insider threat. Some of the reasons contributing to this gap in the research include lack of data for analysis and few useful methods for investigating the topic. As such, organizations employ technical controls such as firewalls and limit user access or order to prevent possible insider breaches of security. Unfortunately, technical controls do little to isolate suspicious and malicious insider activities without unacceptable false positive alarms. For example, access control based on authentication and authorization has an important assumption that insiders would always use legitimate privileges to perform harmful activities and thus be caught, but once this assumption is violated, access control will lose its power. Monitoring, another prevailing technique dealing with insider threat, is based on assumption that abnormal system usage indicates suspicious insiders. But monitoring is more of a post-hoc confirmation method to confirm already suspicious insiders of interest (Hunker & Probst, 2011), and thus brings into question if it can serve as a deterrent. (Pfleeger, 2008) Technical approaches to insider threat combat suffer for two major shortcomings: First, malicious insider intention can be unobservable (Hunker & Probst, 2011) and behavioral patterns of insiders vary significantly. However, all insider attacks have one thing in common: they are performed by insiders with motivation. In a 2005 study about insider incidents in the banking and financing sector Randazzo, et al. (2005) found that in 23 insider incidents from 1996 to 2003, 81% incidents involved perpetrators were motivated by financial gains, other than that, 23% for revenge, 15% for dissatisfaction and 15% for desire for respect. Other research suggests that anger, resentment or feelings of revenge could be root causes of insider attacks (De Cremer, 2006). The extant research also tries to identify psychological indicators of malicious insiders’ motivation. Greitzer and Frincke (2010) developed 12 indicators of suspicious malicious insiders, top three of which are disgruntlement, accepting feedback and anger management issues. They also relayed that these indicators are fairly good predictors. However, these indicators are all factors which might be observed at workplace and assumption behind this is that a potential or ongoing malicious insider would reveal this at work. This may not always be the cases as disciplined insiders may stay “under the radar” and not exhibit such indicators. Further, these indicators have yet to be empirically validated. The current state of the insider threat phenomenon is more oriented toward preventing possible perpetrators and less concerned with the identification and capture. This study aims to advance the existing research on identifying malicious insiders by employing information technology to validate insider threat indicators with empirical evidence. The rest of this paper is arranged as follows: in the next section, we will review extant research on insider threat and introduce a research model to guide our investigation of potential indicators. Following that, we will discuss our data collection plan and methodology for analyzing that data. Finally, we will conclude by discussing some challenges and limitations in our forthcoming study. 2. LITERATURE REVIEW In this section, we will first review definitions of both insider and insider threat and then the theory of planned behavior is discussed as theoretical basis for the current research. Last but not the least, indicators of malicious in extant research are reviewed and organized into our framework. 2.1 Terminology Insiders One of the challenges insider threat research is the lack of a widely accepted definition of insider. The term, insider, can be defined in several dimensions (Hunker & Probst, 2011): Access to the system: an insider is defined as legitimate user (Chinchani et al., 2005) who is or previously has been authorized the access to an information system. Other definitions, instead, extend the meaning of access to include physical access and, an insider is defined as having logical or physical access (Randazzo, Keeney, Kowalski, Cappelli, & Moore, 2005). Action based definition: “access to the system” definition defines who insiders are but action based definition defines what insiders do. Bishop and Gates (2008) defines an insider someone as who “violate security policy”. Intention based definition: Hayden (1999) defines four categories of insider: traitor, zealot, browser and well-intentioned insider. Zealot strongly believes correctness should be made insider the organization; browser is a category of individuals who are overly curious in nature; traitor category includes those who have a malicious intent to “destroy, damage, or sell out their organizations”. Moreover in more general sense, some research removes information system context (Bishop, Gollmann, Hunker, & Probst, 2008) and some combined several dimensions together, such as Wood’s definition which classified insider into different categories based on their system roles, intention and system consequences (Wood, 2000). As stated by Hunker and Probst (2011), the definition of insider highly depends on research questions and situations of interest. In this research we focus on all kinds of insiders not confined to the information technology context and all malicious actions performed by these insiders. For this research, we use the definition by Bishop and Gollmann (2008): An insider is a person that has been legitimately empowered with the right to access, represent, or decide about one or more assets of the organization’s structure. However as noted above, there exists various kinds of insiders and the subject of this research are malicious insiders, whose profiles are consistent with the description of Hunker and Probst: an individual deeply embedded in an organization, highly trusted and in a position to do great damage if so inclined. Insider Threat The definition of insider threat depends on how insider is defined. Intuitively, insider threat is a threat posed by insiders. However, this definition is problematic since we could not have clear understanding of “threat” or evaluate “risk” of this threat even if “insider” were well defined. As argued by Hunker and Probst (2011), each factor used to determine insider can be used to determine taxonomy. Although the majority of extant research defines insider threat as certain type of actions, no widely accepted taxonomy of insider threat exists. Hunker and Probst (2011) defines insider threats as potential misuses and actions which result in misuse. Chinchani and colleagues (2005) define insider threat as abuse of privileges with consequence of damage or losses. In other places in Chinchani’s research, insider threat is also defined as “violation of policies”. Specifically, Randazzo defines insider threat as actions affected the security of the organization’ data, system or operation (Randazzo et al., 2005). Other research classified insider threats into different categories by different factors from different perspectives. Based on intentions, insider threat is classified into malicious or inadvertent actions (Brackney & Anderson, 2004). Combined with technical expertise dimension, actions are categorized into international destruction, detrimental misuse, dangerous tinkering, naïve mistakes, aware assurance and basic hygiene (Stanton, Stam, Mastrangelo, & Jolton, 2005). In this research, we adopt a broad definition of Predd et al. (2008) as: As noted in extant research, there exists various insiders and characteristics of insiders are inherently different in multi-dimensions. Consequently, definitions of insider threat depended heavily on the context of the study as well as research questions. Some research embraces this idea and suggests defining insider threat in a loose and general way to avoid “fine nuances” (Flegel, Vayssiere, and Bitz, 2010) while other research defines insider threat as a contextual taxonomy based on characteristics of the individual, the organization, the system and the environment (Predd, Pfleeger, Hunker, & Bulford, 2008). Insider threat: an insider’s action that puts an organization or its resources at risk. Predd et al.(2008), also extends this definition by specifying a contextual way as shown below (Figure 1). This diagram states that instead of defining insider threat as a term including various types of activities in a universal way, it’s better to include its context, including organization, system, environment and individuals as part of its definition. However, they do not differentiate non-malicious or careless insiders from malicious insiders Figure 1 A Framework Of Insider Threat This definition is adopted by two reasons: First, Predd’s definition is consistent with our adopted taxonomy of insider. Research (Predd et al., 2008), from which we adopt definition of insider, defines insider threat as: “an insider threat is an individual with privileges who misuses them or whose access results is misuse.” This definition is consistent with respect to system usage as well as organizational consequence. What’s more, it broadens Hunker’s definition by adding context and offering a top-down method in order to map different scenarios. Second, Predd’s definition is consistent with our research question. The intended contribution of this study is to empirically identify common characteristics of malicious insiders. Our study focuses on identifying indicators that might help identify malicious insiders before they exploit their privileged access. 2.3 Factors Affecting Incidents of Insider Threat 2.2 Criminology Theory and The Theory of Planned Behavior As stated by previous research, certain theories in criminology are relevant to insider threat detection and prevention (Hunker & Probst, 2011) such as earlier theories of deterrence (Kankanhalli, et. al., 2003; Straub & Welke, 1998), social bonds (Lee et.al., 2004), and social learning (Hollinger, 1993; Parker & Parker, 1976; Skinner & Fream, 1997) which are integrated into the theory of planned behavior (Dugo, 2007; J. Lee & Lee, 2002; Peace, et.al., 2003). Additionally, the Theory of Planned Behavior (TPB) was developed to explain and predict specific behaviors in a specific context (Ajzen, 1991). According to the theory, human behavior is guided by three kinds of considerations: behavioral beliefs, normative beliefs, and control beliefs (Ajzen, 1985). Behavioral beliefs are an individual’s expectation of outcomes and their evaluations of these outcomes; normative beliefs represent other’s expectation and individual’s willingness to comply with these expectations and the last one refers to external factors which facilitate individual’s intended action. From the framework of planned behavior, an individual’s behavior is the result of motivation (behavioral beliefs), environments (normative beliefs) and opportunities (control beliefs). The relevance of TPB is confirmed by both survey, in which among 23 insider incidents, 81% of related insiders planned their job; (Randazzo et al., 2005), as well as by theoretical model which includes risk-averse nature and planned action as factors affecting an insider’s action (Wood, 2000). Further, Predd et al. (2008) argue that insider threat should be defined as specific to its context which makes it suitable for applying criminology into the study of insider threat. Based on Predd’s (2008) work, an insider’s actions are shaped by four factors: organization, individual himself/herself, environment and system. Organization sets up security rules and policy, as well as affect insider’s action via organization culture. System reflects implemented policy; environments shape and constrain both organizational behavior and an insider’s behavior through social or ethical norms; last but not the least, the individual’s motivation directly affects how he/she plans and mount insider attacks. In this section, we will start with this framework and review related researches about organizational factors (including system), the insider’s motivation and environmental effects. 2.3.1 Organizational Factors According to Predd, organizational factors affecting insider threat include organizational security policy and organizational culture. Organizational security policy includes not only articulated policy but also implemented policy (the system). Besides, organizational culture affects insider threat via leveraging its employee’s awareness and compliance of security policy as well as its management styles. (1) Policy Three aspects of policy will influence the effect of policy: capability of policy language, stated policy and implemented policy. First, as stated by Hunker and Probst, (2011), capability of policy language is not adequate to effectively prevent insider threat and this is the inherent shortcomings of policy language. This disadvantage originates from complex and dynamic situation which policy are facing. And he also suggests that deployment of domainspecified policy which could clarify situations in which execution could only be authorized when discretionary circumstances justify them. Second, policy taking place is not inherently the same kind as argued by Hunker, four hierarchy exist as oracle policy, feasible policy, configured policy and real time policy. Unawareness and misunderstanding of policy hierarchy could result in policy absence or policy conflict. What’s more, policies are not always explicit but sometimes implicit and gap exists between stated policy and observed policy (Puhakainen, 2010), which could be mitigated by security training programs (Vance, 2012) and increased participation of top managers (Hu, 2012). (2) Organization Culture Specifically, organization culture affect incidents of insider threat in the following aspects: First, whether security policy support or interfere with organizational work flow will affect compliance with security policy. Second, levels of security awareness to organization members will affect insider strategy (Hunker & Probst, 2011). Levels of security awareness includes perception, understanding and prediction (Shaw, Post, & Ruby, 1999). Third, organizational purpose and management structure will affect security structure and policy. 2.3.2 Motivation of Insiders We note that extant research focusing on motivations of insider and insider threats does not differentiate terms of “psychological profile” and “motivation”. The former focuses more on personal or internal motivations and the latter focuses more on goals of insiders’ actions. In another study, (Wood, 2000) lists four major goals of malicious insiders: profit; provoking change such as change in policy; subverting mission of organization; and personal goals such as being respected or gaining power. On the other hand, when considering psychological profile, Stolfo and colleagues (2008) list 10 types of motivation which might be most harmful: (1) making unintentional mistake; (2) trying to accomplish needed tasks; (3) trying to make the system do something for which it was not designed as a form of innovation to make the system more useful or usable; (4) trying innocently to do something beyond the authorized limit, without knowing the action is unauthorized; (5) checking the system for weakness, vulnerabilities or errors, with the intention of reporting problems (6) testing the limits of authorization; checking the system for weaknesses, vulnerabilities or errors, without the intention of reporting problems; (7) browsing, killing time by viewing data; (8) expressing boredom, revenge or disgruntlement; (9) perceiving a challenge: treating the system as a game to outwit; (10) acting with the intention of causing harm, for reasons such as fame, greed, capability, divided loyalty or delusion. Additionally, Greitzer and Frinke (2010) identified several psychological indicators such as disgruntled, anger management issues and ignorance of authority, and in a case study about sabotage and espionage, common characteristics such as antisocial and narcissistic personalities have been identified (Moore, et. al, 2008). These indicators mentioned above are just a small piece of the big picture. Harmful actions performed by insiders include espionage, sabotage (Gelles, 2005; Krofcheck & Gelles, 2005) or just accidental mistakes (Predd et al., 2008) or innocent errors (Salem, Hershkop, & Stolfo, 2008). Motivations of these actions are just as diverse as types of actions (Salem et al., 2008). 2.3.3 Environmental Factors Predd argues that environment defines whether an action is legal or ethical and emphasizes punishment enforced by law (Nance & Marty, 2011). What’s more, cultural differences and attitude toward what is appropriate will also affect bounds of insiders as well as definitions of malicious. For example, Edward Snowden appears to believe he was “doing the right thing” when he exposed NSA information. As mentioned before, complex and dynamic of external environment will affect policy making as well as policy implementation (Hunker & Probst, 2011), as a result, affect incidents of insider attacks. 2.3.4 System System is implemented policy (Predd et al., 2008) and techniques support, technically, the realization of security policy. Current techniques to mitigate insider threat include access control, monitoring, integrated approaches, trusted system and predicting model. 2.3.4.1 Access Control Access control has two aspects: authentication and authorization. Authentication defines who you are and authorization defines what you can do. However, access control has limitations such as it could not prevent users who are using legitimate privileges to behave as malicious insiders. 2.3.4.2 Monitoring This paper talks about two types of monitoring and several techniques to perform monitoring. These two types are misuse detection and anomaly detection. Misuse detection and modeling identifies defined types of misuse through rule-based detection. But limitation is this method could only detect known type of insider attack. Framework to perform this includes finite state machine, petri nets or regular expression. Anomaly Detection flags significant deviation from expected normal behavior as a proxy for unknown misuse. Method and theories used here include co-currence of multiple events, highorder of markov chain, naïve Bayesian network. Problems with monitoring includes no evidence as deterrent and violation of privacy. 2.3.4.3 Integrated Approaches Integrated approaches take combined several techniques together, including honey pots, network level sensor, physical logs, and model of insiders and pre-attack insiders to infer malicious intent. 2.3.4.4 Trusted System Key characteristic of trusted system is reference validation (Neumann, 2010): each execution could be tracked back to specific users. This characteristic makes trusted system as resistant to insiders as well as to outsiders. In operations, a trusted system is implemented by isolating executing privilege domains with less privilege domains, isolating one user’s access from another user’s access (Saltzer, 1974) or assigning user specific random domains (Neumann, 2010). 2.3.4.5 Predicting Model Predicting model uses system usage as predictors of insider attacks, such as inconsistent digital behaviors (Dimkov, 2011) and unusual access (Probst, 2009). 3. INTRODUCTION MODEL OF RESEARCH In this section, we will first build our model based on the planned behavior theory and other related theories and research. Then constructs and their corresponding measures will be discussed. 3.1 Model Derivation There has been a considerable amount of work with respect to individual motivations in the literature. Researchers from various disciplines propose that the constructs depicted in the table are indicators of insider threat. As noted, our motivation is to validate those constructs. Figure 2 shows our model derived from the Theory of Planned Behavior, in which insider threat incident behavior is preceded by three constructs: attitude, subjective norm and perceived behavioral control. Figure 2: Research Model 3.1.1 Attitude Attitude refers to “the degree to which a person has a favorable or unfavorable evaluation or appraisal of the behavior in question” (Ajzen, 1991). Internal as well as insider’s perception about external factors would affect the insider’s attitude towards what he or she is doing or plan to do. From previous literature, factors affecting an insider’s attitude towards himself or herself include both internal and external factors: 3.1.1.1 Internal Factors (1) Self-image Randall,1989) (Loch & Conger, 1996; (2) Deindividuation (Lee & Lee, 2002; Loch & Conger, 1996) (3) Commitment to organization (Dugo, 2007; Lee & Lee, 2002; Li,et al., 2010) (4)Beliefs (Loch &Conger,1996; Vance, et,al., ,2012;) (5) Psychological Indicators (Greitzer & Frincke, 2010; Moore et al., 2008) 3.1.1.2 External Factors or Perception about External Factors Insider’s (1) Perceived punishment severity and perceived punishment certainty (Cox, 2012; Dugo, 2007; Ifinedo, 2012; Li, et.al., 2010; Peace et al., 2003; Peach, et al., 2010; Son, 2011; Vance,et,al., 2012) (2) Security culture (Hu et. al., 2012) Perceived behavioral control refers to people’s perception of the ease or difficulty of performing the behavior of interest (Ajzen, 1991). Although in the theory of planned behavior, perceived behavioral control is one predictor of intention of behavior, actual level of behavioral control was also used as a predictor (Bulgurcu, Cavusoglu, & Benbasat, 2010). In Bulgurcu’s research, number of security staffs and number of security systems in use are used as predictor of IT security policy compliance behavior. Predicting power of actual behavioral control is also confirmed by Ajzen (1991), one of the builder of the theory of planned behavior, arguing perceived behavioral control serves as a proxy for actual behavioral control. Factors affecting perceived behavioral control in existing research include: (1) punishment certainty (Peace et al., 2003) (2) security policy and security systems(Lee & Lee, 2002) (3) locus of control (Cox, 2012) 3.2 Definition of Constructs There has been a considerable amount of work with respect to individual motivations in the literature. Researchers from various disciplines propose that the constructs depicted below. As noted, our motivation is to validate those constructs. (3) Organizational culture (Cox, 2012; Hu et al., 2012) 3.1.2 Subjective Norm Subjective norm is a social factor and refers to “perceived social pressure to perform or not to perform or not to perform the behavior”(Ajzen, 1991). Therefore, in the context of insider threat research, subjective norm is specified as how coworkers and senior works feel about insiders’ actions (Lee & Lee, 2002). But in this research, we extend Lee’s scope of subjective norm to include influences from family or any other sources, not limited to influences exerted from workplaces. 3.1.3 Perceived Behavioral Control 3.2.1 Self-image In previous research, demographic characteristics of insiders include gender, age, education, socioeconomic status, religion, marriage status, professions and position in organization (Randall, 1989). And self-image is the characteristic an individual defines himself or herself (Loch & Conger, 1996). As argued by Loch & Conger, if an individual defines himself or herself by religion, he or she is mostly likely to comply with rules of that religion. Therefore, characteristic used by individual to define himself or herself serves as one measure of his or her attitude. 3.2.2 Deindividuation Deindividuation is first defined as a feeling of “being estranged or separated from others that can lead to behavior violating established norms of appropriateness” (Zimbardo, 1969). What’s more, it is widely used in insider threat researches as an antecedent of insider threat (Lee & Lee, 2002; Loch & Conger, 1996). People with deindividuation has less interaction with others and will be less likely to perform socially accepted behaviors. 3.2.2 Perceived Severity/Certainty Punishment If an individual’s perceived punishment severity is high and perceived probability to be discovered is high, he or she would perceived a high level of behavioral control. 3.2.3 Commitment to Organization/Beliefs Commitment to organization refers to “one is committed to conformity by not only what one has but also what one hopes to attain” (Hirschi, 2002). And beliefs refer to strength of individual’s feeling about whether he or she should comply with organizational rules. Therefore, the more an individual is committed to organization, the less likely he or she is to commit malicious threat to organization (Dugo, 2007; Lee & Lee, 2002). 3.2.4 Organizational Culture Organizational culture refers to whether the organization is goal-oriented or rule-oriented. For a goal-oriented organization, insiders comply with organization by fulfilling organizational goals, however, rule-oriented organization requires insiders comply with procedures and regulations (Cox, 2012; Hu et al., 2012). 3.2.5 Security Policy and Systems Security policy and security systems refer to official and implemented security policies in organization. Quality of security-whether stated policy covers potential risk emerging area-as well as implemented policy-how many security systems are used- will affect organizational security level (Hu et. al. 2012). 3.2.6 Psychological Indicators Twelve psychological indicators are suggested by Greitzer & Frincke (2010) as shown in Table 1: Table 1 Psychological Indicators Indicator Disgruntlement Accepting Feedback Anger Management Issues Disengagement Disregard for authority Performance Stress Confrontational Behavior Personal Issues Self-Centeredness Lack of Dependability Absenteeism Description Employee observed to be dissatisfied in current position. The employee is observed to have a difficult time accepting criticism. The employee often allows anger to get pent up inside. The employee keeps to self, is detached, withdrawn and tends not to interact with individuals or groups; avoid meetings. The employee disregards rules, authority or policies. The employee has received a corrective action based on poor performance. The employee appears to be under physical, mental or emotional strain or tension that he or she has difficulty handling. Employee exhibits argumentative or aggressive behavior or is involved in bullying or intimidation. Employee has difficulty keeping personal issues separate from work. The employee disregards needs or wishes of others, concerned primarily with own interests and welfare. Employee is unable to keep commitments or promises; unworthy of trust. Employee has exhibited chronic unexplained absenteeism. 4. Methodology As noted above, previous research often focuses on preventing insider incidents as opposed to actually identifying malicious insiders. Further, while some malicious insider characteristics have been proposed, many have not had to stand the scrutiny of empirical investigation. We aim to close these gaps by using text mining and classification to exam third party data; namely past reports on captured malicious insiders and empirically examine their characteristics. Then we intend to use those empirically supported characteristics in an attempt to better predict and identify potential malicious insiders. records, VPN login records and so on. When these data are collected, algorithms will be employed to calculate observations. Fuller, et. al. (2009) demonstrated how decision tress, neural networks, and logistic regression can be used in similar law enforcement cases. Once the data is mined and classified, observations can be made. Observations are inferences from data to reflect a certain state. In previous example, timecard records (data) and VPN login could be used to calculate Time At Work (observation). From observations, indicators can be derived. Indicators are referred to actions or events that are precursors of a certain behavior. In previous examples, unusual late work hour (indicator) could be derived from time at work (observation). 4.2.2 Extension of Previous Method 4.1 Data Sample Data used in this study are mainly from two sources: public reports and previous research. We will begin by text mining public reports for keywords of name (the insider) involved in discovered insider incidents. Once we identify a satisfactory number of cases, we will then textmine for indicators of the characteristics posited by previous research. 4.2 Research Methods The method we propose to use in this research is based on the process introduced by Greitzer and Frincke (2010). In this process, data collected is first refined into observations and then these observations are clustered into different indicators. In this section, we first briefly introduce Greitzer and Frincke’s information extraction model and then specify the process and method we will use in this research. 4.2.1 Introduction of the Method In Greitzer and Frincke’s (2010) approach data in the form of text based reports is collected. Data represents direct available information about activities of individuals such as timecard In this research, our interest has a wider perspective including but not limited to psychological indicators of malicious insiders as Greitzer and Frincke (2011) did. Therefore, we extend the scope in terms of data, observation and indicator, but stay with the framework. We intend to use direct descriptions about extant malicious insiders from public reports, national or local media, and previous research as raw data in our current study. These unstructured data are processed into structural observations using information extraction text-mining. In this process, heuristic methods are employed: we mine and extract descriptions of malicious insiders and refined them into observations (a reflection of certain characteristic or state of insider), then the next piece of data is processed. If the observation extracted from data already exists (i.e., has already been identified), then a new record of that observation is added to the others. However, if the item has not yet been observed then a new observation will be created and recorded. A major difference between our method and Greitzer’s method is that indicators in our research are not refined and extracted from observations, but instead they are predefined by previous research. Therefore, observations are clustered into indicators specified in Section 3 using clustering text mining techniques. However, we note that having predefined observations will not precluded use from identifying potential new observations, and we expect to find some. Modern text mining and classification techniques are quite powerful and can yield results not found by human observation (Fuller, et al., 2011). Conclusion The problem of malicious insider threat is of concern to practitioners and academic alike, yet the phenomenon has yet to be significantly examined beyond the domains of psychology and criminal science where mainly human observation is the only means of data collection. We aim to employ a form of data mining, namely text mining along with observation classification to expand and aggregate finding from multiple historical insider threat cases. In doing so, we believe we can develop better indicators for identifying the characteristics of malicious insiders. However, our work has just begun. Our next step is to collect cases of malicious insider threat and all reports and articles covering each case. Then, we will employ our text-mining and classification techniques identified above to validate the indicators derived from previous research and possible identify additional indicators. Finally, we hope to build a common set of validated indictors of malicious insiders. We hope this proposed method will garner discussion and endorsement from other researchers pursuing and solution to the insider threat problem. REFERENCES [1]Adkins, M., Twitchell, D. P., Burgoon, J. K., & Nunamaker Jr, J. F. (2004). Advances in automated deception detection in text-based computer-mediated communication. Paper presented at the Defense and Security. [2]Ajzen, I. (1985). From intentions to actions: A theory of planned behavior: Springer. [3]Ajzen, I. (1991). The theory of planned behavior. Organizational behavior and human decision processes, 50(2), 179-211. [4]Anderson, R. H. (1999). Research and Development Initiatives Focused on Preventing, Detecting, and Responding to Insider Misuse of Critical Defense Information Systems: DTIC Document. [5]Bellovin, S. M. (2008). The insider attack problem nature and scope Insider Attack and Cyber Security (pp. 1-4): Springer. [6]Bishop, M., Engle, S., Peisert, S., Whalen, S., & Gates, C. (2009). Case studies of an insider framework. Paper presented at the System Sciences, 2009. HICSS'09. 42nd Hawaii International Conference on. [7]Bishop, M., & Gates, C. (2008). Defining the insider threat. Paper presented at the Proceedings of the 4th annual workshop on Cyber security and information intelligence research: developing strategies to meet the cyber security and information intelligence challenges ahead. [8]Bishop, M., Gollmann, D., Hunker, J., & Probst, C. W. (2008). Countering insider threats. Paper presented at the Dagstuhl Seminar. Brackney, R. C., & Anderson, R. H. (2004). Understanding the Insider Threat. Proceedings of a March 2004 Workshop: DTIC Document. [9]Bulgurcu, B., Cavusoglu, H., & Benbasat, I. (2010). Information security policy compliance: an empirical study of rationality-based beliefs and information security awareness. Mis Quarterly, 34(3), 523-548. [10]Chinchani, R., Iyer, A., Ngo, H. Q., & Upadhyaya, S. (2005). Towards a theory of insider threat assessment. Paper presented at the Dependable Systems and Networks, 2005. DSN 2005. Proceedings. International Conference on. [11]Cox, J. (2012). Information systems user security: A structured model of the knowing– doing gap. Computers in Human Behavior, 28(5), 1849-1858. [12]De Cremer, D. (2006). Unfair treatment and revenge taking: The roles of collective identification and feelings of disappointment. Group Dynamics: Theory, Research, and Practice, 10(3), 220. [13]Dugo, T. (2007). The insider threat to organizational information security: a sturctural model and empirical test. [14]Flegel, U., Vayssiere, J., & Bitz, G. (2010). A state of the art survey of fraud detection technology Insider Threats in Cyber Security (pp. 73-84): Springer. [15]Fuller, C. M., Marett, K., & Twitchell, D. P. (2012). An examination of deception in virtual teams: Effects of deception on task performance, mutuality, and trust. Professional Communication, IEEE Transactions on, 55(1), 20-35. [16] Fuller, C.M., Biros, D.P. and Wilson R.L., “Decision Support for Determining Veracity via Linguistic Based Cues,” Decision Support Systems, 46, 2009, 695-703. [17] Fuller, C., Biros, D., Delen, D. “Data and Text Mining methods applied to the task of Detecting Deception in Real World Crime Investigation Records, “Expert Systems with Applications, June 2011 [18]Gelles, M. (2005). Exploring the mind of the spy. Employees’ guide to security responsibilities: Treason, 101. [19]Greitzer, F. L., & Frincke, D. A. (2010). Combining traditional cyber security audit data with psychosocial data: towards predictive modeling for insider threat mitigation Insider Threats in Cyber Security (pp. 85-113): Springer. Hayden, M. (1999). The insider threat to US government information systems: DTIC Document. [20]Herath, T., & Rao, H. R. (2009). Protection motivation and deterrence: a framework for security policy compliance in organisations. European Journal of Information Systems, 18(2), 106-125. [21]Hirschi, T. (2002). Causes of delinquency: Transaction publishers. Hollinger, R. C. (1993). Crime by computer: Correlates of software piracy and unauthorized account access. Security Journal, 4(1), 2-12. [22]Hu, Q., Dinev, T., Hart, P., & Cooke, D. (2012). Managing Employee Compliance with Information Security Policies: The Critical Role of Top Management and Organizational Culture*. Decision Sciences, 43(4), 615-660. [23]Hunker, J., & Probst, C. W. (2011). Insiders and insider threats—an overview of definitions and mitigation techniques. Journal of Wireless Mobile Networks, Ubiquitous Computing, and Dependable Applications, 2(1), 4-27. [24]Kankanhalli, A., Teo, H.-H., Tan, B. C., & Wei, K.-K. (2003). An integrative study of information systems security effectiveness. International journal of information management, 23(2), 139-154. [25]Krofcheck, J., & Gelles, M. (2005). Behavioral consultation in personnel security: Training and reference manual for personnel security professionals. Yarrow Associates, Fairfax, Virginia. [26]Lee, J., & Lee, Y. (2002). A holistic model of computer abuse within organizations. Information Management & Computer Security, 10(2), 57-63. [27]Lee, S. M., Lee, S.-G., & Yoo, S. (2004). An integrative model of computer abuse based on social control and general deterrence theories. Information & Management, 41(6), 707-718. [28]Li, H., Zhang, J., & Sarathy, R. (2010). Understanding compliance with internet use policy from the perspective of rational choice theory. Decision Support Systems, 48(4), 635645. [29]Loch, K. D., & Conger, S. (1996). Evaluating ethical decision making and computer use. Communications of the ACM, 39(7), 74-83. [30]Moore, A. P., Cappelli, D. M., & Trzeciak, R. F. (2008). The “big picture” of insider IT sabotage across US critical infrastructures: Springer. [31]Nance, K., & Marty, R. (2011). Identifying and visualizing the malicious insider threat using bipartite graphs. Paper presented at the System Sciences (HICSS), 2011 44th Hawaii International Conference on. [32]Parker, D. B., & Parker, D. (1976). Crime by computer: Scribner New York. [33]Peace, A. G., Galletta, D. F., & Thong, J. Y. (2003). Software piracy in the workplace: A model and empirical test. Journal of Management Information Systems, 20(1), 153178. [34]Pfleeger, C. P. (2008). Reflections on the insider threat Insider Attack and Cyber Security (pp. 5-16): Springer. [35]Predd, J., Pfleeger, S. L., Hunker, J., & Bulford, C. (2008). Insiders behaving badly. IEEE Security & Privacy, 6(4), 0066-0070. [36]Probst, C. W., Hunker, J., Gollmann, D., & Bishop, M. (2010). Aspects of Insider Threats Insider Threats in Cyber Security (pp. 1-15): Springer. [37]Randall, D. M. (1989). Taking stock: Can the theory of reasoned action explain unethical conduct? Journal of Business Ethics, 8(11), 873882. [38]Randazzo, M. R., Keeney, M., Kowalski, E., Cappelli, D., & Moore, A. (2005). Insider threat study: Illicit cyber activity in the banking and finance sector: DTIC Document. Salem, M. B., Hershkop, S., & Stolfo, S. J. (2008). A survey of insider attack detection research Insider Attack and Cyber Security (pp. 69-90): Springer. [39]Shaw, E. D., Post, J. M., & Ruby, K. G. (1999). Inside the Mind of the Insider. Security Management, 43(12), 34. [40]Skinner, W. F., & Fream, A. M. (1997). A social learning theory analysis of computer crime among college students. Journal of research in crime and delinquency, 34(4), 495518. [41]Stanton, J. M., Stam, K. R., Mastrangelo, P., & Jolton, J. (2005). Analysis of end user security behaviors. Computers & Security, 24(2), 124133. [42]Stolfo, S. J., Bellovin, S. M., Hershkop, S., Keromytis, A. D., Sinclair, S., & Smith, S. (2008). Insider attack and cyber security: beyond the hacker (Vol. 39): Springer. [43]Straub, D. W., & Welke, R. J. (1998). Coping with systems risk: security planning models for management decision making. Mis Quarterly, 441-469. [44]Vance, A., Siponen, M., & Pahnila, S. (2012). Motivating IS security compliance: insights from habit and protection motivation theory. Information & Management, 49(3), 190198. [45]Wood, B. (2000). An insider threat model for adversary simulation. SRI International, Research on Mitigating the Insider Threat to Information Systems, 2, 1-3. [46]Zimbardo, P. G. (1969). The human choice: Individuation, reason, and order versus deindividuation, impulse, and chaos. Paper presented at the Nebraska symposium on motivation. [47] Dimkov, T., Pieters, W., & Hartel, P. (2011). Portunes: representing attack scenarios spanning through the physical, digital and social domain. InAutomated Reasoning for Security Protocol Analysis and Issues in the Theory of Security (pp. 112-129). Springer Berlin Heidelberg. [48] Probst, C. W., & Hansen, R. R. (2009, May). Analysing access control specifications. In Systematic Approaches to Digital Forensic Engineering, 2009. SADFE'09. Fourth International IEEE Workshop on (pp. 22-33). IEEE. PHISHING INTELLIGENCE USING THE SIMPLE SET COMPARISON TOOL Jason Britt, Dr. Alan Sprague, Gary Warner University of Alabama at Birmingham Computer and Information Sciences Birmingham, AL 35233 [email protected] ABSTRACT Phishing websites, phish, attempt to deceive users into exposing their passwords, user IDs, and other sensitive information by imitating legitimate websites, such as banks, product vendors, and service providers. Phishing investigators need fast automated tools to analyze the volume of phishing attacks seen today. In this paper, we present the Simple Set Comparison tool. The Simple Set Comparison tool is a fast automated tool that groups phish by imitated brand allowing phishing investigators to quickly identify and focus on phish targeting a particular brand. The Simple Set Comparison tool is evaluated against a traditional clustering algorithm over a month's worth of phishing data, 19,825 confirmed phish. The results show clusters of comparable quality, but created more than 37 times faster than the traditional clustering algorithm. Keywords: phishing, phish kits, phishing investigation, data mining, parallel processing 1. INTRODUCTION Phishing websites, phish, attempt to deceive users into exposing their passwords, user IDs, and other sensitive information by imitating legitimate websites, such as banks, product vendors, and service providers. Phish widely range in quality from simple html files to complex replicas indistinguishable from the actual website. Phishing has been a problem for years and organizations such as the Anti-phishing Working Group (APWG) founded in 2003 and PhishTank founded in 2005 have been fighting phishing for years [1, 2]. Today, phishing is still a problem. A 2013 Kaspersky lab report places phishing attacks as one of the three most prevalent external threats facing corporations [3]. Between April and June of 2014, APWG reported observing 128,378 new phishing attacks [4]. This is the second highest phishing attack volume observed in a three month period by APWG [4]. Phishing attack volumes are large and have increased over the years. Quickly identifying similar phish imitating a particular brand can be useful for corporate and law enforcement phishing investigators. An entity that can quickly identify itself as the target of a phishing attack can take timely and appropriate responses. It can also tailor its response to the phishing attack that is targeting it. During an investigation, law enforcement gains the ability to focus on phishing attacks against a particular brand. Law enforcement is also able to quickly identify and aggregate all of the phishing attacks against a particular brand. The currently observed phishing attack volumes make manual phish analysis uneconomical. Fast and scalable automated methods need to be used to assist corporate and law enforcement phishing investigators. Clustering algorithms, which are algorithms used to sort items into groups of similar items, can be used to generate groups or clusters of similar phishing websites. Clustering unbranded phish with branded phish can be used to apply a brand label to the unbranded phish. If a phish cluster contains a branded phish the same brand can be applied to the unbranded phish in the cluster with some amount of confidence. Traditional clustering algorithms such as k-means, SLINK, and DBSCAN can be used but are relatively slow when operating on large data sets [5]. This paper presents the Simple Set Comparison Tool to cluster large phishing data sets faster than traditional clustering algorithms. The Simple Set Comparison Tool quickly sorts phishing sites into groups of similar phish by brand. The Simple Set Comparison Tool uses a divide and conquer approach to quickly cluster large chronological data sets. The large chronological data set is subdivided or partitioned into many smaller datasets by date and time. Because the dataset is partitioned the most computationally expensive clustering work can be performed on these partitions in parallel by multiple machines. After the computationally expensive work has been performed the smaller partitions are rejoined to form a clustering of the original large dataset, but taking less runtime than traditional methods. Another key feature of the Simple Set Comparison Tool is its adaptability. It can make use of most methods to compare phishing websites and adapt to use most clustering algorithms with very few restrictions. The tool is evaluated using manually reviewed real world phishing data consisting of 19,825 phish covering 245 brands collected from September 1st 2014 to September 30th 2014. The real world phish have been reviewed by a security company and assigned a brand label representing the brand the phish is imitating. The tool is evaluated using the brand labels as a ground truth and using several common clustering evaluation metrics. The Simple Set Comparison Tool’s clustering quality and runtime are compared to a traditional clustering algorithm’s runtime and clustering quality over the same dataset. The results show the Simple Set Comparison Tool produces a similar high quality clustering when compared to the traditional clustering algorithm, but the Simple Set Comparison Tool ran more than 37 times faster. The Simple Set Comparison Tool makes the following contributions: 1. Aggregates phishing attacks against brands. 2. Parallelizes most clustering tasks resulting in a dramatic runtime improvement over traditional clustering algorithms. 3. Has the adaptability to use a wide variety of phish similarity distance metrics and a wide variety of clustering algorithms. The rest of the paper is laid out as follows, section 2 discusses related work, section 3 describes the data set used for evaluation, section 4 covers the algorithms used in the Simple Set Comparison Tool, section 5 presents and discusses the comparative results between the Simple Set Comparison Tool and a traditional clustering algorithm, section 6 presents the conclusions drawn, and section 7 covers future work. 2. RELATED WORK Phishing researchers have been mainly focused on identifying phish versus non-phish websites, also known as binary classification. There is some research attempting to classify phish into more than two categories such as by brand or phish author. However, to the authors’ knowledge there is no research attempting to classify phish into brand categories using a parallelizable approach with the adaptability to use most distance metrics and clustering algorithms. Phishing researchers have presented a number of classification methods for binary classification. These methods can be categorized into three general groups: email advertising phish, URL, and content-based approaches. Some email based approaches classify the words in a spam email’s body to determine the legitimacy of the email [6]. Other email-based approaches use features derived from the email message such as the sender email, sender IP address, and non-matching URLs between the hyperlink and anchor tag [7]. These features are used to classify the email through machine learning algorithms [7, 8]. URLbased approaches have been explored. Gyawali et al.. [9]and Ma et al. [10] proposed solutions to phishing identification by using features that can be derived from a URL. These researchers demonstrated that URL-based methodologies can identify phishing URLs with high accuracy; however, such techniques can be avoided causing lower detection rates by shortening the phishing URLs or other methods to randomize the URL. Content-based approaches use the content of the phishing website for detection. Dunlop et al. [11] presents a method for determining the visual similarity between screenshots of phishing websites. Other researchers have used components within the source code [12] [13]. There have also been a number of researchers that use combinations of all three categories for detection [14] [15] [16]. All of the binary classification techniques lack the ability to identify the brand targeted by the phish and they are not scalable as the techniques and algorithms used are not parallelizable. Also, most of the techniques lack adaptability and can be avoided by the attacker adapting the phish as seen with the URL based approaches. There are several different techniques presented in research to classify phish into more than binary categories or cluster phish. Phish clustering has been an area of interest for researchers that are proactively trying to determine the criminals behind the phishing attacks [12, 17, 18]. Criminals use to create domains on the same IP blocks, which Weaver and Collins leverage in a clustering algorithm using the IP address or hosting network to cluster phish [13]. Similar attack patterns against domains have been used to attribute phishing attacks to particular criminals [19]. IP and hosting network based solutions have been avoided by criminals adapting to use botnets or compromised webservers spread over different IPs and hosting networks. The Deep MD5 technique has been used to cluster phish into groups of related phish using local domain files [20, 21]. The Deep MD5 technique can be avoided by slightly changing phish files from phish to phish, although this behavior has not yet been seen in the wild [20, 21, 22]. The most recent method called Syntactical fingerprinting uses structural components to cluster phishing websites by brand and by criminal [22]. Email addresses receiving phished credentials found in the kits used to create the phish have also been used to cluster phishing websites [23]. The non-binary classification techniques take unique approaches to classifying phish, but none can be performed in parallel. Also, all of the classification techniques lack adaptability as they rely on a particular phish comparison metric or clustering algorithm. All of the non-binary classification techniques presented lack scalability and adaptability. Existing research is mainly focused on binary classification. However, there is some research focused on non-binary phish classification such as classifying phish by brand or criminal. There is a lack of techniques that can perform phish classification using scalable methods such as parallelization. Also, there is no adaptable and scalable phish classification technique that can liberally use different phish comparison methods and different classification algorithms should criminals adapt their phishing attacks. 3. DATA SET The phishing URLs are gathered from a large spam-based URL provider, a large anti-phishing company, and a number of other feeds including private companies, security companies, and financial institutions. The source of the URLs is either URLs contained in spam advertising phish or URLs reported by the public to fraud alert email addresses. The data set favors financial institutions and under represents gaming and social media phish when compared to other phishing collections. A number of methods are used in the industry to count phish. Some methods count distinct URLs. If there is any randomization in the host name, directory path, or arguments it leads to ‘over-counting’. Cases where this occurs include wild-card DNS entries, per user customized URLs, or virtual hosts allowing the same directory path for multiple domains to resolve to a single IP address. A conservative counting approach that attempts to de-duplicate URLs leading to the same phishing content is used. The phishing data consists of all files referenced in the potential phish. The website files are fetched using an automated web crawler that makes use of a Firefox mechanization tool [24]. After the files are downloaded, a hash value is generated for each file using the MD5 hashing algorithm. While the MD5 hashing algorithm is not a cryptographically secure algorithm it is not being used for a cryptographic purpose, but rather to identify individual files. MD5 hash values can be changed by slightly altering a file’s content each time it is deployed in a phish. However, phish authors would have to create a phish kit to automate the file changes needed to perform this for every file every time a phish is deployed. So far this behavior has not been observed in the wild. For this reason the authors feel the MD5 hashing algorithm is acceptable to use for file identification in this instance. Screenshots and the domain information are manually reviewed to determine whether the potential phish is a phish. Brand Count Tech Company 1 3,815 Telecom Company 1 1,720 Tech Company 2 1,484 Financial Institution 2 1,435 Financial Institution 3 829 Tech Company 3 786 Tech Company 4 709 Financial Institution 4 657 Tech Company 5 589 Financial Institution 5 529 Figure 1 Ten Most Numerous Brands Phished The data set consists of 19,825 confirmed phishing sites collected between September 1st 2014 and September 30th 2014. There are a total of 245 different brands. Figure 1 shows an anonymized by sector listing of the 10 most phished brands in the data set. 4. ALGORITHMS The Simple Set Comparison Tool consists of four broadly defined steps. 1. 2. 3. 4. Creating Time Windows Clustering Time Windows Comparing Time Windows Merging Time Windows In the first step, user specified single time windows are created. A cross time window is created for every combination of user specified single time windows. In the second step, the single and cross time windows are clustered independently of one another. All single and cross time window clustering processes can be run in parallel. In the third step, single time windows are compared to overlapping cross time windows based upon shared cluster members resulting in a cluster similarity graph. The time window comparisons can be run in parallel. In the fourth step, a clustering algorithm is then run over the cluster similarity graph to merge similar clusters. The result is a clustering of the entire data set. The Simple Set Comparison Tool takes advantage of parallel processing in the second and third steps. To be able to process a large data set in parallel the data set must first be subdivided or partitioned. The phish data is tagged with a received time which allows partitioning on a chronological basis. The parallel processing in steps two and three are the key to reducing the runtime compared to traditional clustering algorithms. 4.1 Creating Time Windows The data set is subdivided into four different time windows of approximately seven days each consisting of the following date ranges 09/01/2014 to 09/08/2014, 09/09/2014 to 09/15/2014, 09/16/2014 to 09/22/2014, and 09/23/2014 to 09/30/2014. Throughout the rest of the paper these date ranges will be referred to by their date range window number as presented in figure 2. Begin Date End Date Window Number 9/1/2014 9/8/2014 1 9/9/2014 9/15/2014 2 9/16/2014 9/22/2014 3 9/23/2014 9/30/2014 4 Figure 2 Date Ranges and Their Date Range Number As well as single time date ranges, single time windows, multi-date ranges are created, cross time windows. Cross time windows are created by merging data from two single time windows together. This is performed for every combination of single time windows. In this case, creating combinations of the four single time windows results in the following six cross time window combinations; 1:2, 1:3, 1:4, 2:3, 2:4, and 3:4. Throughout the rest of the paper cross time windows will be referred to by their cross time window number. For example, the cross time window that crosses time windows one and two is 1:2. Window Phish Counts Single window 1 Single window 2 Single window 3 Single window 4 Cross Window 1 : 2 Cross Window 1 : 3 Cross Window 1 : 4 Cross Window 2 : 3 Cross Window 2 : 4 Cross Window 3 : 4 0 2,000 4,000 6,000 Number Phish 8,000 10,000 12,000 Figure 3 Single and Cross Time Windows and Number of Phish in Each The number of phish in each window is depicted in figure 3. The four single time windows consist of approximately 4,000 to 5,000 phish each. The six cross time windows contain between 9,000 to almost 11,000 phish. The cross time window data sets are about twice as large as the single time window data sets. 4.2 Clustering Time Windows The phish time windows and cross time windows are clustered by comparing phishing websites using the Deep MD5 method as a similarity score and a SLINK clustering algorithm to sort the phish into groups based upon their similarity scores [25]. Deep MD5 generates a score based upon file set similarity. Deep MD5 generates a score using the count of candidate one’s files (count1), the count of candidate two’s files (count2), and the number of matching file MD5 values between candidate one and candidate two (overlap). 𝑜𝑣𝑒𝑟𝑙𝑎𝑝 𝑜𝑣𝑒𝑟𝑙𝑎𝑝 𝐾𝑢𝑙𝑐𝑧𝑦𝑛𝑠𝑘𝑖 2 𝐶𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 = 0.5 ( ) + 0.5 ( ) 𝑐𝑜𝑢𝑛𝑡1 𝑐𝑜𝑢𝑛𝑡2 A Kulczynski 2 coefficient, equation 1, is then applied to count1, count2, and overlap to generate the Deep MD5 score with a value between 0.0 and 1.0. For example two websites, website X and website Y, could be compared using Deep MD5. If website X consists of files {a,b,c,d,e} and website Y consists of files {a,b,f,g} then the overlap count between the two websites’ file sets is two (overlap). Website X’s file count is five (count1) and website Y’s file count is four (count2). Then the Deep MD5 score is 0.5(2/5) + 0.5(2/4) or 0.45. After the Deep MD5 similarity scores are generated the results are feed to a SLINK clustering algorithm. The SLINK clustering algorithm is a graph theoretic clustering algorithm. The graph has vertices of phishing websites and for each pair of vertices there exists a deep MD5 similarity score. Edges where the similarity score meets or exceeds a threshold are kept and edges not meeting the minimum threshold are discarded. An analysis of Deep MD5 scores between phish showed good matching results between phish with the same brand for threshold values ranging from 0.5 to 0.75 with very little change [26]. A 0.6 value is chosen as a middle ground between the high and low end threshold values. After all edges have been pruned, the SLINK clustering algorithm turns connected components into clusters. Figure 4 Similarity Score Generation for Single Time Windows A and B Each single time window is clustered by generating similarity scores for all phish from a single window and then applying a SLINK clustering algorithm. Figure 4 shows similarity scores being generated for two single time windows before the clustering algorithm is applied. Figure 5 Similarity Score Generation for Cross Time Window A:B The six cross time windows are clustered by generating a similarity score for all phish from one window compared to another window and applying a SLINK clustering algorithm. Phish from the same time window are not compared, only phish from different windows are compared. Figure 5 shows similarity scores being generated for a single cross time window before the clustering algorithm is applied. Clustering the four single time windows and six cross time windows can be performed independently of one another, allowing all clustering processes to be run in parallel instead of in sequence. 4.3 Comparing Time Windows The cross time window clusters are used to merge the individual clusters from different time windows. Clusters from single time windows are compared to clusters from overlapping cross time windows. The clusters are compared by counting the number of phish shared between the two clusters (overlap) divided by the total number of phish in the single time window cluster (count1) resulting in a score between 0.0 and 1.0. 𝑜𝑣𝑒𝑟𝑙𝑎𝑝 𝑐𝑜𝑢𝑛𝑡1 The cross time window’s size is not included as it will dilute the similarity score. Because the cross time window clusters can incorporate phish from two time windows they are generally much larger. Each single time window to cross time window comparison can be run independently of one another. Comparing time window clusters can be performed in parallel. Comparing time windows based upon shared cluster members results in a cluster similarity graph for all clusters from all time windows. 𝐶𝑙𝑢𝑠𝑡𝑒𝑟 𝑀𝑒𝑚𝑏𝑒𝑟𝑠ℎ𝑖𝑝 𝑆𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 = 4.4 Merging Clusters A clustering algorithm is then run over the cluster similarity graph to merge similar clusters from different time windows. A SLINK clustering algorithm is used to determine the cluster meges. The SLINK clustering algorithm is chosen because of its simplicity as this is an initial investigation into the effectiveness of the Simple Set Comparison tool. The clustering algorithm used in this step of the Simple Set Comparison Tool is interchangeable. The only requirement is the clustering algorithm takes an edge based representation of a graph and produce non-overlapping clusters. 5. RESULTS The Simple Comparison Set Tool is compared to a traditional clustering algorithm, a SLINK clustering algorithm, run over the same data. The SLINK clustering algorithm uses a Deep MD5 similarity score with a threshold of 0.6. The Simple Set Comparison Tool uses a SLINK clustering algorithm with a DeepMD5 threshold of 0.6 for its first step. A SLINK clustering algorithm with a Deep MD5 threshold of 0.6 is chosen to provide an apple to apples comparison to the Simple Set Comparison tools results. The Simple Set Comparison Tool and the traditional clustering both ran on the same hardware, a 64 bit Windows 7 Enterprise desktop with an Intel Core2 Quad CPU running at 2.83GHz and 8.00 Gigabytes of RAM. Both methods use the same java implementation of the SLINK clustering algorithm. All results are stored to the same Postgresql data base on the local machine. The first subsection compares the clustering quality produced by different runs of the Simple Set Comparison Tool with varying cluster merging thresholds, used in step four. A single merging threshold value is chosen after analysis of the clustering quality and is used to compare against the traditional clustering algorithm. The second subsection compares the quality of the clustering produced by the Simple Set Comparison Tool and the traditional SLINK clustering using three different cluster quality measures. The third subsection shows an anecdotal comparison of the largest ten clusters generated by the Simple Set Comparison Tool and the traditional clustering algorithm. The fourth subsection computes the runtime of the Simple Set Comparison Tool and compares it to that of the traditional clustering algorithm. The fifth subsection discusses the algorithms used in the Simple Set Comparison Tool. The sixth subsection discusses issues that may cause runtime performance to decrease. 5.1 Comparing Cluster Merging Thresholds The clustering quality is measured using three different entropy based metrics; homogeneity, completeness, and V-measure [28]. All three measures are based upon evaluating the clustering results compared to a ground truth label assigned to all data points. The ground truth label assigned to the data points represents a perfect clustering of the dataset. The three different measures evaluate how close to a perfect clustering is created. The ground truth label used is the phish brand. Homogeneity evaluates how well the clustering is at placing members that should be in the same cluster in the same cluster. A perfect homogeneity score is achieved when all clusters only contain members with the same label. Completeness evaluates how well the clustering came to determining the correct number of clusters. A perfect completeness score is achieved when there is only one cluster for each label. Vmeasure is the harmonic mean of the homogeneity and completeness scores, a blend of homogeneity and completeness scores. Also included is the number of clusters created. The Simple Set Comparison Tool is evaluated over eleven different thresholds ranging from 0.001 to 1.0. These ranges are chosen as the smallest and largest thresholds to evaluate because the smallest cluster similarity score found above 0.0 is approximately 0.0094 and the largest cluster similarity score is 1.0. The other thresholds used range from 0.1 to 0.9 with 0.1 increments between to get the best coverage without an exhaustive search of all threshold values. Threshold Number Clusters Homogeneity Completeness V-Measure 0.001 1,248 0.9871 0.6778 0.8037 0.1 1,281 0.9872 0.6761 0.8025 0.2 1,321 0.9872 0.6739 0.8010 0.3 1,324 0.9872 0.6729 0.8003 0.4 1,332 0.9872 0.6716 0.7994 0.5 1,332 0.9872 0.6716 0.7994 0.6 1,335 0.9873 0.6702 0.7984 0.7 1,368 0.9866 0.6683 0.7969 0.8 1,377 0.9851 0.6641 0.7934 0.9 1,386 0.9845 0.6606 0.7907 1 1,416 0.9850 0.6352 0.7723 Figure 6 Simple Set Comparison Tool Clustering Quality Measures The cluster quality measures stay very consistent across the thresholds. As the threshold increases the homogeneity scores only changes in the third and fourth decimal places. Oddly though the homogeneity scores rise slightly until the 0.6 threshold and then fall slightly until the 1.0 threshold. This may be due to an unusual breakdown of good similarity clusters at very high thresholds. The completeness score decreases slightly but consistently from the 0.001 threshold to the 1.0 threshold. The v-measure score that measures the tradeoff between homogeneity and completeness slightly decreases from the 0.001 to the 1.0 threshold. The degradation of completeness without a corresponding improvement in homogeneity begins at 0.3. The relative degradation of the clustering is also reflected in the declining v-measure score from the 0.3 to 0.4 thresholds. The clustering result produced by the 0.3 merging threshold is chosen as a best representative to compare to the traditional clustering algorithm on clustering quality. 5.2 Comparing Clustering Quality Results The 0.6 threshold value used for traditional clustering is the same value used in the Simple Set Comparison Tool during the initial clustering of the time windows. The 0.6 threshold should serve as a good benchmark. As it is noted in a Deep MD5 evaluation paper [26] there is little difference between the 0.5 and 0.75 DeepMD5 threshold values. 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Clustering Quality Comparison SLINK Simple Set Comparison Tool Homogeneity Completeness V-Measure Figure 7 Traditional Clustering and Simple Set Comparison Quality Measures Comparing the clustering quality measures shows almost no differences as the homogeneity, completeness, and v-measure scores are all relatively similar. Traditional clustering produces a slightly better completeness score. There are negligible differences between the quality of clusterings produced. 5.3 Anecdotal Comparison An anecdotal comparison between the ten largest clusters generated by traditional clustering and the Simple Set Comparison Tool is presented below. Cluster Brand Telecom Company 1 Tech Company 1 Financial Institution 3 Financial Institution 2 Tech Company 3 Tech Company 2 Traditional Clustering Cluster Size Simple Set Comparison Tool Cluster Size 1,291 915 794 770 637 567 1,291 915 794 770 637 567 Tech Company 1 Financial Institution 6 Telecom Company 1 Financial Institution 4 365 303 303 302 365 303 303 302 Figure 8 Ten Largest Clusters Produced By Traditional and Simple Set Comparison Tool Clustering Traditional clustering and the Simple Set Comparison Tool both produced the same ten largest clusters. Meaning, both sets of ten are perfectly homogeneous, have the exact same cluster sizes, and have the same brand label. The individual phish that make up both of the sets of ten were not compared to determine if they have exactly the same phish contained in each corresponding cluster. 5.4 Runtime Comparison The total runtime for the Simple Set Comparison Tool is computed by adding the runtime for each step together. There is no runtime spent for the first step as the chronological dividing points for each time window are chosen before the tool is run. The second step is run in parallel, ideally each clustering process is run on a separate machine; the runtime for step two will be the longest runtime out of the group. The third step’s runtime is the longest comparison runtime out of all single to cross time window comparisons. Since the third step is run in parallel, ideally each comparison process is run on a separate machine; the runtime for step three will be the longest runtime out of the group. The fourth step is not parallelized. The fourth step’s runtime will be the runtime it takes to assemble a global clustering out of the cluster similarity graph generated in step three. The parallel clustering processes run in step two have the largest runtimes out of all of the steps and their runtimes are presented in Figure 9. Window Clustering Runtimes Single window 1 Single window 2 Single window 3 Single window 4 Cross Window 1 : 2 Cross Window 1 : 3 Cross Window 1 : 4 Cross Window 2 : 3 Cross Window 2 : 4 Cross Window 3 : 4 0 100,000 200,000 300,000 400,000 500,000 600,000 700,000 800,000 900,000 Miliseconds Figure 9 Clustering Runtimes for Single and Cross Time Windows. Clustering the single time windows takes between three minutes for the fastest and twelve minutes for the slowest. Clustering the cross time windows takes between almost seven minutes and almost fourteen minutes. The longest runtime out of the group is cross window 1:2 at almost 14 minutes, 836,108 milliseconds. Step two, comparing single to cross time window clusters, took very little time. All twelve of the comparisons took only 640 milliseconds combined. The longest comparison took 93 milliseconds and the shortest took 31 milliseconds. The longest runtime out of the twelve comparisons is 93 milliseconds. Step four, merging time windows, is not parallelized and has a single runtime of 1,324 milliseconds. Adding the longest runtime for step two (836,108 milliseconds), the longest runtime for step three (93 milliseconds), and step four’s runtime (1,324 milliseconds) results in a total runtime of almost fourteen minutes (837,525 milliseconds). The biggest contributor of total runtime comes from step two, clustering single and cross time windows. In particular it comes from clustering the cross time window 1:2. The traditional clustering algorithm took over eight and a half hours, 31,044,322 milliseconds, to complete. The Simple Set Comparison Tool’s runtime is more than 32 times faster than the traditional clustering algorithm’s runtime. There is almost no difference between the quality of clusterings produced by the Simple Set Comparison Tool and traditional clustering. The runtime is the biggest difference between the two. The Simple Set Comparison Tool is more than 37 times faster at producing results for the monthly dataset. 5.5 Interchangeable Algorithms The Simple Set Comparison Tool has three interchangeable pieces. The distance metric used for comparing phish in the first step, the clustering algorithm used to cluster phish in the first step, and the clustering algorithm used when merging similar clusters in the third step. The Simple Set Comparison Tool can make use of a variety of phish similarity metrics. The tool only requires the similarity measure be numeric and have an upper and lower bound. The Simple Set Comparison Tool can make use of a variety of clustering algorithms. The Simple Set Comparison Tool requires a clustering algorithm to take an edge representation of a graph as input and produce nonoverlapping clusters. These requirements allow the Simple Set Comparison Tool to use a variety of existing clustering algorithms. The Deep MD5 metric is being used as a similarity measure in the first step as it has been shown to be useful for clustering phishing websites [26] [21] [20]. The Deep MD5 metric is not fool proof as it relies on file reuse by phish. Small changes to a file will change the MD5 value for that file. If a phishing author was so inclined all content files referenced by a phish could be slightly changed each time a particular phish was created. The result would be a Deep MD5 score of 0.0 between two phish created by the same author that targeted the same brand with the same functionality and appearance. However, this has not been noticed to be prevalent in the wild at this time. If this does occur at some future date, the similarity metric used by the Simple Set Comparison Tool is interchangeable and another more sufficient phish similarity metric can be used in place of Deep MD5. The only requirement the Simple Set Comparison Tool has for a comparison metric is that the metric produces a single numerical value within a defined upper and lower bound. The Deep MD5 similarity metric is being used as an example similarity metric that is currently effective in this particular use case. The SLINK clustering algorithm is used in step one for clustering time windows and in step three when merging clusters. The same clustering algorithm does not have to be used in both step one and step three. Indeed there may be circumstances where using a different clustering algorithm in step one and step three may produce better results. However, in this particular case using the SLINK clustering algorithm to cluster the time windows in step one and merge clusters in step three is effective as it produces a clustering of similar high quality to traditional clustering. The SLINK clustering algorithm is not the best or newest clustering algorithm. It is a simple clustering algorithm and has been shown to produce good results when applied to clustering phish [26] [21] [23] [27]. Like the similarity metric, the clustering algorithm used by the Simple Set Comparison Tool is interchangeable. The Simple Set Comparison Tool only requires a clustering algorithm take an edge based representation of a graph as input and produce non-overlapping clusters within a single data set. 5.6 Performance Discussion The Simple Set Comparison Tool results show a drastic runtime gain when compared to the traditional clustering algorithm’s runtime. However, issues can arise that would increase runtime. Steps two and three are performed in parallel and are scalable. Step four is not run in parallel and the runtime could become a problem under certain circumstances. The key driving factor for step four’s runtime is the size of the similarity graph, or number of clusters, produced by step three. There are two ways the number of clusters created would increase. The first is by increasing the number of time windows. For example if an existing time window is subdivided into two time windows the number of clusters produced would be approximately double assuming each of the clusters generated from the original time window would effectively be split in half. The second way the number of clusters created would increase is if the clustering algorithm used in step two has a low completeness score, thus producing many more clusters than needed. The Simple Set Comparison tool’s runtime improvement versus the traditional clustering algorithm is achieved over the month long data set by using appropriately sized time windows containing many phish and using a clustering algorithm in step two that has a sufficient completeness score. While the window sizes used in this evaluation have not been optimized through an exhaustive search the selected window sizes achieve a large performance gain and generate a clustering of equivalent quality to the traditional clustering algorithm. 6. CONCLUSIONS The clustering quality metrics show the Simple Set Comparison tool’s results are essentially equivalent to the traditional clustering output, which are good at a cluster homogeneity score above 0.98. The Simple Set Comparison tool’s runtime is drastically better than the traditional clustering runtime. The runtime improvement is due to the Simple Set Comparison Tool partitioning the dataset and performing a majority of its clustering in parallel. The Simple Set Comparison Tool works well with the Deep MD5 comparison metric and SLINK clustering algorithm when clustering phish data. However, the Simple Set Comparison Tool is adaptable enough to make use of many different comparison metrics and clustering algorithms. It can quickly create quality phish clusters that can be used for phish identification or aggregation by phishing investigators. 7. FUTURE WORK Further evaluations need to be performed on different data sets to determine the general applicability of the Simple Set Comparison tool. A future goal is to evaluate the Simple Set Comparison tool’s ability to deal with heterogeneous data. One example would be creating clusters consisting of phish, spam advertising phish, and kits used to create phish. Incorporating two more sources of data, especially spam, would significantly increase the amount of data to cluster. It will also require the use of multiple similarity metrics. Deep MD5 can only be used to compare two phish and cannot be used to relate URLs found in spam to phishing websites. Other similarity metrics will have to be developed. The inclusion of multiple similarity metrics used over heterogeneous data may necessitate the use of more sophisticated clustering algorithms. Each similarity metric would represent a different type of relationship between data points such as phish and spam email versus phish and phish kits. The different types of relationships may have different value ranges and distributions over their respective value ranges. A more locally adaptable clustering algorithm may be required to generate adequate clusters over such heterogeneous data. REFERENCES [1] APWG, "About the APWG," APWG, 2014. [Online]. Available: https://apwg.org/about-APWG/ . [Accessed 11 December 2014]. [2] PhishTank, "PhishTank FAQ," PhishTank, [Online]. Available: http://www.phishtank.com/faq.php#whatisphishtank. [Accessed 11 December 2014]. [3] Kaspersky Lab, "Kaspersky Lab," Kaspersky Lab, 8 November 2013. [Online]. Available: http://www.kaspersky.com/about/news/virus/2013/Malware_spam_and_phishing_the_thre ats_most_commonly_encountered_by_companies. [Accessed 11 December 2014]. [4] G. Aaron and R. Manning, "Phishing Activity Trends Report 2nd quarter 2014," 29 August 2014. [Online]. Available: http://docs.apwg.org/reports/apwg_trends_report_q1_2014.pdf. [Accessed 11 December 2014]. [5] J. Han and M. Kamber, Data Mining: Concepts and Techniques, San Diego, CA: Academic Press, 2001, p. 337. [6] A. Saberri, M. Vahidi and B. M. Bidgoli, "Learn to Detect Phishing Scams Using Learning and Ensemble Methods," in Web Intelligence and Intelligent Agent Technology Workshops, Silicon Valley, CA, 2007. [7] S. Abu-Nimeh, D. Nappa, X. Wang and S. Nair, "A Comparison of Machine Learning Techniques for Phishing Detection," in eCrime Researchers Summit, Pittsburgh, PA, 2007. [8] I. Fette, N. Sadeh and A. Tomasic, "Learning to Detect Phishing emails," in Proceedings of the 16th International Conference on World Wide Web, Banff, Alberta Canada, 2007. [9] B. Gyawali, T. Solorio, M. Montes-y-Gomez, B. Wardman and G. Warner, "Evaluating a Semisupervised Approach to Phishing URL Identification in a Realistic Scenario," in Conference on Email and Anti-Spam, Perth, Australia, 2011. [10] J. Ma, L. Saul, S. Savage and G. Voelker, "Beyond Blacklists: Learning to Detect Malicious Web Sites from Suspicious URLs," in Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, 2009. [11] M. Dunlop, S. Groat and D. Shelly, "GoldPhish: Using Images for Content-Based Phishing Analysis," in The Fifth International Conference on Internet Monitoring and Protection, 2010. [12] R. Basnet, S. Mukkamala and A. H. Sung, "Detection of Phishing Attacks: A Machine Learning Approach," in Studies in Fuzziness and Soft Computing, 2008, pp. 373-383. [13] R. Suriya, K. Saravanan and A. Thangavelu, "An Integrated Approach to Detect Phishing Mail Attacks A Case Study," in Proceedings of the 2nd International Conference on Security of Information and Networks, North Cyprus, Turkey, 2009. [14] C. Whittaker, B. Ryner and M. Nazif, "Large-Scale Automatic Classification of Phishing Pages," in Network and Distributed Systems Security Symposium, San Diego, CA, 2010. [15] G. Xiang and J. Hong, "A Hybrid Phish Detection Approach by Identity Discovery and Keywords Retrieval," in Proceedings of the 18th International Conference on World Wide Web, Madrid, Spain, 2009. [16] Y. Zhang, J. Hong and L. Cranor, "CANTINA: A Content-Based Approach to Detecting Phishing Web Sites," in International Conference on World Wide Web, Banff, Alberta, Canada, 2007. [17] D. Irani, S. Webb, J. Griffon and C. Pu, "Evolutionary Study of Phishing in eCrime Researchers Summit," in eCrime Researchers Summit, Atlanta, GA, 2008. [18] R. Weaver and M. Collins, "Fishing for Phishes: Applying Capture-Recapture Methods to Estimate Phishing Populations," in Proceedings of the Anti-Phishing Working Groups 2nd Annual eCrime Researchers Summit, Pittsburgh, PA, 2007. [19] B. Wardman, G. Shukla and G. Warner, "Identifying Vulnerable Websites by Analysis of Common String in Phishing URLs," in eCrime Researchers Summit, Tacoma, 2009. [20] J. Britt, B. Wardman, A. Sprague and G. Warner, "Clustering Potential Phishing Websites Using DeepMD5," in Proceedings of the 5th USENIX Conference on Large-Scale Exploits and Emergent Threats, 2012. [21] B. Wardman, G. Warner, H. McCalley, S. Turner and A. Skjellum, "Reeling in Big Phish with a Deep MD5 Net," Journal of Digital Forensics, Security, & Law, vol. 5, no. 3, pp. 33-55, 2010. [22] B. Wardman, J. Britt and G. Warner, "New Tackle to Catch A Phisher," International Journal of Electronic Security and Digital Forensics, vol. 6, no. 1, pp. 62-80, 2014. [23] S. Zawoad, A. Dutta, A. Sprague, R. Hasan, J. Britt and G. Warner, "Phish-Net: Investigating Phish Clusters Using Drop Email Addresses," in 2014 APWG eCrime Researchers Summit, San Francisco, 2013. [24] M. Maischein, "WWW::Mechanize::Firefox," [Online]. Available: http://search.cpan.org/dist/WWW-Mechanize-FireFox/.. [Accessed 2013 01 01]. [25] R. Sibson, "SLINK: An optimally efficient algorithm for the single-link cluster method," The Computer Journal, vol. 16, no. 1, pp. 30-34, 1973. [26] B. Wardman, T. Stallings, G. Warner and A. Skjellum, "High-Performance Content-Based Phishing Attack Detection," in eCrime Researchers Summit, San Diego, CA, 2011. [27] A. Rosenberg and J. Hirschberg, "V-Measure: A Conditional Entropy-Based External Cluster Evaluation Measure," EMNLP-CoNLL, vol. 7, pp. 410-420, 2007. [28] B. Wardman, J. Britt and G. Warner, "New Tackle to Catch a Phisher," International Journal of Electronic Security and Digital Forensics, vol. 6, no. 1, pp. 62-80, 2014. TOWARDS A DIGITAL FORENSICS COMPETENCY-BASED PROGRAM: MAKING ASSESSMENT COUNT Rose Shumba University of Maryland University College Department of Cybersecurity and Information Assurance Largo, MD [email protected] ABSTRACT This paper describes an approach that UMUC has initiated to revise its graduate programs to a Competency-Based Education (CBE) curriculum. The approach, which is Learning Demonstration (LD) centric, includes the identification of learning goals and competences, identification and description of the LDs, mapping of the LDs to the competences, scripting the LDs, placing the LDs into the respective courses, validating the developed materials, and the development of the open learning resources. Programs in the Cybersecurity and Information Assurance Department, including the Digital Forensics and Cyber Investigations program, are being revised. An LD centric approach to curriculum development helps align programs to the needs of employers, and standards of accreditation bodies. The rationale behind this paper is twofold: to support course development through providing reusable competency inventory, LD inventory, and open resources and to provide assessment by defining competences of an individual as a function of knowledge and skills. This is a work in progress. Keywords: learning goal, digital forensics, competences, competency-based education, learning demonstration 1. INTRODUCTION Competency-Based Education (CBE) has become a mainstream topic in higher education. However, CBE is not new. In the 70s, institutions including Empire State College, DePaul University and Thomas Edison State College developed systems for competency identification and validation. The basic building blocks for CBE include the identification of competences and the associated competency-based assessment. Traditional models of learning focus on what academics believe graduates need to know. These models rely heavily upon the successful completion of a series of courses, assignments, quizzes, exams, tests and time in class results in achieving the learning outcomes (Klein-Collins, 2013). CBE is an outcomes-based approach to education where the emphasis is on what graduates know and what they can do. The learning experience is dependent upon standardized and agreed upon definitions of skills, abilities, knowledge, competences and demonstrations. Students learn at their own pace and have to demonstrate mastery of the content and associated skills required for a particular course/context, regardless of how long it takes. Competences are identified; the content, readings and assignments are then selected to support student attainment of the competences. A competency may be defined as a combination of skills, knowledge and attitude that enable an individual to perform a task in a specific context (student and or workplace focused) to a defined level of proficiency (Chacko, 2014). CBE measures learning rather than time spent (Mendenhall, 2012). As such, assessment of learning plays a critical role in many CBE programs and helps provide the needed support and mastery (Learning in the News, 2010, American Progress). In a CBE model, students know the content they are expected to study and the activities to perform for assessment (Colloquium, 2008). The above figure, from the National Postsecondary Education Cooperative report, (NPEC, 2002) describes a CBE model. The four main tiers of the model include: the traits and characteristics, skills, abilities and knowledge, competences, and demonstrations Traits and characteristics are the foundation of learning and the natural makeup of the individuals upon which further experiences may be built. The differences in traits and characteristics help explain why people pursue different learning experiences. Through the learning experience, which includes formal education, work and participation in community affairs, skills, abilities and knowledge are acquired. The unique combination of skills, abilities and knowledge that one has acquired define the competences that an individual possesses. The individual competences possessed by an individual are combined in carrying out different demonstrations or tasks. CBE can be very accommodating to adult learners who have some college, but no degree but have prior learning. Competency methodology is not limited to domain knowledge of a degree but also the critical analysis and decision making capability of an individual. To meet the professional needs of working adult learners whose responsibilities may include jobs, family and military service UMUC Graduate School is revising four graduate programs in the Cybersecurity and Information Assurance Department to CBE: Cybersecurity Technical (UMUC, 2014), Cybersecurity Policy (UMUC, 2014), Digital Forensics and Cyber Investigations (UMUC, 2014), and Information Technology with Information Assurance Specialization (UMUC, 2014). This paper focuses on the work done so far towards the revision of the Digital Forensics and Cyber Investigations (DF) program to CBE curriculum. Section 2 gives an overview of different CBE models. Section 3 describes the UMUC Digital Forensics and Cyber Investigations program. Section 4 describes the LD centric process being used in the program revision. Section 5 presents the lessons learnt so far. This is a work in progress. 2. A BRIEF OVERVIEW OF THE CBE MODELS There are CBE curriculum models that are evolving: assessment-based, structured instruction, and integrated models (Wax, 2014) Elements of an assessment-based program are that the degree is based on the student demonstrating a predetermined set of competences. The students learn through a variety of modes, faculty serve as assessors, students complete assessments at their own pace, student may start at any time, and there is competency-based assessment. There may also be an option for face to face capstone requirement. Western Governors University is an example of a university with an assessment-based program (West Governor, 2012). The structured instruction model includes courses with modules developed around competences, a variety of learning modes, assessment embedded throughout curriculum, faculty members acting as mentors and advisors, different faculty members acting as assessors, pre-test that allows the students to skip to next module for subjects already mastered (Personalized Learning Assessment), and post-test that validates learning. Examples of colleges using this model: Capella FlexPath, Kentucky Learn on Demand, Nothern Arizona PL, and Texas Affordable Bachelors’s. The integrated model, which is currently being followed by DePaul University School for New Learning, includes multiple pathways to a credential through competency-based assessment, plus on the ground traditional courses and prior learning assessment. There is also conversion of competences for credit hours. Currently UMUC has some of these elements in the undergraduate program: prior learning assessment and traditional courses. The UMUC Graduate School is adopting the integrated model. It is not clear yet, what elements will be in the model. We envision the role of faculty changing significantly. Faculty will serve more as student guides and less as authoritarian figures setting the pace of learning and dictating grades (Krathwohl, 2002). The semester will be 11 weeks long as opposed 12 weeks. Courses will be translated into credit hours. There will be no flexi-path. Any student who finishes early must wait for the next semester. The logistics of how this will eventually work are not finalized. This is a work in progress. 3. THE UMUC DIGITAL FORENSICS PROGRAM Digital forensic examiners are in demand to mitigate the growing vulnerabilities of the digital universe. Based on projections, the field cannot meet the demand for digital forensic professionals in the near future. The profession is expected to grow double digits with the increasing demand for cybersecurity from the public and private entities (Withycombe,2014) (Gannett,2012). The Bureau of Labor Statistics estimates computer forensics jobs to grow more than 13% in the next several years. The National Security Agency is planning to hire 3,000 specialists to combat the thousands of cyber-attacks in the US. The Department of Homeland Security is expected to hire about 1,000 more cybersecurity specialists (Gannett, 2012). Many colleges and universities are adding forensics courses to their curriculum to meet the demand for forensic specialists. (CriminalJusticeSchool, 2015),(EndicottPopovsky, Frinckle, 2006), (Nelson, Phillips, 2008). Given the vital need for qualified digital forensic professionals and the steady rise in the number of colleges/universities offering digital forensics courses, there is a great need to align programs to the needs of employers, and standards of the accreditation bodies. There is need for programs that emphasize “doing”, that is meeting higher layer objectives of Bloom’s taxonomy of cognitive learning (Krathwohl, 2002). UMUC offers an online Digital Forensics and Cyber Investigation Graduate Masters and Graduate Certificate program (UMUC graduate, 2014). The Graduate Masters of Digital Forensics and Cyber Investigations program requires that students complete six 6-credit courses (36 credits total): CSEC 610: Cyberspace and Cybersecurity CSEC 620: Human Aspects in Cybersecurity: Ethics, Legal Issues and Psychology CSEC 650: Cyber Crime Investigations and Digital Forensics CSEC 661: Digital Forensics Investigation CSEC 662: Cyber Incident Analysis and Response CSEC 670: Capstone The Graduate Certificate program has three courses: CSEC 650, CSEC 661 and the CSEC 662. The majority of the students coming into the UMUC program are career changers. We therefore require that all our students take the CSEC 610 and the CSEC 620 courses. The CSEC 610 course is a study of the fundamentals of cyberspace and cybersecurity; cyber architecture, cyber services, protocols, algorithms, hardware components, software components, programming languages, various cybersecurity mechanisms, business continuity planning, security management practices, security architecture, operations security, physical security, cyber terrorism, and national security. CSEC 620 covers an examination of the human aspects in cybersecurity; ethics, relevant laws, regulations, policies, standards, psychology, and hacker culture. Emphasis is on the human element and the motivations for cyber-crimes. Analysis covers techniques to prevent intrusions and attacks that threaten organizational data. The CSEC 650 course covers the theory and practice of digital forensics. Topics include computer forensics, network forensics, mobile forensics, and other types of digital forensics. Discussion also covers identification, collection, acquisition, authentication, preservation, examination, analysis, and presentation of evidence for prosecution purposes. The CSEC 661 covers the processes and technologies used in the collection, preservation, and analysis of digital evidence in local, networked, and cloud environments. An examination of policies and procedures related to security incidents, exposures, and risks and technologies used to respond to such threats. CSEC 662 of policies and procedures related to security incidents, exposures, and risks and technologies used to respond to such threats. Topics include dynamic vulnerability analysis, intrusion detection, attack response, evidence protection, and business continuity. Discussion also covers types and modes of computer-facilitated attacks, readiness, and evidence scope, as well as the role of computer emergency response teams. All students within the program are required to take a Cybersecurity Capstone course, CSEC 670. The CSEC 670 course is a study of and an exercise in developing, leading, and implementing effective enterprise and national-level cybersecurity programs. Focus is on establishing programs that combine technological, policy, training, auditing, personnel, and physical elements. Challenges within specific industries (such as health, banking, finance, and manufacturing) are explored (UMUC graduate, 2014). 4. LEARNING DEMONSTRATION CENTRIC PROCESS A key component of the new CBE curriculum is the use of authentic assessments that students employ both to learn and demonstrate learning. These are referred to as LDs in the process. Authentic assessments is based on doing and not just knowing, which is the higher order, thinking skills of Bloom’s Taxonomy of learning (Learning in the News, 2014): evaluation, synthesis, analysis and application. Students advance based on their ability to master a skill or a competency. Large skill sets are broken down into competences, which have sequential level of mastery. The LD centric process involved the identification of the learning goals and competences, development of the LDs, scripting of the LDs, placement of the LDs into respective courses, validation of the developed materials and the development of open resources to support the developed LDs. Once again, this is a work in progress. 4.1 Identification of the learning goals and competences A learning goal is a very broad statement of what students should know or be able to accomplish. The purpose for crafting a set of learning goals was to provide a brief and broad picture of what the program expects its students to know and be able to do upon graduation (outcomes). The input sources for identification of the learning goals and competences included the National Cybersecurity Workforce Framework (NICE,2013)], NSA Center of Excellence in Information Assurance. Cyber Defense Knowledge Units (NSA, 2013) the Air Force Office of Special Investigations Defense Crime Center CDFAE (CDFAE, 2012), DoD 8570 (DoD,2010) and Subject Matter Experts. Five broad digital forensics specific learning goals were identified: 1. Learners interpret and utilize laws, policies, procedures, and governance in digital forensic and incident response situations. 2. Learners demonstrate the appropriate use of multiple digital forensic tools and technologies in a variety of criminal and security breach situations. 3. Learners design and implement strategies for proper seizure, evidence handling, investigation, and analysis of digital artifacts, including preparing reports and presenting findings. 4. Learners adapt proper professional, legal, and ethical frameworks to govern their forensic activities in local, national, and global environments. 5. Learners assess an Information Architecture for potential security threats and evidentiary value In addition to the above five learning goals, the UMUC Graduate School has four learning goals which all students in the Graduate School need to master: 1. Communication: learners demonstrate ability to communicate clearly both orally and in writing. 2. Critical thinking: learners demonstrate ability to apply logical, step-by-step decision-making processes to formulate clear, defensible ideas and to draw ethical conclusions. 3. Quantitative reasoning: learners demonstrate the ability to use mathematical operations and analytical concepts and operations to address problems and to inform decisionmaking. 4. Leadership, facilitation, and collaboration: learners lead, facilitate, and collaborate with a variety of individuals and diverse teams to achieve organizational objectives. The Graduate School provided the competences for the four learning goals above to all the departments revising their programs. Appendix A presents identified competences for learning goal 2 for the Digital Forensics and Cyber Investigations program. 4.2 Identifying and mapping the LDs Once the learning goals and competences were identified, LDs were identified and described. LDs are the real tasks that someone in a given context will need to be able to do. Identifications of the LDs requires a thorough understanding of the work products that professionals in the industry are required to produce on a daily basis. Such work products vary greatly in complexity and scope and include memos, reports, studies, oral presentations, speeches, digital/multimedia presentations, audio recordings, print or online publications, social media presence, scientific experiments, graphical models, quantitative models, and many other types of deliverables in addition to research. Working with Subject Matter Experts (SME), 20 LDs for the Digital Forensics and Cyber Investigations program were identified and described. The LDs were arranged in a very specific order to enable students to build capacity as they progress through the program—learning something and then building on it as they move forward. The topics and content student need to master in order to complete a given LD were selected. A subset of the identified LDs, brief description and the associated content and topics for each LD is presented in Appendix B. The next step was the mapping of the program competences to the identified LDS. The goal was to ensure that over the course of the program, the students would learn and demonstrate mastery of every competency in the program a sufficient number of times. Appendix C presents a partial mapping grid for the competences for goal 2 for the Digital Forensics and Cyber Investigations programs to the LDs 5 to 8. The vertical axis shows the competences. The top horizontal axis has a a partial list of the identified LDs (LDs 5 to 8) The highlighted yellow shading shows the mapping of the competences to the LDs. After the mapping was completed, the next step was to assess whether any competences were over evaluated, that is, being evaluated too often. This was achieved by looking out for some gaps and overlaps in the yellow shading. Any unmapped competences were deleted. Any LDs with lower than reasonable mapping were quaranteedfor further consideration. The identified LDs were then scripted and the placed into the courses. Scripting involved expanding what is being required for the student to demonstrate. This involved fully developing the base scenarios and details of LD for the purposes of adding to their realistic nature and context. The Graduate School provided the elements of the scripting process: o o o o o o o o Describing the role the student is playing – is she/he acting or being acted upon Explaining what his/her is being asked to do in his/her role Explain the deliverable within the context of the story Determining the timeline; start and end dates for the story/situation – is there a crucial moment in the sequence of events Explaining the critical issues, events or problems under consideration and there consequences Providing relevant data, in appropriate format Suggesting where other types of relevant data might be found Concluding by integrating the learning demonstration into the template. After fully scripting we will have a fully verified set of LDs, their mapping to competences, and a final set of program competences. The fully scripted LDs will then be placed into courses. Now we will have the core blocks of the program. With the revision of the program nearing final, a focus group with three to five of adjunct faculty will be held to review and validate the placement of the LDs into the courses. 5. LESSONS LEARNT The process has been quite an engaging experience. We have fully designed the core blocks of an academic program from a competence level, something very few Program Chairs in higher education get to do in their careers. The process has enabled us to assess the complexity and effort required in transitioning the traditional curriculum offering to a CBE one in digital forensics. The CBE project started early March 2014. It is expected to be completed during the Spring of 2016. Implementation will immediately follow. There have been a series of workshops and a Canvas online class created to guide the Program Directors through the process. The online class also provided for submission of completed work and discussion forums for Program Director interaction. Completed products documents for learning goals and competences, LD description, and the mapping of competences to the LDs competences were submitted through the Canvas class. Revisiting and revising programs will help make more them more relevant to the market we serve, the working adults. A CBE curriculum will provide students with a career relevant experience, and offer a practical approach to learning. An outcome from this project will be a “knowledge cloud” containing the collection of competences, LDs, and resources that may be shared or reused by related programs. This is a very time consuming process which requires good management and leadership. To kick off the project, roles were defined for the Program Director, coach and Subject Matter Expert. The project leadership includes the Dean and the Associate Dean. The role of the Program Director was to work with an SME to identify goals and competences, map LDs to competences, and develop the learning resources. The coaches worked with the PDs. The coach met biweekly with the Program Directors, reviewed work, and coordinated and dialogued with departmental Chairs. SMEs were selected by the Program Chairs and assisted with content and design as specified in the contract. 6. CONCLUSION This paper has presented the approach that UMUC is using to revise its graduate programs into a CBE curriculum. The approach which is LD centric involves the identification of the learning goals and related competences, development to the LDs, mapping of the LDs to the competences, scripting the LDs, placing the LDs into courses and validation of the developed work. We envision use of open source resources for teaching the course. The pilots of the project are planned for the Spring 2016. It is anticipated that this project will produce a “knowledge cloud” which contains a collection of competences, LDs, and open resource content that may be shared with related courses. REFERENCES 1. Endicott-Popovsky B, Frincke D (2006). Embedding forensic capabilities into networks: addressing inefficiencies in digital forensics investigations. In: Proceedings of the IEEE workshop on information assurance: computer forensics. West Point, NY: United States Military Academy; June 2006. 2. Nelson B, Phillips A, Enfinger F, Steuart C, (2008), Guide to Computer Forensics and Investigations, 4th Edition, Course Technology, . 3. UMUC (2014). Cybersecurity Technical Program. Retrieved from: http://www.umuc.edu/academicprograms/mastersdegrees/cybersecurity.cfm. 4. UMUC(2014). Cybersecurity Policy Program. Retrieved from: http://www.umuc.edu/academicprograms/mastersdegrees/cybersecurity-policy.cfm. 5. UMUC (2014). Digital Forensics and Cyber Investigation Program. Retrieved from: http://www.umuc.edu/academicprograms/masters-degrees/digitalforensics-and-cyberinvestigations.cfm. 6. UMUC (2014),. Information Technology with Information Assurance Specialization Program..Retrieved from: http://www.umuc.edu/academicprograms/mastersdegrees/information-technology-withinformation-assurancespecialization.cfm. 7. Mendenhall, R (2012). What Is Competency-Based Education? Retrieved from: http://www.huffingtonpost.com/drrobert-mendenhall/competency-basedlearning-b_1855374.html. 8. Western Governors University (2014). Why WGU — Competency-Based Learning. Retrieved from: http://www.wgu.edu/why_WGU/comp etency_based_approach. 9. University of Wisconsin. University of Wisconsin Flexible Option FAQs. Retrieved from: http://www.wisconsin.edu/news/2012/ 11-2012/FAQ_FlexOption.pdf. 10. Krathwohl, D(2002). A Revision of Bloom's Taxonomy.Theory into Practice, 2002. 11. National Initiative of Cybersecurity Education (NICE). National Cybersecurity Workforce Framework. Retrieved from: http://csrc.nist.gov/nice/framework/nat ional_cybersecurity_workforce_frame work_03_2013_version1_0_for_printi ng.pdf. 12. NSA (2012). Centers of Excellence in Information Assurance and Cyber Defense Education. Retrieved from: https://www.nsa.gov/ia/academic_outr each/nat_cae.. 13. CDFAE, (2012), Special Investigations Defense Crime Center, Air Force. The National Centers of Digital Forensics Academic Excellence (CDFAE) Program. Retrieved from: http://www.dc3.mil/cybertraining/cdfae. 14. DoD. (2010)DoD 8570, Information Assurance Support Environment. Retrieved from: http://iase.disa.mil/Pages/index.aspx 15. Klein-Collins R(2013), National Institute for Learning Outcomes Assessment. Retrieved from: http://www.learningoutcomeassessmen t.org/documents/Occasional%20Paper %2020.pdf 16. Learning in the News, (2014). Designing Competency-based Training with Bloom’s Taxonomy. Retrieved from: http://mbmtraining.wordpress.com/201 0/12/03/designing-competency-basedtraining-with-blooms-taxonomy/ 17. The Colloquium (2008). Retrieved from http://www.thecolloquium.com/Page5 CoreModel.htm 18. Wax, D (2014), When Assessment of Learning Counts: Competency-based Degree Programs in the USA. A presentation at UMUC. 19. National Postsecondary Education Cooperative (NPEC),(2002), Defining and Assessing Learning: Exploring Competency-Based Initiatives. Retrieved from: http://nces.ed.gov/pubs2002/2002159. pdf 20. Global Media Center, (2014). UMUC to Receive World Affairs Council’s “Educator of the Year” Award. Retrieved from: http://www.umuc.edu/globalmedia/rec ognition-of-global-education.cfm Responders, 2nd Edition. 21. Withycombe, C (2014), Deschutes’s Digital Forensics Lab Stretched, The Bulletin. Retrieved from http://www.bendbulletin.com/localstat e/2731689-151/digital-detectives# 22. Gannett, A (2012), Want CSI Without the Blood? Investigate Computer Forensics. USA Today Retrieved from : http://usatoday30.usatoday.com/money /jobcenter/workplace/bruzzese/story/2 012-01-31/profession-that-huntscybercriminals/52909566/1 23. Chacko, T, (2014), Moving toward competency-based education: Challenges and the way forward, Archives of Medicine and health Science, Volume 2, Issue 2. Goal 2 for the Digital Forensics Program and Associated competences Learners demonstrate the appropriate use of multiple digital forensic tools and technologies in a variety of criminal and security breach situations. 6.1 Use Forensic Tools and Techniques 6.1.1 Utilize tools such as EnCase, FTK, Open Source, Imaging 6.2 Evaluate sources of Forensic artifacts 6.2.1 Analyze computer components 6.2.2 Analyze electronic devices 6.2.3 Examine media 6.2.4 Examine memory (RAM) 6.2.5 Analyze mobile devices 6.2.6 Analyze network artifacts 6.3 Perform Malware Analysis 6.3.1 Detect unauthorized devices 6.3.2 Perform static malware analysis 6.3.3 Perform dynamic malware analysis 6.4 Investigate Mobile Technology 6.4.1 Examine hardware and communication 6.4.2 Evaluate 2G-5G 6.4.3 Examine wireless security 6.4.4 Analyze encryption 6.5 Investigate Multimedia Technologies 6.5.1 Examine audio, pictures, video, 6.5.2 Analyze digital fingerprints/device fingerprints 6.5.3 Utilize various file formats 6.5.4 Examine meta data 6.6 Analyze Social Media Artifacts 6.6.1 Examine social networks 6.6.2 Utilize analysis techniques (closeness, etc.) 6.7 Utilize Visual Analysis 6.7.1 Apply link analysis 6.7.2 Incorporate time analysis 6.7.3 Utilize filtering 6.8 Use GREP 6.8.1 Construct GREP searches for file headers 6.8.2 Construct GREP searches for file signature analysis 6.9 Analyze online and console based gaming 6.9.1 Identify evidence sources and challenges 6.9.2 Incorporate cloud investigation techniques Appendix B Four of the identified LDs LD Brief Description of the LD Topics to be covered 5 Develop an As head of the Incident Response Incident team, you are tasked to develop an Response Incident Response Framework to be Framework used as a guide in responding to various types of incidents that an organization may be faced with (Report) Project plan, project budget, establish, execute, monitor project plan of action, incident response and management techniques, evaluating risks, risk management models, categorizing risk. 6 Incident Response: Imaging Utilize proper procedures and tools such as EnCase, FTK, and open source to image various digital artifacts. These artifacts include computer disks and RAM (for Windows, Linux and Mac systems, cloud data, RAID, SAN, NAS) Lab exercise. Sources of forensics evidence, report writing, affidavits, investigation planning, multimedia technologies; examining audio, picture and videos, examining meta data, investigative techniques, scripting languages, file systems, data storage and transport technologies, Hexadecimal and ASCII, operating systems 7 Conduct Windows Investigations Utilizing EnCase You are presented with evidence from a crime scene and are required to carry out a number of forensics Windows investigations using Encase (Encase features, browser forensics, use of GREP, scripting, memory forensics, email forensics, GREP and file analysis, other Windows artifacts) Sources of forensics evidence, report writing, affidavits, investigation planning, multimedia technologies; examining audio, picture and videos, examining meta data, investigative techniques, scripting languages, file systems, data storage and transport technologies, Hexadecimal and ASCII, operating systems 8 Conduct Windows Investigations utilizing FTK You are presented with evidence from a crime scene and are required to carry out a number of forensics Windows investigations using FTK (FTK features, registry forensics and anti-forensics. Sources of forensics evidence, multimedia technologies; examining audio, picture and videos, examining meta data, investigative techniques, scripting languages, file systems, data storage and transport technologies, Hexadecimal and ASCII, operating systems 6 Learners demonstrate the appropriate use of multiple digital forensic tools and technologies in a variety of criminal and security breach situations. 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 Use Forensic Tools and Techniques 6.1.1 Utilize tools such as EnCase, FTK, Open Source, Imaging 6.1.2 Image a live system running Windows/Linux 6.1.3 Imaging using hardware-based tools 6.1.4 Prepare collection media for compatibility with Windows, Linux and Mac 6.1.5 Acquring a RAID 6.1.6 Acquring and analyzing data from NAS and SAN Evaluate sources of Forensic artifacts 6.2.1 Analyze computer componenets (include booting in a controlled environment) 6.2.2 Analyze electronic devices 6.2.3 Compare different SCSI(Small Computer System Interface Specifications) 6.2.4 Examine media including viewing EXIF data 6.2.5 Examine memory (RAM) 6.2.6 Analyze mobile devices(A ndroid, iOS, Blackberry,PDAs and data from Flash Media) 6.2.7 Analyze network artifacts 6.2.7 Analyze Web artifacts 6.2.8 Collecting data in the cloud Perform Malware Analysis 6.3.1 Detect unauthorized devices 6.3.2 Perform static malware analysis 6.3.3 Perform dynamic malware analysis Investigate Mobile Technology 6.4.1 Examine hardware and communication 6.4.2 Evaluate 2G-5G 6.4.3 Examine wireless security (include use of wireless network scanner to identify local wireless access points) 6.4.4 Examine how to connect to encrypted wireless access point 6.4.4 Analyze encryption Investigate Multimedia Technologies 6.5.1 Examine audio, pictures, video, 6.5.2 Analyze digital fingerprints/device fingerprints 6.5.3 Utilize various file formats 6.5.4 Examine meta data Analyze Social Media Artifacts 6.6.1 Examine social networks 6.6.2 Utilize analysis techniques (closeness, etc.) 6.6.3 Examine IM client logs, configiration files, chat logs Utilze Visual Analysis 6.7.1 Apply link analysis 6.7.2 Time analysis 6.7.3 Utilize filtering Use GREP 6.8.1 Construct GREP searches for file headers 6.8.2 Construct GREP searches for file signature analysis Analyze online and console based gaming 6.9.1 Identify evidence sources and challenges 6.9.2 Incorporate cloud investigation techniques Learning Demonstrations 5 6 7 8 TRACKING CRIMINALS ON FACEBOOK: A CASE STUDY FROM A DIGITAL FORENSICS REU PROGRAM Daniel Weiss University of Arizona [email protected] Gary Warner University of Alabama at Birmingham Director of Research in Computer Forensics [email protected] ABSTRACT The 2014 Digital Forensics Research Experience for Undergraduates (REU) Program at the University of Alabama at Birmingham (UAB) focused its summer efforts on tracking criminal forums and Facebook groups. The UAB-REU Facebook team was provided with a list of about 60 known criminal groups on Facebook, with a goal to track illegal information posted in these groups and ultimately store the information in a searchable database for use by digital forensic analysts. Over the course of about eight weeks, the UAB-REU Facebook team created a database with over 400 Facebook groups conducting criminal activity along with over 100,000 unique users within these groups. As of November 2014, students involved in the research project with Advisor Gary Warner at UAB continued running the automated fetchers since my summer projected completed. Working with U.S. Federal Law Enforcement agencies, there have been at least NINE CONFIRMED ARRESTS of individuals associated with the illegal activities tracked on Facebook. This paper will discuss the methods used to collect the information, store it in a database and analyze the data. The paper will also present possible future uses of the Facebook criminal activity-monitoring tool. Keywords: social media, criminal organizations, online crime, social network monitoring 1. INTRODUCTION For the past five years, the UAB Computer Forensics Research Lab has participated in the National Science Foundation Research Experience for Undergrads (REU) program. During the summer of 2014, the Digital Forensics REU team focused on developing tools for automating the gathering and analysis of the communications between criminals in online forums and on Facebook groups. The UAB-REU summer 2014 research project created a searchable database that keeps track of the growing criminal activity on Facebook. Our case study has a growing database that can keep track of everything on a Facebook group from posts, comments, likes, as well as the user who posted the respective post, the time it was posted, and any image that was posted in a post or comment. This data can be used to draw connections between active users within different groups and lead to arrests if proven criminal acts were performed. Many of the messages that we stored within the database contained credit card numbers associated with other personal information as well. 2. LITERATURE REVIEW Previous REU cohorts have examined the methods in which criminals learn and encourage one another’s criminal behavior through online social interaction in the area of phishing. (Levin, Richardson, Warner, & Kerley, 2012) Others have explored the role of online social media networks in the creation and execution of large international markets for stolen data and identities. Several researchers have examined online web forums that were designed primarily to support international trade in stolen goods and identities. (Holt & Smirnova, 2014), (Motoyama, McCoy, Levchenko, Savage, & Voelker, 2011), (Merces, 2011) As criminals and terrorist grow more brazen, they have realized that the use of secretive online forums is not necessary when Facebook traffic is largely unregulated and unmoderated and represents minimal risk of prosecution or incarceration. The House Homeland Security Committee held hearings on “Jihadist Use of Social Media” in 2011 where testimony included “The Antisocial Network” where it was remarked how little concern adversaries have about discovery. (Kohlman, 2011) The Law Reviews and Journals are beginning to fill with articles about the use of evidence from social media in the courts. Many of the opinions expressed in those articles helped to make the case for the existence of this project. One current trend in this debate is whether messages shared “quasiprivately” only to a chosen community of friend’s withstood Fourth Amendment challenges regarding expectations of privacy. (Sholl, 2013) Others have argued about the admissibility of such evidence, partly with regards to whether it constituted heresay under Federal Rules of Evidence. (Holt & San Pedro, 2014) Still others argue about the authentication of the evidence and how to prove the origins and identify of the poster. (Griffith, Winter 2012). To address all of these concerns, evidence would need to be gathered in a repeatable and automated way that preserved the timestamp and ‘userid’ of the creator of the evidence, and only from pages that could be shown to be publicly “Open.” 3. FACEBOOK AS OPEN SOURCE INTELLIGENCE 3.1. Problem Statement Summary The UAB-REU Facebook team was given a list of known criminal groups on Facebook, and was asked to track these groups over the summer of 2014. Specifically, the following was to be accomplished by the end of the summer. Can we quickly decide if a Facebook group is discussing criminal activity and if so, can we characterize what types of activities they do or targets they are after. For example is the criminal activity credit card fraud, stolen electronics, shipping of illegal or stolen items, viruses, malware, botnets, spamming, and even terrorists organizations or supporters of terrorists. We also wanted to be able to identify the most influential, and or important people, and or most active users within a group. By the end of summer our goal was to be tracking at least 200 criminal Facebook groups. With these goals in mind we set out to develop code to request and retrieve the wanted information from Facebook, and store the information into a searchable database where we could easily query the data for further investigations. 3.2. Facebook’s Graph Application Programming Interface (API) The API is on the developer side of Facebook and is a great tool that we used over the summer project. “The graph API is the primary way to get data in and out of Facebook’s social graph (network).”1 Essentially the Graph API allows a user to post, delete, and also get information to and from Facebook. The graph API was a tremendous asset for our team because it allowed us to query many useful searches directly without having to perform many iterations to gather wanted information, however to do so an Access Token was required. 3.2.1. The Basics The Graph API is a representation of the information on Facebook, which is composed of nodes, edges, and fields. Nodes are basically the “things” on Facebook. Ex. Users, Photos, Posts. Edges are the connections between nodes, such as a comment or a like on a photo. Fields are the information about nodes. For example, a node that is a user can have a field such as their birthday or hometown. 3.2.2. Using the Graph API to find more criminal groups To find more criminal Facebook groups, we used the Graph API, and searched for groups with specific keywords. Group names that had the word such as “Hacker” or “CVV” within their 1 Graph API Overview https://developers.facebook.com/docs/graphapi/overview name were added to our list of criminal groups. Even though it was not for sure that these groups were criminal our database queries later on would tell us. Figure 1 below shows the Graph API searching for all groups with the word “Hacking” in its’ name. Our team developed a “Bag of Words” which essentially was a list of keywords that we used to find new Facebook groups. Figure 1 Graph API Search Source: https://developers.facebook.com 3.2.3. Facebook Privacy The Graph API is a very handy tool that Facebook has allowed the public to use. However, Facebook privacy still comes into play when using the API. Facebook groups that have a Privacy status of Open, meaning anyone can see the group and join it, or a status of Closed, meaning anyone can see the group but must request to join the group can be seen through the Graph API. A group that is secret will not show up on the API. A secret group has no record of existing through any means of searches; the only way to be in a secret group is by getting invited to join the group. Of course being in a closed or secret group allows users to see everything going on within the group making the group ‘Open’ to the users within. Figure 2 and Figure 3 below are examples of an open and a closed group. Notice the difference in the amount of information between the Open and Closed group. Figure 4 below is an example of a closed group that the current Facebook user on the Graph API was a member of. Notice that it now looks like an open group. Figure 2 Open Group Example Source: https://developers.facebook.com Figure 3 Closed Group non-Member Example Source: https://developers.facebook.com Figure 4 Closed Group Member Example Source: https://developers.facebook.com 3.2.4. Aliases When collecting group ID numbers to run through the fetcher we realized that we were only able to pull information from a group that was open. To fix this issue we created Facebook aliases that looked like cyber criminals. We made two main accounts in particular and tried joining as many of the closed groups that we had found through the Graph API as possible. As a matter of fact it was not very hard to get accepted to a number of these groups. Once accepted into these groups we would run the Graph API with our alias’ Access Token and then run the fetcher. This was a huge step in our summer research as it allowed us to gather a considerably larger amount of data. 4. CODE IMPLEMENTATION 4.1. Automation of the Graph API The program for extracting the information from Facebook was written in Java. The code used a library package called RestFB, which allowed for direct access to the Graph API while in Java. We would supply the Graph API with a Group ID number and then would retrieve all of the group’s members, posts, comments, likes, pictures, etc. 4.2. The Database Our team created an SQL database to store the data retrieved from Facebook and make it easy to search for wanted results. In SQL, several different tables were created to easily make connections between users and groups. Within Java we coded to put all of the comments within its own SQL searchable table for example. Similarly tables were used to store information for images, posts, and groups. We created a user group’s table that allowed us to connect users to multiple groups because there were many instances where the same user belonged to more than one group in our database, and this allowed for a connection between the two. 5. RESULTS Within the database we ran queries to achieve the goals we set out in the beginning of the summer. We were able to determine if a Facebook group was talking about criminal activity, what kind of activity, and the ‘big’ players within those groups as well. By the end of the summer after about two weeks of data collection, the database had over 400 criminal groups that we were tracking and fetching information from. Within those 400 groups, there were over 100,000 unique users in those groups, about 50,000 posts, and about 40,000 comments on posts. keyword. The query searched for posts containing the word ‘fbi’. Many other related queries searched for posts containing the words ‘cia’ or ‘vbv’(Verified By Visa, a common term used by credit card criminals.) i.e. and counted the number of occurrences, displaying then the top ten groups. The following query looked for messages within the group’s posts’ that contained a certain Table 1: Results for the ‘fbi’ query count Groupid name 19 183381435133188 SPAMER’s 11 229655640389234 KING OF HACKER 10 505516012807000 DDOS 9 230749693747529 ! P4K OR4Kz4I H4CkERX ! 8 465238643517306 Bestbikes Grupo ventas Nacional 8 165155633573484 WESTERN UNION 7 290630927627110 Genius Hackers 6 126115430891994 SaDaM Khakwani All Hacking TrickXx & Tip$ 6 112852328867059 HACKERS SPOTTED :))) 6 14929934514034 Hack With Stylee (Hacking Zone) The following query searched for messages that contained a string of 15 or 16 digits, because that was our credit card number identifier if groups were sharing stolen credit cards with one another. The query below shows the results for the top four group’s sharing Visa credit cards. Our query searched for the number four followed by another 15 digits 0-9. Table 2: Results for the Visa query count Fb_group.Groupid Fb_group.name 432 435715723187958 PRO SHOPPER”S TUT AND BINS AND STORES 402 563652277096630 REEF GH ***CCV STRONG CARDS*** KILL THEM ALL 376 384945978297975 PRO SHOPPERS ***KILL, WAL,KMAR,SEAR, AND BEST 256 1422518178033504 The following query took a group that talked about visa credit card numbers frequently and displayed *** KILL CREDIT CARD*** the message along with the user who posted it. (Card numbers have been altered for privacy.) Table 3: Results for the Visa query userid name substring 100008366380917 Nana Less 4266841341509999 02/17 597 Sue Lowe 123 sixth street Calvin LA 71410 100008366380917 Nana Less 4185866411539999 06/16 417 Debra Duhon 300 Big Pasture Rd Lake Charles LA 100000835312440 Okoeokoso More-vimlated Vim-carders high balance cc Undergrad Carder 428208712259999 1014 578 Martin Ibarra 1108 E ORTEGA ST Santa Barbara C 100005869085570 6. FUTURE USES After just a short eight weeks in this REU program, and after only two weeks of actual data collection, results were huge. As of November 2014, students involved in the research project with Advisor Gary Warner at UAB continued running the automated fetchers since my summer projected completed. After the REU program completed for the summer, the tool became the anchor of a new Open Source Intelligence effort within the lab. The database now contains over a half million Facebook messages and replies and is monitoring more than 900 Facebook groups. The most prolific of these groups that were found to be dedicated to criminal activity have logged well over 5,000 messages each from as many as 1,800 distinct Facebook users. The tool has been used to learn more about criminal groups for many Federal, state, and local law enforcement investigations. Original conceived to assist in cybercrime cases, investigations have included tracking of many types of Facebook groups including “carders” (criminals who steal and trade 4347696620159999 1016 919 Cynthia Kroeker 11817 SW 1st Yukon OK 73099 credit cards), “booters” (criminals who sell DDOSing services), online sexual harassment via webcam-controlling botnets, street gangs selling illegal drugs and weapons, and counter-terrorism investigations. Hundreds of Facebook groups have been reported and terminated, while others are left intact to identify ring-leaders and, working with major US-based shipping companies and retailers, to intercept the shipment of stolen packages. Working with an inter-agency task force on violent crime, Facebook evidence from this project was used to document relationships between criminals as well as proof of weapons and drug possession from photos shared on Facebook in support of a RICO case that led to nine felony arrests. The project has also led to additional publications that have been focused on image analysis of the profile pictures. Hackers often use Guy Fawkes masks in profiles pictures, carders often have images of credit cards in their profile pictures, and jihadists often have Islamic State flags on their profile pictures. In addition to keyword clues, these new image analysis tools allow a group to be quickly categorized, even when the language used in the messages is not understood by the analyst. Implementation of a tool like this would have a great impact on the cyber world, as it would aid in the capture of cyber criminals. 7. CONCLUSION The 2014 Digital Forensics REU program at UAB provided students with the opportunity to develop real world applications with valuable outcomes. Our 2014 project identified criminal activity on Facebook, collected evidence and ultimately helped prosecute and punish criminals. The UABREU Facebook team created a searchable database that could be used by law enforcement and intelligence agencies, as well as private sector shipping companies, banks, and credit card companies to identify criminal activity and work with law enforcement to prosecute those responsible for the illegal activity. REFERENCES 1. BIBLIOGRAPHY Allen, M. (2010). Retrieved from restfb.com Burgess, E., & Metz, E. (2008). Applying Google Mini search appliance for document discoverability. Online, 32(4), 25-27. Chan, A. (2009, July). Google to the (EDiscovery) rescue? Retrieved January 11, 2013, from eDiscovery: http://ediscovery.quarles.com/2009/07/arti cles/information-technology/google-tothe-ediscovery-rescue/ Cheeck, J. M., & Buss, A. H. (1981). Shyness and sociability. Journal of personality and social psychology, 41(2), 330. Clark, J. (2005). AnandTech Search goes Google. Retrieved January 11, 2013, from anandtech.com: http://www.anandtech.com/show/1781/3 Claypool, M., Le, P., Wased, M., & Brown, D. (2001). Implicit interest indicators. Proceedings of the 6th international conference on Intelligent user interfaces (pp. 33-40). ACM. Colombini, C., & Colella, A. (2013). Digital profiling: A computer forensics approach. Availability, Reliability and Security for Business, Enterprise and Health Information Systems, 330-343. Colombini, C., Colella, A., & Italian Army. (2012). Digital scene of crime: technique of profiling users. Journal of Wireless Mobile Networks, Ubiquitous Computing, and Dependable Applications. Compton, D., & Hamilton, J. (2011). An Examination of the Techniques and Implications of the Crowd-sourced Collection of Forensic Data. Third International Conference on Privacy, Security, Risk and Trust (PASSAT) (pp. 892-895). IEEE. Cuff, J. (2009). Key trends and developments of rights information management systems– An interview with Jim Cuff of Iron Mountain Digital. Journal of Digital Asset Management, 5(2), 98-110. Denning, D. E., & Baugh Jr., W. E. (1999). Hiding crimes in cyberspace. Information, Communication & Society, 2(3), 251-276. Ericsson, K. A., Krampe, R. T., & Tesch-Römer, C. (1993). The role of deliberate practice in the acquisition of expert performance. Psychological Review, 100(3), 363. Florencio, D., & Herley, C. (2007). A large-scale study of web password habits. Proceedings of the 16th international conference on World Wide Web (pp. 657666). ACM. Garrison, J. (2012, December 11). Google Mini Search Appliance Teardown. Retrieved July 8, 2013, from http://1n73r.net/2012/12/11/google-minisearch-appliance-teardown/ Gaw, S., & Felten, E. (2006). Password management strategies for online accounts. Proceedings of the second symposium on Usable privacy and security (pp. 44-45). ACM. Herley, C. (2012). Why do Nigerian Scammers say they are from Nigeria? WEIS. Google. (2013a). Google Mini Help. Retrieved January 11, 2013, from Google Web Site: http://support.google.com/mini/?hl=en#to pic=219 Holt, M. R., & San Pedro, V. (2014). Social Media Evidence: What you can't use won't help you - Practical considerations for using evidence gathered on the Internet. The Florida Bar Journal. Google. (2013b). Google Mini: Information. Retrieved January 11, 2013, from Google web site: http://lp.googlemkto.com/NORTHAMSearchLCSMiniEn dofLife_GoogleMiniFAQs.html Holt, T. J., & Smirnova, O. (2014). Examining the Structure, Organization and Processes of the International Market for Stolen Data. Washington DC: National Criminal Justice Reference Service. Google. (2013c). Google Mini Report Overview. Retrieved January 11, 2013, from Google web site: http://static.googleusercontent.com/extern al_content/untrusted_dlcp/www.google.ie/ en/ie/enterprise/mini/library/MiniReports. pdf Jenkins, C., Corritore, C. L., & Weidenbeck, S. (2003). Patterns of information seeking on the Web: A qualitative study of domain expertise and Web expertise. IT & Society, 1(3), 64-89. Google. (2013d). First-Time Startup of a Google Search Appliance. Retrieved January 15, 2013, from Google web site: https://developers.google.com/searchappliance/documentation/50/installation/In stallationGuide#FirstTime Google. (2013e). Google Mini Help Center. Retrieved June 30, 2013, from Google web site: https://developers.google.com/searchappliance/documentation/50/help_mini/ho me Google. (2013f). Google Mini License Agreement v3.0. Retrieved July 8, 2013, from Google web site: http://1n73r.net/wpcontent/uploads/2012/12/google-minieula.pdf Grabosky, P. (2000). Computer crime: A criminological overview. Workshop on Crimes Related to the Computer Network, Tenth United Nations Congress on the Prevention of Crime and the Treatment of Offenders. Vienna. Griffith, H. L. (Winter 2012). Understanding and Authenticating Evidence from Social Networking Sites. Washington Journal of Law, Technology & Arts. Kohlman, E. (2011, 12 6). The Antisocial Network: countering the use of online social networking technologies by foreign terrorist organizations. Retrieved from House.gov: homeland.house.gov/sites/homeland.house .gov/files/Testimony Kohlmann[1].pdf Krone, T. (2004). A typology of online child pornography offending. Australian Institute of Criminology. Larrieu, T. (2009). Crawling the Control System. No. JLAB-ACO-09-1072; DOE/OR/23177-1007. Newport News, VA: Thomas Jefferson National Accelerator Facility. LaTulippe, T. (2011). Working Inside the Box: An Example of Google Desktop Search in a Forensic Examination. Journal of Digital Forensics, Security and Law, 6(4), 11-18. Levin, R., Richardson, J., Warner, G., & Kerley, K. (2012). Explaining Cybercrime through the Lens of Differential Association Theory. eCrime Researchers Summit (pp. 1-9). Las Croabas, Puerto Rico: IEEE. McElhaney, S., & Ghani, S. (2008). Enterprise Search and Automated Testing. Governance, Risk, and Compliance Handbook: Technology, Finance, Environmental, and International Guidance and Best Practices, 267. Multidisciplinary Analysis (pp. 217-235). Springer Berlin Heidelberg. Merces, F. (2011). The Brazilian Underground Market: The Market for Cybercriminal Wannabes? Retrieved from Trend Micro: www.trendmicro.com/cloudcontent/us/pdfs/securityintelligence/white-papers/wp-thebrazilian-underground-market.pdf Scealy, M., Phillips, J. G., & Stevenson, R. (2002). Shyness and anxiety as predictors of patterns of Internet usage. CyberPsychology & Behavior, 5(6), 507515. Merritt, K., Smith, D., & Renzo, J. (2005). An investigation of self-reported computer literacy: Is it reliable. Issues in Information Systems, 6(1), 289-295. Motoyama, M., McCoy, D., Levchenko, K., Savage, S., & Voelker, G. M. (2011). An analysis of underground forums. 2011 ACM SIGCOMM conference on Internet measurement (pp. 71-80). NY: ACM. Ngo, F. T., & Parternoster, R. (2011). Cybercrime victimization: An examination of Individual and Situational level factors. International Journal of Cyber Criminology, 5(1), 773-793. Nykodym, N., Taylor, R., & Vilela, J. (2005). Criminal profiling and insider cyber crime. Digital Investigation, 2(4), 261267. Orr, E., Sisic, M., Ross, C., Simmering, M. G., Arseneault, J. M., & Orr, R. R. (2009). The influence of shyness on the use of Facebook in an undergraduate sample. CyberPsychology & Behavior, 12(3), 337340. Radianti, J., Rich, E., & Gonzalez, J. J. (2009). Vulnerability black markets: Empirical evidence and scenario simulation. 42nd Hawaii International Conference on System Sciences (pp. 1-10). IEEE. Rogers, M. K. (2006). A two-dimensional circumplex approach to the development of a hacker taxonomy. Digital investigation, 3(2), 97-102. Rogers, M. K. (2010). The Psyche of Cybercriminals: A Psycho-Social Perspective. In Cybercrimes: A Sholl, E. W. (2013). Exhibit Facebook: The Discoverability and Admission of Social Media Evidence. Tulane Journal of Technology and Intellectual Property. Topalli, V. (2004). Criminal expertise and offender decision-making: An experimental analysis of how offenders and non-offenders differentially perceive social stimuli. British Journal of Criminology, 45(3), 269-295. United States Government. (2013, September 27). Criminal Complaint. Retrieved October 11, 2013, from http://www.scribd.com/doc/172773407/Ul bricht-Criminal-Complaint-Silk-Road Warren, P., & Streeter, M. (2006). Cyber alert: How the world is under attack from a new form of crime. Vision Paperbacks. Wright, R., Logie, R. H., & Decker, S. H. (1995). Criminal expertise and offender decision making: An experimental study of the target selection process in residential burglary. Journal of Research in Crime and Delinquency, 32(1), 39-53. Subscription Information The Proceedings of the Conference on Digital Forensics, Security and Law is a publication of the Association of Digital Forensics, Security and Law (ADFSL). The Journal is published on a non-profit basis. In the spirit of the JDFSL mission, individual subscriptions are discounted. However, we do encourage you to recommend the journal to your library for wider dissemination.. The proceedings are published in both print and electronic form under the following ISSN's: ISSN: 1931-7379 (print) ISSN: 1931-7387 (online) Subscription rates for the proceedings are as follows: Institutional – Print: $120 Institutional – Online only: Open Access (1 issue) Individual – Print: $25 Individual – Online only: Open Access (1 issue) Subscription requests may be made to the ADFSL. The offices of the Association of Digital Forensics, Security and Law (ADFSL) are at the following address: Association of Digital Forensics, Security and Law 1642 Horsepen Hills Road Maidens, Virginia 23102 Tel: 804-402-9239 Fax: 804-680-3038 E-mail: [email protected] Website: http://www.adfsl.org Contents Committee ........................................................................................................................................ 4 Schedule ............................................................................................................................................ 5 Keynote Speaker: Jeff Salyards, Executive Director of the Defense Forensic Science Center 9 Keynote Speaker: Craig Ball, Board Certified trial lawyer, certified computer forensic examiner, law professor and electronic evidence expert .............................................................. 9 Invited Speaker: Mohamed Chawki, Chief Judge of the Council of State, Egypt ................... 11 Invited Speaker: Gareth Davies, Senior Lecturer at the University of South Wales, UK ....... 11 Invited Speaker: Philip Craiger, Daytona State College.............................................................. 11 Invited Paper: Potential Changes to eDiscovery Rules in Federal Court: A Discussion of the Process, Substantive Changes and Their Applicability and Impact on Virginia Practice ....... 13 Joseph J. Schwerha, IV, Attorney at Law, Pennsylvania, USA Invited Paper: A Profile of Prolonged, Persistent SSH Attack on a Kippo Honeynet .................. 23 Craig Valli, Director of the Security Research Institute at Edith Cowan University, Australia Two Challenges of Stealthy Hypervisors Detection: Time Cheating and Data Fluctuations ... 33 Igor Korkin* An Empirical Comparison of Widely Adopted Hash Functions in Digital Forensics: Does the Programming Language and Operating System Make a Difference? ........................................ 57 Satyendra Gurjar, Ibrahim Baggili*, Frank Breitinger and Alice Fischer Investigating Forensics Values of Windows Jump Lists Data ..................................................... 69 Ahmad Ghafarian* A Survey of Software-based String Matching Algorithms for Forensic Analysis ..................... 77 Yi-Ching Liao * A New Cyber Forensic Philosophy for Digital Watermarks in the Context of Copyright Laws .................................................................................................................................................. 87 Vinod Polpaya Bhattathiripad , Sneha Sudhakaran* and Roshna Khalid Thalayaniyil A Review of Recent Case Law Related to Digital Forensics: The Current Issues .................... 95 Kelly Anne Cole, Shruti Gupta, Dheeraj Gurugubelli* and Marcus K Rogers On the Network Performance of Digital Evidence Acquisition of Small Scale Devices over Public Networks ............................................................................................................................ 105 Irvin Homem * and Spyridon Dosis Measuring Hacking Ability Using a Conceptual Expertise Task .............................................. 123 Justin Giboney*, Jeffrey Gainer Proudfoot, Sanjay Goel * and Joseph S. Valacich HTML5 Zero Configuration Covert Channels: Security Risks and Challenges .................... 135 Jason Farina*, Mark Scanlon, Stephen Kohlmann, Nhien-An Le-Khac and Tahar Kechadi Continuous Monitoring System Based on Systems' Environment ........................................... 151 Eli Weintraub* and Yuval Cohen Identifying Common Characteristics of Malicious Insiders ..................................................... 161 Nan Liang * and David Biros* Phishing Intelligence Using The Simple Set Comparison Tool ................................................ 177 Jason Britt*, Dr. Alan Sprague and Gary Warner Towards a Digital Forensics Competency-Based Program: Making Assessment Count ....... 193 Rose Shumba* Case Study: 2014 Digital Forensics REU Program at the University of Alabama at Birmingham ........... 205 Daniel Weiss* and Gary warner * Author Presenting and/or Attending