Document 6534887
Transcription
Document 6534887
Data Spillage in Hadoop Clouds: An Overview Adviser: Dr. Brandeis Marshall Student: Dri Torres Problem Statement Motivation What are possible Data Spill solutions in Hadoop, and can they integrate with current Data Spill procedure? Economic, Data Spill recovery costs millions. Address Security downfalls in Hadoop Significance To harness the full potential of Big Data, Data Spillage must be addressed What is Data Spillage? The transfer of classified data onto an unclassified information systems Once classified data has leaked it is dirty What is Big Data? 32% of Companie s use Hadoop A Buzz Word Less than 50% think it is clearly defined Processes terabytes of data in minutes Hadoop Stores and processes Big Data Volume, Velocity, Variety (The 3 Vs) NIST Procedure for Data Spillage The Problem with Hadoop is “it’s a moving target” Spillage and Cloud Computing Cloud Environments A Top-Down Approach Standardization Information Sharing Governments should establish an advanced Analytics Agency Standardized Software and Hardware Leads to Information Securing Standard skilled and professional staff Always report a spill. Fully document and publish work • Public • Private • Separate Cloud Technology • Map Reduce • Yet Another Resource YARN(MR Negotiator v2) • Adopted 2013 • Compatible with more than MR Long Term Storage Big Data Analytics Knowledge Systems for Metadata Inter-Agency Collaboration Solutions Establish Big Data control centers Private Hadoop Networks Sample Cases for Analyzing Spills References Gang-Hoon Kim, Silvana Trimi, and Ji-Hyong Chung. 2014. Big-data applications in the government sector. Commun. ACM 57, 3 (March 2014), 78-85. DOI=10.1145/2500873 http://doi.acm.org/10.1145/2500873 Lindner, Felix FX, and Sandro Gaycken. "Back to Basics: Beyond Network Hygiene." Best Practices in Computer Network Defense: Incident Detection and Response (2014): 54-64. Recurity Labs. IOS Press, 10 Feb. 2014. Web. 17 July 2014. <http://recurity-labs.com>. Michael Stonebraker and Judy Robertson. 2013. Big data is 'buzzword du jour;' CS academics 'have the best job'. Commun. ACM 56, 9 (September 2013), 10-11. DOI=10.1145/2500468.2500471 This material is based upon work supported by the National Science Foundation under grant #1062970. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. INSuRE is training students in information security research. Sponsored by the National Science Foundation with problems provided by the National Security Agency.