Big Data
Transcription
Big Data
Big Data – Perspectives for Germany Seize the Opportunity Prof. Dr. Stefan Wrobel Fraunhofer-Institut für Intelligente Analyseund Informationssysteme IAIS Fraunhofer Big Data Initiative www.iais.fraunhofer.de bigdata.fraunhofer.de Prof. Dr. Stefan Wrobel © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS Fraunhofer IAIS: Intelligent Analysis and Information Systems Do more with data „From sensor data to business intelligence, g from media analysis to visual information systems: our technology allows enterprises to do more with data.“ 200+ employees, at the campus Birlinghoven castle close to Bonn Research areas Machine Learning and Data Mining Multimedia Pattern Recognition Visual Analytics Process Intelligence Autonomous Systems Prof. Dr. Stefan Wrobel © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS 2 Fraunhofer – From innovation to market Big Data Infrastructure Visual Analytics Basic research Machine Learning Core compentences © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS Customers Big Data everywhere… © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS Big Data Trends Convergence Ubiquitous Intelligent Systems www. User Content © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS Open Data Zetta, Zebi, Yotta and Yobi [Wikipedia, 2011] Prof. Dr. Stefan Wrobel © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS 6 Stored data at US enterprises For example 1.5 billion new entries at Tesco per month 2.5 petabytes Data Warehouse at Walmart Prof. Dr. Stefan Wrobel © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS [[McKinsey, y, 2011]] 7 Open Data – Examples of publicly available data sources 6 billion web pages 400 million facts 200 TB of genomic daten More than 4.1 41 million English articles © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS 86 billion ngrams 270 data catalogues Big Data The view of BITKOM, The German IT Association Volume Variety Number of records and files External data (web open data, etc.) Yottabytes Company data Zettabytes Unstructured, semistructured, Exabytes structured data Presentations | text | video | images | tweets | blogs Petabytes Terabytes Machine to machine communication Big Data High g speed p data g generation Constant transmission of generated Data in realtime Milliseconds Seconds | minutes | hours Velocity Discoveryy of relationships, p patterns, meaning Prediction models Data Mining Text Mining Image Analytics | Visualization | Realtime Analytics Quelle: BITKOM Big Data Leitfaden, 2012. BITKOM AK Big Data Prof. Dr. Stefan Wrobel © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS 9 Big Data A definition attempt Big Data in general refers to The trend towards availabity of ever more detail than ever closer to realtime data The switch from a model-driven to a model- and data-driven approach y and use of big g data The economic p potentials that result from the analysis when properly integrated into company processes Big Data currently focuses technically on the following aspects Volume, Variety, Velocity In-memory computing, Hadoop etc. Real-time analysis and effects of scale Big Data must take implications to society into account Prof. Dr. Stefan Wrobel © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS 10 Quellen: http://m.sybase.com/detail?id http://m.sybase.com/detail?id=1095954 1095954 und McKinsey Studie, 2011 © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS Innovation study Big Data Desk research ( (current t state t t off affairs) ff i ) Detailed overview of the national and international Big Data landscape More than 50 systematic Big Data Business Cases In-depth workshops for industry sectors (qualitative study) Expert workshops Finance, Telecom, Market research, EC Comm., I Insurance 1.10.2012 to 30.11.2012 Online O li survey (quantitative study) Prof. Dr. Stefan Wrobel 12 © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS 82 high-ranking executives from small and large companies Sector workshops Big Data Finance Telko Insurance Market research Prof. Dr. Stefan Wrobel 13 © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS E-Commerce Characteristic areas of companies for Big Data applications according di to t sector t 14 © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS Most frequent goals: Increased revenue and cost-savings 15 © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS Per sector view of tasks for Big Data applications 16 © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS Realtime or non-realtime and automated versus nonautomated t t d analysis l i 17 © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS Overall view 69% of all respondents are striving to gain strategic advantages from Big Data. 78% answer that th t they th need d to t improve i h human resources for f Big Bi Data. D t 67% of respondents say that the budget for Big Data topics (technologies, analyses, data sources excluding personnel) must increase. Only 8% of respondents say that there are no barriers towards Big Data success. These results hold cross all sectors 18 © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS Insight from qualitative per sector workshops More efficiencyy from intelligent g information systems y Mass individualization of products and services Intelligent products adapt while in use Prof. Dr. Stefan Wrobel © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS 19 Big Data in sales forecasting More efficiency from intelligent information systems • Idea: “Predict sales at the article level more precisely” • Big g Data: more than 100 million records per week added to the system • Benefits: higher availability and more economically efficient http://www.blue-yonder.com Prof. Dr. Stefan Wrobel © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS 20 Suppliers and technologies in the context of Big Data (Selection) © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS Challenges for realization • • Respondents see the main problems in the following areas • Data security and privacy (49%) • B d t and Budget d priorities i iti (45%) • Technical challenges of data management (38%) • Expertise (36%) • Insufficient knowledge about Big Data possibilities (35%). To change the current deficits, 95% of respondents are looking for • Best Practices, Trainings, supplier and solutions surveys and improved privacy regulations 22 © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS Fraunhofer initiative Big Data Joint competences in a »Big Data Factory« for Germany Strategies, Solutions and Successes 20 Fraunhofer institutes – one central coordination point Synchronized and broad competence portfolio with many years of expertise in big data in different sectors Best of class Big Data solutions for individual projects, consulting and qualification of personnel Fraunhofer initiative Big Data – Benefit from the future today! bigdata.fraunhofer.de © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS Big Data – realized by Fraunhofer Visual Analytics Reliable supplier Fraud recognition Efficient for more security chains in finance data production p © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS Visual Analytics for enhanced security React faster Visual Analytics systems support decision makers live in the process of evaluating, understanding and acting on security risks in distributed infrastructures Fraunhofer solutions increase the security and stability of critical infrastructures such as power or communication networks Leading suppliers and operators manage, monitor and optimize their networks with „Visual Visual Analytics Analytics“ applications © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS Reliable supply chains Control logistic processes while they run Sensor-based information systems deliver realtime situation assessments and recognize disturbances in the supply chain in a productive manner Fraunhofer assistance systems protect from unexpected p supply pp y p problems and increase resource efficiency The info broker software ensures the success of all companies i in i the th supply l chain h i from f th original the i i l supplier all the way to the manufacture © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS Fraud recognition in finance Recognize fraudsters in realtime Big Data algorithms recognize fraudulent credit card transactions in milliseconds Fraunhofer software protects credit card companies and their customers The Software Th S ft i in is i day-to-day d t d used d att a leading l di European payment transaction company and protect so portfolio of several million of credit cards d © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS Increased production efficiency Optimize production with a push of a button Big-Data information systems condense millions off individual i di id l messages to smart indicators i di Fraunhofer software protects against standstills, increases efficiency and ensures the quality of production Our manufacturing intelligence system is in use at an international automobil company Scalable technologie for the „Internet of things“ © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS Fraunhofer Living Lab Big Data – A Core A hit t Architecture f Scalable for S l bl and d Real-Time R l Ti A l ti Analytics … and basis for our training course „Data Scientist Big Data“ Batch-Anwendung Analyse Anal se von on Kundenfeedback Realtime-Anwendung Big Data Forschungsmonitor Ausgewählte Technologien Anwendungsfälle © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS 5 Milliarden a de Webseiten (Q1/2012) ~ 20TB nur Text Big Data Datensatz What‘s happening on the internet? Consumers g get networked in ways never seen before The number of postings about products and brands grows overproportionally Soon more than 6 billion consumers will use mobile devices at the Point of Sale to read things from the internet and use that for their purchase decisions ITU International Telecommunications Union © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS Recognize important customer feedback among millions off postings ti and d web b pages Big Data Process chain Collection of requirements, q , specification Collection of data,, data p pool Customization C stomi ation and operation of the system, running system -> Permanent P fl flow off relevant l i f information i Online EmotionsRadar © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS validation Data validation, consulting, running service Mobility Mining – for outdoor advertising Use of mobility data to predict effectiveness of media advertising Question How many people pass a given poster board at any given day? What is the distribution between public transport, p , cars and p pedestrians? What special about the model? First model for 6,9 6 9 million street segments in Germany Central element in Germany for determining reach of outdoor advertising g Basis for all traffic-related questions in market research © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS Information source mobility data Mobility Mining helps understand cell phone data Quality of cell phone data … high coverage of the population … no cost-intensive data collection … allow ll a view i off spatial ti l and d temporal t l dynamics of mobility at different levels … can be processed in realtime © Fraunhofer Munich: Indication of cell load from GSM data Our research expertise: 2005 GeoPKDD – EU - FET Cell phone data are indicators for mobility © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS 2010 MODAP – EU - CA 2011 LIFT – EU – FET 2011 DATASIM – EU – FET Champions League Cell phone example Allianz Arena 29&30.7.9: Audi Cup Champions League Champions League VS AC Florenz VS Juventus Turin DFB-Pokal VS Eintracht Frankfurt Bundesliga Heimspiele FC Bayern München © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS VS Lyon Champions League VS Manchester Länderspiel Lä d i l Deutschland VS Argentinien One value per hour Our approach: Integration of heterogeneous data sources Frequency Map GPS Dynamic Mobility Model d l Cellphone data GSM Interviews (CATI) Household database Geodata © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS Privacy-preserving Data Mining Reconciles Data Mining and data privacy Legal questions and public opinion Also: Protection of company interests in distributed Data Mining Privacy by Design Development of privacy compatible analytics Guaranteed anonymity, guaranteed results Project P j t examples l Data Mining in Fraud detection for - Banco Bilbao Vizcaya Argentaria (BBVA) - Arvato A I f Infoscore LIFT „Safe Safe Zone Zone“ Technologie © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS Big Data – Big Opportunities • Data are a resource that will be decisive in competition p • Big Data technologies allow the intelligent analysis and linking of big and heterogeneous data in realtime • With the right approach Big Data and privacy are no contradiction • New perspectives for better products, more efficient production and resource-effective action • Companies become„Data become Data-driven driven Enterprises Enterprises“ The challenge: Technologies and Business Know-how must be integrated in business and production processes in order to create value Prof. Dr. Stefan Wrobel © Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS 37