Big Data

Transcription

Big Data
Big Data – Perspectives for Germany
Seize the Opportunity
Prof. Dr. Stefan Wrobel
Fraunhofer-Institut für Intelligente Analyseund Informationssysteme IAIS
Fraunhofer Big Data Initiative
www.iais.fraunhofer.de
bigdata.fraunhofer.de
Prof. Dr. Stefan Wrobel
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Fraunhofer IAIS: Intelligent Analysis and Information Systems
Do more with data
„From sensor data to business intelligence,
g
from media analysis to visual information
systems: our technology allows enterprises to
do more with data.“
200+ employees, at the campus
Birlinghoven castle close to Bonn
Research areas
 Machine Learning and Data Mining
 Multimedia Pattern Recognition
 Visual Analytics
 Process Intelligence
 Autonomous Systems
Prof. Dr. Stefan Wrobel
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
2
Fraunhofer – From innovation to market
Big Data
Infrastructure
Visual
Analytics
Basic research
Machine
Learning
Core compentences
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Customers
Big Data everywhere…
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Big Data Trends
Convergence
Ubiquitous Intelligent Systems
www.
User Content
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Open Data
Zetta, Zebi, Yotta and Yobi
[Wikipedia, 2011]
Prof. Dr. Stefan Wrobel
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
6
Stored data at US enterprises
 For example 1.5 billion new entries at Tesco per month
 2.5 petabytes Data Warehouse at Walmart
Prof. Dr. Stefan Wrobel
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
[[McKinsey,
y, 2011]]
7
Open Data – Examples of publicly available data sources
6 billion
web pages
400 million
facts
200 TB of
genomic daten
More than 4.1
41
million English
articles
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
86 billion ngrams
270 data catalogues
Big Data
The view of BITKOM, The German IT Association
Volume
Variety
Number of records and files
External data (web open data, etc.)
Yottabytes
Company data
Zettabytes
Unstructured, semistructured,
Exabytes
structured data
Presentations | text | video | images | tweets | blogs
Petabytes
Terabytes
Machine to machine communication
Big Data
High
g speed
p
data g
generation
Constant transmission of generated
Data in realtime
Milliseconds
Seconds | minutes | hours
Velocity
Discoveryy of relationships,
p
patterns, meaning
Prediction models
Data Mining
Text Mining
Image Analytics | Visualization | Realtime
Analytics
Quelle: BITKOM Big Data Leitfaden, 2012. BITKOM AK Big Data
Prof. Dr. Stefan Wrobel
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
9
Big Data
A definition attempt
Big Data in general refers to
 The trend towards availabity of ever more detail than ever closer to
realtime data
 The switch from a model-driven to a model- and data-driven approach
y and use of big
g data
 The economic p
potentials that result from the analysis
when properly integrated into company processes
Big Data currently focuses technically on the following aspects

Volume, Variety, Velocity

In-memory computing, Hadoop etc.

Real-time analysis and effects of scale
Big Data must take implications to society into account
Prof. Dr. Stefan Wrobel
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
10
Quellen: http://m.sybase.com/detail?id
http://m.sybase.com/detail?id=1095954
1095954 und McKinsey Studie, 2011
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Innovation study Big Data
Desk research
(
(current
t state
t t off affairs)
ff i )
 Detailed overview of the
national and international Big
Data landscape
 More than 50 systematic Big
Data Business Cases
In-depth workshops for industry
sectors (qualitative study)
 Expert workshops
 Finance, Telecom,
Market research, EC
Comm.,
I
Insurance
 1.10.2012 to 30.11.2012
Online
O
li survey
(quantitative study)
Prof. Dr. Stefan Wrobel
12
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
 82 high-ranking
executives from small
and large companies
Sector workshops Big Data
Finance
Telko
Insurance
Market research
Prof. Dr. Stefan Wrobel
13
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
E-Commerce
Characteristic areas of companies for Big Data applications
according
di to
t sector
t
14
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Most frequent goals: Increased revenue and cost-savings
15
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Per sector view of tasks for Big Data applications
16
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Realtime or non-realtime and automated versus nonautomated
t
t d analysis
l i
17
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Overall view
 69% of all respondents are striving to gain strategic advantages from Big
Data.
 78% answer that
th t they
th need
d to
t improve
i
h
human
resources for
f Big
Bi Data.
D t
 67% of respondents say that the budget for Big Data topics
(technologies, analyses, data sources excluding personnel) must increase.
 Only 8% of respondents say that there are no barriers towards Big Data
success.
 These results hold cross all sectors
18
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Insight from qualitative per sector workshops
 More efficiencyy from intelligent
g
information systems
y
 Mass individualization of products and services
 Intelligent products adapt while in use
Prof. Dr. Stefan Wrobel
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
19
Big Data in sales forecasting
More efficiency from intelligent information systems
•
Idea: “Predict sales at the article level
more precisely”
•
Big
g Data: more than 100 million
records per week added to the system
•
Benefits: higher availability and more
economically efficient
http://www.blue-yonder.com
Prof. Dr. Stefan Wrobel
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
20
Suppliers and technologies in the context of Big Data (Selection)
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Challenges for realization
•
•
Respondents see the main problems in the following areas
•
Data security and privacy (49%)
•
B d t and
Budget
d priorities
i iti (45%)
•
Technical challenges of data management (38%)
•
Expertise (36%)
•
Insufficient knowledge about Big Data possibilities (35%).
To change the current deficits, 95% of respondents are looking for
•
Best Practices, Trainings, supplier and solutions surveys and improved privacy
regulations
22
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Fraunhofer initiative Big Data
Joint competences in a »Big Data Factory« for Germany
Strategies, Solutions and Successes
20 Fraunhofer institutes – one central coordination point
Synchronized and broad competence portfolio with many years of expertise
in big data in different sectors
Best of class Big Data solutions for individual projects, consulting and
qualification of personnel
Fraunhofer initiative Big Data –
Benefit from the future today!
bigdata.fraunhofer.de
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Big Data – realized by Fraunhofer
Visual Analytics
Reliable supplier
Fraud recognition
Efficient
for more security
chains
in finance data
production
p
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Visual Analytics for enhanced security
React faster
Visual Analytics systems support decision makers live in the
process of evaluating, understanding and acting on security
risks in distributed infrastructures
Fraunhofer solutions increase the security and stability of critical
infrastructures such as power or communication networks
Leading suppliers and operators manage,
monitor and optimize their networks with
„Visual
Visual Analytics
Analytics“ applications
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Reliable supply chains
Control logistic processes while they run
Sensor-based information systems deliver realtime situation
assessments and recognize disturbances in the
supply chain in a productive manner
Fraunhofer assistance systems protect from
unexpected
p
supply
pp y p
problems and increase resource
efficiency
The info broker software ensures the success of all
companies
i in
i the
th supply
l chain
h i from
f
th original
the
i i l
supplier all the way to the manufacture
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Fraud recognition in finance
Recognize fraudsters in realtime
Big Data algorithms recognize fraudulent credit card
transactions in milliseconds
Fraunhofer software protects credit card
companies and their customers
The Software
Th
S ft
i in
is
i day-to-day
d t d used
d att a leading
l di
European payment transaction company and
protect so portfolio of several million of credit
cards
d
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Increased production efficiency
Optimize production with a push of a button
Big-Data information systems condense millions
off individual
i di id l messages to smart indicators
i di
Fraunhofer software protects against standstills,
increases efficiency and ensures the quality of
production
Our manufacturing intelligence system is in use at an
international automobil company
Scalable technologie for the „Internet of things“
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Fraunhofer Living Lab Big Data – A Core
A hit t
Architecture
f Scalable
for
S l bl and
d Real-Time
R l Ti
A l ti
Analytics
… and basis for our training course „Data Scientist Big Data“
Batch-Anwendung
Analyse
Anal
se von
on
Kundenfeedback
Realtime-Anwendung
Big Data
Forschungsmonitor
Ausgewählte
Technologien
Anwendungsfälle
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
5 Milliarden
a de
Webseiten
(Q1/2012)
~ 20TB nur Text
Big Data
Datensatz
What‘s happening on the internet?
Consumers g
get networked in
ways never seen before
The number of postings
about products and
brands grows
overproportionally
Soon more than 6 billion consumers will use
mobile devices at the Point of Sale to read things
from the internet and use that for their purchase
decisions
ITU International Telecommunications Union
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Recognize important customer feedback among millions
off postings
ti
and
d web
b pages
Big Data Process chain
Collection of requirements,
q
,
specification
Collection of data,, data p
pool
Customization
C
stomi ation and operation of the
system, running system
-> Permanent
P
fl
flow
off relevant
l
i f
information
i
Online EmotionsRadar
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
validation
Data validation,
consulting, running
service
Mobility Mining – for outdoor advertising
Use of mobility data to predict effectiveness of media advertising
Question
 How many people pass a given poster board
at any given day?
 What is the distribution between public
transport,
p , cars and p
pedestrians?
What special about the model?
 First model for 6,9
6 9 million street segments in
Germany
 Central element in Germany for determining
reach of outdoor advertising
g
 Basis for all traffic-related questions in
market research
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Information source mobility data
Mobility Mining helps understand cell
phone data
Quality of cell phone data
 … high coverage of the population
 … no cost-intensive data collection
 … allow
ll
a view
i
off spatial
ti l and
d temporal
t
l
dynamics of mobility at different levels
 … can be processed in realtime
© Fraunhofer
Munich: Indication of cell load from GSM data
Our research expertise:
2005 GeoPKDD – EU - FET
Cell phone data are indicators for
mobility
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
2010 MODAP – EU - CA
2011 LIFT – EU – FET
2011 DATASIM – EU – FET
Champions League
Cell phone example Allianz Arena
29&30.7.9: Audi Cup
Champions League
Champions League
VS AC Florenz
VS Juventus Turin
DFB-Pokal VS
Eintracht Frankfurt
Bundesliga Heimspiele FC Bayern München
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
VS Lyon
Champions League
VS Manchester
Länderspiel
Lä
d
i l
Deutschland VS
Argentinien
One value per hour
Our approach: Integration of heterogeneous data sources
Frequency
Map
GPS
Dynamic
Mobility
Model
d l
Cellphone
data GSM
Interviews
(CATI)
Household
database
Geodata
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Privacy-preserving Data Mining
 Reconciles Data Mining and data privacy

Legal questions and public opinion

Also: Protection of company interests in distributed
Data Mining
 Privacy by Design

Development of privacy compatible analytics

Guaranteed anonymity, guaranteed results
 Project
P j t examples
l

Data Mining in Fraud detection for
- Banco Bilbao Vizcaya Argentaria (BBVA)
- Arvato
A
I f
Infoscore
LIFT „Safe
Safe Zone
Zone“ Technologie
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Big Data – Big Opportunities
•
Data are a resource that will be decisive in
competition
p
•
Big Data technologies allow the intelligent
analysis and linking of big and heterogeneous
data in realtime
•
With the right approach Big Data and privacy are
no contradiction
•
New perspectives for better products, more
efficient production and resource-effective action
•
Companies become„Data
become Data-driven
driven Enterprises
Enterprises“
The challenge: Technologies and Business Know-how must be integrated in
business and production processes in order to create value
Prof. Dr. Stefan Wrobel
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
37