Geert-Jan Houben

Transcription

Geert-Jan Houben
Social Data Science for
Intelligent Cities
The Role of Social Media for Sensing Crowds
Prof.dr.ir. Geert-Jan Houben
TU Delft
Web Information Systems
&
Delft Data Science
WIS - Web Information Systems
http://blog.electricbricks.com/?attachment_id=16621
“Why”
The Most Complex System
Created By People
Populated by people
Kowloon Walled City, Hong Kong
But, what about the Web?
Largest human-made artefact
Credits: http://blogs.microsoft.com/blog/2014/04/15/a-data-culture-for-everyone/
Intelligent = IT
•
•
Prescriptive, centralised design
•
Environment should fit the software
•
Data should fit the software
•
Users should fit the software
Computing science => Efficiency
▪
Efficient Software => Efficient Systems
A Web-driven Paradigm Shift
Decentralisation
Openness & Linking
Personalisation
Adaptation
Utility
Credits: http://blogs.microsoft.com/blog/2014/04/15/a-data-culture-for-everyone/
Intelligent = Data
Data
Scale
Machines
Speed
Sustainability
Intelligent = Data Semantics
Data
Scale
Machines
Speed
Sustainability
Intelligent = Social Data Semantics
People
Create
Analyse
Interpret
Data
Engage &
Retain
Describe
People
Machines
Social Data Science
Creation
Annotations
Implicit vs. Explicit
Organically vs.
On-Demand
To Train Machines
Analysis
When Machines Cannot
Sources
Mobile Phones
Social Media
(Personal) Sensors
Interpretation
Multiple Domains
Knowledge Generation
Well-being
Environment
City Life
In Real World: In-Situ
Culture, Context
The World is My Lab
HCI
Network Analysis
Sociology
Cognitive Psychology
Knowledge Discovery
Data Mining
Behavioural
Economics
Collective
Intelligence
Security & Privacy
Software Engineering
Domain Specific Expertise
Urban Reality Mining
Infer human relationships and behaviour
in an urban environment
Qualitative
Cheap & Fast (w/ infrastructure)
Semantic By Design
Scalable
Truthful
Sustainable
Expensive
Very Expensive (w/o infrastructure)
Not Scalable
Biased
Not Sustainable
Untrustworthy
Can we combine the benefits, minimising the issues?
Urban Reality Mining w/ Social Data
High Spatial-Temporal Resolution
High Technology Penetration
Truthful
Expensive
No Semantics
Open (Linked Data)
Physical Sensors
Social Media
Social Glass Video
https://vimeo.com/120564204
Social Glass Features
System Architecture
Harnessing Heterogeneous Social Data to Explore, Monitor, and Visualize Urban Dynamics
System Architecture
Harnessing Heterogeneous Social Data to Explore, Monitor, and Visualize Urban Dynamics
Use Case: City Scale Events
▪
Thousands of events, hosted in hundreds of
venues, attracting hundreds of thousands of
people
•
Event sponsors want to quantify the return on
investment
•
Manual assessment of event popularity is effective,
but expensive
> 600 Venues > 1000 Events
6 Days > 500.000 attendees
Goals
Anomaly detection (mobile
phone and social media
data)
Topical characterisation and
sentiment analysis
Visitors characterisation and
engagement
Event recommendation
Milan Design Week
Anomaly Detection & Topical Characterization
Credits: http://citydatafusion.org - Emanuele Della Valle
Event Recommendation
Balduini, M.; Bozzon, A.; Della Valle, E.; Yi Huang;
Houben, G.-J., "Recommending Venues Using Continuous
Predictive Social Media Analytics," Internet Computing,
IEEE , vol.18, no.5, pp.28,35, Sept.-Oct. 2014
> 50 artworks ~ 8 Weeks
Amsterdam Light Festival 2014-2015
Popularity of light artworks according to
Instagram posts: some artworks seem to
be more popular than others
Popularity of light artworks according to
Instagram posts made by RESIDENTS:
here the popularity is more uniform
Popularity of light artworks according to
Instagram posts made by TOURISTS:
differences are more evident and
sometimes in the proximity of touristic
point of interests
Most common paths according to
Instagram
Timeline for the number of posts for the I/
O underflow light artwork according to
Instagram posts: more activity during
Christmas holidays
Timeline for the number of posts for the I/
O underflow light artwork according to
Instagram posts from RESIDENTS: the
activity is generally more spread
Timeline for the number of posts for the I/
O underflow light artwork according to
Instagram posts from TOURISTS: the
activity increases clearly during the
Christmas holidays
Online Education & Inclusion
analytics to make online education
truly learner-centric and to adapt to
the students & their backgrounds
massive online education is
about massively adapting
to the context of use
with increasing diversity comes
importance of social and cultural
features: inclusion
WIS - Web Information Systems
People In The Loop
People as sensing and
computational units
Social Sensing
Social Environmental Sensing
Current Challenge: Social Data Veracity
Social Data is
• nuanced by
culture, context,
background
• uncertain in
expression and
content
• inconsistent,
ambiguous,
deceptive
Lack of Veracity is a challenge
•
Hampers reliability of analysis
•
Supports wrong
interpretations
But often it is an opportunity
•
Reality can be perceived different
ways
•
Bias and diversity can be desirable
data properties
Veracity By Design
Crowds & Niches for Crowd Annotation
involving general crowds and
qualified niches of domain experts
for annotating large
collections of (heritage) objects
with domain-specific expertise
WIS - Web Information Systems
Semantics
Scale
how to make sense of social data
with its variety, accuracy, and diversity?
how to handle large volumes of data
and engage large groups of people?
Speed
how to create and interpret in real-time
and in changing contexts?
Sustainability
how to ensure sustained functioning
of people-enhanced systems?
http://blog.electricbricks.com/?attachment_id=16621
scientific agenda: towards theory and technology for (software and human-enhanced) machines to create value out of data
Social data gives us one of the largest reflections of the world, but/
and it is a man-made reflection
‘unique opportunity turning into interesting research problem’
The power of what machines can do with the data needs to be wellunderstood and transparent for solid engineering and uptake
‘what machines can do and what they cannot do’
Science and technology follow the principles of the Web
‘fundamental & experimental’
http://blog.electricbricks.com/?attachment_id=16621
Sense & value come from big data, but even more so from what
(software and human-enhanced) machines can make of the data
‘V = M * D’
Acknowledgments
social-glass.org
Geert-Jan Houben
Web Information Systems
wisdelft.nl
Delft Data Science
delftdatascience.tudelft.nl
gjhouben.nl
WIS - Web Information Systems