Geert-Jan Houben
Transcription
Geert-Jan Houben
Social Data Science for Intelligent Cities The Role of Social Media for Sensing Crowds Prof.dr.ir. Geert-Jan Houben TU Delft Web Information Systems & Delft Data Science WIS - Web Information Systems http://blog.electricbricks.com/?attachment_id=16621 “Why” The Most Complex System Created By People Populated by people Kowloon Walled City, Hong Kong But, what about the Web? Largest human-made artefact Credits: http://blogs.microsoft.com/blog/2014/04/15/a-data-culture-for-everyone/ Intelligent = IT • • Prescriptive, centralised design • Environment should fit the software • Data should fit the software • Users should fit the software Computing science => Efficiency ▪ Efficient Software => Efficient Systems A Web-driven Paradigm Shift Decentralisation Openness & Linking Personalisation Adaptation Utility Credits: http://blogs.microsoft.com/blog/2014/04/15/a-data-culture-for-everyone/ Intelligent = Data Data Scale Machines Speed Sustainability Intelligent = Data Semantics Data Scale Machines Speed Sustainability Intelligent = Social Data Semantics People Create Analyse Interpret Data Engage & Retain Describe People Machines Social Data Science Creation Annotations Implicit vs. Explicit Organically vs. On-Demand To Train Machines Analysis When Machines Cannot Sources Mobile Phones Social Media (Personal) Sensors Interpretation Multiple Domains Knowledge Generation Well-being Environment City Life In Real World: In-Situ Culture, Context The World is My Lab HCI Network Analysis Sociology Cognitive Psychology Knowledge Discovery Data Mining Behavioural Economics Collective Intelligence Security & Privacy Software Engineering Domain Specific Expertise Urban Reality Mining Infer human relationships and behaviour in an urban environment Qualitative Cheap & Fast (w/ infrastructure) Semantic By Design Scalable Truthful Sustainable Expensive Very Expensive (w/o infrastructure) Not Scalable Biased Not Sustainable Untrustworthy Can we combine the benefits, minimising the issues? Urban Reality Mining w/ Social Data High Spatial-Temporal Resolution High Technology Penetration Truthful Expensive No Semantics Open (Linked Data) Physical Sensors Social Media Social Glass Video https://vimeo.com/120564204 Social Glass Features System Architecture Harnessing Heterogeneous Social Data to Explore, Monitor, and Visualize Urban Dynamics System Architecture Harnessing Heterogeneous Social Data to Explore, Monitor, and Visualize Urban Dynamics Use Case: City Scale Events ▪ Thousands of events, hosted in hundreds of venues, attracting hundreds of thousands of people • Event sponsors want to quantify the return on investment • Manual assessment of event popularity is effective, but expensive > 600 Venues > 1000 Events 6 Days > 500.000 attendees Goals Anomaly detection (mobile phone and social media data) Topical characterisation and sentiment analysis Visitors characterisation and engagement Event recommendation Milan Design Week Anomaly Detection & Topical Characterization Credits: http://citydatafusion.org - Emanuele Della Valle Event Recommendation Balduini, M.; Bozzon, A.; Della Valle, E.; Yi Huang; Houben, G.-J., "Recommending Venues Using Continuous Predictive Social Media Analytics," Internet Computing, IEEE , vol.18, no.5, pp.28,35, Sept.-Oct. 2014 > 50 artworks ~ 8 Weeks Amsterdam Light Festival 2014-2015 Popularity of light artworks according to Instagram posts: some artworks seem to be more popular than others Popularity of light artworks according to Instagram posts made by RESIDENTS: here the popularity is more uniform Popularity of light artworks according to Instagram posts made by TOURISTS: differences are more evident and sometimes in the proximity of touristic point of interests Most common paths according to Instagram Timeline for the number of posts for the I/ O underflow light artwork according to Instagram posts: more activity during Christmas holidays Timeline for the number of posts for the I/ O underflow light artwork according to Instagram posts from RESIDENTS: the activity is generally more spread Timeline for the number of posts for the I/ O underflow light artwork according to Instagram posts from TOURISTS: the activity increases clearly during the Christmas holidays Online Education & Inclusion analytics to make online education truly learner-centric and to adapt to the students & their backgrounds massive online education is about massively adapting to the context of use with increasing diversity comes importance of social and cultural features: inclusion WIS - Web Information Systems People In The Loop People as sensing and computational units Social Sensing Social Environmental Sensing Current Challenge: Social Data Veracity Social Data is • nuanced by culture, context, background • uncertain in expression and content • inconsistent, ambiguous, deceptive Lack of Veracity is a challenge • Hampers reliability of analysis • Supports wrong interpretations But often it is an opportunity • Reality can be perceived different ways • Bias and diversity can be desirable data properties Veracity By Design Crowds & Niches for Crowd Annotation involving general crowds and qualified niches of domain experts for annotating large collections of (heritage) objects with domain-specific expertise WIS - Web Information Systems Semantics Scale how to make sense of social data with its variety, accuracy, and diversity? how to handle large volumes of data and engage large groups of people? Speed how to create and interpret in real-time and in changing contexts? Sustainability how to ensure sustained functioning of people-enhanced systems? http://blog.electricbricks.com/?attachment_id=16621 scientific agenda: towards theory and technology for (software and human-enhanced) machines to create value out of data Social data gives us one of the largest reflections of the world, but/ and it is a man-made reflection ‘unique opportunity turning into interesting research problem’ The power of what machines can do with the data needs to be wellunderstood and transparent for solid engineering and uptake ‘what machines can do and what they cannot do’ Science and technology follow the principles of the Web ‘fundamental & experimental’ http://blog.electricbricks.com/?attachment_id=16621 Sense & value come from big data, but even more so from what (software and human-enhanced) machines can make of the data ‘V = M * D’ Acknowledgments social-glass.org Geert-Jan Houben Web Information Systems wisdelft.nl Delft Data Science delftdatascience.tudelft.nl gjhouben.nl WIS - Web Information Systems