Big Data
Transcription
Big Data
Big Data & Analytics Willkommen zur IBM Roadshow SPSS Predictive Analytics meets Big Data Think Big, Start Small, Act Fast Oktober /November 2014 Hamburg, Berlin, Stuttgart, München, Düsseldorf, Wien, Zürich © 2014 IBM Corporation Big Data & Analytics Agenda 08.30 - 09.00 Uhr: Registrierung 09.00 - 09.15 Uhr: Big Data Analytics - Hype oder Realität? 09.15 - 09.45 Uhr: Big Data – Überblick • Vorstellung der IBM Big Data Plattform • Architektur und Komponenten • Big Data Anwendungsbereiche und Referenzen 09.45 - 10.30 Uhr: Predictive Analytics & Big Data die skalierbare IBM SPSS Lösung jetzt auch auf Hadoop • Typische Anwendungsbereiche und aktuelle Herausforderungen im Mittelstand • Referenzen und Beispiele aus dem Mittelstand und diversen Industriebereichen • Überblick SPSS Predictive Analytics Produktportfolio & Architektur 10.30 - 11.00 Uhr: 2 Pause © 2014 IBM Corporation Big Data & Analytics Agenda 11.00 - 12.00 Uhr: InfoSphere BigInsights - Hadoop fit für den Einsatz im Unternehmen - Live-Demonstration • Explorative Aufbereitung, Analyse und Visualisierung von Daten mit Big Sheets • Automatisieren von Abläufen mit Applikationen und Workflows • Integration von Hadoop in analytische Umgebungen mit Big SQL 12.00 - 13.15 Uhr : IBM SPSS Predictive Analytics live Demonstration anhand von Beispielen aus den Bereichen Kundendatenanalyse und Analyse von Produktions-/Fertigungsdaten in der Industrie • IBM SPSS Modeler & Analytic Server – Data Mining basierend auf Big Data für Analysten und Data Mining Experten • Watson Analytics - Vorstellung und Ausblick neuer IBM Technologien und Trends 13:15 Uhr: 3 Zusammenfassung und anschließendes Lunchbuffet © 2014 IBM Corporation Big Data Analytics Hype oder Realität? 4 © 2013 IBM Corporation Big Data & Analytics Big Data bietet viele Chancen in allen Bereichen 49% der Kunden nutzen zwei oder mehr Technologien um einzukaufen, und 53% der aktiven, erwachsenen Social-Netzwerker FOLLOW A BRAND Milliarden von Kundenpräferenzen und -bewertungen existieren in Callcentern, Webseiten, Transaktionsdaten, Mensch-Maschine Interaktionen und Social Media 6.8% nur des Marketing glauben, dass Social Media in die Strategie integriert ist 7.6% des Budgets im Marketing ist für Social Media eingeplant 5 Variety Daten in vielen, verschiedenen Formaten Velocity Batch und streaming 1 von 5 Onlineminuten werden in sozialen Netzen verbracht 400 Millionen tweets werden täglich verschickt Big Data Big Value Glaubwürdigkeit <->Zweifel Veracity Terabytes und zettabytes Daten Volume 5 Milliarden+ Mobiletelefone weltweit 55% nuzten ihr Mobiltelefon für Preisvergleiche 34% scannen QR Codes © 2014 IBM Corporation Big Data & Analytics Gartner: Hype Cycle für Big Data 2014 6 © 2014 IBM Corporation Big Data & Analytics Anwendungsbeispiele: Firmen die Empfehlungssyteme nutzen Zielsetzung: Kunden sollen individuell angesprochen werden um den Umsatz & die Nutzung zu steigern. 7 © 2014 IBM Corporation Big Data & Analytics Anwendungsbeispiel-Details: “We know what people watch on Netflix and we’re able with a high degree of confidence to understand how big a likely audience is for a given show based on people’s viewing habits.” (Jonathan Friedland - Netflix Communications Director) • Mehr als 50 Millionen Nutzer • Mehr als 60 Millionen Filmabrufe pro Tag (Netflix halt fest; Pausen, Vor- und Zurückspulen • Mehr als 6 Million Bewertung / Tag • Mehr als 4 Million Suchanfragen / Tag • Geopositionsdaten • Gerätinformation (Mobiltelefon, Tablet, TV, …) • Uhrzeit und Datum / Wochentag (es wurde bewiesen, dass Nutzer in der Woche mehr TV-Shows sehen und am Wochenende mehr Filme) • Metadaten von Drittanbietern wie zum Beispiel Nielsen • Social Media Daten von Facebook und Twitter • Mehr als 1Mrd US$ Umsatz im 1. Quartal 2014 8 © 2014 IBM Corporation Big Data & Analytics 9 © 2014 IBM Corporation Big Data Überblick 10 © 2013 IBM Corporation Big Data & Analytics The Big Data Paradox: More Data, Less Confidence 11 1 in 3 Business leaders frequently make decisions based on information they don’t trust, or don’t have 1 in 2 Business leaders say they don’t have access to the information they need to do their jobs 60% Have more data than they can use 40% Time spent on each big data project to understand information 2.5 Billion 1 Trillion 3 Times gigabytes new every day connected things by 2015 Increase in transistors per human by 2017 © 2014 IBM Corporation Big Data & Analytics IBM Referenzarchitektur für Informationsmanagement und Analyse Transaction and application data Information ingestion and integration zone Enterprise warehouse and data mart zone Why did it happen? Reporting, analysis, content analytics Information governance zone 12 © 2014 IBM Corporation Big Data & Analytics IBM Referenzarchitektur für Informationsmanagement und Analyse New/Enhanced Applications All Data Real-time analytics zone Transaction and application data Machine and sensor data Enterprise content Information ingestion and integration zone Exploration, landing and archive zone What is happening? Enterprise warehouse and data mart plus analytics databases zone Discovery and exploration What did I learn, what’s best? Reporting, analysis, Why did it happen? content analytics Cognitive What could happen? Image and video What action should I take? Decision management New business models Financial performance Risk Predictive analytics and modeling Information governance zone Operations and fraud Social data Systems Third-party data Customer experience Security Storage On premise, Cloud, As a service IT economics IBM Big Data & Analytics Infrastructure 13 © 2014 IBM Corporation Big Data & Analytics Battelle – Boosts power grid reliability and helps consumers and businesses cut their energy costs Processes up to 10PB of data in real time to deliver game changing insights Boosts grid efficiency through the introduction of innovative transactive control mechanisms Provides price incentives helping consumers make informed choices about energy usage Solution components • IBM® InfoSphere® Streams • IBM PureData™ System for Analytics (powered by IBM Netezza® technology) The transformation: Capturing and analyzing huge volumes of data from smart energy meters, and applying appropriate algorithms helps set electricity prices based on changing demand factors. “This project will help optimize the system and better integrate renewable resources.” — Ronald Melton, PhD, project director for the Pacific Northwest Smart Grid Demonstration Project, led by Battelle. Real-time analytics zone 14 Exploration, landing and archive zone Enterprise warehouse and data mart plus analytics databases zone © 2014 IBM Corporation Big Data & Analytics Dublin City Centre – Monitors citywide traffic intelligently to optimize public transit systems Monitors 600 buses across 150 routes and 5,000 bus stops daily Offers 50 updates a second for real-time visualization of bus locations and arrival times Estimates arrival times and transit times, as well as flagging likely delays Solution components • IBM® InfoSphere® Streams The transformation: Measuring speed and traffic flow across public transit routes enables new insight for intelligent decisions that enhance system performance and reliability. “IBM solutions have enabled our traffic managers to make more accurate decisions.” — Brendan O'Brien, Head of Technical Services, Dublin City Council Traffic Division Real-time analytics zone 15 Exploration, landing and archive zone Enterprise warehouse and data mart plus analytics databases zone © 2014 IBM Corporation Big Data & Analytics Vestas – Turns climate into capital with Big data using IBM InfoSphere BigInsights 97% decrease in response times for wind forecasting information Cuts cost per kilowatt hour increasing customer’s return on investment 40% reduction in energy consumption, reducing IT footprint while increasing power Solution components • IBM® InfoSphere® BigInsights™ Enterprise Edition The transformation: Analyzing petabytes of wind data to pinpoint optimal turbine placement, maximizes power generation and reduces energy costs. “We can now show our customers how the wind behaves and provide a solid business case that is on par with any other investment that they may have.” — Lars Christian Christensen, vice president, Vestas Wind Systems Real-time analytics zone 16 Exploration, landing and archive zone Enterprise warehouse and data mart plus analytics databases zone © 2014 IBM Corporation Big Data & Analytics InfoSphere BigInsights is based on Apache™ Hadoop Apache™ Hadoop® is an open source software project that enables the distributed processing of large data sets across clusters of commodity servers. MapReduce - The framework that understands and assigns work to the nodes in a cluster. HDFS - A file system that spans all the nodes in a Hadoop cluster for data storage. Scalable – New nodes can be added as needed, and added without needing to change data formats, how data is loaded, how jobs are written, or the applications on top. Cost effective – Hadoop brings massively parallel computing to commodity servers. Flexible – Hadoop is schema-less, and can absorb any type of data, structured or not, from any number of sources. Fault tolerant – When you lose a node, the system redirects work to another location of the data and continues processing without missing a beat. 17 © 2014 IBM Corporation Big Data & Analytics InfoSphere BigInsights for Hadoop includes the latest Open Source components, enhanced by enterprise components IBM InfoSphere BigInsights for Hadoop Open Source IBM Governance GPFS FPO Data Privacy for Hadoop HDFS Data Matching File System Flexible Scheduler HBase Audit & History Data Store Adaptive MapReduce Enterprise Search MapReduce Data Masking Big SQL Security Pig Data Security for Hadoop Sqoop LDAP Hive Kerberos ETL YARN* HCatalog Monitoring Flume Search Jaql Resource Management & Administration Streams Solr/ Lucene Runtime Text Analytics Oozie Stream Computing Console Big R Data Access 18 Text Analytics Extractors Dashboard Advanced Analytics R BigSheets Reader and Macro Eclipse Tooling: MapReduce, Hive, Jaql, Pig, Big SQL, AQL BigSheets Charting Applications & Development ZooKeeper Visualization & Ad Hoc Analytics © 2014 IBM Corporation * In Beta Big Data & Analytics IBM InfoSphere BigInsights – Enterprise Ready Hadoop solution performance gain on average over open source Hadoop InfoSphere BigInsights Accelerators Visualization & Exploration Analytics Analytics Development Environment Development Tools Included in BigInsights Enterprise Edition: (limited use license) Big SQL Enterprise capabilities Extractors and APIs Streams Analytics Extraction Engine Data Explorer Cognos BI Connectors Workload Management Security Workload Optimization (MapReduce/SQL) • Accelerators Administration & Security Open source based components 19 Open source based components IBM tested & supported open source components © 2014 IBM Corporation Big Data & Analytics Collaborative Big Data for many roles Business Users can get their hands on big data and use big data applications and BigSheets to get insights into their data Data scientists can perform deeper analysis and get richer insights Administrators are empowered to be more agile through better controls and views into key performance indicators Developers can leverage unified tooling in a Big Data Application Development Lifecycle and are able to create and deploy new types of applications, with enhancements that simplify even complex workflows 20 © 2014 IBM Corporation Big Data & Analytics BigSheets to analyze and visualize Model “big data” collected from various sources in spreadsheetlike structures Filter and enrich content with built-in functions Combine data in different workbooks Visualize results through spreadsheets, charts Export data into common formats (if desired) No programming knowledge needed! 21 © 2014 IBM Corporation Big Data & Analytics Big SQL – Architected for Performance Leverage IBM's rich SQL heritage, expertise, and technology • Modern SQL:2011 capabilities • DB2 compatible SQL PL support SQL-based Application •SQL bodied functions and stored procedures •Application logic/security encapsulation IBM Data Server Client 22 Architected from the ground up for performance • low latency and high throughput MapReduce replaced with a modern MPP architecture • Compiler and runtime are native code (not java) • Big SQL worker daemons live directly on cluster • Continuously running (no startup latency) • Processing happens locally at the data Big SQL SQL MPP Runtime Data Sources Parquet CSV Seq RC Avro ORC JSON Custom Operations occur in memory with the ability InfoSphere BigInsights to spill to disk • Supports aggregations and sorts larger than available RAM Integration with BigSheets (source & target) © 2014 IBM Corporation Big Data & Analytics IBM Referenzarchitektur für Informationsmanagement und Analyse New/Enhanced Applications All Data Real-time analytics zone Transaction and application data Machine and sensor data Enterprise content Information ingestion and integration zone Exploration, landing and archive zone What is happening? Enterprise warehouse and data mart plus analytics databases zone Discovery and exploration What did I learn, what’s best? Reporting, analysis, Why did it happen? content analytics Cognitive What could happen? Image and video What action should I take? Decision management New business models Financial performance Risk Predictive analytics and modeling Information governance zone Operations and fraud Social data Systems Third-party data Customer experience Security Storage On premise, Cloud, As a service IT economics IBM Big Data & Analytics Infrastructure 23 © 2014 IBM Corporation Predictive Analytics & Big Data - die skalierbare IBM SPSS Lösung jetzt auch auf Hadoop 24 © 2013 IBM Corporation Big Data & Analytics IBM Referenzarchitektur für Informationsmanagement und Analyse New/Enhanced Applications All Data Real-time analytics zone Transaction and application data Machine and sensor data Enterprise content Information ingestion and integration zone Exploration, landing and archive zone What is happening? Enterprise warehouse and data mart plus analytics databases zone Discovery and exploration What did I learn, what’s best? Reporting, analysis, Why did it happen? content analytics Cognitive What could happen? Image and video What action should I take? Decision management New business models Financial performance Risk Predictive analytics and modeling Information governance zone Operations and fraud Social data Systems Third-party data Customer experience Security Storage On premise, Cloud, As a service IT economics IBM Big Data & Analytics Infrastructure 25 © 2014 IBM Corporation Big Data & Analytics Was ist Predictive Analytics Predictive Analytics generiert aus Daten operative Aktionen, indem verlässliche Schlüsse zur aktuellen Situation und zukünftigen Ereignissen erkannt bzw. prognostiziert werden. Die drei häufigsten Anwendungsgebiete sind Kundenbeziehungen, Operations und Risikothemen: 26 © 2014 IBM Corporation Big Data & Analytics Anwendung von Preditive Analytics am Point of Interaction Menschen helfen, die bestmögliche Aktion auszuführen Was soll ich jetzt tun??? Mach dies! Systemen helfen, die bestmögliche Aktion auszuführen Was soll ich jetzt tun??? Mach dies! 27 © 2014 IBM Corporation Big Data & Analytics Verschiedene Ziele müssen gleichzeitig betrachtet werden “Mit welcher Wahrscheinlichkeit wird ein Kunde reagieren?” “Welche Kunden sind abwanderungsgefährdet?” Erkennung und Prävention von Betrug Akquise “guter” Kunden Bindung profitabler Kunden RisikoMinimierung Steigerung des Kundenwerts 28 “Welche Aktivitäten sind betrugsverdächtig?” “Welches ist das interessanteste nächste Produktangebot?“ “Welche Kunden werden vss. Zahlungsschwierigkeiten bekommen?” © 2014 IBM Corporation Big Data & Analytics Predictive Analytics - Einsatzgebiete Controlling & Produktion • • • • • • • • • Analyse von Fehlerquellen Ausschussminimierung Intelligente Wartung Umsatzprognosen Standortanalysen und -planung Lagerbestandsanalysen und Prognosen Risikoanalyse Abweichungsanalyse: Ursache - Wirkung Einkaufsoptimierung Ressourceneinsatzplanung Marketing • • • • • • • • 29 Analytisches Customer Relationship Management Kampagnenoptimierung / Zielgruppensegmentierung Kundenbindungsmanagement Kundenwertanalyse Warenkorbanalysen Cross-/Upsell Verhaltensanalyse / Clickstream-Analyse Sentimentanalyse Abwanderungsanalyse © 2014 IBM Corporation Big Data & Analytics Predictive Analytics - Einsatzgebiete Vertrieb und Service • • • • • • RFM-Analyse Next-Best-Action Ersatzteilprognose und –versorgung Präventiver Austausch von fehlerhaften Teilen im Kundendienst Optimierung von Rückrufaktionen Gewährleistungsanalyse Öffentlicher Bereich und Healthcare 30 • Erkennung ungewöhnlicher Fälle (Steuerbetrug, Sozialbetrug, Geldwäsche, EU-Fördermittel) • Vorhersage der Bevölkerungswanderung Stadt/Land • Stauprognose, intelligente Verkehrsflussleitung • Verkehrsmittelplanung im ÖPNV, Auslastung • Wahl der passenden Therapieform durch Wirksamkeitsvorhersagen • Polizei: Personaleinsatzplanung, Vorhersage von Hotspots • Import- / Export-Kontrollen • Justiz-Psychologie, Wirksamkeitsvorhersage der Behandlungskonzepte, Vorhersage Rückfallwahrscheinlichkeit 30 © 2014 IBM Corporation Big Data & Analytics Beispiel: Zielgerichtetes Marketing bei Inbound-Telefonkontakten “Ich rufe an, weil ich mich mal wegen meines Download limits erkundigen wollte. Wie viel davon habe ich denn schon verbraucht?” “Frau Burghardt, Sie sind gerade kurz vor Ihrem 10GB Limit. Wir können Ihnen als geschätztem langjährigen Kunden aber anbieten, zu unserem attraktiven BreitbandUnlimited Angebot zu wechseln” “Natürlich, Frau Burghardt. Ich sehe kurz nach… “ Next Best Action : Empfehlung Breitband-Unlimited 31 © 2014 IBM Corporation Big Data & Analytics Predictive Analytics im „moment of truth“ Kündigungsrisiko 0622147763 Het is voor U voordeliger om een sms voorraad 50 aan te sluiten. Kundenwertindikator Johnson Churchilllaan 22 1022 AM Amsterdam 53463788 Cross-Selling Vorschlag 0622147763 32 © 2014 IBM Corporation Big Data & Analytics E-Commerce Retailer optimiert das Kundenerlebnis und steigert den Erfolg durch optimierte Marketingkampagnen Herausforderungen • • Verständnis über Websiteverhalten und Ableitung zielgerichteter Maßnahmen sowie Optimierung von Kampagnen. Zusammenführung von Daten aus unterschiedlichen Systemen in near-real-time. Lösung • • • 33 Analytische Lösung verbindet Daten aus 17 Business Units und über 70 Brand-Websites zu einer vollständigen, umfassenden Kundensicht. Statistische und prädiktive Analytik analysiert near-real-time stream von Kundendaten, Transaktionen, click streams, Mobile App Nutzung, Online Umfrageresultate und Filialtransaktionen. Mustererkennung im Kundenverhalten und Unterstützung der Next Best Action. Nutzen Amortisation des Projekts in sieben Monaten mit 122% ROI. Reduktion der Kosten von mehr als 500,000 EUR und Steigerung des Umsatzes. Verringerung der Aufwände und Zeit für Kampagnenmanagement und Datenverarbeitungsprozesse um 90%. © 2014 IBM Corporation Big Data & Analytics Richmond Police Department Public – Police, Crie & Defense Eindämmung von Verbrechen mit Predictive Analytics Hintergrund & Challenge Benefits Angesichts von steigenden Verbrechensraten benötigte das Richmond Police Department einen effizienten und kostengünstigen Weg, um Verbrechensdaten zu analysieren, Bedrohungen für die öffentliche Sicherheit zu erkennen und intelligente Personalentscheidungen treffen zu können. Analyse von sehr großen Datensammlungen und Vorhersage von Verbrechensmustern Lösung Notwendige Intelligenz zur Eindämmung von Verbrechen Die Polizei entschied sich deswegen für IBM SPSS Predictive Analytics Software, um ein Tool zu implementieren, das Daten von verschiedenen Quellen in ein Data Warehouse integrieren würde und dabei versteckte Beziehungen in den Daten erkennen kann. So können beispielsweise automatische Verbrechensvorhersagen kreiert werden. Komponenten der Lösung Möglichkeit zur effizienten Ressourcenverteilung Reduktion von Gewaltverbrechen um 32% von 2006 bis 2007 und um zusätzliche 40% von 2007 bis 2008 IBM SPSS Statistics IBM SPSS Modeler 34 34 © 2014 IBM Corporation Big Data & Analytics Einsatz von Predictive Analytics im gesamten Produktlebenszyklus Diagnose von Versuchsdaten Prognose von Serviceintervallen Entwicklung Identifikation verwandter Probleme, Ursachen und Maßnahmen Analyse von Händlerabrechnungen und daraus abgeleitete Massnahmen After Sales Garantie Frühe Baureihen Vermeidung von Wiederholreparaturen Automatisierte Auswertung von Texten Produktempfehlungen am Point of Sales Automatisierte Befragungen und direkte Analyse der Antworten Produktionsoptimierung und -monitoring Marketing Sales Produktion Logistik Kundenbedarfsermittlungen durch Segmentierung 35 Nutzung von Telemetriedaten zur vorzeitigen Fehleridentifikation Lean SixSigma Vorausschauende Instandhaltung und Lagerhaltung © 2014 IBM Corporation Big Data & Analytics Daimler AG: Automobilhersteller steigert Produktivität in der Zylinderkopfproduktion Ergebnisse 25 Prozent Steigerung der Produktivität in der Daimler Zylinderkopfproduktion dank der mit IBM SPSS gewonnenen Erkenntnisse. 50 Prozent Verkürzung der Hochlaufphase des Fertigungsprozesses bis zur Erreichung der Zielwerte. Bei Überschreitung von Schwellwerten ermöglichen die Auswertungen eine schnelle Fehlerquellenlokalisierung, gezielte Prozesseingriffe und somit die Vermeidung von Ausschussprodukten, noch bevor sie entstehen. Link zur Referenz - englisch: http://www-01.ibm.com/common/ssi/cgibin/ssialias?subtype=AB&infotype=PM&appname=SWGE_YT_YV_WWEN&html fid=YTC03659WWEN&attachment=YTC03659WWEN.PDF Link zur Referenz - deutsch: http://www-01.ibm.com/common/ssi/cgibin/ssialias?subtype=AB&infotype=PM&appname=SWGE_YT_YV_DEDE&htmlfi d=YTC03659DEDE&attachment=YTC03659DEDE.PDF 36 © 2014 IBM Corporation Big Data & Analytics Israel Electric Corporation Israel Electric Corporation increased the efficiency of maintenance schedules, costs and resources, resulting in fewer outages and higher customer satisfaction. The Company Business Need Israel Electric Corporation (IEC) is the primary electricity provider in Israel, which is responsible for building, maintaining and operating the country’s power infrastructure.. IEC generates 95 percent of Israel’s electricity. To meet peak demand, its turbines need to run at full capacity – so it is vital to keep them online and running efficiently. “Using IBM’s analytical tools has brought us significant savings, both by reducing the time taken to understand faults and by cutting the dollars spent on turbine failures and downtime. ” Dr. Moshe Shavit, CTO for Gas Turbines at IEC Solution Sophisticated analysis of machine behavior The IEC team used IBM SPSS Modeler to perform cluster analyses of the data from each of the turbines and create a model of their “normal” behavior during start-up, steady-state and shut-down. With the baselines for each individual unit established, the team was able to compare their performance and begin identifying common problems Moving towards preventive maintenance Better root-cause analysis of past component failures enables IEC to move from a break-fix maintenance model to a more preventive approach. Improving safety The turbines have an alarm built-in by the manufacturer which is triggered 30 minutes before a major failure. With IBM SPSS Modeler IEC can predict such an event 30 hours before it happens Enhancing performance and fuel efficiency Neural network techniques calculate expected values for each turbine (workload, fuel consumption and other conditions) and compare them with the actual values on a daily basis. If a large variance is detected, the control engineers are alerted immediately 37 Key Benefits • Reduce costs by up to 20 percent by avoiding the need to restart turbines after an outage • Saved approximately USD 75,000 in fuel costs per turbine by identifying inefficient fuel usage. • Provides early warning of certain types of failure up to 30 hours before they occur, instead of 30 minutes. © 2014 IBM Corporation Big Data & Analytics IBM SPSS Predictive Analytics Produktportfolio und Architektur © 2014 IBM Corporation Big Data & Analytics IBM SPSS Predictive Analytics Capture Predict Transaktionen Demographie Interaktionen Meinungen Data Collection Act Vorhersagen Optimierung und Umsetzung in Prozesse Real time Analytics Predictive Modeling Data Mining Text Analytics Social Network Analysis Statistical Analysis Social Media Analytics Statistics Modeler Analytic Server Decision Management Collaboration and Deployment Services Predictive Customer Analytics Acquire Grow Retain 39 Predictive Operational Analytics Manage Maintain Maximize Predictive Threat & Fraud Analytics Monitor Detect Control © 2014 IBM Corporation Big Data & Analytics IBM SPSS Data Collection Befragungstechnologie um Meinungen, Einstellungen und Zufriedenheit von Kunden, Mitarbeitern und Lieferanten zu sammeln Vervollständigt intern gesammelte Daten, um eine vollständigere Sicht auf den Kunden zu bekommen Liefert eine genauere Sicht über Meinungen und Einstellungen 40 40 © 2014 IBM Corporation Big Data & Analytics IBM SPSS Social Media Analytics Umfassende Monitoring, Analyse und Reporting Plattform für Social Media Insights Eingebaute Big Data Fähigkeiten Führende Sentiment Analyse und Segmentierung (Geographics, Demographics, Influencers) Beziehungen (Affinitäten, Assoziationen, Kausalketten) Impact Analyse (Share of Voice, Reichweite, Sentiment) Explorative / Discovery – Fähigkeiten (Themen, Akteure, Sentiment) 41 © 2014 IBM Corporation Big Data & Analytics IBM SPSS Statistics Datenmanagement, Advanced Statistics für Analysten Sammlung, Exploration, Analyse, Interpretation und Präsentation von Daten Bietet tiefere Einsichten in Stichproben und ein Vielzahl an Prozeduren für Forecasting und Analyse Riesige User-Basis aus den Universitäten Steigert Vertrauen in Ergebnisse und Entscheidungen 42 42 © 2014 IBM Corporation Big Data & Analytics IBM SPSS Modeler Komplette Workbench für Data und Text Mining Hohes Maß an Interaktivität und Benutzerfreundlichkeit Vielzahl von Algorithmen für Exploration und Vorhersage Ermöglicht die Entdeckung von neuen Mustern und Trends zur weiteren Verwendung in Business-Prozessen Bringt Wiederholbarkeit in Entscheidungsprozesse 43 43 © 2014 IBM Corporation Big Data & Analytics IBM SPSS Modeler Komplette Workbench für Data und Text Mining Hohes Maß an Interaktivität und Benutzerfreundlichkeit Vielzahl von Algorithmen für Exploration und Vorhersage Ermöglicht die Entdeckung von neuen Mustern und Trends zur weiteren Verwendung in Business-Prozessen Bringt Wiederholbarkeit in Entscheidungsprozesse 44 44 © 2014 IBM Corporation Big Data & Analytics IBM SPSS Analytic Server Delivers fast time to solution for predictive analytics of big data Visual, easy to use interface abstracts analysts & line of business users from complexities of big data systems Big Data Predictive Analytics auf Hadoop 45 45 © 2014 IBM Corporation Big Data & Analytics IBM SPSS Decision Management Business Applikationen auf der Basis von SPSS Modeler Schnelle Produktivität und Automation Auf den Entscheidungsprozess ausgerichtetes GUI Maßgeschneiderte Applikationen für Business Anwender 46 46 © 2014 IBM Corporation Big Data & Analytics IBM SPSS Predictive Analytics Enhances other IBM Technology Predictive Customer Analytics Predictive Threat & Fraud Analytics Manage Maintain Maximize Acquire Grow Retain Data Collection Predictive Operational Analytics Social Media Analytics Statistics Monitor Detect Control Modeler Analytic Server Decision Management Collaboration and Deployment Services IBM Research Etc… 47 © 2014 IBM Corporation Big Data & Analytics IBM SPSS Modeler Enables discovery of key insights, patterns & trends in data to optimize decisions Data Mining workbench • Easy to use / visual • Comprehensive set of algorithms • Structured & unstructured data • Supports data mining process (CRISP-DM) • Outstanding performance & scalability via SQL pushback, in-database processing & Hadoop Map / Reduce processing via Analytic Server • Reproducible process delivering high productivity, quick time-to-solution & high ROI Brings repeatability to ongoing decision making 49 © 2014 IBM Corporation Big Data & Analytics In-Database Support with SPSS Modeler Server • • • • All the features of IBM SPSS Modeler Large volumes of data High performance Administration and security options In-Database via… • SQL pushback • In Database Algorithms • Scoring Adapters • SQL scoring 50 © 2014 IBM Corporation Big Data & Analytics In-Database Scoring with SPSS Modeler 51 Extension to current In-Database Capabilities allowing more SPSS models to be scored In-Database Improve the efficiency of scoring models by minimizing data movement and leveraging database capabilities Supported for the following platforms • Teradata (13 and above) • PureData for Analytics (6.0 and above) • DB2 for z/OS (DB2 Accessories Suite) • DB2 (Linux, Unix, Windows) © 2014 IBM Corporation Big Data & Analytics Helper Applications in SPSS Modeler Modeler Server supports integration with data mining and modeling tools that are available from database vendors, including IBM PureData for Analytics (Netezza) IBM DB2 InfoSphere Warehouse Oracle Data Miner Microsoft Analysis Services 52 © 2014 IBM Corporation Big Data & Analytics SPSS Modeler and PureData for Analytics(Netezza) Modeler supports integration with IBM PureData for Analytics, providing the ability to run data mining algorithms to be directly in the IBM PureData for Analytics environment from the Modeler user interface. The following algorithms from PureData for Analytics are supported within Modeler Bayes Net Decision Trees Divisive Clustering Generalized Linear K-Means KNN Linear Regression Naive Bayes PCA Regression Tree Time Series 2 Step cluster 53 © 2014 IBM Corporation Big Data & Analytics Real-Time Analytics on Streaming Data Real Time Decisions Streaming Enhancement - Support for Forecasting (Time Series) Environment Monitoring ICU Monitoring Powerful Analytics Algo Trading Telco Churn Prediction Smart Grid Cyber Security Government / Law Enforcement Millions of Events per Second Microsecond Latency Traditional / Non-traditional Data Sources 54 © 2014 IBM Corporation Big Data & Analytics IBM SPSS Analytic Server Architecture Relational databases IBM SPSS Modeler IBM SPSS Modeler SQL & UDF Server Desktop Big data requests Analytics IBM SPSS Analytic Server 55 55 IBM Infosphere Biginsights © 2014 IBM Corporation InfoSphere BigInsights – Hadoop fit für den Einsatz im Unternehmen Live - Demonstration 56 © 2013 IBM Corporation Big Data & Analytics Live Demo – BigInsights, step 1 • Select file OR directory in the file browser • Choose the suitable reader (in this example JSON Object Reader) • Review the data in a table and create a „Master Workbook“ 57 © 2014 IBM Corporation Big Data & Analytics Live Demo – BigInsights, step 2 • Add and delete columns (in this example add „Datum“) • Implement analytics using function (fx) and add sheets • Create result sheet and add chart (if desired) 58 © 2014 IBM Corporation Big Data & Analytics Live Demo – BigInsights, step 3 • Choose a suitable chart type and add chart • Fill parameters in chart wizard to customize chart design • Review chart within BigSheets 59 © 2014 IBM Corporation Big Data & Analytics Live Demo – BigInsights, step 4 • Create or choose a dashboard • Add chart or content (Add Widget button) • Customize dashboard layout (size and position of the content) 60 © 2014 IBM Corporation Big Data & Analytics Live Demo – BigInsights, step 5 • • • • 61 Select the proper application (in the example Distributed File Copy_MR) Fill in the required parameters Select the BigSheets worksbook to update (Advanced Settings) Run the application © 2014 IBM Corporation Big Data & Analytics Live Demo – BigInsights, step 6 • View the status of the running application • View the automated start and status of all depended BigSheets workbooks • View the yellow warning triangle in the dashboard 62 © 2014 IBM Corporation Big Data & Analytics Live Demo – BigInsights, step 7 • View options of dashboard charts (links, settings, etc.) • Select a dashboard chart and klick ‚link to the corresponding workbook‘ • The workbook will be automatically opened 63 © 2014 IBM Corporation Big Data & Analytics Live Demo – BigInsights, step 8 • Select workbook and workflow icons • Explore content and navigation of workbook diagramm • Explore content and navigation of workflow diagramm 64 © 2014 IBM Corporation Big Data & Analytics Live Demo – BigInsights, step 9 • Push the „Create Table“ button within the BigSheets workbook • Enter target schema and table name • Confirm table creation 65 © 2014 IBM Corporation Big Data & Analytics Live Demo – BigInsights, step 10 • Open the Big SQL client (select „run Big SQL queries“ option in the BigInsights welcome tab ) • Enter SQL query • Run the SQL query and review results 66 © 2014 IBM Corporation Big Data & Analytics IBM BigInsights - Information IBM BigInsights Quick Start Edition Information and software download • Big Data University Information and courses about Big Data, BigInsights and more IBM Big Data YouTube channel including BigInsights Quick Start Tutorials • developerWorks technical resource and professional network for IT practitioners 67 © 2014 IBM Corporation IBM SPSS Predictive Analytics live Demonstration anhand von Beispielen aus den Bereichen Kundendatenanalyse und Analyse von Produktions-/ Fertigungsdaten in der Industrie 68 © 2013 IBM Corporation IBM SPSS Analytic Server Architecture Relational databases IBM SPSS Modeler IBM SPSS Modeler SQL & UDF Server Desktop Big data requests Analytics IBM SPSS Analytic Server 69 IBM Infosphere Biginsights © 2013 IBM Corporation High End Analytics mit IBM SPSS Modeler Visuelles Programmieren analytischer Streams Hohes Maß an Interaktivität und Benutzerfreundlichkeit Skalierbarkeit durch Client-/ Server Architektur Nahtlose Zusammenarbeit mit allen gängigen Datenbanksystemen Orientierung am CRISP-DM Modell für Data Mining 70 © 2013 IBM Corporation IBM SPSS Modeler Menüleiste Symbolleiste Streams, Ausgaben und Model Manager Stream Zeichenfläche Projektfenster Palette Status 71 Knoten © 2013 IBM Corporation Datenzugriff, -aufbereitung & Reporting (Überblick!) 72 Datenzugriff ODBC Datenbanken, Flat Files, … Datenmanipulation und -aufbereitung, u.a.: Datenselektion & -transformation Umgang mit fehlenden oder extremen Werten Pre-processing, Bereinigung, Abfragen Festlegen von Typen/Rollen RFM-Analyse Transformationen Merkmalsauswahl (Vorselektion für Modellierung) 'Outputs' von Modellen werden wie Transformationen weiterverarbeitet Export von Einzelfallinformationen und Scores sowie von aggregierten Informationen © 2013 IBM Corporation Interaktive graphische Ad-Hoc-Analysen für die zielgerichtete Exploration gefundener Zusammenhänge Explorative Grafiken Erster Einblick in die Datenstruktur Dienen auch als interaktive Datenaufbereitungstools Banking Histogram Home Insurance Web Car Insurance Exit Page Current Account Bonds Homepage Personal Loan Savings Credit Card Entdeckung von Zusammenhängen Erkenntnisgewinn Unterstützung für weitere Aufbereitung Visualisierung von Modellergebnissen Plot 73 © 2013 IBM Corporation Mächtige Modellierungsalgorithmen Klassifikation und Prognose Neuronale Netze, C5.0, C&RT, CHAID, Quest, Regression (log., OLS, Cox), GZLM, Zeitreihen, Decision List, Diskriminanz, SLRM, SVM, Bayes‘sche Netze Bagging und Boosting von Modellen möglich Clusterung Kohonennetze, K-Means, TwoStep, Anomalieerkennung Assoziationsregeln Apriori, CARMA, Sequenzanalyse Text Mining Datenreduktion: Faktorenanalyse, Merkmalsauswahl Meta-Modelling Automatische Modellselektion (binäre und numerische Zielgrößen, Cluster, Zeitreihenmodelle), Vergleich/Kombination der Ergebnisse mehrerer Modelle In-Database Modelling 74 © 2013 IBM Corporation Big Data & Analytics Live Demonstration IBM SPSS Modeler Premium 75 © 2014 IBM Corporation Big Data & Analytics IBM SPSS Modeler Premium Decision Tree Scoring Rules and Predictor Importance Model evaluation shows that the model built on data combined with Textmining concepts shows the best results (green line) 76 © 2014 IBM Corporation Big Data & Analytics IBM SPSS Modeler: Entscheidungsbaumverfahren identifizieren charakteristische Fehlermuster in der Produktion Gesamtdaten: 32% Teile nicht in Ordnung Wenn Öffnungszeit >1292 dann 75,9% Ausschuß 77 Wenn Öffnungszeit >1292 Und Kühlkreis22Durchfl.Max zw. 0,9 und 1,1 und TempMax <= 386 dann 90,3% nicht in Ordnung © 2014 IBM Corporation Big Data & Analytics IBM SPSS Analytic Server Analytic Server delivers fast time to solution for big data analytics • Delivers integrated support for unstructured/semi-structured predictive analytics • Data-centric architecture ensures scalability & performance (analytics occurs near data) • Leverages visual, easy to use interfaces that abstract analysts from complexities of big data systems – no coding is required Abstracts analysts from complexities of distributed big data systems 78 © 2014 IBM Corporation Big Data & Analytics IBM SPSS Modeler with IBM SPSS Analytic Server • Visual, easy to use interface shields analysts & line of business users from complexities of big data systems – IBM SPSS Modeler • 79 Data mining and text analytics workbench to build predictive models without programming or coding © 2014 IBM Corporation Big Data & Analytics IBM SPSS Analytic Server Access, import and export data directly from/to Hadoop 80 80 80 © 2014 IBM Corporation Big Data & Analytics Use Case for SPSS Modeler + SPSS Analytic Server: Leveraging Big Data to Ensure Quality Electric Service Background Large electric utility has instrumented with a system of 10 million smart meters Business Need Ensure quality of service by balancing traditionally-generated power with customer-generated power User IBM SPSS Modeler clients working in a Hadoop environment “Downstream” consumers who use the information to help them better manage their energy consumption Without IBM SPSS Analytic Server No realistic way to analyze the massive data set derived from 10 million meters generating 9 billion records and 100 fields across all data sources With IBM SPSS Analytic Server “Looking ahead” over the next 48 hours to predict likely system imbalances and address them proactively Taking into account weather conditions on which customer-generated power is highly dependent 81 © 2014 IBM Corporation Big Data & Analytics Use Case for SPSS Modeler + SPSS Analytic Server: (Cont’d) Leveraging Big Data to Ensure Quality Electric Service Background – Approximately 9 billion records and 100 fields across all data sources Without IBM SPSS Analytic Server – An aggregation and subsequent sampling of meter readings leading to less than accurate forecasts With IBM SPSS Analytic Server – Models built on individual meter readings This Modeler stream (i.e. this predictive model) combines customer data (e.g. rate plan), meter data (actual usage), and weather data to produce predicted individual usage 82 © 2014 IBM Corporation Big Data & Analytics Architecture – SPSS Modeler, Analytic Server & BigInsights SQL / UDF IBM SPSS Modeler Stream File Big Data Request Modeler Client Relational Database IBM SPSS Analytic Server Modeler Server Hadoop Job Analytics IBM InfoSphere BigInsights Modeler Server utilizes Analytic Server for Big Data Analysts define analysis in a familiar & accessible workbench to conduct analysis, modeling & scoring over high volumes of varied data Federation of heterogeneous data sources to use legacy & external data in model building & scoring Transformations, sampling & write-back of output to big data systems 83 © 2014 IBM Corporation Vorstellung und Ausblick neuer IBM Technologien und Trends IBM Watson™ Analytics © 2014 IBM Corporation Expectations from technology have never been higher Our work and personal lives have blurred It’s an “always-on” world © 2014 IBM Corporation A Do-It-Yourself mentality now prevails Leveraging analytics still faces many obstacles 38% have a limited understanding of how to use analytics 34% can not find time to analyze data 24% find it difficult to get data The desire to make datadriven decisions is prevalent Making decisions rapidly is no longer a goal; it’s an imperative Access to required data sources is critical while maintaining governed standards Source: Analytics: The New Path to Value, a joint MIT Sloan Management Review and IBM Institute for Business Value study. Copyright © Massachusetts Institute of Technology © 2014 IBM Corporation Even a simple analytics project has multiple steps and people Data Access Data Preparation Reporting Business Analysts IT Analysis Collaboration Business Users Validation © 2014 IBM Corporation Data Scientists and Statisticians And it’s rarely a straightforward process Data Access Business Analysts IT Data Preparation Reporting Collaboration Validation Analysis Business Users Data Scientists and Statisticians © 2014 IBM Corporation IBM Watson Analytics Put analytics in the hands of a broad range of users Make data access and refinement easier Deliver through the cloud for agility and speed Understand Your Business Tell a Story Automated intelligence accelerates your ability to answer questions Visualizations support your decisions and communicate results Mobile Ready Secure Get Better Data Think Ahead Predictive analytics reveals insights and opportunities Embedded information services provide data access and refinement © 2014 IBM Corporation IBM Watson Analytics Self-service analytics for business users and experts alike Business Users Business Analysts Data Scientists © 2014 IBM Corporation IT IBM Watson Analytics Empowering the business for success Marketing Sales Finance IT Operations HR Campaign Planning and ROI Customer Retention Prioritizing Accounts Receivable Helpdesk Case Analysis Warranty Analysis Employee Retention Examples © 2014 IBM Corporation Video Watson Analytics http://www.youtube.com/watch?v=IV7mVOI5Gug © 2014 IBM Corporation IBM Watson Analytics Quick start intuitive interface Natural language dialogue Data discovery Mobile-ready Cloud-based agility © 2014 IBM Corporation IBM Watson Analytics Data access and refinement Intelligent automation Integrated social business Report and dashboard creation Visual storytelling Guided analytic discovery Unified analytics experience © 2014 IBM Corporation IBM Watson Analytics Single Analytics Experience Fully Automated Intelligence Natural Language Dialogue Guided Analytic Discovery Visit WatsonAnalytics.com and get started for free http://www.ibm.com/analytics/watsonanalytics/ © 2014 IBM Corporation Legal Disclaimer o © IBM Corporation 2014. All Rights Reserved. o The information contained in this publication is provided for informational purposes only. While efforts were made to verify the completeness and accuracy of the information contained in this publication, it is provided AS IS without warranty of any kind, express or implied. In addition, this information is based on IBM’s current product plans and strategy, which are subject to change by IBM without notice. IBM shall not be responsible for any damages arising out of the use of, or otherwise related to, this publication or any other materials. Nothing contained in this publication is intended to, nor shall have the effect of, creating any warranties or representations from IBM or its suppliers or licensors, or altering the terms and conditions of the applicable license agreement governing the use of IBM software. o References in this presentation to IBM products, programs, or services do not imply that they will be available in all countries in which IBM operates. Product release dates and/or capabilities referenced in this presentation may change at any time at IBM’s sole discretion based on market opportunities or other factors, and are not intended to be a commitment to future product or feature availability in any way. Nothing contained in these materials is intended to, nor shall have the effect of, stating or implying that any activities undertaken by you will result in any specific sales, revenue growth or other results. © 2014 IBM Corporation Big Data & Analytics Weitere Informationen und Kontaktdaten Die Präsentationen sowie weitere Informationen zu „Predictive Analytics meets Big Data“ finden Sie unter www.ibm.com/events/spssbd Generelle Informationen zu den IBM SPSS Lösungen und Big Data finden Sie unter www.ibm.com/de/spss bzw. www.ibm.com/software/products/de/category/bigdata Das IBM SPSS und Big Data Team steht Ihnen gerne unter 0049-89-4504 2022 (SPSS) bzw. 0049-7032-1549 116 (Big Data - Frau Marta Musial) zur Verfügung. 97 © 2014 IBM Corporation