PRO-OPT Components
Transcription
PRO-OPT Components
Big Data meets Big Data Wie die Integration von Big-Data-Lösungen über Unternehmensgrenzen gelingt Torsten Lenhart Fraunhofer IESE © Fraunhofer IESE Architekturen 2016 Hildesheim, 23.06.2016 The Fraunhofer-Gesellschaft at a Glance The Fraunhofer-Gesellschaft undertakes applied research of direct utility to private and public enterprise and of wide benefit to society. €2.1 billion Major infrastructure capital expenditure and defense research Almost 30% is contributed by the German federal and Länder Governments. €1.8 billion Contract Research 67 institutes and research units Finance volume 24,000 staff More than 70% is derived from contracts with industry and from publicly financed research projects. 2015 © Fraunhofer IESE Fraunhofer IESE The institute for software and systems engineering methods Founded in 1996, headquartered in Kaiserslautern Over 155 full-time equivalents (FTEs) Our solutions can be scaled flexibly and are suitable for companies of any size Our most important business areas: © Fraunhofer IESE Automotive and Transportation Systems Information Systems Automation and Plant Engineering Energy Management Health Care E-Government Our Competencies – for Your Benefit for innov ativ e s y s tem s S OFTWARE-ENABLED INNOVATIONS © Fraunhofer IESE Our Competencies – for Your Benefit ES /CPS S m art Ecos y s tem s IS /Mobile S OFTWARE-ENABLED INNOVATIONS © Fraunhofer IESE Motivation Data ABC AG Data BSP GmbH CTR SE MTR Inc Data © Fraunhofer IESE Data Motivation Car Manufaturer (OEM) ABC AG BSP GmbH © Fraunhofer IESE … CTR SE 1st Tier Suppliers … MTR Inc … 2nd Tier Suppliers … … … 3rd Tier Suppliers The research project PRO-OPT PRO-OPT aims at identifying valuable data and making it available for creating additional benefit for all members of a Smart Ecosystem. Project Lead, Technology and Evaluation Partner for Automotive Diagnostics Technical Project Lead, Research Partner for Access Restriction, SW Architecture & Data Quality Technology and Evaluation Partner for Production Systems Partners Data Supplier and Evaluation Partner Evaluation Partner and Data Supplier for Automotive Diagnostics Research Partner for Data Mining and Integration of System Components Project Duration: 01.01.2015 - 31.12.2017 © Fraunhofer IESE Multiplicator Visualization Main Challenge 1: Build a Technology Bridge How can we bridge the isolated data islands of the partners in the ecosystem so that analysis can be performed across these islands while the original owner of the data keeps control of it? PRO-OPT @ CeBIT Hannover, 14.03. – 18.03.2016 Main Challenge 2: Data Usage Control Data is the DNA of a company There are common, but also conflicting interests in the ecosystem Data access is often perceived as a binary decision → Fine granular access policies with additional control and protection mechanisms are a key success factor Data © Fraunhofer IESE Main Challenge 3: Substantiation of Benefits High-Level: Everyone agrees Concrete use cases are sometimes difficult to define But they are needed to justify investments & compromises Without knowing the concrete benefits, I don‘t allow access to my data A research project is well suited to resolve this deadlock © Fraunhofer IESE Without access to the data, we cannot substantiate the benefits S tep 0: Gathering an holis tic ov erv iew PRO-OPT Big Data Landscape © Fraunhofer IESE S tep 1: Identify ing potential pillars for the technology brigde Observation Rise of multi-purpose data processing engines: © Fraunhofer IESE Spark vs Flink – High-Level Comparison Flink Origin TU Berlin University of California, Berkeley Execution Model Directed Acyclic Graphs (DAGs) Directed Acyclic Graphs (DAGs) Streaming Support Native Micro-Batches Latest Stable Release 1.0.3 (11.05.2016) 1.6.1 (09.03.2016) Spreading Contributors ~ 200 ~ 900 Sprachen Scala, Java Scala, Python, (Java) Project Homepage http://flink.apache.org http://spark.apache.org Backing Company http://data-artisans.com https://databricks.com © Fraunhofer IESE S tep 2: Defining the ov erall architecture High-Level PRO-OPT Platform Architecture PRO-OPT Platform PRO-OPT Frontend PRO-OPT API PRO-OPT Bridge Company 1 Company k PRO-OPT Connector Catalog PEPs PDP Spark Adapter Company n PRO-OPT Connector Catalog PEPs PDP Flink Adapter PRO-OPT Connector Catalog PEPs Spark Adapter Flink Cluster © Fraunhofer IESE Cluster PDP Cluster PRO-OPT Components PRO-OPT Frontend UI for general analysis and reporting features Also used for catalog and platform management PRO-OPT API Interface for defining PRO-OPT programs Inspired by the Spark and Flink API – but detailed structure is work in progress PRO-OPT Bridge Acts as some kind of pre-compiler and dispatcher Conveys the PRO-OPT programs into one or multiple Spark and/or Flink programs Passes the Spark and/or Flink programs together with the identity of the originator to the respective PRO-OPT connectors Collects the results from the connectors and applies some additional processing to generate the final result © Fraunhofer IESE PRO-OPT Components Catalogue Contains a list & description of all data sources of this ecosystem member Private: Policy for each data source with fine granular usage rules: Who is allowed to use the data? How often can it be used? Wich parts can be used? What parts can be actually returned? Do additional measures have to be applied (e.g. pseudonymisation)? ... Catalogue Datenquellen Datenquellen Data Sources Descriptor Public Private © Fraunhofer IESE Metadata Policies PRO-OPT Components PRO-OPT Connector Receives Spark and/or Flink programs (dependent on the cluster that is installed at this particular ecosystem member) Performs a pre-processing step (e.g. replacing data source ids with actual addresses) Enforces the data usage control rules defined through the repective policies by applying Policy Enforcement Points (PEP) and Policy Decision Points (PDP) Data usage control is based on the Fraunhofer IESE IND2UCE Framework Integrated Distributed Data Usage Control Enforcement © Fraunhofer IESE Sample Scenario: Warranty Claim Process warranty_claims_oem.csv: Problem Analysis vin;itemid;description;… XYZ000051T2123456;I2726373;…;… XYZ000068K3526889;I9102889;…;… Warranty Claim XYZ000052T2671171;I2727384;…;… … ? Cassandra table items: OEM (Car Manufacturer) type Supplier I2726373 Compensation or Rejection desc P2728-C … part1 … … I2724415 type desc P2231-A … part1 part2 part3 … … … I2726329 type desc P2728-C … part1 part2 … … … © Fraunhofer IESE OEM Catalog { OEM Catalog "id": 126, DataSource inspection_logs "name":"configs", "description":"Information on car orders", DataSource configs "type":"structured", "columns":{ DataSource quantities "vin":{ … "description":"Vehicle Identification Number", "type":"String" }, "dealer":{ "description":"Id of the dealer", "type":"integer" }, "enginetype":{ "description":"Type of the engine (diesel, petrol, etc)", "type":"String" }, […] }, "restrictions":{ […] } } © Fraunhofer IESE OEM Catalog { OEM Catalog "id": 126, DataSource inspection_logs […] "restrictions":{ DataSource configs "limitrows": { "max": 100 DataSource quantities }, … "frequency": { "maxperday": 1, "maxpermonth": 10 }, „pseudonymize": { "column" : "dealer" } } } © Fraunhofer IESE Sample Scenario: Warranty Claim Process warranty_claims_oem.csv: Problem Analysis vin;itemid;description;… XYZ000051T2123456;I2726373;…;… XYZ000068K3526889;I9102889;…;… Warranty Claim XYZ000052T2671171;I2727384;…;… … Cassandra table items: Zulieferer Data source configs: OEM OEM VIN Dealer Enginetype … (Car Manufacturer) XYZ000051T2123456 122 Diesel … (Automobilhersteller) XYZ000051T2123457 XYZ000051T2123458 … 256 122 Petrol Perrol … … Compensation or Rejection I2726373 type desc P2728-C … part1 … … I2724415 type desc P2231-A … part1 part2 part3 … … … I2726329 type desc P2728-C … part1 part2 … … … © Fraunhofer IESE Sample Program Creates or invokes PRO-OPT Program PRO-OPT Read warranty_claims_oem.csv → ProOptDataSet<String> input Map input → ProOptDataSet<WarrantyClaim> claims Extract List<String> vins from claims Read 126 → →DataFrame DataFrameordersDF ordersDF Readsource sourcewith fromidpath Read source with fromidpath 89 → DataFrame itemsDF Filter ordersDF bydealer vins →→ ordersDF Anonymize column ordersDF Filter itemsDF by itemIds → itemsDF Collect ordersDF → Row[] orders Collect itemsDF → Row[] items © Fraunhofer IESE PRO-OPT Limit orders → orders Analyze result → ECU for petrol engines used for Diesel engine Join orders, items, claims → ProOptDataSet<WarrantyClaim,Order,Item> result Supplier OEM CSV Extract List<String> itemIds from claims Status & Outlook Implementation UI, API & Bridge (ongoing) Connector (ongoing) Spark Adaptor (ongoing) Flink Adaptor (starts soon) Extension of Catalog & Policy Language Other Topics: Data Quality Crowd Sourcing … Not in scope for now: privacy issues © Fraunhofer IESE Contact Tors ten Lenhart [email protected] Fraunhofer IES E Fraunhofer-Platz 1 67663 Kaiserslautern Germany Phone +49 631/6800-0 www.iese.fraunhofer.de © Fraunhofer IESE Image Credits S lide Des cription Author Link/Inform ation Licens e 1 Data World Map Geralt https://pixabay.com/de/bin%C3%A4r-eins-null-kontinente-erde-1414315/ CC0 1.0 7 Car DSA GmbH © DSA GmbH – used with kind approval of DSA GmbH Proprietary 9 Golden Gate Unsplash https://pixabay.com/de/golden-gate-br%C3%BCcke-san-francisco-388917/ CC0 1.0 10 Remote control JJuni https://pixabay.com/en/remote-control-one-trillion-kinds-1143461/ CC0 1.0 10 Crown OpenClipartVectors https://pixabay.com/de/krone-golden-gelb-kaiser-zubeh%C3%B6r-576226/ CC0 1.0 11 Chicken silhouette ClkerFreeVectorImages https://pixabay.com/de/hen-huhn-gefl%C3%BCgel-bauernhof-tier-311285/ CC0 1.0 12 Earth from space Skeeze https://pixabay.com/en/panorama-earth-canada-landscape-1241289/ CC0 1.0 14 Under the bridge Unsplash https://pixabay.com/en/bridge-river-under-cityscape-1149241/ CC0 1.0 17 Blueprint Wokandapix https://pixabay.com/en/blueprint-ruler-architecture-964630/ CC0 1.0 22 Sheet icon Paomedia http://www.iconarchive.com/show/small-n-flat-icons-by-paomedia.html PD 22 Factory IconsMind https://www.iconsmind.com/ Custom 22 ECU DSA GmbH © DSA GmbH – used with kind approval of DSA GmbH Proprietary 22 Test bed DSA GmbH © DSA GmbH – used with kind approval of DSA GmbH Proprietary 26 Bulb IconLeak http://iconleak.com http://www.iconarchive.com/show/or-icons-by-iconleak/light-bulb-icon.html Custom 27 Fernrohr Hans https://pixabay.com/de/fernrohr-durchblick-aussicht-blick-122960/ CC0 1.0 © Fraunhofer IESE