- MarkLogic
Transcription
- MarkLogic
JOE PASQUA SVP PRODUCT STRATEGY, MARKLOGIC possible SLIDE: 2 © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Schema-agnostic Database Hierarchical Data Model MarkLogic Server is a document-centric database Supports any-structured data via hierarchical data model Stores compressed binary trees Document Trade Cashflows Title Metadata Author First Last Section SLIDE: 4 Payment Party Identifier Net Payment Date Section Section Section Section Party Reference Trade ID Payment Payer Party Receiver Amount Party © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. MarkLogic is Schema-Agnostic JSON and XML are self-describing: <SAR> near airport </title> <title> Suspicious vehicle vehicle… <date>2012-11-12Z</date> <type> observation/surveillance </type> <threat> <type> suspicious activity </type> <category> suspicious vehicle </category> </threat> <location> <lat>37.497075</lat> <long>-122.363319</long> </location> with license plate ABC 123 was observed parked behind the airport sign… <description> A blue van van… <triple> <subject> IRIID</subject> <predicate> isa</predicate> <object>license-plate </object></triple> <triple> <subject> IRIID</subject> <predicate> value </predicate> <object>ABC 123</object></triple> </description> </SAR> SLIDE: 5 © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. MarkLogic is Schema-Agnostic <SAR> <description> <title> Suspicious vehicle… <type> <location> <date> 2012-11-12Z <threat> <lat> 37.497075 observation/surveillance <long> -122.363319 <type> <category> suspicious activity suspicious vehicle SLIDE: 6 <triple> A blue van… <triple> <subject> IRIID <predicate> isa <object> <predicate> ABC 123 <subject> value IRIID <object> license-plate © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Scale Out Architecture MarkLogic Architecture Java XQuery REST ODBC Interfaces Evaluator XSLT | XPath | XQuery | SQL | SPARQL Cache Expanded Tree Cache | Module Cache Evaluation Layer Broadcaster / Aggregator Transaction Controller Multiversion Concurrency Control Cache Transaction Journal Compressed Tree Cache | List Cache Indexes Data Layer Values | Structure | Text | Scalar | Metadata | Security | Geospatial | Reverse | Triple Compressed Storage XML | JSON | Binary | Text SLIDE: 8 © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Cluster Topologies RDBMS XA SLIDE: 9 © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. ACID Transactions MVCC /articles/doc1.xml /articles/doc1.xml Document Document Title First Section Title Author Last Section Section Section First Section Section 628 ∞ 423 ∞ C Creation Timestamp D Deletion Timestamp SLIDE: 11 Metadata Section 628 ∞ Author Last Section Metadata Year Section Section Section Section ∞ MVCC Benefits: ACID transactions Zero-latency search indexing High throughput Lock-free reads Point-in-time query Fast database rollback Enables serial writes © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Search & Query Universal Index Term Which vetted SAR reports contain the phrase blue van? Term List “blue” 123, 125, 129, 130, 152, 344, … “van” 123, 125, 126, 129, 130, 152, … “observed” 125, 152, 516, 522, 765, 890, … “blue van” 123, 125, 129, 130, 152, 486, … STEM “observe” 125, 152, 516, 522, 765, 890, … <SAR> … <SAR>/<location> <threat>/<category> Document References 125, 516, 890, … MarkLogic indexes… Words … Phrases … Stemmed words and phrases Structure <type>suspicious activity</type> … <date>2012-11-12Z</date> … Collection:Vetted … Words and phrases in the context of structure Role:Analyst + Action:Read … … … Values … … Collections … … Security Permissions SLIDE: 13 © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Which vetted SAR reports containing the phrase blue van were submitted before 2013? Range Index Term Term List “blue” 123, 125, 129, 130, 152, 344, … “van” 123, 125, 126, 129, 130, 152, … “observed” 125, 152, 516, 522, 765, 890, … “blue van” 123, 125, 129, 130, 152, 486, … STEM “observe” 125, 152, 516, 522, 765, 890, … <SAR> … <SAR>/<location> … <threat>/<category> … <type>suspicious activity</type> … <date>2012-11-12Z</date> … Collection:Vetted … Role:Analyst + Action:Read … … … … … … … SLIDE: 14 Document References 125, 516, 890, … Range indexes map document IDs to values, and vice-versa in a compact in-memory representation. © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Triple IndexIndex Geospatial Term Term List “blue” 123, 125, 129, 130, 152, 344, … “van” 123, 125, 126, 129, 130, 152, … “observed” 125, 152, 516, 522, 765, 890, … “blue van” 123, 125, 129, 130, 152, 486, … STEM “observe” 125, 152, 516, 522, 765, 890, … <SAR> … <SAR>/<location> … <threat>/<category> … <type>suspicious activity</type> … <date>2012-11-12Z</date> … Collection:Vetted … Role:Analyst + Action:Read … … … … … … … SLIDE: 15 SAR reports blue vanbefore from 2013 Which vetted SARs about aabout blue avan from before 2013 refer refer to a location the airport? with this location to partialnear plate ABC? Document References 125, 516, 890, … The Geospatial index is like a 2D range index, with built-in query support for point, box, circle, and complex polygons. © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Which vetted SARs about a blue van from before 2013 with this location refer to partial plate ABC? Triple Index Term List 123, 125, 129, 130, 152, 344, … 123, 125, 126, 129, 130, 152, … Document References 125, 516, 890, … 125, 152, 516, 522, 765, 890, … 123, 125, 129, 130, 152, 486, … 125, 152, 516, 522, 765, 890, … … … The Triple index is an index of “facts” expressed as Semantic triples. It can efficiently query and join billions of “linked data” triples. … > … … … … … … … SLIDE: 16 © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Alerting Alerting: What and Why Alerting gives you an immediate notification when some condition is met; e.g. Tell me any time a home is sold to someone who lives outside the country You may have huge numbers of these alerts – each one is a query It is impractical to keep running them repeatedly to get real time alerting MarkLogic makes it fast with Reverse Indexes SLIDE: 18 © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Reverse Indexes We can save a query like any other document As a result, the query itself gets indexed Any time a new document is added or an old document is updated, we can find all possible matching queries in the index We use a “unified expression tree” to finish the job extremely efficiently Once again, indexes are at the heart of MarkLogic SLIDE: 19 © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Tiered Storage Tiered Storage Here’s how you enable tiered storage… Define data tiers based on a range index Have content balanced into forests by tier Move an entire tier to different storage Query one tier… …or the other tier… …or both at once! All with no downtime, and 100% consistency! On your own storage, HDFS or even S3 SLIDE: 21 © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. MarkLogic 7: Feature-rich Data Platform Powerful POWERFUL Deliver more value, build more powerful applications Deliver more value, build more powerful applications Semantics: RDF & SPARQL Flexible Indexes Triple Index JSON Storage Full Text Search AGILE TRUSTED Prepare for and respond quickly to change Enterprise-ready and secure for mission-critical apps ACID Transactions XA Distributed Transactions Common Criteria Security Certification Role-based Security & LDAP Support Configuration Management Monitoring & Management Automated Failover Backup/ Restore SQL Support Point-intime Recovery Replication BI Integration Journal Archiving Database Rollback SchemaAgnostic Elastic Tiered Storage Cloud Ready Alerting & Event Processing mlcp Content Pump Programmatic Controls & Metering Scalable Geospatial Query HDFS & Amazon S3 Storage Application Builder REST & Java APIs In-database MapReduce Hadoop Connector Analytic Functions Visualization Widgets Information Studio SLIDE: 22 © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.