- MarkLogic

Transcription

- MarkLogic
JOE PASQUA
SVP PRODUCT STRATEGY, MARKLOGIC
possible
SLIDE: 2
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Schema-agnostic
Database
Hierarchical Data Model

MarkLogic Server is a document-centric database

Supports any-structured data via hierarchical data model

Stores compressed binary trees
Document
Trade Cashflows
Title
Metadata
Author
First
Last
Section
SLIDE: 4
Payment
Party
Identifier Net Payment Date
Section
Section
Section
Section
Party
Reference
Trade
ID
Payment
Payer
Party Receiver Amount
Party
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
MarkLogic is Schema-Agnostic
JSON and XML are self-describing:
<SAR>
near airport </title>
<title> Suspicious vehicle
vehicle…
<date>2012-11-12Z</date>
<type> observation/surveillance </type>
<threat>
<type> suspicious activity </type>
<category> suspicious vehicle </category>
</threat>
<location>
<lat>37.497075</lat>
<long>-122.363319</long>
</location>
with license plate ABC 123 was observed parked behind the airport sign…
<description> A blue van
van…
<triple> <subject> IRIID</subject> <predicate> isa</predicate> <object>license-plate </object></triple>
<triple> <subject> IRIID</subject> <predicate> value </predicate> <object>ABC 123</object></triple>
</description>
</SAR>
SLIDE: 5
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
MarkLogic is Schema-Agnostic
<SAR>
<description>
<title>
Suspicious vehicle…
<type>
<location>
<date>
2012-11-12Z
<threat>
<lat>
37.497075
observation/surveillance
<long>
-122.363319
<type>
<category>
suspicious activity
suspicious vehicle
SLIDE: 6
<triple>
A blue van…
<triple>
<subject>
IRIID
<predicate>
isa
<object>
<predicate>
ABC 123
<subject> value
IRIID
<object>
license-plate
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Scale Out Architecture
MarkLogic Architecture
Java
XQuery
REST
ODBC
Interfaces
Evaluator
XSLT | XPath | XQuery | SQL | SPARQL
Cache
Expanded Tree Cache | Module Cache
Evaluation Layer
Broadcaster / Aggregator
Transaction Controller
Multiversion Concurrency Control
Cache
Transaction Journal
Compressed Tree Cache | List Cache
Indexes
Data Layer
Values | Structure | Text | Scalar | Metadata | Security | Geospatial | Reverse | Triple
Compressed Storage
XML | JSON | Binary | Text
SLIDE: 8
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Cluster Topologies
RDBMS
XA
SLIDE: 9
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
ACID Transactions
MVCC
/articles/doc1.xml
/articles/doc1.xml
Document
Document
Title
First
Section
Title
Author
Last
Section
Section
Section
First
Section
Section
628
∞
423
∞
C
Creation Timestamp
D
Deletion Timestamp
SLIDE: 11
Metadata
Section
628
∞
Author
Last
Section
Metadata
Year
Section
Section
Section
Section
∞
MVCC Benefits:

ACID transactions

Zero-latency search indexing

High throughput

Lock-free reads

Point-in-time query

Fast database rollback

Enables serial writes
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Search & Query
Universal Index
Term
Which vetted SAR reports contain
the phrase blue van?
Term List
“blue”
123, 125, 129, 130, 152, 344, …
“van”
123, 125, 126, 129, 130, 152, …
“observed”
125, 152, 516, 522, 765, 890, …
“blue van”
123, 125, 129, 130, 152, 486, …
STEM “observe”
125, 152, 516, 522, 765, 890, …
<SAR>
…
<SAR>/<location>
<threat>/<category>
Document
References
125, 516, 890, …
MarkLogic indexes…

Words
…

Phrases
…

Stemmed words and phrases

Structure
<type>suspicious activity</type> …
<date>2012-11-12Z</date>
…
Collection:Vetted
…

Words and phrases in the context of structure
Role:Analyst + Action:Read
…
…
…

Values
…
…

Collections
…
…

Security Permissions
SLIDE: 13
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Which vetted SAR reports containing the phrase blue
van were submitted before 2013?
Range Index
Term
Term List
“blue”
123, 125, 129, 130, 152, 344, …
“van”
123, 125, 126, 129, 130, 152, …
“observed”
125, 152, 516, 522, 765, 890, …
“blue van”
123, 125, 129, 130, 152, 486, …
STEM “observe”
125, 152, 516, 522, 765, 890, …
<SAR>
…
<SAR>/<location>
…
<threat>/<category>
…
<type>suspicious activity</type> …
<date>2012-11-12Z</date>
…
Collection:Vetted
…
Role:Analyst + Action:Read
…
…
…
…
…
…
…
SLIDE: 14
Document
References
125, 516, 890, …
Range indexes map
document IDs to values,
and vice-versa in a compact
in-memory representation.
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Triple
IndexIndex
Geospatial
Term
Term List
“blue”
123, 125, 129, 130, 152, 344, …
“van”
123, 125, 126, 129, 130, 152, …
“observed”
125, 152, 516, 522, 765, 890, …
“blue van”
123, 125, 129, 130, 152, 486, …
STEM “observe”
125, 152, 516, 522, 765, 890, …
<SAR>
…
<SAR>/<location>
…
<threat>/<category>
…
<type>suspicious activity</type> …
<date>2012-11-12Z</date>
…
Collection:Vetted
…
Role:Analyst + Action:Read
…
…
…
…
…
…
…
SLIDE: 15
SAR reports
blue
vanbefore
from 2013
Which vetted SARs
about aabout
blue avan
from
before
2013
refer refer
to a location
the
airport?
with
this
location
to partialnear
plate
ABC?
Document
References
125, 516, 890, …
The Geospatial index is like a
2D range index, with built-in
query support for point, box,
circle, and complex polygons.
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Which vetted SARs about a blue van from before 2013
with this location refer to partial plate ABC?
Triple Index
Term List
123, 125, 129, 130, 152, 344, …
123, 125, 126, 129, 130, 152, …
Document
References
125, 516, 890, …
125, 152, 516, 522, 765, 890, …
123, 125, 129, 130, 152, 486, …
125, 152, 516, 522, 765, 890, …
…
…
The Triple index is an index of
“facts” expressed as Semantic
triples. It can efficiently query
and join billions of “linked
data” triples.
…
> …
…
…
…
…
…
…
SLIDE: 16
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Alerting
Alerting: What and Why

Alerting gives you an immediate notification when some condition is met; e.g.

Tell me any time a home is sold to someone who lives outside the country

You may have huge numbers of these alerts – each one is a query

It is impractical to keep running them repeatedly to get real time alerting

MarkLogic makes it fast with Reverse Indexes
SLIDE: 18
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Reverse Indexes

We can save a query like any other document

As a result, the query itself gets indexed

Any time a new document is added or an old document is updated, we can find
all possible matching queries in the index

We use a “unified expression tree” to finish the job extremely efficiently

Once again, indexes are at the heart of MarkLogic
SLIDE: 19
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Tiered Storage
Tiered Storage
Here’s how you enable tiered storage…

Define data tiers based on a range index

Have content balanced into forests by tier

Move an entire tier to different storage

Query one tier…
…or the other tier…
…or both at once!
All with no downtime, and 100% consistency!
On your own storage, HDFS or even S3
SLIDE: 21
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
MarkLogic 7: Feature-rich Data Platform
Powerful
POWERFUL
Deliver more value, build more powerful applications
Deliver more value, build more powerful applications
Semantics:
RDF &
SPARQL
Flexible
Indexes
Triple
Index
JSON
Storage
Full Text
Search
AGILE
TRUSTED
Prepare for and respond quickly to change
Enterprise-ready and secure for mission-critical apps
ACID
Transactions
XA
Distributed
Transactions
Common
Criteria
Security
Certification
Role-based
Security &
LDAP
Support
Configuration
Management
Monitoring
&
Management
Automated
Failover
Backup/
Restore
SQL
Support
Point-intime
Recovery
Replication
BI
Integration
Journal
Archiving
Database
Rollback
SchemaAgnostic
Elastic
Tiered
Storage
Cloud
Ready
Alerting
& Event
Processing
mlcp
Content
Pump
Programmatic
Controls &
Metering
Scalable
Geospatial
Query
HDFS &
Amazon S3
Storage
Application
Builder
REST &
Java APIs
In-database
MapReduce
Hadoop
Connector
Analytic
Functions
Visualization
Widgets
Information
Studio
SLIDE: 22
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.