Slides

Transcription

Slides
Reactive Application Design
For High-Volume Multidimensional Temporal Data
Series
@pidster
Stuart Williams == ‘Pid’
• Lead Engineer, RTI
• SpringSource / VMware / Pivotal
• A little OSS: Apache, Eclipse
@pidster
Mumbling Isn’t a Sign of
Laziness — It’s a Clever
Data-Compression Trick
Source: http://nautil.us/blog/mumbling-isnt-a-sign-of-lazinessits-a-cleverdata_compression-trick
Practical
• We built a real-time* application (RTI)
– Using Spring (yes! I know!)
– Share some lessons
– Do some demos
– Reveal a secret go-faster switch!
* for some definition of real-time
Built with…
• Spring IO Platform
– Boot, Data, Integration, Reactor, AMQP,
SpEL, Shell (and a little Groovy)
• GemFire, RabbitMQ
• C24
Questions…
• Do you know your system load & input rates?
– No.
– Yes!
•
•
•
•
•
•
Up to 1k/s?
Up to 5k/s?
Up to 10k/s?
Up to 100k/s?
Up to 1M/s?
Above 1M/s?
Questions…
• Heard of Spring Integration?
– Tried it?
– Used it in production?
• Heard of Spring Reactor?
– Tried it?
– Used it in production?
Some discussion about…
DESIGN
Design Goals & Challenges
• High throughput
– versus
• Low latency
• Enable low-impact analysis on live streams
– We don’t know in advance what this will be…
– User accessible API for analytics
Many whiteboards later…
Show me the…
BIG PICTURE!
The Big Picture
Reactive
expressions
Query, Adapt, React
feedback loop
Analytics
Stream
Analytics
Queries
Metrics
Firehose
AMQP
Ingester
Ingest Grid
Distribution
End User
Applications
Databases
Queries
- Unstructured
- Structured
HTTP
Ingestion
& Filtering
HTTP
Analytics &
Distribution
End-user /
Consumers
Input Data Rates
RTI
Twitter*
 100k/s baseline
 6k/s average
 ~120k/s daily peak
 9k/s daily peak
 >1M/s annual peak
 30k/s large events
OK, so Twitter’s internal fan-out & timeline access
rates & storage problems are vastly different!
(see also Redis…)
**Source @catehstn
twitter.com/catehstn/status/494918021358813184
Load Characteristics
 Low numbers of inbound connections
 High rates, micro-bursts
 Occasional peaks of nearly 2x, rare peaks of 10x
 Variable payload size (200B – 300KB)
 Internal fan-outs multiply event rates
More statistics…
 100k/s order of magnitude
– 8,640,000,000 (per day)
– An Integer based counter will ‘roll over’ in ~3 days
 400Mbps of raw data
– 10Gbps NICs required to support traffic peaks
– Logging! Verbose errors can fill a disk quickly
– Queues backing up == #fail
 Upcoming 10x existing rates!
What’s all this fuss about…
REACTIVE APPLICATIONS?
Reactive Applications
www.reactivemanifesto.org
Responsive
Depends on
Scalable
Resilient
Depends on
Event (or message) driven
Reactive Streams
• Collaboration between key industry players
18
Reactive Streams: Specification
• Semantics
– Single document listing all rules
– Open enough to allow for various patterns
• 4 API interfaces
• TCK to verify implementation behaviour
19
Reactive Streams
github.com/reactive-streams
org.reactivestreams.Processor
org.reactivestreams.Publisher
org.reactivestreams.Subscriber
org.reactivestreams.Subscription
A quick look at the…
REACTIVE STREAMS API
Spring Reactor
LMAX Disruptor – a RingBuffer
reactor.bus.EventBus
reactor.core.Dispatcher
reactor.rx.Stream
A quick look at the…
REACTOR API
Spring Integration
Enterprise Integration Patterns
See http://www.enterpriseintegrationpatterns.com/ by Hohpe & Woolf
Messages
Channels
Endpoints
(pipes & filters architecture)
SI Pipeline Example
A quick look at the…
SPRING INTEGRATION JAVA DSL
Spring Integration Performance
• 3.x
– Take out all the SpEL
• 4.x
– 4.0
• Improved
– 4.1 (Q4 2014)
• Put back all the SpEL
– 4.2 (late 2015)
• Rather good
Back to the…
BIG PICTURE AGAIN
The Big Picture
Key Reactor usages Reactive
expressions
Query, Adapt, React
feedback loop
Analytics
Stream
Analytics
Queries
Metrics
Firehose
AMQP
Ingester
Ingest Grid
Distribution
End User
Applications
Databases
Queries
- Unstructured
- Structured
HTTP
Ingestion
& Filtering
HTTP
Analytics &
Distribution
End-user /
Consumers
Reactor Usage
• UDP/TCP Servers (or clients)
• Outputs – batching
• Dispatchers & Streams
– Expression evaluation engine
Spring Integration + Reactor
• Batching
– Adaptive sizing
• Routing
– With … batching
An example or two…
SPRING INTEGRATION + REACTOR
But what about the…
TEMPORAL DATA SERIES?
Temporal Data
Expressions
RingBuffer
Old Data
New Data
And there was something about a…
SECRET ‘GO-FASTER’ SWITCH?
Spring Expression Language
(SpEL)
• Powerful expression language
• Supports querying and manipulating an object
graph at runtime
• Similar to Unified EL
– Additional features, include method invocation
and string templating.
SpEL is slow!

Enter SpEL Compilation
• 3 modes
– Immediate
– Mixed
– Off
• -Dspring.expression.compiler.mode=mixed
And now for a quick
SPEL DEMO
SpEL is fast!

Fin
And relax
And now for some…
QUESTIONS?
@pidster @smaldini