Slides
Transcription
Slides
Reactive Application Design For High-Volume Multidimensional Temporal Data Series @pidster Stuart Williams == ‘Pid’ • Lead Engineer, RTI • SpringSource / VMware / Pivotal • A little OSS: Apache, Eclipse @pidster Mumbling Isn’t a Sign of Laziness — It’s a Clever Data-Compression Trick Source: http://nautil.us/blog/mumbling-isnt-a-sign-of-lazinessits-a-cleverdata_compression-trick Practical • We built a real-time* application (RTI) – Using Spring (yes! I know!) – Share some lessons – Do some demos – Reveal a secret go-faster switch! * for some definition of real-time Built with… • Spring IO Platform – Boot, Data, Integration, Reactor, AMQP, SpEL, Shell (and a little Groovy) • GemFire, RabbitMQ • C24 Questions… • Do you know your system load & input rates? – No. – Yes! • • • • • • Up to 1k/s? Up to 5k/s? Up to 10k/s? Up to 100k/s? Up to 1M/s? Above 1M/s? Questions… • Heard of Spring Integration? – Tried it? – Used it in production? • Heard of Spring Reactor? – Tried it? – Used it in production? Some discussion about… DESIGN Design Goals & Challenges • High throughput – versus • Low latency • Enable low-impact analysis on live streams – We don’t know in advance what this will be… – User accessible API for analytics Many whiteboards later… Show me the… BIG PICTURE! The Big Picture Reactive expressions Query, Adapt, React feedback loop Analytics Stream Analytics Queries Metrics Firehose AMQP Ingester Ingest Grid Distribution End User Applications Databases Queries - Unstructured - Structured HTTP Ingestion & Filtering HTTP Analytics & Distribution End-user / Consumers Input Data Rates RTI Twitter* 100k/s baseline 6k/s average ~120k/s daily peak 9k/s daily peak >1M/s annual peak 30k/s large events OK, so Twitter’s internal fan-out & timeline access rates & storage problems are vastly different! (see also Redis…) **Source @catehstn twitter.com/catehstn/status/494918021358813184 Load Characteristics Low numbers of inbound connections High rates, micro-bursts Occasional peaks of nearly 2x, rare peaks of 10x Variable payload size (200B – 300KB) Internal fan-outs multiply event rates More statistics… 100k/s order of magnitude – 8,640,000,000 (per day) – An Integer based counter will ‘roll over’ in ~3 days 400Mbps of raw data – 10Gbps NICs required to support traffic peaks – Logging! Verbose errors can fill a disk quickly – Queues backing up == #fail Upcoming 10x existing rates! What’s all this fuss about… REACTIVE APPLICATIONS? Reactive Applications www.reactivemanifesto.org Responsive Depends on Scalable Resilient Depends on Event (or message) driven Reactive Streams • Collaboration between key industry players 18 Reactive Streams: Specification • Semantics – Single document listing all rules – Open enough to allow for various patterns • 4 API interfaces • TCK to verify implementation behaviour 19 Reactive Streams github.com/reactive-streams org.reactivestreams.Processor org.reactivestreams.Publisher org.reactivestreams.Subscriber org.reactivestreams.Subscription A quick look at the… REACTIVE STREAMS API Spring Reactor LMAX Disruptor – a RingBuffer reactor.bus.EventBus reactor.core.Dispatcher reactor.rx.Stream A quick look at the… REACTOR API Spring Integration Enterprise Integration Patterns See http://www.enterpriseintegrationpatterns.com/ by Hohpe & Woolf Messages Channels Endpoints (pipes & filters architecture) SI Pipeline Example A quick look at the… SPRING INTEGRATION JAVA DSL Spring Integration Performance • 3.x – Take out all the SpEL • 4.x – 4.0 • Improved – 4.1 (Q4 2014) • Put back all the SpEL – 4.2 (late 2015) • Rather good Back to the… BIG PICTURE AGAIN The Big Picture Key Reactor usages Reactive expressions Query, Adapt, React feedback loop Analytics Stream Analytics Queries Metrics Firehose AMQP Ingester Ingest Grid Distribution End User Applications Databases Queries - Unstructured - Structured HTTP Ingestion & Filtering HTTP Analytics & Distribution End-user / Consumers Reactor Usage • UDP/TCP Servers (or clients) • Outputs – batching • Dispatchers & Streams – Expression evaluation engine Spring Integration + Reactor • Batching – Adaptive sizing • Routing – With … batching An example or two… SPRING INTEGRATION + REACTOR But what about the… TEMPORAL DATA SERIES? Temporal Data Expressions RingBuffer Old Data New Data And there was something about a… SECRET ‘GO-FASTER’ SWITCH? Spring Expression Language (SpEL) • Powerful expression language • Supports querying and manipulating an object graph at runtime • Similar to Unified EL – Additional features, include method invocation and string templating. SpEL is slow! Enter SpEL Compilation • 3 modes – Immediate – Mixed – Off • -Dspring.expression.compiler.mode=mixed And now for a quick SPEL DEMO SpEL is fast! Fin And relax And now for some… QUESTIONS? @pidster @smaldini