Telecom Control and Data Plane Convergence
Transcription
Telecom Control and Data Plane Convergence
Telecom Control and Data Plane Convergence Choosing the right multicore software architecture for high performance control and data plane applications Magnus Karlsson Systems Architect Telecom Market Conditions IP-based services driving exponential data traffic growth Traffic volume Expected Traffic volume High focus on rich user experience and service platforms Expected Revenue Declining Average Revenue Per User (ARPU) Focus on Network OPEX and CAPEX cost Desired network cost/bit Time Voice dominated Data dominated Trends in Telecom Telecom is going “ALL IP” - Massive IP packet processing applications - Power consumption is a critical design factor - QoS is becoming increasingly important – telecom reliability in the datacom world Industry response: - Application specific processors, multicore and HW acceleration engines - CPU and DSP use cases converging - Integration of control plane and data plane into the same multicore processors Fundamental Differences between Control Plane and Data Plane Applications Control Plane Characteristics: - CPU bound processing - Operations And Maintenance functions - Typically terminates IP traffic Data Plane Characteristics - IO bound processing - Highly optimized to use as few CPU cycles as possible - Typically do not terminate IP traffic Multicore in Networking Applications Examples Application Type => Software Architecture Parallel, symmetric processing ”run-to-completion” Parallel, asymmetric processing functional pipelining egress ABC ABC egress ABC ingress All tasks in a flow can be handled by a thread I/O bound processing – low cpu budget Load balancing through hardware support Scaling depends on advanced HW support Popular for data plane C C B B A A ingress Different cores work on different stages Complex protocols - CPU bound processing Load balancing through run-time rebalancing Scaling capability depends on OS support for state migration/sharing Popular for control plane and O&M Application Type => Software Architecture Parallel, symmetric processing ”run-to-completion” Parallel, asymmetric processing functional pipelining egress ABC ABC egress ABC ingress All tasks in a flow can be handled by a thread I/O bound processing – low cpu budget Load balancing through hardware support Scaling depends on advanced HW support Popular for data plane C C B B A A ingress Different cores work on different stages Complex protocols - CPU bound processing Load balancing through run-time rebalancing Scaling capability depends on OS support for state migration/sharing Popular for control plane and O&M Another view of Multi-core Use Cases IP Packet Processing Control Plane CPU-bound Cycles/Byte SMP domain Linux or RTOS Termination Control Signaling IO-bound Transcoding AMP Linux, RTOS or ”bare-metal” domain Deep Packet Inspection Intrusion Prevention Other IP Forwarding Data Plane Level of parallelism or cores Multiple Use Cases Demand Multiple OS Solutions One “size” doesn’t fit all – i.e. both Linux and RTOS Requirements on Multicore Operating Systems Future for data plane applications Bare-metal or AMP Direction for control plane apps SMP Challenge: Find an operating environment for multicore processors that satisfy both data plane and control plane applications, despite their fundamental differences. Solution: Use a modular and flexible system that combines the best characteristics of SMP, AMP and bare-metal. Incumbent Configuration for Integrated Control and Data Plane SMP Operating System + Bare-Metal Control plane application Shared OS resources SMP OS Core 0 Core 1 Data plane app Exec. env Core 2 … Data plane app Exec. env Core n Advantages: • Control plane application can use high-level SMP RTOS or Linux • Can fully utilize processor vendor’s Executive Environment, if any • Raw bare-metal performance for data plane processing Disadvantages: • Bare-metal cores becomes silos, hence only suitable for run-to-completion applications • Poor debugging, profiling and run-time management capabilities on bare-metal cores • No platform services available such as IPC, file systems and networking stacks on bare-metal cores Introducing “XMP” – A better Way XMP provides both SMP and AMP Characteristics XMP Hybrid AMP/SMP Multiprocessing Common shared OS resources Scheduler Scheduler … Scheduler … Core n Kernel event backplane Core 1 Core 2 SMP Characteristics: AMP Characteristics • Easy to use • Simple configuration • Load balancing/process migration • Deterministic • Very good scalability • Suitable for IO intensive applications XMP – Combining the Best of AMP and SMP in One RTOS Enea OSE Multicore Edition – an XMP Solution Linear scalability of performance on multicore devices - Asymmetric kernel design that has a scheduler instance on each core - Avoid use of global or shared locks in kernel calls or interrupt processing Maintain single core performance on each core - Enhanced driver execution model, allows HW vendor bare-metal SDK:s “Executive Environments” to run inside an OSE process without additional overhead Management and debug - Shared OS services as in an SMP OS Seamless runtime debugging, CPU and memory profiling on all cores User defined load-balancing based on open API Booting, loading, starting and stopping applications Fault management OSE 5 / Multicore Edition Fully featured RTOS for distributed and fault-tolerant systems Highly Scalable RTOS Designed for distributed systems Support for memory protection and dynamic software updates Comprehensive IP networking support Optima tools integrated with CodeWarrior Multicore Edition: Hybrid SMP/AMP Microkernel SMP ease-of-use/configuration AMP scalability and performance Ability to run bare-metal applications on a core Reaching Bare-Metal Performance Supervisor threads + polling busy loop to achieve bare-metal performance The rest of the OS functionality can be used as needed - No OS overhead when not in use Provides management, debugging and profiling of software on all cores Bare-Metal Performance and Linear Scalability in OSE Multicore Edition “Bare Metal” Light-Weight Executive - packet polling loop Enea OSE Multicore Edition version 5.4.1 Two benchmark scenarios: - Packet processing - Simple packet routing in a bare-metal environment - Simple packet routing in an OSE Multicore Edition process, with full access to all services in OSE - Scalability - Instantiation of a “silo” application over many cores Throughput Mbyte/s Data Plane Processing Performance Bare-Metal OSE ME 112 128 256 512 1025 1280 1518 Frame size Performance nearly identical to raw bare-metal speed Scalability over many Cores Total number of transactions (normalized) Scalability OSE Multicore Edition vs Linux 18 16 14 12 10 OSE 8 Ideal 6 4 2 0 1 2 3 4 5 6 7 8 9 10 Number of cores 11 12 13 14 15 16 What about Linux for Control Plane and OSE Multicore Edition for Data Plane? Who owns the boot and configuration policies? How to partition shared resources like memory? How to share devices and services in runtime? How to be able to profile and debug all parts of the system? How to be able to dynamically balance the load? Challenge: How do we create hardware abstraction for all those OS:es? A classic request, but historically put on OS by applications! Solution: Heterogeneous execution environments on multicore devices needs a new “OS” software layer, a so called Hypervisor! Enea Hypervisor Example: Linux, RTOS & EE Applications Multicore Processor Linux App RTOS/App Tools and Management I/F EE App Linux Enea Hypervisor CPU 0 CPU 1 CPU 2 Based on OSE Multicore Edition technology: flexible, lightweight, extensible, framework for execution and management Enables multiple operating environments to coexist on the same multicore in different configurations Enables boot, remote management and system configuration control Enables tool support for debugging and profiling of the whole system Enea Hypervisor Features Provides support to: - Dynamically load and remove native applications, or guest domains Multicore Processor - Measure CPU load per core or individual application in runtime - Define scheduling policy between guest domains Linux App RTOS/App - Perform system wide load regulation - Communicate between guest domains using Enea LINX Bare Metal Linux Enea Hypervisor CPU 0 CPU 1 - Share services like file systems across guest domains Optimized for co-existence between Linux and OSE – LINX communication channel over shared memory – Shared file system using LINX – Shared Ethernet device(s) – Shared console CPU 2 Low Entry Alternative A lightweight OS model for Control/Data Plane OSEck for Multicore CPU’s – AMP Model OSEck – Enea “lightweight” kernel executive Background User App / System Proc Foreground User App Executive Env. OSEck Worker Core X Ethernet Data Plane LINX over Shared Pools LINX shared pool connection manager Implements support for EE applications Optima support Bare-metal performance Easy migration - simple to port user applications Add observability on a per core level Linux Management Core Worker Core Y LINX over Shared Pools Linux Management Core LINX over Ethernet OSEck – Two Scheduling Models Pre-emptive Model CPU x Optima Monitor IDLE dSPEED dSPEED dSPEED Interrupt-driven Run-to-completion Model CPU y Optima Monitor dSPEED dSPEED dSPEED IDLE PRI 0-31 PRI 0-31 LINX RLNH Shell LINX RLNH Shell Timeout Server Timer INT 0-31 LINX Shmem RX Timeout Server Timer Run-to-completion processing loop LINX Shmem RX INT 1-31 INT0 Scalable, Uniform IPC – Enea LINX Core-to-core, device-to-device, board-to-board level message based IPC Intra-core communication (OSE and OSEck) - Intra-core message passing is by reference (zero-copy) Inter-core communication (LINX) - LINX can transport messages over shared memory, DMA, hardware queues etc. Message Process 1 Inter-device communication (LINX) - Accomplished using LINX communicating over sRIO, Ethernet or PCIe Process 2 Open source for Linux, superior in performance to TIPC Proprietary for OSE, OSEck and other OS OSEck for MSC8155/56 Message ID Sender Receiver Owner Data PC OSE for P4080 LINX over Shared Memory / sRIO / Ethernet / … SC3850 SC3850 SC3850 SC3850 SC3850 SC3850 Control Worker Worker e500 e500 e500 GPP Process 3 Enea Multicore Solutions Target Customers: Network Equipment Providers Adjacent markets Target applications: Packet forwarding/processing LTE L1/L2 processing Connectivity layer processing Combined Control/Data plane Cycles/Byte Transcoding Termination Other Packet Processing Control Signaling IP Forwarding Level of parallelism Enea Offering: OSE ME - Fully featured Multicore RTOS with SMP ease-of-use and AMP performance/scalability Enea Hypervisor – for heterogenous/guest OS support, like Linux + OSE OSEck - Compact Kernel “AMP” RTOS for signal processing/packet processing on multicore CPU’s/DSPs LINX - Inter Process Communication (IPC) framework: OS, processor, interconnect independent Optima – Eclispe Development tools for debugging and system profiling/analysis True Convergence of Control and Data Plane A single implementation that supports SMP and AMP, or Hybrid models for All Use Cases: • Control Plane • Control + Data Plane (with bare metal) • Data Plane Technologies and features • Heterogeneous systems - support for both Linux and RTOS • Hypervisor/Virtualization • Performance • Load balancing • Fast Path IP • System wide IPC • High Availability – Fault Localization • Integration with applications environments • Eclipse based Integrated tools Summary No “one” processing model meets every use case for multicore in telecom. An understanding of specific use cases is crucial in determining the best solution: Control plane and data plane applications require different software architectures, but can co-exist on multicore processors A flexible OS platform that can combine properties of bare-metal, AMP, and SMP is the best fit Enea provides the most flexible OS framework that addresses most use cases OSE Multicore Edition with its hybrid kernel technology can support both IO and CPU bound processing in one homogeneous configuration For control plane applications on Linux, OSE Multicore Edition can be extended with Hypervisor support to incorporate guest domains in a heterogeneous configuration OR, a lightweight, small footprint AMP model for special low end or entry use cases Questions? [email protected]