100G/400G Networking Solutions
Transcription
100G/400G Networking Solutions
100G/400G Networking Solutions Ethernet Technology Summit 2014 Today’s Presenters Michael Miller Gilles Garcia VP, Technology Innovation & System Applications MoSys, Inc. Director, Wired Communications Xilinx Agenda Introduction Key bottlenecks in next-generation systems supporting 100/400G, MoSys Flexible component solutions for carrier and networking equipment, Xilinx Q&A 100G Networks and Systems Short Reach Data Center Long Distance Metro Metro Long Haul 40 to 80km Long Reach Long Haul 40 to 80km+ Client Card Optical Module Optical Module Packet Processor & Traffic Manager Optical Module Optical Module Client Card Optical Module Optical Module Optical Module Routers and switches Packet optical and transport Line Card Ethernet Line Card System components Optical Module Ethernet Optical Module Systems Packet Processor& Traffic Manager Packet Processor & Traffic Manager Optical Module Ethernet /OTN Optical modules Packet processing • • • Traffic management Memory Interfaces Backplanes Switching 400G+Networking Equipment Line Card Area and Power Fixed MSA 40 10G SFP+ 10 40G QSFP+ 4 100G CFP2 CFP4 … 1 400G CDFP … Module Module Module Interface Rates Must Transition 10G to 25G and 50G Memory Packet Processing Engine Traffic Mgmt & Fabric Interface Memory Memory uP Performance (Engine) Memory Access (Fuel) Backplane Interconnect Rate Memory Backplane Drivers (~1meter) # Ports Module PHY Retimer / Gearbox 400G Examples Faceplate – Line Interface Critical Constraints: Interface Density & Memory Access Today 6G10G, 25G In Design 100G Modules & Line Card Density CFP Density Migration Path 400G Line Card Faceplate Configurations CFP2 CFP4/QSFP28 10G SerDes Dominate the Line Card 25G SerDes in 100G CFP FIC PPE LR4/ER4 ROSA LR4/ER4 ROSA PPE FIC PPE FIC LR4/ER4 ROSA Gearbox SFP+ SFP+ SFP+ SFP+ LR4/ER4 TOSA Gearbox CFP SFP+ SFP+ SFP+ SFP+ CFP LR4/ER4 TOSA LR4/ER4 ROSA Backplane CFP LR4/ER4 TOSA FIC SFP+ SFP+ SFP+ SFP+ 40x10G-KR CAUI CFP LR4/ER4 TOSA PPE SFP+ SFP+ SFP+ SFP+ 10x 10G-SR CAUI 400Gbps Line Card 4x28G-VSR 10x10G-SR CAUI-4 CAUI Gearbox SFP+ SFP+ SFP+ SFP+ Option 4 x 100GE CFP Physical Interface Module Gearbox 40 x 10GE SFP+ Physical Interface Module Transition to 25G SerDes on Line Card Retimer CFP2 CFP2 / CFP4 QSFP28 ReTimer CFP2 / CFP4 QSFP28 ReTimer CFP2 / CFP4 QSFP28 ReTimer CFP2 / CFP4 QSFP28 ReTimer CFP2 / CFP4 QSFP28 ReTimer ReTimer SFP+ CFP2 / CFP4 SFP+ SFP+ QSFP28 SFP+ SFP+ ReTimer Gearbox SFP+ SFP+ CFP2 / CFP4 SFP+ QSFP28 SFP+ 4x25G FIC w/ Retimer FIC Retimer 400G PPE ROSA 400G PPE 400G PPE TOSA 32 x 25G 100GBASE-KR4 FIC w/ Retimer 4 x 28G-VSR 4 x 28G-SR, MR, LR CAUI-4 CAUI-4 Backplane 800 Gbps Aggregate Example Key Needs Performance Faceplate Density Test / control features Power Reach / flexibility Short and long distance Module types Rates 10, 40, 100G Copper Cables 100G+ OIF Ecosystem Interoperability MoSys LineSpeed™ 100G PHYs 100G CFP2/CPAK Optical Module Interoperability (100GBASE-LR4): Long Reach & Backplane Interoperability (CEI-25-LR ): Introducing High Performance Serial Memory, A New Breed Packet ingress Packet Processor Engine … Core 0 SerDes SerDes Examples: MoSys® Bandwidth Engine® IC Micron, Samsung... Hybrid Memory Cube Core n-1 Multiple CEI/XSFI Serial links allow concurrent transport operations Core n Core 1 … … … SerDes SerDes On-chip memory controller provides abstraction layer Memory Access Controller + Optional Offload Logic Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Banks Banks Banks Banks Banks Banks Banks Banks Banks Banks Packet egress Banks Serial Memory Banks Multibank™ architecture enables concurrent operations Memory Comparison Attribute Bandwidth Engine (BE) Hybrid Memory Cube (HMC) JEDEC DDR4 Physical Interface Serial CEI Standard Serial CEI Standard JEDEC parallel IO Protocol GigaChip™ Interface HMC Consortium RAS/CAS (DRAM) 72b 256b 128b Capacity 0.5~1 Gb 16~32 Gb 4 Gb Buffer Bandwidth 400 Gbps 1280 Gbps 4.8 Gbps Random Access Rate >4.5 Bt/s 2.6~2.9 Bt/s 0.2 Bt/s 66 272 42 BGA 19x19 BGA 31x31 BGA 8x12 7-11W ~28W ? 0.7W Min Access Size Signal Pins Package Power ……………………………………………… ……………………………………………… ……………………………………………… ……………………………………………… ……………………………………………… ……………………………………………… ……………………………………………… ……………………………………………… ……………………………………………… ……………………………………………… ……………………………………………… ……………………………………………… ……………………………………………… ……………………………………………… ……………………………………………… ……………………………………………… TDM / Scheduler Serial IO Switch Serial IO 8 DDR4 ~ 16+20 8 16 16 16 16 System Trade-offs - Interfaces and Memory Type Example: 100GE Line Card ~280 8 12 Interlaken 8 Small 100GE Packet Buffer Parallel HSTL Packet Processing Engine Traffic Manager ~240 BE 16x Serial GCI Fabric Interface Serial GCI ~240 16 DRAM DRAM DRAM DRAM DRAM DRAM DRAM DRAM DRAM DRAM DRAM DRAM DRAM DRAM DRAM DRAM MAC PCS FEC Optics Serial ILA Serial GCI 4x 16 BE Parallel SSTL HMC Serial HMC Large Deep 100GE TM Packet Buffers RLDRAM DRAM DRAM DRAM DRAM DRAM DRAM DRAM 16 QDR QDR QDR QDR BE 6 Bt/s NSE Key Memory Attributes: Size, Power, Access rate, Bandwidth Statistics 2 Bt/s LUT 3 Bt/s ACL LUT 8x Parallel SSTL Implemented in FPGA Companion Devices MoSys Devices Overall Interface Comparison: GCI Provides Highest Data Transfer Efficiency Packet Header Processing Application 100% 100% Read Data Return Efficiency 90% %efficiency %efficiency Packet Buffering Applications 80% 70% Read/Write Data Transfer Efficiency 90% 80% 70% 60% 60% 50% 50% BE 50:50 40% HMC 50:50 40% ILA BE HMC 30% 30% 20% 20% 10% 10% 0% 0 10 20 30 Payload Size (B) 40 50 0% 0 20 40 60 80 100 120 Payload Size (B) Efficiency includes Transaction & Transport protocol: %Transfer Efficiency = Data / (CMD + Address + Data + Transport Protocol) 140 160 180 Intelligent Memory Offload Further Multiplies Efficiency Bandwidth Engine Architecture Up to 4.5 Billion external memory accesses/second w/16 SerDes Lanes Macros support up to 6 Billion internal accesses/second w/8 SerDes Lanes Macros execute atomically: Statistics, Metering, Read & Set, Test & Set EIR 6B Access/s CIR Instruction EBS CBS Bandwidth Engine ≤ CBS ≤ EBS > EBS + Packet Processing Engine + f(χ) Partition f(χ) Partition f(χ) Partition f(χ) Partition Stats Table +n +1 Byte Packet Supports: 4 x 100GE ports w/ Stats + Metering over 8 lanes MoSys Used In 100/200G Appliance Boxes/Blades: Security, Load Balancing, SDN Controller... Memory Access Solutions MSR820 or MSR720 Line Side: 4 x 40G or 2 x 100G 3.75Bt/s @ 3ns tRC Statistics/Metering: 8M x 64b Bandwidth Engine 4x25G or 2x40G State Table: 2M x 576b 10x10G Optics Module GB/RT Index Table: 8M x 72b LineSpeed Gearbox, Retimer 12.5G CPU DRAM CPU DRAM FPGA/ ASIC Datapath Solutions 12.5G 2 x Interfaces Memory Bandwidth Solutions MSR620 Bandwidth Engine 100G Full Duplex Packet Buffer Up to 200G possible w/15.6G BE Gen 3 Architecture in 2015 Targets Data Structure Offload & Access Acceleration 25G Rx SerDes 25G Rx SerDes GCI-A GCI-B 8 Scheduling Domain In-order Request Queues Search Table Primitives: Exact match Hashing IPv4/v6 LPM Indirect loads & stores BE core execution: 5 GA rd + 5 GA wr ~3ns “tRC” + Atomic RMW Inst. Result Queue Reservations Weighted Round Robin Scheduler BE 820+ 1 Gb Memory Macro Ops Instruction Results Results 8 Scheduling Domain Reorder Result Buffers GCI-A GCI-B 25G Tx SerDes 25G Tx SerDes Multi-cycle Macro Offload 100G+ Serial Memory Solutions MoSys Bandwidth Engine MSRx20 Xilinx® Virtex®-7 FPGA Xilinx Kintex® UltraScale™ FPGA Photos provided courtesy of Xilinx, Inc. Flexible component solutions for carrier and networking equipment 100G to 400G+ Market Drivers Services • Massive Bandwidths • Rapid Rollout • In service Upgrades • (No Trucks) Networks • Lower CAPEX • Lower OPEX • Fast Deployment • In service Upgrades Equipment • High Capacity • Low Price/Port • Multi-Service • Low Power • Scalable – Programmable Silicon • High Capacity • Multi-Service • Packet Optimal • Low Power • Scalable – Reusable • Interface Rich Pressured Demand for 100G, nx100G, 400G Ports Source : Infonetics 100G – 400G Customer Challenges High capacity line cards (400G/500G/800G/1T) and standards still in flux Programmable Systems Integration Ethernet and OTN must keep up with the faceplate and backplane data rates Increased System Performance Harmonizing services Fewer line cards in-service upgrade Reduce CAPEX and OPEX BOM Cost Reduction Higher level of integration Less overhead 2x performance / capacity Same power envelope Total Power Reduction Market pressure to reduce ‘traditional’ 3 year development cycles to ~ 1 year Accelerated Design Productivity Xilinx Silicon & Wired SmartCORE IP Portfolio All programmable FPGA & SOC portfolio Wired IP MAC/Interlaken SmartCORE IP Client Side Options SFP+ ●●● nx10 GE Xilinx FPGA Backplane Or Chip to Chip I/F SFP+ QSFP CFP2/CF P4 QSFP28 CDFP mx40 GE kx100 GE MAC / PCS SmartCORE IP Or User Logic Interlaken 2-48 x 10/12.5G 2-24 x 25 G 1x400 GE Either or • Leading IP provider 10GE/40GE & 100GE MAC cores • The only vendor with 16x25Gbps transceivers for 400G production • Driving new technologies, e.g. 400Gbps CRC / PCS prototyping & prestandard implementations DDR3/4 RLD3 Serial Attach Memory • Leading IP provider of up to 150G Interlaken cores • Active contributor to Interlaken Alliance • Working on implementations for up to 600G Interlaken Memory Subsystem Xilinx FPGA • Industry leading IO and serial density with the Virtex®-7 & UltraScale™ family SmartCORE IP Or User Logic MAC ILKN Payload Abstraction Layer: BRAM, Discrete Memory, Serial Memory Control/ Statistics RLD3/ QDRIV Payload DDR3/4 BE (statistics, Shallow buffering ) Memory Options HMC TCAM DDR3/4 (deep buffering) • Network-optimized controllers for DDR4, DDR3, RLD3, QDR IV • Working on serial attached controllers designs • Co-interoperability with serial attached devices: memories and search nx100GE or 400GE MAC to Interlaken Bridge • Xilinx targeting first to market with pre-standard implementations • Flexible interface options to support interconnect between companion ASSPs • Effective solution for aggregation and oversubscription handling CFP 400 Optics or 4xCFP4 Or CDFP Optic Xilinx FPGA Backplane or Chip to Chip I/F 16x25G CDAUI MAC SmartCORE BRIDGE LOGIC Interlaken 2-48 x 10/12.5G 2-24 x 25 G Either or DDR3/4 RLD3 Serial Attach Memory SDNet: All Programmable Line Card – A Holistic Vision Packet Parsing 10/40/100G Line Rates Packet Editing Policing Packet Manipulation Filtering Packet Lookup/Search Congestion Management Quality of Service Provisioning of Services Vivado Design Suite for Implementation Rapid Programming of ‘What’ System Requires No Need to Understand Chip Architecture In-Service “Hitless” Updates Automation to Optimize Implementation of Spec Code Portability and Scalability Across Line Rates Scalable From Core to Edge Applications 1GB 10GB 40GB Software Control using Standard SDK and API Leverage Leadership Xilinx Technology Portfolio Leadership SerDes, clocking, BRAM, high-speed memory controllers, 3DIC Increased performance and integration at 28nm/20nm/16nm 100G B All Programmable FPGA or SoC “Softly” Defined Line Card 100Gbps/nx100Gbps SDNet Router Line Card • Xilinx leading high-density line card evolution with new SDNet framework • Delivered multiple generations of ultra high speed traffic manager • Fully integrated packet processor demonstrated at 2014 Interop Conference Xilinx FPGA Technology SmartCORE Packet Processor Parse MAC – – – Payload Manager Edit Search Protocol agnostic In-service programmable Integrated search engines – – Queue Manager Parameterized for best app fit Dynamically managed Payload Abstraction Layer: BRAM, Discrete Memory, Serial Memory Control/ Statistics RLD3/ QDRIV Payload DDR3/4 BE (statistics, Shallow buffering ) HMC TCAM DDR3/4 (deep buffering) Interlaken to Backplane 40/48x10/12.5G 5/20 x 25.78125 N x CFP2/ CFP4 SmartCORE Traffic Manager Current nx100G & Future OTUCn P-OTS Architecture • • • • • Xilinx leading ultra high-speed OTN switching evolution nx100G Transponder & MuxMapSAR single chip Xilinx ref. designs available 4x100G Transponder demo with API SDK shown at 2014 OFC Conference Xilinx enabling 400G OTN pre-standard projects Migration to ‘native’ 400G can reuse existing high density client cards CFP Or CFP2/4 CFP Or CFP2/4 CFP Or CFP2/4 CFP Or CFP2/4 Xilinx FPGA Client Line Card OTN Mapper SAR Interlaken – SmartCORE Mapper and MuxSAR IP available – Easy Migration to UltraScale – Proven connectivity to CFP2 – CFP4 – ready for CDFP / QSFP28… DWDM Line card Interlaken Mux/SAR Framer – SmartCORE Framer and SAR IP available for nx100G – Plan for SmartCORE 200G & 400G Framer & SAR IP – Easy Migration to UltraScale – Connectivity to 400G coherent optics modules 400G Coherent Optics The First ASIC-class Programmable Architecture The 400G+ Evolution Monolithic to 3D IC Planar to FinFET ASIC-class performance Summary Xilinx is enabling 4x100G & 400G applications • Line Card Solutions – – – – – 100G OTN transponder (Kintex®-7) 4x100G OTN transponder (Virtex-7) Modular router line cards for SP & Core: nx40GE, nx100GE (Virtex-7) 400G OTN Muxponder & Transponder single chip solution (UltraScale) 400GE ports single chip solution (UltraScale) • Tools and Design Methodology – accelerate time-to-market – Vivado® Design Suite – SDNet for exact match and optimized packet processing • SmartCORE™ IP (unmatched functionality) – Transponder, Muxponders, MuxSARs, FECs , MACs, Interlaken – Packet processors, embedded search, traffic management • Customer focus with system expertise and application knowledge • Visit xilinx.com/wired Conclusions Growing demand for 100G/400G networks Second and third generation 100G systems being deployed – Smaller optical modules – Higher speed serial interfaces – Advanced packet processing and traffic management First 400G solutions being deployed – Leveraging 100G technologies and components – Looking for high Integration with power efficiency 400G technologies & components becoming available to enable next generation systems Thank you. Questions? Michael Miller Gilles Garcia VP, Technology Innovation & System Applications MoSys, Inc. Director, Wired Communications Xilinx