FPGA Accelerator Virtualization in an OpenPOWER cloud
Transcription
FPGA Accelerator Virtualization in an OpenPOWER cloud
FPGA Accelerator Virtualization in an OpenPOWER cloud Fei Chen, Yonghua Lin IBM China Research Lab Trend of Acceleration Technology Acceleration in Cloud is Taking Off • Used FPGA to accelerate Bing search on 1632 servers • A 6*8 2D-torus design for high throughput network topology • Storage >2000PB, processing 10~100PB/day, log 100TB~1PB/day • Using FPGA for storage controller • Used GPU for Deep Learning Acceleration programming becomes hot topics OpenCL, Sumatra (Oracle), LiMe (IBM), … Appliance Acceleration in Cloud TB scale problem Acceleration architecture for single node Dedicate acceleration resource PB scale problem Architecture for thousands of nodes Proprietary accelerator framework Open framework to enable accelerator sharing & integration Close innovation model Open innovation model through eco-system Shared acceleration resources Innovations Required • Scalable acceleration fabric • Open framework for accelerator integration and sharing • Accelerator resource abstraction, re-configuration and scheduling in cloud • Modeling & advisory tool for dynamic acceleration system composition 2 Resources on FPGA are huge • Resources on FPGA – Programmable resources •Logic cells (LCs) •DSP slices: Fixed/floating-point •On-chip memory blocks •Clock resources – Miscellaneous periphrals (Xilinx Virtex as an example) •DDR3 controllers •PCIe Gen3 interfaces •10G Ethernet controllers •... – Hard processor core •PowerPC: Xilinx Virtex-5 FXT •ARM: Xilinx Zynq-7000 •Atom: Intel + Altera E600C FPGA Capacity Trends Xilinx Virtex UltraScale 440 FPGA, the largestscale FPGA in the world delivered in2014, consists of more than 4 million logical cells (LCs). Using this chip, we can build up to 250 AES crypto accelerators, or 520 ARM7 processor cores. 3 FPGA on Cloud – Double Win Cloud benefits from FPGA • • Performance Power consumption FPGA benefits from cloud • • • Lower the cost • Tenants need not purchase and maintain FPGAs. • Tenants pay for accelerators only when using them. More applications • High FPGA utilization Ecosystem • Grow with the cloud ecosystem 4 Motivation of Accelerator/FPGA as Service in Cloud Enable the manageability Reduce system cost Can FPGA (pool) be managed in data center? How to reduce system cost through sharing FPGA resources among applications, VMs, containers? App. App. ID, location, reconfiguration, performance, etc. VM App. VM Container VM Container Container Dynamic, Flexible, Priority controllable Reduce deployment complexity How to orchestrate FPGA/accelerator resources with VM, network and storage resources easily, according to the need of application? host Bring high value of cloud infrastructure Could we generate new value for IaaS ? Container Network VM Storage 5 FPGA Ecosystem in Cloud Accelerator Market Place •Companies or individual developers could upload and sale their accelerator through market place (e.g. on OpenPOWER) Accelerator Cloudify Tool (in plan) • Accelerator market place will do the cloudify for accelerator, through integrating the service layer with accelerator and compilation •All the integration, compilation, test, verification and certification will be done in automatic way. •Pay for the usage of accelerator, rather than license and hardware •Get the accelerator service in selfservice way •Use the single HEAT orchestrator to finish the workload deployment with accelerator, together with compute, network, and storage. HEAT orchestrator Accelerator developers •Cloud service provider will buy the “Cloudified” accelerators on market place •Create the Service Category for FPGA accelerator, and sale on the cloud as service Cloud Tenants Compute Network Storage FPGA accelerator POWER8/PowerKVM FPGA cards Cloud Service Provider OpenStack extension for accelerator service Service logics for accelerator service in FPGA Accelerator as Service on SuperVessel • Accelerator MarketPlace for developers to upload and compile the accelerators for SuperVessel POWER cloud • Allow user to request different size of cluster Fig.2 Cloud users could apply accelerator when creating VM Fig.1 Accelerator MarketPlace for SuperVessel Cloud 7 Enabling FPGA virtualization in OpenStack cloud KVM-based Compute Node Utilities HW Modules Virtual FPGA Guest OS Virtual Machine Guest Process Bitfile Library APIs Guest Control Module Library Guest Driver Virtual Machine Openstack Agent Utilities APIs Hypervisor Docker-based Compute Node Host Control Module Openstack Agent Applications APIs Driver Images Library Host Driver Kernel Utilities APIs Driver Images Control Module / Driver …… Hardware DRAM Openstack-based Cloud Tenant Control Node Tenant Scheduler Service Logic FPGA Compute Node FPGA Compute Node Hardware CAPI FPGA Components for FPGA framework Enhanced OpenStack 8 FPGA accelerator as Services online on SuperVessel Cloud Try it here: www.ptopenlab.com Super Marketplace (Online) SuperVessel Cloud Service 1. VM and container service 2. Storage service 3. Network service 4.Accelerator as service (Preparing) SuperVessel Big Data and HPC Service 1. Big Data: MapReduce (Symphony), SPARK 2. Performance tuning service OpenPOWER Enablement Service 1. X-to-P migration 2. AutoPort Tool 3. OpenPOWER new system test service 5. Image service (Online) Super Class Service 1. On-line video courses 2. Teacher course management 3. User contribution management (Preparing) Super Project Team Service 1. Project management service 2. DevOps automation Docker SuperVessel Cloud Infrastructure 9 Storage IBM POWER servers OpenPOWER server FPGA/GPU Thanks! 10 FPGA Implementation PCIe / CAPI High Bandwidth I/O Service Logic Registers C D A B Job Queue Reconfig Controller Job Scheduler DMA Engine FPGA chip Context Controller Eth DRAM Computer Su bl ay Su er bl ay er C D fo rm Pl at Hardware vi ce OS Se r Apps B A …… Switch Security Controller A User Sublayer B C D : Shared FPGA resource Service Sublayer : Job Queue, Switch, … Platform Sublayer : DRAM, PCIe, ICAP, … The FPGA subsystem is designed as a computer system. 11 System Implementation 3. VM request (with accelerator) Compute Node Compiler Scheduler 1. Accelerator source code package 5. Launch VM Control Node Dashboard Compute 4. acc_file Glance 2. image_file Compiler • Control Node: Nova, Glance, Horizon, Neutron, Swift • Compute Node: Nova Compute • Compiler: FPGA incremental compiler environment 12 Evaluation Host (2) Management – Bandwidth Control Process 0 One VM 1600 Bandwidth (BM/s) 1200 1100 1000 Process 1 Total Reduce VM bandwidth 1200 Increase P0 bandwidth 800 Process 0 Reduce P0 bandwidth 400 1194 MB/s 1000 100 25 MB/s 10 1 1 2 3 4 5 6 Number of Processes 7 8 1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 Time Host: All processes run in host environment One VM: All processes run in one VM VMs : Each process runs in one VM AESs: Each VM uses one independent AES accelerator Average Latency (ms) 1 (1) • • • • 11 21 31 0 900 Process 1 10000 41 51 61 Time (second) 4 3 2 2.3ms 1 0.21ms 0.22ms 0 1 11 21 31 1 11 21 31 41 51 61 Time (second) 80% 60% CV Total Bandwidth(MB/s) 1300 (3) Management – Priority Control Bandwidth (MB/s) (1) Accelerator Sharing Evaluation 40% (3) 20% Process 0 : 256KB payload, 100 times per second Process 1 : 4MB payload, best effort use. Same priority during second 1 ~ 38. Raise process 0 priority at second 38. 0% Process 1 begin 41 51 61 Time (second) Priority control 13