Intel Open Network Platform Server (Release 1.2)
Transcription
Intel Open Network Platform Server (Release 1.2)
Intel® Open Network Platform Server (Release 1.2) Benchmark Performance Test Report Document Revision 1.0 December 2014 Intel® Open Network Platform Server Benchmark Performance Test Report Revision History 2 Revision Date 1.0 December 15, 2014 Comments Initial document for release of Intel® Open Network Platform Server 1.2 Intel® Open Network Platform Server Benchmark Performance Test Report Contents 1.0 Introduction ................................................................................................................ 5 2.0 Ingredient Specifications ............................................................................................ 7 2.1 Hardware Versions ....................................................................................................................7 2.2 Software Versions .....................................................................................................................7 3.0 Test Cases ................................................................................................................... 9 4.0 Test Descriptions .......................................................................................................11 4.1 Performance Metrics ................................................................................................................ 11 4.2 Test Methodology ................................................................................................................... 12 4.2.1 Layer 2 and Layer 3 Throughput Tests.................................................................................. 12 4.2.2 Latency Tests.................................................................................................................... 12 5.0 Platform Configuration ...............................................................................................13 5.1 Linux Operating System Configuration .......................................................................................13 5.2 BIOS Configuration ................................................................................................................. 13 5.3 Core Usage for Intel DPDK Accelerated vSwitch .......................................................................... 14 5.4 Core Usage for OVS ................................................................................................................ 15 6.0 Test Results ...............................................................................................................17 6.1 Host Performance with DPDK .................................................................................................... 17 6.1.1 Test Case 0 - Host L3 Forwarding ........................................................................................ 18 6.1.2 Test Case 1 - Host L2 Forwarding ........................................................................................ 19 6.2 Virtual Switching Performance .................................................................................................. 20 6.2.1 Test Case 2 - OVS with DPDK-netdev L3 Forwarding (Throughput) ........................................... 21 6.2.2 Test Case 3 - OVS with DPDK-netdev L3 Forwarding (Latency) ................................................ 22 6.2.3 Test Case 4 - OVDK vSwitch L3 Forwarding (Throughput) ....................................................... 23 6.3 VM Throughput Performance without a vSwitch ........................................................................... 24 6.3.1 Test Case 5 - VM L2/L3 Forwarding with PCI Pass-through ...................................................... 25 6.3.2 Test Case 6 - DPDK L3 Forwarding: SR-IOV Pass-through ....................................................... 26 6.4 VM Throughput Performance with vSwitch .................................................................................. 27 6.4.1 Test Case 7 - Throughput Performance with One VM (OVS with DPDK-netdev, L3 Forwarding) ..... 27 6.4.2 Test Case 8 - Throughput Performance with Two VMs in Series (OVS with DPDK-netdev, L3 Forwarding).................................................................................................................. 28 6.4.3 Test Case 9 - Throughput Performance with Two VMs in Parallel (OVS with DPDK-netdev, L3 Forwarding).................................................................................................................. 30 6.5 Encrypt / Decrypt Performance .................................................................................................32 6.5.1 Test Case 10 - VM-VM IPSec Tunnel Performance Using Quick Assist Technology........................ 32 Appendix A A.1 A.2 A.3 DPDK L2 and L3 Forwarding Applications....................................................35 Building DPDK and L3/L2 Forwarding Applications ........................................................................ 35 Running the DPDK L3 Forwarding Application .............................................................................. 35 Running the DPDK L2 Forwarding Application .............................................................................. 36 Appendix B Building DPDK, Open vSwitch, and Intel DPDK Accelerated vSwitch ...........39 B.1 Getting DPDK .......................................................................................................................... 39 B.1.1 Getting DPDK Git Source Repository..................................................................................... 39 B.1.2 Getting DPDK Source tar from DPDK.ORG ............................................................................. 39 B.2 Building DPDK for OVS ............................................................................................................. 40 B.2.1 Getting OVS Git Source Repository ...................................................................................... 40 B.2.2 Getting OVS DPDK-netdev User Space vHost Patch ................................................................ 41 B.2.3 Applying OVS DPDK-netdev User Space vHost Patch .............................................................. 41 B.2.4 Building OVS with DPDK-netdev .......................................................................................... 41 B.3 Building DPDK for Intel DPDK Accelerated vSwitch ....................................................................... 42 3 Intel® Open Network Platform Server Benchmark Performance Test Report B.3.1 Building Intel DPDK Accelerated vSwitch............................................................................... 42 B.4 Host Kernel Boot Configuration .................................................................................................. 43 B.4.1 Kernel Boot Hugepage Configuration .................................................................................... 43 B.4.2 Kernel Boot CPU Isolation Configuration ............................................................................... 43 B.4.3 IOMMU Configuration ......................................................................................................... 44 B.4.4 Generating GRUB Boot File .................................................................................................44 B.4.5 Verifying Kernel Boot Configuration...................................................................................... 44 B.4.6 Configuring System Variables (Host System Configuration) ..................................................... 44 B.5 Setting Up Tests ...................................................................................................................... 45 B.5.1 OVS Throughput Tests (PHY-PHY without a VM) ..................................................................... 45 B.5.2 Intel DPDK Accelerated vSwitch Throughput Tests (PHY-PHY without a VM) ............................... 48 B.5.3 Intel DPDK Accelerated vSwitch with User Space vHost........................................................... 50 B.5.4 OVS with User Space vHost ................................................................................................ 53 B.5.5 SR-IOV VM Test ................................................................................................................ 61 B.5.6 Affinitization and Performance Tuning................................................................................... 68 B.6 PCI Passthrough with Intel® QuickAssist ..................................................................................... 69 B.6.1 VM Installation .................................................................................................................. 70 B.6.2 Installing strongSwan IPSec Software .................................................................................. 71 B.6.3 Configuring strongSwan IPsec Software ................................................................................ 72 Appendix C Glossary ......................................................................................................75 Appendix D Definitions ..................................................................................................77 D.1 D.2 Packet Throughput................................................................................................................... 77 RFC 2544 ............................................................................................................................... 78 Appendix E 4 References ..................................................................................................79 Intel® Open Network Platform Server Benchmark Performance Test Report 1.0 Introduction The primary audiences for this document are architects and engineers implementing the Intel® Open Network Platform Server. Software ingredients of this architecture include: • Fedora 20* • Data Plane Development Kit (DPDK) • Open vSwitch (OVS)* • OpenStack* • OpenDaylight* An understanding of system performance is required to develop solutions that meet the demanding requirements of the telecom industry and transform telecom networks. This document provides a guide for performance characterization using the Intel® Open Network Platform Server (Intel ONP Server). Ingredient versions, integration procedures, configuration parameters, and test methodologies all influence performance. This document provides a suite of Intel ONP Server performance tests, and includes learning to help with optimizing system configurations, test cases, configuration details, and performance data. The test methodologies and data provide a baseline for performance characterization of a Network Function Virtualization (NFV) compute node using Commercial Off-the-Shelf (COTS) with Intel® Xeon® E5-2697 V3 processor. This data does not represent optimal performance but should be easy to reproduce to help architects evaluate their specific performance requirements. The tests described are not comprehensive and do not cover any conformance aspects. Providing a baseline configuration of well tested procedures can help to achieve optimal system performance when developing an NFV/SDN solution. Benchmarking is not a trivial task, as it takes equipment, time, and resources. Engineers doing NFV performance characterization need strong domain knowledge (telco, virtualization, etc.), good understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols, and hands-on experience working with relevant open-source software (Linux virtualization, vSwitch, software acceleration technologies like DPDK, system optimization/tuning, etc.). 5 Intel® Open Network Platform Server Benchmark Performance Test Report NOTE: 6 This page intentionally left blank. Intel® Open Network Platform Server Benchmark Performance Test Report 2.0 Ingredient Specifications 2.1 Hardware Versions Table 2-1 Hardware Ingredients Description Notes Platform Intel® Server Board S2600WTT Intel® Xeon® DP-based Server (2 CPU sockets). 120 GB SSD 2.5in SATA 6 GB/s Intel Wolfsville SSDSC2BB120G4. Processors Intel® Xeon® E5-2697 V3 processor (Formerly code-named Haswell) 14 Core, 2.60GHz, 145W 35M cache, 9.6 GT/s QPI, DDR4-1600/1866/2133. Tested with DDR4-1067 Cores 14 physical cores per CPU (i.e. per socket) 28 Hyper-threaded cores per CPU (i.e. per socket) for 56 total cores per platform (i.e. 2 sockets). Memory 8 GB DDR4 RDIMM Crucial Server capacity = 64 GB RAM (8 x 8 GB). Tested with 32 GB memory. NICs (Niantic) 2x Intel® 82599 10 Gigabit Ethernet Controller NICs are on socket zero (3 PCIe slots available on socket 0). BIOS BIOS Revision: GRNDSDP1.86B.0038.R01.1409040644 Release Date: 09/04/2014 Intel® Virtualization Technology for Directed I/O (Intel® VT-d) enabled. Hyper-threading disabled. Quick Assist Technology Intel® QuickAssist Adapter 8950-SCCP (formerly code-named Walnut Hill) with Intel® Communications Chipset 8950 (formerly code-named Coleto Creek) PCIe acceleration card with Intel® Communications Chipset 8950. Capabilities include RSA, SSL/IPsec, Wireless Crypto, Security Crypto, Compression. 2.2 Software Versions Table 2-2 Software Versions Software Component Function Version/Configuration Fedora 20 x86_64 Host OS 3.15.6-200.fc20.x86_64 Fedora 20 x86_64 VM 3.15.6-200.fc20.x86_64 Qemu‐kvm Virtualization technology 1.6.2-9.fc20.x86_64 Data Plane Development Kit (DPDK) Network Stack bypass DPDK 1.7.1 Intel® DPDK Accelerated vSwitch (OVDK)1 vSwitch v1.1.0.61 Open vSwitch (OVS) vSwitch Open vSwitch V 2.3.90 git commit 99213f3827bad956d74e2259d06844012ba287a4 git commit bd8e8ee7565ca7e843a43204ee24a7e1e2bf9c6f git commit e9bbe84b6b51eb9671451504b79c7b79b7250c3b 7 Intel® Open Network Platform Server Benchmark Performance Test Report Table 2-2 Software Versions (Continued) Software Component 2 DPDK-netdev Function Version/Configuration vSwitch patch This is currently an experimental feature of OVS that needs to be applied to OVS version specified above. http://patchwork.openvswitch.org/patch/6280/ ® Intel QuickAssist Technology Driver (QAT) Crypto Accelerator Intel® Communications Chipset 89xx Series Software for Linux* - Intel® QuickAssist Technology Driver (L.2.2.0-30). Latest drivers posted at: https://01.org/packet-processing/intel%C2%AE-quickassist-technologydrivers-and-patches strongSwan IPSec stack strongSwan v.4.5.3 http://download.strongswan.org/strongswan-4.5.3.tar.gz PktGen Software Network Package Generator v.2.7.7 1. Intel has stopped further investment in the OVDK (available on Github). Intel OVDK demonstrated that the Data Plane Development Kit (DPDK) accelerates performance. Intel has also contributed to the Open vSwitch project which has adopted the DPDK and therefore it is not desirable to have two different code bases that use the DPDK for accelerated vSwitch performance. As a result, Intel has decided to increase investment in the Open vSwitch community project with a focus on using the DPDK and advancing hardware acceleration. 2. Keeping up with line rate at 10 Gb/s, in a flow of 64-byte packets, requires processing more than 14.8 million packets per second, but current standard version of OVS kernel pushes more like 1.1 million to 1.3 million, OVS’ latency in processing small packets becomes intolerable as well. The new, mainstream-OVS code goes under the name OVS with DPDK-netdev and has been on Github since March. It’s available as an “experimental feature” in OVS version 2.3. Intel hopes to include the code officially in OVS 2.4, which OVS committers are hoping to release early next year. 8 Intel® Open Network Platform Server Benchmark Performance Test Report 3.0 Test Cases Network throughput, latency, and jitter performance are measured for L2/L3 forwarding and uses standard test methodology RFC 2544. L3 forwarding on host and port-port switching performance using vSwitch also serves as verification test after installing compute node ingredients. The table below is a summary of performance test-cases (this is subset of many possible test configurations). Table 3-1 Test Case Test Case Summary Description Configuration Workload/Metrics Host-Only Performance 0 1 L3 Forwarding on host L2 Forwarding on host DPDK DPDK • L3 Fwd (DPDK test Application) • No pkt modification • Throughput1 • Latency (avg, max, min) • L2 Fwd (DPDK Application) • Pkt MAC modification • Throughput1 • Latency (avg, max, min) • Switch L3 Fwd • Throughput1 • Latency (avg, max, min) • Switch L3 Fwd • Latency distribution (percentile ranges)3 • Switch L3 Fwd Virtual Switching Performance 2 3 4 5 6 7 L3 Forwarding by vSwitch – throughput OVS with DPDK-netdev2 2 L3 Forwarding by vSwitch latency OVS with DPDK-netdev L3 Forwarding by vSwitch Intel® DPDK Accelerated vSwitch L2 Forwarding by single VM L3 Forwarding by single VM L3 Forwarding by single VM Pass-through SR-IOV 2 OVS with DPDK-netdev User space vhost • Throughput1 • Latency (avg, max, min) • L2 Fwd in VM • Pkt MAC modification • Throughput1 • Latency (avg, max, min) • L2 Fwd in VM • Pkt MAC modification • Throughput1 • Latency (avg, max, min) • L3 table lookup in switch with Port fwd (test pmd app) in VM without pkt modification • Throughput1 • Latency (avg, max, min) 9 Intel® Open Network Platform Server Benchmark Performance Test Report Table 3-1 Test Case 8 9 Test Case Summary (Continued) Description L3 Forwarding by two VMs in series L3 Forwarding by VMs in parallel Configuration OVS with DPDK-netdev User space vhost Workload/Metrics 2 OVS with DPDK-netdev2 User space vhost • L3 table lookup in switch with Port fwd (test pmd app) in VM without pkt modification • Throughput1 • Latency (avg, max, min) • L3 table lookup in switch with Port fwd (test pmd app) in VM without pkt modification • Throughput1 • Latency (avg, max, min) • Throughput1 Security 10 Encrypt/decrypt with Intel® Quick Assist Technology VM-VM IPSec tunnel performance 1. Tests run for 60 seconds per load target. 2. OVS with DPDK-netdev test-cases were performed using OVS with DPDK-netdev which is available as an “experimental feature” in OVS version 2.3. This shows performance improvements possible today with standard OVS code when using DPDK. Refer to Section 2.2 for details of software versions with commit IDs. 3. Tests run 1 hour per load target. 10 Intel® Open Network Platform Server Benchmark Performance Test Report 4.0 Test Descriptions 4.1 Performance Metrics The Metro Ethernet Forum (http://metroethernetforum.org/) gives Ethernet network performance parameters that affect service quality as: • Frame delay (latency) • Frame delay variation (jitter) • Frame loss • Availability Furthermore the following metrics can characterize a network connected device (i.e. compute node): • Throughput1 — The maximum frame rate that can be received and transmitted by the node without any error for “N” flows. • Latency — Introduced by the node that will be cumulative to all other end-to-end network delays. • Jitter2 i— Introduced by the node infrastructure. • Frame loss3 — While overall network losses do occur, all packets should be accounted for within a node. • Burst behavior4 — This measures the ability of the node to buffer packets. • Metrics5 — Used to characterize the availability and capacity of the compute node. These include start-up performance metrics as well as fault handling and recovery metrics. Metrics measured when the network service is fully “up” and connected include: — Power consumption of the CPU in various power states. — Number of cores available/used. — Number of NIC interfaces supported. — Headroom available for processing workloads (for example, available for VNF processing). 1. See Appendix D for definition for packet throughput. 2. Jitter data is not included here. While “round-trip” jitter is fairly easy to measure, ideally it should be measured from the perspective of the application that has to deal with the jitter (e.g. VNF running on the node). 3. All data provided is with zero packet loss conditions. 4. Burst behavior data is not included here. 5. Availability and capacity data is not included here. 11 Intel® Open Network Platform Server Benchmark Performance Test Report 4.2 Test Methodology 4.2.1 Layer 2 and Layer 3 Throughput Tests These tests determines the maximum Layer 2 or Layer 3 forwarding rate without traffic loss, and average latency for different packet sizes for the Device Under Test (DUT). While packets do get lost in telco networks, generally communications equipment is expected NOT to lose any packets. Therefore, it is important to do a zero packet-loss test of sufficient duration. Test results are zero packet-loss unless otherwise noted. Test are usually performed full duplex with traffic transmitting in both directions. Tests here were conducted with a variety of flows, for example, single flow (unidirectional traffic), two flows (bidirectional traffic with one flow in each direction), 30,000 flows (bidirectional with 15k flows in each direction). For Layer 2 forwarding, the DUT must perform packet parsing and Layer 2 address look-ups on the ingress port and then modify the MAC header for before forwarding the packet on the egress port. For Layer 3 forwarding, the DUT must perform packet parsing and Layer 3 address look-ups on the ingress port and then forward the packet on the egress port. Test duration refers to the measurement period for and particular packet size with an offered load and assumes the system has reached a steady state. Using RFC 2544 test methodology, this is specified as at least 60 seconds. To detect outlier data, it is desirable to run longer duration “soak” tests once maximum zero loss throughput is determined. This is particularly relevant for latency tests. In this document, throughput tests used 60-second load iterations, and latency tests were run for one hour (per load target). 4.2.2 Latency Tests Minimum, Maximum, and average latency numbers were collected with throughput tests. This test aims to understand latency distribution for different packet sizes and over an extended test run to uncover outliers. This means that latency tests should run for extended periods (at least 1 hour and ideally 24 hours). Practical considerations dictate that we pick the highest throughput that has demonstrated zero packet loss (for a particular packet size) as determined with above throughput tests. Note: 12 Latency through the system depends on various parameters configured by the software, such as the ring size in the Ethernet adapters, Ethernet tail pointer update rates, platforms characteristics, etc. Intel® Open Network Platform Server Benchmark Performance Test Report 5.0 Platform Configuration The configuration parameters described in this section are based on experience running various tests and may not be optimal. 5.1 Linux Operating System Configuration Table 5-1 Linux Configuration Parameter Enable/Disable Explanation Firewall Disable Disable. NetworkManager Disable As set by install. irqbalance Disable ssh Enabled Address Space Layout Randomization (ASLR) Disable IPV4 Forwarding Disable Host kernel boot parameters (regenerate grub.conf) default_hugepagesz=1G hugepagesz=1G hugepages=32 'intel_iommu=off’ isolcpus=1,2,3,4,5,6,7,8,9..n-1 for n core systems Isolate DPDK cores on host. VM kernel boot parameters (regenerate grub.conf) default_hugepagesz=1GB hugepagesz=1GB hugepages=1 isolcpus=1,2 Isolate DPDK cores on VM. 5.2 BIOS Configuration The configuration parameters described in this section are based on experience running the tests described and may not be optimal. Table 5-2 BIOS Configuration Parameter Enable/Disable Explanation Power Management Settings CPU Power and Performance Policy Intel® Turbo boost Traditional This is the equivalent of “performance” setting. Enable Processor C3 Disable Prevents low power state Processor C6 Enable Not relevant when C3 is disabled. 13 Intel® Open Network Platform Server Benchmark Performance Test Report Table 5-2 BIOS Configuration (Continued) Parameter Enable/Disable Explanation Processor Configuration Intel® Hyper-Threading Technology (HTT) Disable MLC Streamer Enable Hardware Prefetcher: Enabled MLC Spatial Prefetcher Enable Adjacent Cache Prefetch: Enabled DCU Data Prefetcher Enable DCU Streamer Prefetcher: Enabled DCU Instruction Prefetcher Enable DCU IP Prefetcher: Enabled Enable This is disabled by the kernel when iommu = off (default). Virtualization Configuration Intel® Virtualization Technology for Directed I/O (VT-d) Memory Configuration Memory RAS and Performance Configuration -> NUMA Optimized Auto I/O Configuration DDIO (Direct Cache Access) Auto Other Configuration HPET (High Precision Timer) 5.3 Disable Requires a kernel rebuild to utilize (Fedora 20 by default sets to disabled). Core Usage for Intel DPDK Accelerated vSwitch Performance results in this section do not include VM tests with Intel DPDK Accelerated vSwitch (OVDK). The core usage information here is provided as a reference as a suggestion for doing additional tests with OVDK. Table 5-3 Process Core Usage for Intel DPDK Accelerated vSwitch Core Linux Commands OVS_DPDK Switching Cores OVS_DPDK 0, 1, 2, 3 Affinity is set in `ovs_dpdk’ command line. Tests used coremask 0x0F (i.e., cores 0, 1, 2, and 3. vswitchd 8 taskset –a <pid_of_vswitchd_process> This is only relevant when setting up flows/ OS/VM Cores QEMU* VM1 VCPU0 4 taskset –p 10 <pid_of_vm1_qemu_vcpu0_process> QEMU* VM1 VCPU1 5 taskset –p 20 <pid_of_vm1_qemu_vcpu1_process> QEMU* VM2 VCPU0 6 taskset –c –p 40 <pid_of_vm2_qemu_vcpu0_process> QEMU* VM2 VCPU1 7 taskset –c –p 80 <pid_of_vm2_qemu_vcpu1_process> Kernel 14 0 + CPU socket 1 cores All other CPUs isolated (`isolcpus’ boot parameter). Intel® Open Network Platform Server Benchmark Performance Test Report 5.4 Core Usage for OVS Table 5-4 Core Usage for OVS Process Core Linux Commands OVS switchd Cores ovs-vswitchd (all tasks except pmd) 3 ovs-vswitchd pmd task 4 OVS user space processes affinitized to core mask 0x04. # ./vswitchd/ovs-vswitchd --dpdk -c 0x4 -n 4 --socket-mem 4096,0 -unix:$DB_SOCK --pidfile Sets the PMD core mask to 2 for CPU core 1 affinitization (sets in the OVS database and is persistent), # ./ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=2 All Linux threads for OVS vSwitch daemon can be listed with: $ top -p `pidof ovs-vswitchd` -H -d1 Using taskset, the current task thread affinity can be checked and changed to other CPU cores as needed. 15 Intel® Open Network Platform Server Benchmark Performance Test Report NOTE: 16 This page intentionally left blank. Intel® Open Network Platform Server Benchmark Performance Test Report 6.0 Note: Test Results In this section, all performance testing done using RFC 2544 test methodology (zero loss), refer to Appendix D for more information. 6.1 Host Performance with DPDK Figure 6-1 Host Performance with DPDK Configuration: 1. No VMs are configured and no vSwitch is started or used. 2. DPDK-based port forwarding sample application (L2/L3fwd). Data Path (Numbers Matching Red Circles): 1. The packet generator creates flows based on RFC 2544. 2. The DPDK L2 or L3 forwarding application forwards the traffic from the first physical port to the second. 3. The traffic flows back to the packet generator and is measured per RFC 2544 throughput test. 17 Intel® Open Network Platform Server Benchmark Performance Test Report 6.1.1 Test Case 0 - Host L3 Forwarding Test Configuration Notes: • Traffic is unidirectional (1 flow) or bidirectional (2 flows). • Packets are not modified. Figure 6-2 Host L3 Forwarding Performance – Throughput The average latency data in Table implies that latency increases with packet size, however this is because throughput is approaching maximum (where latency increases dramatically). Therefore these latency figures may not represent realistic conditions (i.e. at lower data rates). Table 6-1 shows performance with bidirectional traffic. Note that performance is limited by the network card. Table 6-1 18 Host L3 forwarding Performance – Average Latency Packet Size (Bytes) Average Latency (s) Bandwidth (% of 10 GbE) 64 9.77 78 128 12.33 97 256 13.17 99 512 20.24 99 1024 33.02 99 1280 39.82 99 1518 46.61 99 Intel® Open Network Platform Server Benchmark Performance Test Report 6.1.2 Test Case 1 - Host L2 Forwarding Test Configuration Notes: • Traffic is bidirectional (2 flows). • Packets are modified (Ethernet header). Figure 6-3 Host L2 Forwarding Performance – Throughput 19 Intel® Open Network Platform Server Benchmark Performance Test Report 6.2 Virtual Switching Performance Figure 6-4 Virtual Switching Performance Configuration: 1. No VMs are configured (i.e. PHY-to-PHY). 2. Intel® DPDK Accelerated vSwitch or OVS with DPDK-netdev, port-to-port. Data Path (Numbers Matching Red Circles): 1. The packet generator creates flows based on RFC 2544. 2. The Intel® DPDK Accelerated vSwitch or OVS network stack forwards the traffic from the first physical port to the second. 3. The traffic flows back to the packet generator. 20 Intel® Open Network Platform Server Benchmark Performance Test Report 6.2.1 Test Case 2 - OVS with DPDK-netdev L3 Forwarding (Throughput) Test Configuration Notes: • Traffic is bidirectional, except for one flow case (unidirectional). Number of flows in chart represents total flows in both directions. • Packets are not modified. • These tests are “PHY-PHY” (physical port to physical port without a VM). Figure 6-5 OVS with DPDK-netdev Switching Performance - Throughput Figure 6-6 OVS with DPDK-netdev Switching Performance - Average Latency1 1. Latency is measured by the traffic generator on a per packet basis. 21 Intel® Open Network Platform Server Benchmark Performance Test Report 6.2.2 Test Case 3 - OVS with DPDK-netdev L3 Forwarding (Latency) Test Configuration Notes: • Traffic is bidirectional. • Packets are not modified. • Test run for 1 hour per target load. • Only 64-byte results are shown in the chart. • These tests are “PHY-PHY” (physical port to physical port without a VM). Figure 6-7 Host L2 Forwarding Performance – Latency Distribution for 64B Packets The chart in Figure 6-7 shows latency distribution of packets for a range of target loads. For example, with target load of 10% (i.e. 1 Gb/s) almost all packets have a latency of between 2.5 and 5.1 s, whereas at a target load of 50% (i.e. 5 Gb/s) almost all packets have a latency of between 10.2 and 20.5 s. 22 Intel® Open Network Platform Server Benchmark Performance Test Report 6.2.3 Test Case 4 - OVDK vSwitch L3 Forwarding (Throughput) Test Configuration Notes: • Traffic is bidirectional. • Packets are not modified. • Results for three different flow settings are shown (2, 10k, 30k). • These tests are “PHY-PHY” (physical port to physical port without a VM). Figure 6-8 OVDK Switching Performance - Throughput Figure 6-9 OVDK Switching Performance - Average Latency 23 Intel® Open Network Platform Server Benchmark Performance Test Report 6.3 VM Throughput Performance without a vSwitch Figure 6-10 VM Throughput Performance without a vSwitch Configuration (Done Manually): 1. One VM gets brought up and connected to the NIC with PCI Passthrough or SR-IOV. 2. IP addresses of the VM gets configured. Data Path (Numbers Matching Red Circles): 1. The packet generator creates flows based on RFC 2544. 2. Flow arrived at the first vPort of the VM. 3. The VM receives the flows and forwards them out through its second vPort. 4. The flow is sent back to the packet generator. 24 Intel® Open Network Platform Server Benchmark Performance Test Report 6.3.1 Test Case 5 - VM L2/L3 Forwarding with PCI Passthrough Test Configuration Notes for L3 Forwarding: • Traffic is bidirectional. • Packets not modified. Figure 6-11 VM L3 Forwarding Performance with PCI Pass-through - Throughput Test Configuration Notes for L2 Forwarding: • Traffic is bidirectional. • Packets are modified (Ethernet Header). Figure 6-12 VM L2 Forwarding Performance with PCI Pass-through - Throughput 25 Intel® Open Network Platform Server Benchmark Performance Test Report 6.3.2 Test Case 6 - DPDK L3 Forwarding: SR-IOV Passthrough Test Configuration Notes for L3 Forwarding: • Traffic is bidirectional. • Packets not modified. Figure 6-13 VM L3 Forwarding Performance with SR-IOV - Throughput 26 Intel® Open Network Platform Server Benchmark Performance Test Report 6.4 VM Throughput Performance with vSwitch 6.4.1 Test Case 7 - Throughput Performance with One VM (OVS with DPDK-netdev, L3 Forwarding) Figure 6-14 VM Throughput Performance with vSwitch - One VM Configuration (Done Manually): 1. One VM gets brought up and connected to the vSwitch. 2. IP address of the VM gets configured. 3. Flows get programmed to the vSwitch. Data Path (Numbers Matching Red Circles): 1. The packet generator creates flows based on RFC 2544. 2. The vSwitch forwards the flows to the first vPort of the VM. 3. The VM receives the flows and forwards them out through its second vPort. 4. The vSwitch forwards the flows back to the packet generator. Test Configuration Notes: • Traffic is bidirectional. • Packets not modified. 27 Intel® Open Network Platform Server Benchmark Performance Test Report Figure 6-15 VM Throughput Performance - One VM (OVS with DPDK-netdev) 6.4.2 Test Case 8 - Throughput Performance with Two VMs in Series (OVS with DPDK-netdev, L3 Forwarding) Figure 6-16 VM Throughput Performance with vSwitch - Two VMs in Series 28 Intel® Open Network Platform Server Benchmark Performance Test Report Configuration (Done Manually): 1. Setup 2 VMs and connect to vSwitch. 2. IP addresses of VMs are configured. 3. Flows get programmed to the vSwitch. Data Path (Numbers Matching Red Circles): 1. The packet generator creates flows based on RFC 2544. 2. The vSwitch forwards the flows to the first vPort of VM #1. 3. VM #1 receives the flows and forwards them to VM #2 through its second vPort (via the vSwitch). 4. The vSwitch forwards the flows to the first vPort of VM #2. 5. VM #2 receives the flows and forwards them back to the vSwitch through its second vPort. 6. The vSwitch forwards the flows back to the packet generator. Test Configuration Notes: • Traffic is bidirectional • Packets not modified Figure 6-17 VM Throughput Performance - Two VMs in Series (OVS with DPDK-netdev) 29 Intel® Open Network Platform Server Benchmark Performance Test Report 6.4.3 Test Case 9 - Throughput Performance with Two VMs in Parallel (OVS with DPDK-netdev, L3 Forwarding) Figure 6-18 VM Throughput Performance with vSwitch - Two VMs in Parallel Configuration (Done Manually): 1. Setup 2 VMs and connect to vSwitch. 2. IP addresses of VMs are configured. 3. Flows get programmed to the vSwitch. Data Path (Numbers Matching Red Circles): 1. The packet generator creates flows based on RFC 2544. 2. The vSwitch forwards the flows to the first vPort of VM #1 and first vPort of VM #2. 3. VM #1 receives the flows and forwards out through its second vPort. VM #2 receives the flows and forwards out through its second vPort. 4. The vSwitch forwards the flows back to the packet generator. Test Configuration Notes: • Traffic is bidirectional • Packets not modified 30 Intel® Open Network Platform Server Benchmark Performance Test Report Figure 6-19 VM Throughput Performance - Two VMs in Parallel (OVS with DPDK-netdev) 31 Intel® Open Network Platform Server Benchmark Performance Test Report 6.5 Encrypt / Decrypt Performance 6.5.1 Test Case 10 - VM-VM IPSec Tunnel Performance Using Quick Assist Technology An IPSec Tunnel is setup between two tunnel endpoints, one on each Virtual Machine, and each VM uses Quick Assist technology to offload encryption and decryption. Figure 6-20 VM-VM IPSec Tunnel Performance Using QAT Throughput tests use UDP data through the IPSEC tunnel and encap/decap in the VM is handled by strongSwan (an openSource IPsec-based VPN Solution - https://www.strongswan.org/). Refer to Appendix B.6 to setup a VM with QAT drivers, netkeyshim module and the strongSwan IPSec application. Configuration: • Two Quick Assist Engines are required (one for each VM). • PCI pass-through is used to attach the Quick Assist PCI accelerator to a VM. • The virtual network interface of the VMs is virtio with vHost i.e. per standard Open vSwitch. • Netperf UDP stream test is used to generate traffic in the VM. The chart below provides a performance comparison with Quick Assist which indicates an improvement of more than twice the throughput for 64-byte packets and more for larger packets. 32 Intel® Open Network Platform Server Benchmark Performance Test Report Figure 6-21 VM-to-VM IPsec Tunnel Performance Using Intel® Quick Assist Technology Note: Results using Intel Grizzly Pass Xeon® DP Server, 64 GB RAM (8x 8 GB), Intel® Xeon® Processor Series E5-2680 v2 LGA2011 2.8GHz 25MB 115W 10 cores, 240GB SSD 2.5in SATA. See Intel® Open Network Platform for Servers Reference Architecture (Version 1.1), also is provided here for reference. Similar results are expected for the ingredients specified in Section 2.0, “Ingredient Specifications”. 33 Intel® Open Network Platform Server Benchmark Performance Test Report NOTE: 34 This page intentionally left blank. Intel® Open Network Platform Server Benchmark Performance Test Report Appendix A DPDK L2 and L3 Forwarding Applications A.1 Building DPDK and L3/L2 Forwarding Applications The DDPK is installed with IVSHMEM support. It can be also installed without IVSHMEM (see DPDK documentation). 1. If previously built for OVS or OVDK, DPDK needs to be uninstalled first. # make uninstall # make install T=x86_64-ivshmem-linuxapp-gcc 2. Assuming DPDK Git repository, change DPDK paths for tar install. # export RTE_SDK=/usr/src/dpdk # export RTE_TARGET=x86_64-ivshmem-linuxapp-gcc 3. Build l3fwd. # cd /usr/src/dpdk/examples/l3fwd/ # make 4. Build l2fwd. # cd /usr/src/dpdk/examples/l2fwd/ # make A.2 Running the DPDK L3 Forwarding Application Build DPDK and l3fwd as described in Appendix A.1. Setup host bootline 1 GB and 2 MB hugepage memory and isolcpus. 1. Initialize host hugepage memory after reboot. # mount -t hugetlbfs nodev /dev/hugepages # mkdir /dev/hugepages_2mb # mount -t hugetlbfs nodev /dev/hugepages_2mb -o pagesize=2MB 2. Initialize host UIO driver. # modprobe uio # insmod $RTE_SDK/x86_64-ivshmem-linuxapp-gcc/kmod/igb_uio.ko 35 Intel® Open Network Platform Server Benchmark Performance Test Report 3. Locate target NICs to be used for test (assuming 06:00.0 and 06:00.01 was found target), attach to UIO driver, and check. # # # # lspci -nn |grep Ethernet $RTE_SDK/tools/dpdk_nic_bind.py --bind=igb_uio 06:00.0 $RTE_SDK/tools/dpdk_nic_bind.py --bind=igb_uio 06:00.1 $RTE_SDK/tools/dpdk_nic_bind.py --status 4. To run the DPDK L3 forwarding application on the compute node host use the following command line: # cd $RTE_SDK/examples/l3fwd # ./build/l3fwd -c 0x6 -n 4 --socket-mem 1024,0 -- -p 0x3 -config="(0,0,1),(1,0,2)" Note: The -c option enables cores (1,2) to run the application. The --socket-mem option just allocates 1 GB of the hugepage memory from target NUMA node 0 and zero memory from NUMA node 1. The -p option is the hexadecimal bit mask of the ports to configure. The --config (port, queue, lcore) determines which queues from which ports are mapped to which cores. Additionally, to run the standard Linux network stack L3 forwarding without DPDK, enable the kernel forwarding parameter as below and start the traffic. # echo 1 > /proc/sys/net/ipv4/ip_forward To setup the test for verifying the devstack installation on compute node and to contrast DPDK host L3 forwarding versus Linux kernel forwarding refer to Section 7.1. This section also details the performance data for both test cases. A.3 Running the DPDK L2 Forwarding Application The L2 forward application modifies the packet data by changing the Ethernet address in the outgoing buffer which might impact performance. L3fwd program forwards the packet without modifying the packet. Build DPDK and l2fwd as described in Appendix A.1. Setup host bootline 1 GB and 2 MB hugepage memory and isolcpus. 1. Initialize host hugepage memory after reboot. # mount -t hugetlbfs nodev /dev/hugepages # mkdir /dev/hugepages_2mb # mount -t hugetlbfs nodev /dev/hugepages_2mb -o pagesize=2MB 2. Initialize host UIO driver. # modprobe uio # insmod $RTE_SDK/x86_64-ivshmem-linuxapp-gcc/kmod/igb_uio.ko 3. Locate target NICs to be used for test (assuming 06:00.0 and 06:00.01 was found target), attach to UIO driver, and check. # # # # 36 lspci -nn |grep Ethernet $RTE_SDK/tools/dpdk_nic_bind.py --bind=igb_uio 06:00.0 $RTE_SDK/tools/dpdk_nic_bind.py --bind=igb_uio 06:00.1 $RTE_SDK/tools/dpdk_nic_bind.py --status Intel® Open Network Platform Server Benchmark Performance Test Report 4. To run the DPDK 23 forwarding application on the compute node host use the following command line: # cd $RTE_SDK/examples/l3fwd # ./build/l3fwd -c 0x6 -n 4 --socket-mem 1024,0 -- -p 0x3 -config="(0,0,1),(1,0,2)" Note: The -c option enables cores (1,2) to run the application. The --socket-mem option just allocates 1 GB of the hugepage memory from target NUMA node 0 and zero memory from NUMA node 1. The -p option is the hexadecimal bit mask of the ports to configure. The --config (port, queue, lcore) determines which queues from which ports are mapped to which cores. 37 Intel® Open Network Platform Server Benchmark Performance Test Report NOTE: 38 This page intentionally left blank. Intel® Open Network Platform Server Benchmark Performance Test Report Appendix B Building DPDK, Open vSwitch, and Intel DPDK Accelerated vSwitch B.1 Getting DPDK The DPDK needs to be built differently for 1) the various test programs (e.g. DPDK L2 and L3 test programs), 2) OVS with DPDK-netdev, and 3) Intel DPDK Accelerated vSwitch. The DPDK should be built in the same way for the host machine and VM. There are two ways of obtaining DPDK source as listed in Appendix B.1.1 and Appendix B.1.2. B.1.1 Getting DPDK Git Source Repository The git repository was the method used to obtain the code used for tests in this document. The DPDK Git Source Repository which was used since the latest available tar file changes often. The older tar files are more difficult to obtain, while the target git code is obtained by just checking out the correct commit or tag. # cd /usr/src # git clone git://dpdk.org/dpdk # cd /usr/src/dpdk Need to checkout the target DPDK version: # git checkout -b test_v1.7.1 v1.7.1 The export directory for DPDK git repository is: # export RTE_SDK=/usr/src/dpdk B.1.2 Getting DPDK Source tar from DPDK.ORG This is an alternative way to get the DPDK source and is provided here as additional information. Get DPDK release 1.7.1 and untar it. Download DPDK 1.7.1 at: http://www.dpdk.org/download Move to install location, move DPDK tar file to destination, and untar: # # # # cd /usr/src mv ~/dpdk-1.7.1.tar.gz . tar -xf dpdk-1.7.1.tar.gz cd /usr/src/dpdk-1.7.1 39 Intel® Open Network Platform Server Benchmark Performance Test Report The specific DPDK version directory must be used for the code builds. The git repository DPDK directory is shown in the documentation below and would need to be changed for the DPDK 1.7.1 target directory. # export RTE_SDK=/usr/src/dpdk-1.7.1 B.2 Building DPDK for OVS The CONFIG_RTE_BUILD_COMBINE_LIBS=y needs to be set in config/common_linuxapp file. It can be changed by using a text editor or using the following patch: diff --git a/config/common_linuxapp b/config/common_linuxapp index 9047975..6478520 100644 --- a/config/common_linuxapp +++ b/config/common_linuxapp @@ -81,7 +81,7 @@ CONFIG_RTE_BUILD_SHARED_LIB=n # # Combine to one single library # -CONFIG_RTE_BUILD_COMBINE_LIBS=n +CONFIG_RTE_BUILD_COMBINE_LIBS=y CONFIG_RTE_LIBNAME="intel_dpdk" # The patch can be applied by copying it into a file in the upper level directory just above DPDK directory (/usr/src). If the patch file was called patch_dpdk_single_lib, the following is an example of it being applied: cd /usr/src/dpdk patch -p1 < ../patch_dpdk_single_lib # make install T=x86_64-ivshmem-linuxapp-gcc B.2.1 Getting OVS Git Source Repository Make sure the DPDK is currently built for the OVS build, if not change as documented above and rebuild DPDK. Get the OVS Git Repository: # cd /usr/src # git clone https://github.com/openvswitch/ovs.git # cd /usr/src/ovs Assume that the latest commit is the target commit. Create a working branch off current branch to do testing and for applying user vhost patch: # git checkout –b test_branch Or checkout specific commit used in testing: # git checkout –b test_branch e9bbe84b6b51eb9671451504b79c7b79b7250c3b The user vhost interface patch needs to be applied for User vHost Tests. 40 Intel® Open Network Platform Server Benchmark Performance Test Report B.2.2 Getting OVS DPDK-netdev User Space vHost Patch Currently (10/31/14), not all the User vHost DPDK patches have been up streamed into the OVS source code. The current patches can be obtained from: Sept. 29, 2014: http://patchwork.openvswitch.org/patch/6280/ [ovs-dev,v5,1/1] netdev-dpdk: add dpdk vhost ports ovs-dev-v5-1-1-netdev-dpdk-add-dpdk-vhost-ports.patch B.2.3 Applying OVS DPDK-netdev User Space vHost Patch Before applying the patch, it is desirable to move to and use your own git branch to prevent source tree issues switching issues. The following creates and switches to the new git branch test_srt, but any non-used branch name can be used: git checkout -b test_srt Switched to a new branch 'test_srt' Assuming patch has been placed in /usr/src directory, the following applies the patch: cd /usr/src/ovs patch -p1 < ../ovs-dev-v5-1-1-netdev-dpdk-add-dpdk-vhost-ports.patch Assuming that the patch applies correctly, the changes can be checked into the git repository for the test branch allowing the switching between different OVS commit versions. At a later time the patch can be applied to later point in the code repository. B.2.4 Building OVS with DPDK-netdev The DPDK needs to be built for OVS as described above. The following assumes DPDK is from git repository, adjust DPDK path if other DPDK build directory: # # # # # # cd /usr/src/ovs export DPDK_BUILD=/usr/src/dpdk/x86_64-ivshmem-linuxapp-gcc ./boot.sh ./configure --with-dpdk=$DPDK_BUILD make make eventfd Note: The “ivshmem” option adds support for IVSHMEM and also supports other interface options (user space vHost). By using this option keeps the build consistent across all test configurations. 41 Intel® Open Network Platform Server Benchmark Performance Test Report B.3 Building DPDK for Intel DPDK Accelerated vSwitch The DPDK must be built as part of the DPDK Accelerated vSwitch master build, or may not build correctly. It cannot have the CONFIG_RTE_BUILD_COMBINE_LIBS=y flag set. B.3.1 Building Intel DPDK Accelerated vSwitch The OVDK master build is the best way to build OVDK. Make sure DPDK is the base release code without the CONFIG_RTE_BUILD_COMBINE_LIBS=y flag set. The following for the git repository can be used as long as changes were not checked into the repository: # cd /usr/src/dpdk # git diff –M –C If no changes in code, no output will be generated. If the CONFIG_RTE_BUILD_COMBINE_LIBS=y flag set, the output will be similar to the OVS DPDK patch above. The following will discard the change: # cd git checkout -- config/common_linuxapp Applying the DPDK changes as a patch allows the easy switching back and forth between DPDK build. The other method is to create 2 branches with and without the CONFIG_RTE_BUILD_COMBINE_LIBS=y flag set and jump between them. Getting OVDK git source repository: # cd /usr/src/ # git clone git://github.com/01org/dpdk-ovs Check out code for testing (recommended to use more recent code): # cd /usr/src/dpdk-ovs # git checkout –b test_my_xx However, for testing OVDK equivalent to data in this document: # cd /usr/src/dpdk-ovs # git git checkout -b test_v1.2.0-pre1 bd8e8ee7565ca7e843a43204ee24a7e1e2bf9c6f1 The OVDK master build: # # # # cd /usr/src/dpdk export RTE_SDK=$(pwd) cd ../dpdk-ovs make config && make 1. Commit ID shown here has been used to collect data in this document. 42 Intel® Open Network Platform Server Benchmark Performance Test Report B.4 Host Kernel Boot Configuration The host system needs several boot parameters added or changed for optimal operation. B.4.1 Kernel Boot Hugepage Configuration Hugepage system makes large memory pages available for use with applications to improve program performance by reducing the processor overhead of updating Transaction Lookaside Buffer (TLB) table entries in the CPU less frequently. The 1 GB huge pages need to be configured on the boot command line and cannot be changed during runtime operation. However the 2 MB pages counts can be adjusted during runtime operation. Configuration on the kernel bootline allows the system to support both 1 GB hugepages and 2 MB hugepages at the same time. The huge page configuration can be added to the default configuration file /etc/default/grub by adding to the GRUB_CMDLINE_LINUX and the grub configuration file regenerated to get an updated configuration file for Linux boot. # vim /etc/default/grub // edit file . . . GRUB_CMDLINE_LINUX_DEFAULT="... default_hugepagesz=1GB hugepagesz=1GB hugepages=16 hugepagesize=2m hugepages=2048 ..." . . . This sets up huge pages for both 1 GB pages for 16 GB of 1 GB hugepage memory and 2 MB pages for 4 GB of 2 MB hugepage memory. After boot the number of 1 GB pages cannot be changed, but the number of 2 MB pages can be changed during runtime operation. Half of each hugepage set comes from each of the NUMA nodes. This target is setup for OVDK or OVS using two 1-GB Node 0 hugepages and two VMs. Use three 1-GB Node 0 hugepages each for a total of eight 1 GB Node 0 hugepages used for testing. This requires 16 1-GB hugepages are allocated on the Kernel Bootline for eight 1-GB Node 0 hugepages and eight 1-GB Node 1 hugepages. B.4.2 Kernel Boot CPU Isolation Configuration CPU isolation removes the CPU cores from the scheduler to prevent a CPU core targeted for particular processing to be assigned to general processing tasks. You need to know what CPUs are available the system. This isolates 9 CPU cores used in testing. # vim /etc/default/grub // edit file . . . GRUB_CMDLINE_LINUX_DEFAULT="... isolcpus=1,2,3,4,5,6,7,8,9 ..." . . . 43 Intel® Open Network Platform Server Benchmark Performance Test Report B.4.3 IOMMU Configuration The IOMMU is needed for PCI-passthrough and for SR-IVO tests and should be disabled for all other tests. The IOMMU is enabled by setting kernel boot IOMMU parameters: # vim /etc/default/grub // edit file . . . GRUB_CMDLINE_LINUX_DEFAULT="... iommu=pt intel_iommu=on ..." . . . B.4.4 Generating GRUB Boot File After editing, the grub.cfg boot file needs to be regenerated to set the new kernel boot parameters: # grub2-mkconfig -o /boot/grub2/grub.cfg The system needs to be rebooted for the kernel boot changes to take effect. B.4.5 Verifying Kernel Boot Configuration Verify that the system booted with correct kernel parameters by reviewing the kernel boot up messages using dmesgs: # dmesg | grep command [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.11.10-301.fc20.x86_64 default_hugepagesz=1G hugepagesz=1G hugepages=32 root=UUID=978575a5-45f3-4675-9e4be17f3fd0a03 ro vconsole.font=latarcyrheb-sun16 rhgb quiet LANG=en_US.UTF-8 This can be used to verify that the kernel is set correctly for the system. The dmesg only holds the last system message created. If the kernel boot command line is not present, then search /var/log/ messages file. B.4.6 # # # # # # # 44 Configuring System Variables (Host System Configuration) echo "# Disable Address Space Layout Randomization (ASLR)" > /etc/sysctl.d/ aslr.conf echo "kernel.randomize_va_space=0" >> /etc/sysctl.d/aslr.conf echo "# Enable IPv4 Forwarding" > /etc/sysctl.d/ip_forward.conf echo "net.ipv4.ip_forward=1" >> /etc/sysctl.d/ip_forward.conf systemctl restart systemd-sysctl.service cat /proc/sys/kernel/randomize_va_space 0 cat /proc/sys/net/ipv4/ip_forward 0 Intel® Open Network Platform Server Benchmark Performance Test Report B.5 Setting Up Tests B.5.1 OVS Throughput Tests (PHY-PHY without a VM) Build OVS as described in previous section, reboot system, and check the kernel boot line for 1 GB hugepage, iosolcpu setting for target cores (1,2,3,4,5,6,7,8,9) and IOMMU not enabled. 1. Install OVS kernel module and check for being present. # modprobe openvswitch # lsmod |grep open openvswitch gre vxlan libcrc32c 70953 13535 37334 12603 0 1 openvswitch 1 openvswitch 1 openvswitch 2. Install kernel UIO driver and DPDK UIO driver. # modprobe uio # insmod /usr/src/dpdk/x86_64-ivshmem-linuxapp-gcc/kmod/igb_uio.ko 3. Find target Ethernet interfaces. # lspci -nn |grep Ethernet 03:00.0 Ethernet controller [0200]: X540-AT2 [8086:1528] (rev 01) 03:00.1 Ethernet controller [0200]: X540-AT2 [8086:1528] (rev 01) 06:00.0 Ethernet controller [0200]: Network Connection [8086:10fb] (rev 06:00.1 Ethernet controller [0200]: Network Connection [8086:10fb] (rev Intel Corporation Ethernet Controller 10-Gigabit Intel Corporation Ethernet Controller 10-Gigabit Intel Corporation 82599ES 10-Gigabit SFI/SFP+ 01) Intel Corporation 82599ES 10-Gigabit SFI/SFP+ 01) 4. Bind to UIO interface (need to not be configured, ifconfig xxxx down if configured). # /usr/src/dpdk/tools/dpdk_nic_bind.py --bind=igb_uio 06:00.0 # /usr/src/dpdk/tools/dpdk_nic_bind.py --bind=igb_uio 06:00.1 # /usr/src/dpdk/tools/dpdk_nic_bind.py –status 5. Mount hugepage file systems. # # # # # mount -t hugetlbfs nodev /dev/hugepages mkdir /dev/hugepages_2mb mount -t hugetlbfs nodev /dev/hugepages_2mb -o pagesize=2MB cat /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages echo 2048 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages 6. Terminate OVS, clear previous OVS or OVDK database and setup for new database. # # # # pkill -9 ovs rm -rf /usr/local/var/run/openvswitch/ rm -rf /usr/local/etc/openvswitch/ rm -f /tmp/conf.db # mkdir -p /usr/local/etc/openvswitch # mkdir -p /usr/local/var/run/openvswitch 7. Initialize new OVS database. # cd /usr/src/ovs # ./ovsdb/ovsdb-tool create /usr/local/etc/openvswitch/conf.db ./vswitchd/vswitch.ovsschema 8. Start OVS database server. # cd /usr/src/ovs # ./ovsdb/ovsdb-server --remote=punix:/usr/local/var/run/openvswitch/db.sock --remote=db:Open_vSwitch,Open_vSwitch,manager_options --private key=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --pidfile --detach 45 Intel® Open Network Platform Server Benchmark Performance Test Report 9. Initialize OVS database. # cd /usr/src/ovs # ./utilities/ovs-vsctl --no-wait init 10. Start OVS with DPDP portion using 1 GB of Node 0 memory. # cd /usr/src/ovs # ./vswitchd/ovs-vswitchd --dpdk -c 0x1 -n 4 --socket-mem 1024,0 -unix:/usr/local/var/run/openvswitch/db.sock --pidfile –detach 11. Create OVS DPDK Bridge and add the two physical NICs. # # # # cd /usr/src/ovs ./utilities/ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev ./utilities/ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk ./utilities/ovs-vsctl add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk # ./utilities/ovs-vsctl show 12. Set OVS port to port flows for NIC 0 endpoint IP Address of 1.1.1.1 and NIC 0 endpoint IP Address of 5.1.1.1 using script. #! /bin/sh # Move to command directory cd /usr/src/ovs/utilities/ # Clear current flows ./ovs-ofctl del-flows br0 # Add Flow for port 0 to port 1 and port 1 to port 0 ./ovs-ofctl add-flow br0 in_port=1,dl_type=0x800,nw_src=1.1.1.1,nw_dst=5.1.1.1,idle_timeout=0,action=output:2 ./ovs-ofctl add-flow br0 in_port=2,dl_type=0x800,nw_src=5.1.1.1,nw_dst=1.1.1.1,idle_timeout=0,action=output:1 # ./ovs_flow_ports.sh 13. Check load to verify CPU load for the OVS PMD is operating. # top (1) top - 18:29:24 up 37 min, 4 users, load average: 0.86, 0.72, 0.63 Tasks: 275 total, 1 running, 274 sleeping, 0 stopped, 0 zombie %Cpu0 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu1 :100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu2 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st 14. Affinitize DPDK PDM task to CPU core 2. # cd /usr/src/ovs/utilities # ./ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=2 Example of check OVS thread affinitization: # top –p `pidof ovs-switchd` -H –d1 … 46 PID 1724 1658 1656 1657 1659 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 USER root root root root root root root root root root root root root root root root root PR NI VIRT 20 0 5035380 20 0 5035380 20 0 5035380 20 0 5035380 20 0 5035380 20 0 5035380 20 0 5035380 20 0 5035380 20 0 5035380 20 0 5035380 20 0 5035380 20 0 5035380 20 0 5035380 20 0 5035380 20 0 5035380 20 0 5035380 20 0 5035380 RES 6100 6100 6100 6100 6100 6100 6100 6100 6100 6100 6100 6100 6100 6100 6100 6100 6100 SHR 4384 4384 4384 4384 4384 4384 4384 4384 4384 4384 4384 4384 4384 4384 4384 4384 4384 S %CPU %MEM R 99.7 0.0 S 11.0 0.0 S 0.0 0.0 S 0.0 0.0 S 0.0 0.0 S 0.0 0.0 S 0.0 0.0 S 0.0 0.0 S 0.0 0.0 S 0.0 0.0 S 0.0 0.0 S 0.0 0.0 S 0.0 0.0 S 0.0 0.0 S 0.0 0.0 S 0.0 0.0 S 0.0 0.0 TIME+ 7:37.19 1:40.35 0:01.37 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 COMMAND pmd62 urcu2 ovs-vswitchd dpdk_watchdog1 cuse_thread3 handler60 handler61 handler39 handler38 handler37 handler36 handler33 handler34 handler35 handler40 handler46 handler47 Intel® Open Network Platform Server Benchmark Performance Test Report 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 root root root root root root root root root root root root root root root root 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 # taskset -p 1724 pid 1724's current # taskset -p 1658 pid 1658's current # taskset -p 1656 pid 1656's current # taskset -p 1657 pid 1657's current # taskset -p 1659 pid 1659's current # taskset -p 1693 pid 1693's current # taskset -p 1694 pid 1694's current # taskset -p 1695 pid 1695's current # taskset -p 1696 pid 1696's current # taskset -p 1697 pid 1697's current # taskset -p 1698 pid 1698's current # taskset -p 1699 pid 1699's current # taskset -p 1700 pid 1700's current # taskset -p 1701 pid 1701's current # taskset -p 1702 pid 1702's current # taskset -p 1703 pid 1703's current # taskset -p 1704 pid 1704's current # taskset -p 1705 pid 1705's current # taskset -p 1706 pid 1706's current # taskset -p 1707 pid 1707's current # taskset -p 1708 pid 1708's current # taskset -p 1709 pid 1709's current # taskset -p 1710 pid 1710's current # taskset -p 1711 pid 1711's current # taskset -p 1712 pid 1712's current 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5035380 5035380 5035380 5035380 5035380 5035380 5035380 5035380 5035380 5035380 5035380 5035380 5035380 5035380 5035380 5035380 6100 6100 6100 6100 6100 6100 6100 6100 6100 6100 6100 6100 6100 6100 6100 6100 4384 4384 4384 4384 4384 4384 4384 4384 4384 4384 4384 4384 4384 4384 4384 4384 S S S S S S S S S S S S S S S S 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.02 0:00.01 0:00.00 0:00.00 0:00.00 0:00.01 0:00.00 0:00.01 handler48 handler45 handler50 handler49 handler41 handler44 handler42 handler43 revalidator55 revalidator51 revalidator56 revalidator58 revalidator52 revalidator57 revalidator54 revalidator53 affinity mask: 2 affinity mask: 4 affinity mask: 4 affinity mask: 4 affinity mask: 4 affinity mask: 4 affinity mask: 4 affinity mask: 4 affinity mask: 4 affinity mask: 4 affinity mask: 4 affinity mask: 4 affinity mask: 4 affinity mask: 4 affinity mask: 4 affinity mask: 4 affinity mask: 4 affinity mask: 4 affinity mask: 4 affinity mask: 4 affinity mask: 4 affinity mask: 4 affinity mask: 4 affinity mask: 4 affinity mask: 4 47 Intel® Open Network Platform Server Benchmark Performance Test Report # taskset -p 1713 pid 1713's current # taskset -p 1714 pid 1714's current # taskset -p 1715 pid 1715's current # taskset -p 1716 pid 1716's current # taskset -p 1717 pid 1717's current # taskset -p 1718 pid 1718's current # taskset -p 1719 pid 1719's current # taskset -p 1720 pid 1720's current Note: affinity mask: 4 affinity mask: 4 affinity mask: 4 affinity mask: 4 affinity mask: 4 affinity mask: 4 affinity mask: 4 affinity mask: 4 OVS task affiliation will change in the future. B.5.2 Intel DPDK Accelerated vSwitch Throughput Tests (PHY-PHY without a VM) 1. Start with a clean system. # # # # # # # # # # # # # # # cd <Intel_DPDK_Accelerated_vSwitch_INSTALL_DIR> pkill -9 ovs rm -rf /usr/local/var/run/openvswitch/ rm -rf /usr/local/etc/openvswitch/ mkdir -p /usr/local/var/run/openvswitch/ mkdir -p /usr/local/etc/openvswitch/ rm -f /tmp/conf.db rmmod vhost-net rm -rf /dev/vhost-net rmmod ixgbe rmmod uio rmmod igb_uio umount /sys/fs/cgroup/hugetlb umount /dev/hugepages umount /mnt/huge 2. Mount hugepage filesystem on host. # mount -t hugetlbfs nodev /dev/hugepages # mount|grep huge 3. Install the DPDK kernel modules. # cd <Intel_DPDK_Accelerated_vSwitch_INSTALL_DIR> # modprobe uio # insmod <DPDK_INSTALL_DIR>/x86_64-ivshmem-linuxapp-gcc/kmod/igb_uio.ko 4. Bind the physical interfaces to igb_uio driver. For example, if the PCI addresses on the setup are 08:00.0 and 08:00.1 the below command shows how to bind them to the DPDK driver. Replace the PCI addresses according to the setup. # ./dpdk*/tools/dpdk_nic_bind.py --status # ./dpdk*/tools/dpdk_nic_bind.py --bind=igb_uio 08:00.0 # ./dpdk*/tools/dpdk_nic_bind.py --bind=igb_uio 08:00.1 5. Create an Intel® DPDK Accelerated vSwitch database file. # ./ovsdb/ovsdb-tool create /usr/local/etc/openvswitch/conf.db <OVS_INSTALL_DIR/openvswitch/vswitchd/vswitch.ovsschema 48 Intel® Open Network Platform Server Benchmark Performance Test Report 6. Run the ovsdb-server. # ./ovsdb/ovsdb-server --remote=punix:/usr/local/var/run/openvswitch/db.sock --remote=db:Open_vSwitch,Open_vSwitch,manager_options & 7. Create Intel® DPDK Accelerated vSwitch bridge, add physical interfaces (type: dpdkphy) to the bridge: # ./utilities/ovs-vsctl --no-wait add-br br0 -- set Bridge br0 datapath_type=dpdk # ./utilities/ovs-vsctl --no-wait add-port br0 port1 -- set Interface port1 type=dpdkphy ofport_request=1 option:port=0 # ./utilities/ovs-vsctl --no-wait add-port br0 port2 -- set Interface port2 type=dpdkphy ofport_request=1 option:port=1 The output should be: ---------------------------------------------00000000-0000-0000-0000-000000000000 Bridge "br0" Port "br0" Interface "br0" type: internal Port "port1" Interface "port1" type: dpdkphy options: {port="0"} Port "port2" Interface "port2" type: dpdkphy options: {port="1"} ---------------------------------------------8. Start the Intel® DPDK Accelerated vSwitch for the PHY-PHY throughput test. # ./datapath/dpdk/ovs-dpdk -c 0x0F -n 4 --proc-type primary --socket-mem 4096 --p 0x03 --stats_core 0 --stats_int 5 Exit the process using CTRL-C. Copy the valid address on the system from the stdout output of the above command. For example, EAL: virtual area found at 0x7F1740000000 (size = 0x80000000) and use it for the base-virtaddr parameter of the following command: # ./datapath/dpdk/ovs-dpdk -c 0x0F -n 4 --proc-type primary --basevirtaddr=0x7f9a40000000 --socket-mem 4096 -- -p 0x03 --stats_core 0 --stats_int 5 9. Run the vswitchd daemon. # ./vswitchd/ovs-vswitchd -c 0x100 --proc-type=secondary -- --pidfile=/tmp/vswitchd.pid 10. Using the ovs-ofctl utility, add flows for bidirectional test. An example below shows how to add flows when the source and destination IPs are 1.1.1.1 and 6.6.6.2 for a bidirectional use case. # ./utilities/ovs-ofctl del-flows br0 # ./utilities/ovs-ofctl add-flow br0 in_port=1,dl_type=0x0800,nw_src=1.1.1.1,nw_dst=6.6.6.2,idle_timeout=0,action=output:2 # ./utilities/ovs-ofctl add-flow br0 in_port=2,dl_type=0x0800,nw_src=6.6.6.2, nw_dst=1.1.1.1,idle_timeout=0,action=output:1 # ./utilities/ovs-ofctl dump-flows br0 # ./utilities/ovs-vsctl --no-wait -- set Open_vSwitch.other_config:n-handler-threads=1 49 Intel® Open Network Platform Server Benchmark Performance Test Report B.5.3 Intel DPDK Accelerated vSwitch with User Space vHost B.5.3.1 Host Configuration for Running a VM with User Space vHost 1. Start with a clean system. # # # # # # # # # # # # # # # cd <Intel_DPDK_Accelerated_vSwitch_INSTALL_DIR> pkill -9 ovs rm -rf /usr/local/var/run/openvswitch/ rm -rf /usr/local/etc/openvswitch/ mkdir -p /usr/local/var/run/openvswitch/ mkdir -p /usr/local/etc/openvswitch/ rm -f /tmp/conf.db rmmod vhost-net rm -rf /dev/vhost-net rmmod ixgbe rmmod uio rmmod igb_uio umount /sys/fs/cgroup/hugetlb umount /dev/hugepages umount /mnt/huge 2. Mount hugepage filesystem on host. # mount -t hugetlbfs nodev /dev/hugepages # mount|grep huge 3. Install the DPDK, cuse and eventfd kernel modules. # # # # # cd <Intel_DPDK_Accelerated_vSwitch_INSTALL_DIR> modprobe cuse insmod ./openvswitch/datapath/dpdk/fd_link/fd_link.ko modprobe uio insmod <DPDK_INSTALL_DIR>/x86_64-ivshmem-linuxapp-gcc/kmod/igb_uio.ko 4. Bind/Unbind to/from the igb_uio module. # <DPDK_INSTALL_DIR>/tools/dpdk_nic_bind.py --status Network devices using IGB_UIO driver ==================================== <none> Network devices using kernel driver =================================== 0000:01:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection' if=p1p1 drv=ixgbe unused=igb_uio 0000:01:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection' if=p1p2 drv=ixgbe unused=igb_uio 0000:08:00.0 'I350 Gigabit Network Connection' if=em0 drv=igb unused=igb_uio *Active* 0000:08:00.1 'I350 Gigabit Network Connection' if=enp8s0f1 drv=igb unused=igb_uio 0000:08:00.2 'I350 Gigabit Network Connection' if=enp8s0f2 drv=igb unused=igb_uio 0000:08:00.3 'I350 Gigabit Network Connection' if=enp8s0f3 drv=igb unused=igb_uio Other network devices ===================== <none> To bind devices p1p1 and p1p2, (01:00.0 and 01:00.1), to the igb_uio driver. # <DPDK_INSTALL_DIR>/tools/dpdk_nic_bind.py --bind=igb_uio 01:00.0 # <DPDK_INSTALL_DIR>/tools/dpdk_nic_bind.py --bind=igb_uio 01:00.1 # <DPDK_INSTALL_DIR>/tools/dpdk_nic_bind.py --status Network devices using IGB_UIO driver ==================================== 0000:01:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection' drv=igb_uio unused= 0000:01:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection' drv=igb_uio unused= Network devices using kernel driver 50 Intel® Open Network Platform Server Benchmark Performance Test Report =================================== 0000:08:00.0 'I350 Gigabit Network Connection' 0000:08:00.1 'I350 Gigabit Network Connection' 0000:08:00.2 'I350 Gigabit Network Connection' 0000:08:00.3 'I350 Gigabit Network Connection' Other network devices ===================== <none> if=em0 drv=igb unused=igb_uio *Active* if=enp8s0f1 drv=igb unused=igb_uio if=enp8s0f2 drv=igb unused=igb_uio if=enp8s0f3 drv=igb unused=igb_uio 5. Create an Intel® DPDK Accelerated vSwitch database file. # ./ovsdb/ovsdb-tool create /usr/local/etc/openvswitch/conf.db <OVS_INSTALL_DIR/openvswitch/vswitchd/vswitch.ovsschema 6. Run the ovsdb-server. # ./ovsdb/ovsdb-server --remote=punix:/usr/local/var/run/openvswitch/db.sock -- remote=db:Open_vSwitch,Open_vSwitch,manager_options & 7. Create Intel® DPDK Accelerated vSwitch bridge, add physical interfaces (type: dpdkphy) and vHost interfaces (type: dpdkvhost) to the bridge. # ./utilities/ovs-vsctl --no-wait add-br br0 -- set Bridge br0 datapath_type=dpdk # ./utilities/ovs-vsctl --no-wait add-port br0 port1 -- set Interface port1 type=dpdkphy ofport_request=1 option:port=0 # ./utilities/ovs-vsctl --no-wait add-port br0 port2 -- set Interface port2 type=dpdkphy ofport_request=2 option:port=1 # ./utilities/ovs-vsctl --no-wait add-port br0 port3 -- set Interface port3 type=dpdkvhost ofport_request=3 # ./utilities/ovs-vsctl --no-wait add-port br0 port4 -- set Interface port4 type=dpdkvhost ofport_request=4 # ./utilities/ovs-vsctl show The output should be: ---------------------------------------------00000000-0000-0000-0000-000000000000 Bridge "br0" Port "br0" Interface "br0" type: internal Port "port16" Interface "port1" type: dpdkphy options: {port="1"} Port "port17" Interface "port2" type: dpdkphy options: {port="2"} Port "port80" Interface "port3" type: dpdkvhost Port "port81" Interface "port4" type: dpdkvhost ---------------------------------------------8. Start the Intel® DPDK Accelerated vSwitch for 1 VM with two user space vHost interfaces. # ./datapath/dpdk/ovs-dpdk -c 0x0F -n 4 --proc-type primary --socket-mem 4096 -- -p 0x03 --stats_core 0 --stats_int 5 51 Intel® Open Network Platform Server Benchmark Performance Test Report Exit the process using CTRL-C. Copy the valid address on the system from the stdout output of the above command. For example, EAL: virtual area found at 0x7F1740000000 (size = 0x80000000). Use it for the base-virtaddr parameter of the following command: # ./datapath/dpdk/ovs-dpdk -c 0x0F -n 4 --proc-type primary --basevirtaddr=0x7f9a40000000 --socket-mem 4096 -- -p 0x03 --stats_core 0 --stats_int 5 9. Run the vswitchd daemon. # ./vswitchd/ovs-vswitchd -c 0x100 --proc-type=secondary -- --pidfile=/tmp/vswitchd.pid 10. Using the ovs-ofctl utility, add flows entries to switch packets from port 2 to port 4, port 3 to port 1 on ingress and egress path. An example below shows how to add flows for source and destination IPs are 1.1.1.1 and 6.6.6.2 for a bidirectional case. # ./utilities/ovs-ofctl del-flows br0 # ./utilities/ovs-ofctl add-flow br0 in_port=2,dl_type=0x0800,nw_src=1.1.1.1,nw_dst=6.6.6.2, idle_timeout=0,action=output:4 # ./utilities/ovs-ofctl add-flow br0 in_port=3,dl_type=0x0800,nw_src=1.1.1.1,nw_dst=6.6.6.2, idle_timeout=0,action=output:1 # ./utilities/ovs-ofctl add-flow br0 in_port=4,dl_type=0x0800,nw_src=6.6.6.2,nw_dst=1.1.1.1, idle_timeout=0,action=output:2 # ./utilities/ovs-ofctl add-flow br0 in_port=1,dl_type=0x0800,nw_src=6.6.6.2,nw_dst=1.1.1.1, idle_timeout=0,action=output:3 # ./utilities/ovs-ofctl dump-flows br0 11. Copy DPDK source to a shared directory to be passed to the VM. # # # # # rm -rf /tmp/qemu_share mkdir -p /tmp/qemu_share mkdir -p /tmp/qemu_share/DPDK chmod 777 /tmp/qemu_share cp -aL <DPDK_INSTALL_DIR>/* /tmp/qemu_share/DPDK 12. Start the VM using qemu command. # cd <Intel_DPDK_Accelerated_vSwitch_INSTALL_DIR> # taskset 0x30 ./qemu/x86_64-softmmu/qemu-system-x86_64 -cpu host -boot c -hda <VM_IMAGE_LOCATION> -m 4096 -smp 2 --enable-kvm -name 'client 1' -nographic -vnc :12 -pidfile /tmp/vm1.pid -drive file=fat:rw:/tmp/qemu_share -monitor unix:/tmp vm1monitor,server,nowait -net none -no-reboot -mem-path /dev/hugepages -memprealloc -netdev type=tap,id=net1,script=no,downscript=no,ifname=port80,vhost=on -device virtio-netpci,netdev=net1,mac=00:00:00:00:00:01,csum=off,gso=off,guest_tso4=off,guest_tso6= off,guest_ecn=off -netdev type=tap,id=net2,script=no,downscript=no,ifname=port81,vhost=on -device virtio net-pci,netdev=net2,mac=00:00:00:00:00:02,csum=off,gso=off,guest_tso4=off, guest_tso6=off,guest_ecn=off The VM can be accessed using a VNC client at the port mentioned in the qemu startup command, which is port 12 in this instance. B.5.3.2 Guest Configuration to Run DPDK Sample Forwarding Application with User Space vHost 1. Install DPDK on the VM using the source from the shared drive. # # # # # # # # # 52 mkdir -p /mnt/vhost mkdir -p /root/vhost mount -o iocharset=utf8 /dev/sdb1 /mnt/vhost cp -a /mnt/vhost/* /root/vhost export RTE_SDK=/root/vhost/DPDK export RTE_TARGET=x86_64-ivshmem-linuxapp-gcc cd /root/vhost/DPDK make uninstall make install T=x86_64-ivshmem-linuxapp-gcc Intel® Open Network Platform Server Benchmark Performance Test Report 2. Build the DPDK vHost application testpmd. # cd /root/vhost/DPDK/app/test-pmd # make clean # make 3. Run the testpmd application after installing the UIO drivers. # # # # # modprobe uio echo 1280 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/ nr_hugepages insmod /root/vhost/DPDK/x86_64-ivshmem-linuxapp-gcc/kmod/igb_uio.ko /root/vhost/DPDK/tools/dpdk_nic_bind.py --bind igb_uio 0000:00:03.0 0000:00:04.0 ./testpmd -c 0x3 -n 4 --socket-mem 128 -- --burst=64 -i At the prompt enter the following to start the application: testpmd> set fwd mac_retry testpmd> start B.5.4 OVS with User Space vHost Build OVS as described in previous section, reboot system, and check the kernel boot line for 1 GB hugepage, iosolcpu setting for target cores (1,2,3,4,5,6,7,8,9) and IOMMU not enabled. 1. Check DPDK for correct build and build OVS with DPDK and eventfd module if not already built 2. Check system boot for valid Hugepage and isolcpus settings. # dmesg|grep command [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.15.6-200.fc20.x86_64 root=UUID=27da0816-bbbd-4b28-acaf-993939cfa758 ro default_hugepagesz=1GB hugepagesz=1GB hugepages=8 hugepagesz=2M hugepages=2048 isolcpus=1,2,3,4,5,6,7,8,9 vconsole.font=latarcyrheb-sun16 rhgb quiet 3. Remove vhost-net module and device directory. rmmod vhost-net rm /dev/vhost-net 4. Mount hugepage file systems. # mount -t hugetlbfs nodev /dev/hugepages # mkdir /dev/hugepages_2mb # mount -t hugetlbfs nodev /dev/hugepages_2mb -o pagesize=2MB # cat /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages # echo 2048 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages 5. Install OVS kernel module. # modprobe openvswitch # lsmod |grep open openvswitch gre vxlan libcrc32c 70953 13535 37334 12603 0 1 openvswitch 1 openvswitch 1 openvswitch 6. Install cuse and fuse kernel modules. # modprobe cuse # modprobe fuse 7. Install eventfd kernel module. # modprobe /usr/src/ovs/utilities/eventfd_link/eventfd_link.ko 8. Install UIO and DPDK UIO kernel modules. # modprobe uio # insmod /usr/src/dpdk/x86_64-ivshmem-linuxapp-gcc/kmod/igb_uio.ko 53 Intel® Open Network Platform Server Benchmark Performance Test Report 9. Find target Ethernet interfaces. # lspci -nn |grep Ethernet 03:00.0 Ethernet controller AT2 [8086:1528] (rev 01) 03:00.1 Ethernet controller AT2 [8086:1528] (rev 01) 06:00.0 Ethernet controller Connection [8086:10fb] (rev 06:00.1 Ethernet controller Connection [8086:10fb] (rev [0200]: Intel Corporation Ethernet Controller 10-Gigabit X540[0200]: Intel Corporation Ethernet Controller 10-Gigabit X540[0200]: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network 01) [0200]: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network 01) 10. Bind to UIO interface (need to not be configured, ifconfig xxxx down if configured). # /usr/src/dpdk/tools/dpdk_nic_bind.py --bind=igb_uio 06:00.0 # /usr/src/dpdk/tools/dpdk_nic_bind.py --bind=igb_uio 06:00.1 # /usr/src/dpdk/tools/dpdk_nic_bind.py -status 11. Terminate OVS, clear previous OVS or OVDK database and setup for new database. # # # # pkill -9 ovs rm -rf /usr/local/var/run/openvswitch/ rm -rf /usr/local/etc/openvswitch/ rm -f /tmp/conf.db # mkdir -p /usr/local/etc/openvswitch # mkdir -p /usr/local/var/run/openvswitch 12. Initialize new OVS database. # cd /usr/src/ovs # ./ovsdb/ovsdb-tool create /usr/local/etc/openvswitch/conf.db ./vswitchd/vswitch.ovsschema 13. Start OVS database server. # cd /usr/src/ovs # ./ovsdb/ovsdb-server --remote=punix:/usr/local/var/run/openvswitch/db.sock --remote=db:Open_vSwitch,Open_vSwitch,manager_options --privatekey=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --pidfile --detach 14. Initialize OVS database. # cd /usr/src/ovs # ./utilities/ovs-vsctl --no-wait init 15. Start OVS with DPDP portion using 1 GB or 2 GB of Node 0 memory/ # cd /usr/src/ovs # ./vswitchd/ovs-vswitchd --dpdk -c 0x1 -n 4 --socket-mem 2048,0 -- unix:/usr/local/var/run/openvswitch/db.sock --pidfile -detach 16. Set DPDK PDM thread affinity (persistent if OVS database not cleared). # cd /usr/src/ovs ./utilities/ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=2 17. Create OVS DPDK Bridge and add the two physical NICs and user side vHost interfaces. # # # # # # 54 cd /usr/src/ovs ./utilities/ovs-vsctl ./utilities/ovs-vsctl ./utilities/ovs-vsctl ./utilities/ovs-vsctl ./utilities/ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk add-port br0 dpdkvhost0 -- set Interface dpdkvhost0 add-port br0 dpdkvhost1 -- set Interface dpdkvhost1 Intel® Open Network Platform Server Benchmark Performance Test Report The ovs-vswitchd needs to be running and each added in the order after the database has been cleared to establish dpdk0 as port 1, dpdk1 as port 2, dpdkvhost0 as port 3, and dpdkvhost1 as port 4 for the flow example entries to be correct. If the database was not cleared the port ids assignment remain and the port ids often are not in the expected order. In that case the flow commands need to be adjusted or the OVS terminated, OVS database cleared, OVS restarted and bridge created in correct order. 18. Check OVS DPDK bridge # ./utilities/ovs-vsctl show [root@F20-v3 ovs]# ./utilities/ovs-vsctl show c2669cc3-55ac-4f62-854c-b88bf5b66c10 Bridge "br0" Port "dpdk0" Interface "dpdk0" type: dpdk Port "dpdkvhost0" Interface "dpdkvhost0" type: dpdkvhost Port "dpdk1" Interface "dpdk1" type: dpdk Port "br0" Interface "br0" type: internal Port "dpdkvhost1" Interface "dpdkvhost1" type: dpdkvhost Note: UUID, c2669cc3-55ac-4f62-854c-b88bf5b66c10 for this instance, and is different for each instance. 19. Apply the vhost to VM vhost flow operations. # ./ovs_flow_vm_vhost.sh # cat ovs_flow_vm_vhost.sh #! /bin/sh # Move to command directory cd /usr/src/ovs/utilities/ # Clear current flows ./ovs-ofctl del-flows br0 # Add Flow for port 0 (16) to port 1 (17) ./ovs-ofctl add-flow br0 in_port=2,dl_type=0x800,nw_dst=5.1.1.1,idle_timeout=0,action=output:4 ./ovs-ofctl add-flow br0 in_port=1,dl_type=0x800,nw_src=5.1.1.1,idle_timeout=0,action=output:3 ./ovs-ofctl add-flow br0 in_port=4,dl_type=0x800,nw_src=5.1.1.1,idle_timeout=0,action=output:2 ./ovs-ofctl add-flow br0 in_port=3,dl_type=0x800,nw_dst=5.1.1.1,idle_timeout=0,action=output:1 In this case, the end point port 2 has fixed IP endpoint 5.1.1.1 while the port 1 end point has multiple endpoints IPs starting at IP 1.1.1.1 and incrementing up to half the bidirectional flows of the test. This allows a large number of flows to be set using a small number of inexact flow commands. The nw_src and nd_dst can also be deleted and would result in the same flows, but without any IP matches on the inexact flow commands. 55 Intel® Open Network Platform Server Benchmark Performance Test Report B.5.4.1 VM Startup for (OVS User Space vHost) Throughput Test The following test script is sample manual VM startup for vhost throughout test. 1 standard Linux network interface for control and test interaction and 2 User side vHost compatible network interfaces for attaching to DPDK bridge ports 3 and 4.This assumes that Linux network bridge “br-mgt” has been created on the management network. The VM is started with 3 GB of memory using the 1 GB hugepage memory. Three 1 GB pages is minimum of 1 GB hugepage memory that will have one 1 GB hugepage available for use in VM. # cat vm_vhost_start.sh #!/bin/sh vm=/vm/Fed20-mp.qcow2 vm_name="vhost-test" vnc=10 n1=tap46 bra=br-mgt dn_scrp_a=/vm/vm_ctl/br-mgt-ifdown mac1=00:1f:33:16:64:44 if [ ! -f $vm ]; then echo "VM $vm not found!" else echo "VM $vm started! VNC: $vnc, management network: $n1" tunctl -t $n1 brctl addif $bra $n1 ifconfig $n1 0.0.0.0 up taskset 0x30 qemu-system-x86_64 -cpu host -hda $vm -m 3072 -boot c -smp 2 -pidfile /tmp/vm1.pid -monitor unix:/tmp/vm1monitor,server,nowait -mem-path /dev/hugepages mem-prealloc -enable-kvm -net nic,model=virtio,netdev=eth0,macaddr=$mac1 -netdev tap,ifname=$n1,id=eth0,script=no,downscript=$dn_scrp_a -netdev type=tap,id=net1,script=no,downscript=no,ifname=port3,vhost=on -device virtio-netpci,netdev=net1,mac=00:00:00:00:00:01,csum=off,gso=off,guest_tso4=off,guest_tso6=off ,guest_ecn=off -netdev type=tap,id=net2,script=no,downscript=no,ifname=port4,vhost=on -device virtio-netpci,netdev=net2,mac=00:00:00:00:00:02,csum=off,gso=off,guest_tso4=off,guest_tso6=off ,guest_ecn=off -name $vm_name -vnc :$vnc & fi Script shutdown support script: # cat br-mgt-ifdown #!/bin/sh bridge='br-mgt' /sbin/ifconfig $1 0.0.0.0 down brctl delif ${bridge} $1 56 Intel® Open Network Platform Server Benchmark Performance Test Report B.5.4.2 Operating VM for (OVS User Space vHost) Throughput Test 1. The VM kernel bootline needs to have 1 GB hugepage memory configured and vCPU 1 isolated. Check for correct bootline parameters, change if needed and reboot. # dmesg |grep command [ 0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-3.15.6-200.fc20.x86_64 root=UUID=6635b8e3-0312-4892-92e7-ccff453ec06d ro default_hugepagesz=1GB hugepagesz=1G hugepages=1 isolcpus=1 vconsole.font=latarcyrheb-sun16 rhgb quiet LANG=en_US.UTF-8 2. Get DPDK package (same as host). # cd /root # git clone git://dpdk.org/dpdk # cd /root/dpdk 3. Need to checkout the target DPDK version. # git checkout -b test_v1.7.1 v1.7.1 4. The CONFIG_RTE_BUILD_COMBINE_LIBS=y needs to be set in config/common_linuxapp file. It can be changed by using a text editor or using a patch as discussed earlier in this document. # make install T=x86_64-ivshmem-linuxapp-gcc 5. Build the vHost PMD application. # # # # cd /root/dpdk/app/test-pmd/ export RTE_SDK=$(pwd) export RTE_TARGET=x86_64-ivshmem-linuxapp-gcc make CC testpmd.o CC parameters.o CC cmdline.o CC config.o CC iofwd.o CC macfwd.o CC macfwd-retry.o CC macswap.o CC flowgen.o CC rxonly.o CC txonly.o CC csumonly.o CC icmpecho.o CC mempool_anon.o LD testpmd INSTALL-APP testpmd INSTALL-MAP testpmd.map 6. Setup hugepage filesystem. # mkdir -p /mnt/hugepages # mount -t hugetlbfs hugetlbfs /mnt/hugepage 7. Load the UIO kernel modules. # modprobe uio # insmod /root/dpdk/x86_64-ivshmem-linuxapp-gcc/kmod/igb_uio.ko 8. Find the passed Userside network devices. The order is in the order listed in the QEMU startup command line. The first network device is the management network, the second and third network device is the Userside vHost devices. # lspci 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02) 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] 00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II] 57 Intel® Open Network Platform Server Benchmark Performance Test Report 00:01.3 00:02.0 00:03.0 00:04.0 00:05.0 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03) VGA compatible controller: Cirrus Logic GD 5446 Ethernet controller: Red Hat, Inc Virtio network device Ethernet controller: Red Hat, Inc Virtio network device Ethernet controller: Red Hat, Inc Virtio network device 9. Bind the Userside vHost devices to the igb_uio device driver. # /root/dpdk/tools/dpdk_nic_bind.py -b igb_uio 0000:00:04.0 # /root/dpdk/tools/dpdk_nic_bind.py -b igb_uio 0000:00:05.0 10. Check bind status to make sure PCI 0000:00:04.0 and 0000:00:05.0 bound to igb_uio device driver. # /root/dpdk/tools/dpdk_nic_bind.py --status Network devices using DPDK-compatible driver ============================================ 0000:00:04.0 'Virtio network device' drv=igb_uio unused=virtio_pci 0000:00:05.0 'Virtio network device' drv=igb_uio unused=virtio_pci Network devices using kernel driver =================================== 0000:00:03.0 'Virtio network device' if= drv=virtio-pci unused=virtio_pci,igb_uio Other network devices ===================== <none> 11. Run test-pmd application. # cd /root/dpdk/app/test-pmd/ # ./testpmd -c 0x3 -n 4 --socket-mem 128 -- --burst=64 -i --txqflags=0xf00 EAL: Detected lcore 0 as core 0 on socket 0 EAL: Detected lcore 1 as core 0 on socket 0 EAL: Support maximum 64 logical core(s) by configuration. EAL: Detected 2 lcore(s) EAL: unsupported IOMMU type! EAL: VFIO support could not be initialized EAL: Searching for IVSHMEM devices... EAL: No IVSHMEM configuration found! EAL: Setting up memory... EAL: Ask a virtual area of 0x40000000 bytes EAL: Virtual area found at 0x7fc980000000 (size = 0x40000000) EAL: Requesting 1 pages of size 1024MB from socket 0 EAL: TSC frequency is ~2593994 KHz EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable clock cycles ! EAL: Master core 0 is ready (tid=7f723880) EAL: Core 1 is ready (tid=7eee5700) EAL: PCI device 0000:00:03.0 on NUMA socket -1 EAL: probe driver: 1af4:1000 rte_virtio_pmd EAL: 0000:00:03.0 not managed by UIO driver, skipping EAL: PCI device 0000:00:04.0 on NUMA socket -1 EAL: probe driver: 1af4:1000 rte_virtio_pmd EAL: PCI memory mapped at 0x7fca7f72c000 EAL: PCI device 0000:00:05.0 on NUMA socket -1 EAL: probe driver: 1af4:1000 rte_virtio_pmd EAL: PCI memory mapped at 0x7fca7f72b000 Interactive-mode selected Configuring Port 0 (socket 0) Port 0: 00:00:00:00:00:01 Configuring Port 1 (socket 0) Port 1: 00:00:00:00:00:02 Checking link statuses... Port 0 Link Up - speed 10000 Mbps - full-duplex Port 1 Link Up - speed 10000 Mbps - full-duplex Done testpmd> 58 Intel® Open Network Platform Server Benchmark Performance Test Report 12. Enter forward and mac_retry commands to setup operation. testpmd> set fwd mac_retry Set mac_retry packet forwarding mode 13. Start forwarding operation. testpmd> start mac_retry packet forwarding - CRC stripping disabled - packets/burst=64 nb forwarding cores=1 - nb forwarding ports=2 RX queues=1 - RX desc=128 - RX free threshold=0 RX threshold registers: pthresh=8 hthresh=8 wthresh=0 TX queues=1 - TX desc=512 - TX free threshold=0 TX threshold registers: pthresh=32 hthresh=0 wthresh=0 TX RS bit threshold=0 - TXQ flags=0xf00 testpmd> 14. Back on host, affinitize the vCPU to available target CPU core. Review QEMU tasks. # ps -eLF|grep qemu root 1658 1 1658 0 4 948308 37956 4 13:29 pts/2 00:00:01 qemusystem-x86_64 -cpu host -hda /vm/Fed20-mp.qcow2 -m 3072 -boot c -smp 2 -pidfile / tmp/vm1.pid -monitor unix:/tmp/vm1monitor,server,nowait -mem-path /dev/hugepages mem-prealloc -enable-kvm -net nic,model=virtio,netdev=eth0,macaddr=00:1f:33:16:64:44 -netdev tap,ifname=tap46,id=eth0,script=no,downscript=/vm/vm_ctl/br-mgt-ifdown -netdev type=tap,id=net1,script=no,downscript=no,ifname=port3,vhost=on -device virtionetpci,netdev=net1,mac=00:00:00:00:00:01,csum=off,gso=off,guest_tso4=off,guest_tso6= off,guest_ecn=off -netdev type=tap,id=net2,script=no,downscript=no,ifname=port4,vhost=on -device virtionetpci,netdev=net2,mac=00:00:00:00:00:02,csum=off,gso=off,guest_tso4=off,guest_tso6= off,guest_ecn=off -name vhost-test -vnc :10 root 1658 1 1665 1 4 948308 37956 4 13:29 pts/2 00:00:20 qemusystem-x86_64 -cpu host -hda /vm/Fed20-mp.qcow2 -m 3072 -boot c -smp 2 -pidfile /tmp/vm1.pid -monitor unix:/tmp/vm1monitor,server,nowait -mem-path /dev/hugepages -mem-prealloc -enable-kvm -net nic,model=virtio,netdev=eth0,macaddr=00:1f:33:16:64:44 -netdev tap,ifname=tap46,id=eth0,script=no,downscript=/vm/vm_ctl/br-mgt-ifdown -netdev type=tap,id=net1,script=no,downscript=no,ifname=port3,vhost=on -device virtionetpci,netdev=net1,mac=00:00:00:00:00:01,csum=off,gso=off,guest_tso4=off,guest_tso6= off,guest_ecn=off -netdev type=tap,id=net2,script=no,downscript=no,ifname=port4,vhost=on -device virtionetpci,netdev=net2,mac=00:00:00:00:00:02,csum=off,gso=off,guest_tso4=off,guest_tso6= off,guest_ecn=off -name vhost-test -vnc :10 root 1658 1 1666 89 4 948308 37956 4 13:29 pts/2 00:18:59 qemusystem-x86_64 -cpu host -hda /vm/Fed20-mp.qcow2 -m 3072 -boot c -smp 2 -pidfile / tmp/vm1.pid -monitor unix:/tmp/vm1monitor,server,nowait -mem-path /dev/hugepages mem-prealloc -enable-kvm -net nic,model=virtio,netdev=eth0,macaddr=00:1f:33:16:64:44 -netdev tap,ifname=tap46,id=eth0,script=no,downscript=/vm/vm_ctl/br-mgt-ifdown -netdev type=tap,id=net1,script=no,downscript=no,ifname=port3,vhost=on -device virtionetpci,netdev=net1,mac=00:00:00:00:00:01,csum=off,gso=off,guest_tso4=off,guest_tso6= off,guest_ecn=off -netdev type=tap,id=net2,script=no,downscript=no,ifname=port4,vhost=on -device virtionetpci,netdev=net2,mac=00:00:00:00:00:02,csum=off,gso=off,guest_tso4=off,guest_tso6= off,guest_ecn=off -name vhost-test -vnc :10 root 1658 1 1668 0 4 948308 37956 4 13:29 pts/2 00:00:00 qemusystem-x86_64 -cpu host -hda /vm/Fed20-mp.qcow2 -m 3072 -boot c -smp 2 -pidfile / tmp/vm1.pid -monitor unix:/tmp/vm1monitor,server,nowait -mem-path /dev/hugepages mem-prealloc -enable-kvm -net nic,model=virtio,netdev=eth0,macaddr=00:1f:33:16:64:44 -netdev tap,ifname=tap46,id=eth0,script=no,downscript=/vm/vm_ctl/br-mgt-ifdown -netdev 59 Intel® Open Network Platform Server Benchmark Performance Test Report type=tap,id=net1,script=no,downscript=no,ifname=port3,vhost=on -device virtionetpci,netdev=net1,mac=00:00:00:00:00:01,csum=off,gso=off,guest_tso4=off,guest_tso6= off,guest_ecn=off -netdev type=tap,id=net2,script=no,downscript=no,ifname=port4,vhost=on -device virtionetpci,netdev=net2,mac=00:00:00:00:00:02,csum=off,gso=off,guest_tso4=off,guest_tso6= off,guest_ecn=off -name vhost-test -vnc :10 15. In this case, we see that task 1666 is accumulating CPU run time. We change the affinitization to the target core. # taskset -p 1666 pid 1666's current affinity mask: 30 # taskset -p 40 1666 pid 1666's current affinity mask: 30 pid 1666's new affinity mask: 40 16. Check CPU core load to make sure real-time task running on correct CPU. # top (1) top - 13:51:26 up Tasks: 272 total, %Cpu0 : 0.0 us, %Cpu1 : 96.7 us, %Cpu2 : 4.5 us, %Cpu3 : 0.0 us, %Cpu4 : 0.0 us, %Cpu5 : 0.0 us, %Cpu6 :100.0 us, %Cpu7 : 0.0 us, %Cpu8 : 0.0 us, %Cpu9 : 0.0 us, %Cpu10 : 0.0 us, %Cpu11 : 0.0 us, %Cpu12 : 0.0 us, %Cpu13 : 0.0 us, %Cpu14 : 0.0 us, %Cpu15 : 0.0 us, %Cpu16 : 0.0 us, %Cpu17 : 0.0 us, %Cpu18 : 0.0 us, %Cpu19 : 0.0 us, %Cpu20 : 0.0 us, %Cpu21 : 0.0 us, %Cpu22 : 0.0 us, %Cpu23 : 0.0 us, %Cpu24 : 0.0 us, %Cpu25 : 0.0 us, %Cpu26 : 0.0 us, %Cpu27 : 0.0 us, KiB Mem: 32813400 KiB Swap: 8191996 PID 1581 1658 1 2 3 4 5 6 60 USER root root root root root root root root 1:37, 3 users, load average: 2.16, 2.12, 1.93 1 running, 271 sleeping, 0 stopped, 0 zombie 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 3.3 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 7.6 sy, 0.0 ni, 87.6 id, 0.0 wa, 0.3 hi, 0.0 si, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.3 sy, 0.0 ni, 99.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, total, 17556504 used, 15256896 free, 18756 buffers total, 0 used, 8191996 free, 301248 cached 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 st st st st st st st st st st st st st st st st st st st st st st st st st st st st PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 20 0 10.802g 6268 4412 S 116.3 0.0 31:02.42 ovs-vswitchd 20 0 3793232 37956 20400 S 100.3 0.1 20:08.86 qemu-system-x86 20 0 47528 5080 3620 S 0.0 0.0 0:02.72 systemd 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 20 0 0 0 0 S 0.0 0.0 0:00.01 ksoftirqd/0 20 0 0 0 0 S 0.0 0.0 0:02.23 kworker/0:0 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H 20 0 0 0 0 S 0.0 0.0 0:00.00 kworker/u288:0 Intel® Open Network Platform Server Benchmark Performance Test Report 17. Set the OVS flows. # ./ovs_flow_vm_vhost.sh # cat ovs_flow_vm_vhost.sh #! /bin/sh # Move to command directory cd /usr/src/ovs/utilities/ # Clear current flows ./ovs-ofctl del-flows br0 # Add Flow for port 0 (16) to port 1 (17) ./ovs-ofctl add-flow br0 in_port=1,dl_type=0x800,nw_dst=5.1.1.1,idle_timeout=0,action=output:3 ./ovs-ofctl add-flow br0 in_port=2,dl_type=0x800,nw_src=5.1.1.1,idle_timeout=0,action=output:4 ./ovs-ofctl add-flow br0 in_port=3,dl_type=0x800,nw_src=5.1.1.1,idle_timeout=0,action=output:1 ./ovs-ofctl add-flow br0 in_port=4,dl_type=0x800,nw_dst=5.1.1.1,idle_timeout=0,action=output:2 B.5.5 SR-IOV VM Test The SR-IOV test gives Niantic 10 GbE SR-IOV Virtual Function interfaces to a VM to do a network throughput test of the VM using SR-IOV network interface. 1. Make sure the host has boot line configuration for 1 GB hugepage memory, CPU isolation, and the Intel IOMMU is on and running in PT mode. # dmesg |grep command [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.15.6-200.fc20.x86_64 root=UUID=27da0816-bbbd-4b28- acaf-993939cfa758 ro default_hugepagesz=1GB hugepagesz=1GB hugepages=16 hugepagesz=2M hugepages=2048 isolcpus=1,2 ,3,4,5,6,7,8,9 iommu=pt intel_iommu=on vconsole.font=latarcyrheb-sun16 rhgb quiet 2. The hugepage file systems are initialized. # mount -t hugetlbfs nodev /dev/hugepages # mkdir /dev/hugepages_2mb # mount -t hugetlbfs nodev /dev/hugepages_2mb -o pagesize=2MB 3. Black list the IXGBE virtual function device driver to prevent host loading and VF NIC configuration by adding it to /etc/modprobe.d/blacklist.conf. cat /etc/modprobe.d/blacklist.conf . . . # Intel ixgbe sr-iov vf (virtual driver) blacklist ixgbevf 4. Uninstall and reinstall the IXGBE 10 GbE Niantic NIC device driver to enable SR_IOV and create the SR-IOV interfaces. # modprobe -r ixgbe # modprobe ixgbe max_vfs=1 # service network restart In this case only one SR-IOV VF (Virtual Function) was created per PF (Physical Function) using max_vfs=1, but often more than 1 is desired. The network needs to be restarted afterward. 61 Intel® Open Network Platform Server Benchmark Performance Test Report 5. Bring up and configure the Niantic PF interfaces to allow the SR-IOV interfaces to be used. Leaving the PF unconfigured will cause the SR-IOV interfaces to not be usable for the test. The Niantic 10 GbE interfaces happen to have default labeling of p786p1 and p786p2 in this example. # ifconfig p786p1 11.1.1.225/24 up # ifconfig p786p2 12.1.1.225/24 up 6. Find the target SR-IOV interfaces to use. # lspci -nn |grep Ethernet 03:00.0 Ethernet controller [0200]: X540-AT2 [8086:1528] (rev 01) 03:00.1 Ethernet controller [0200]: X540-AT2 [8086:1528] (rev 01) 03:10.0 Ethernet controller [0200]: Function [8086:1515] (rev 01) 03:10.1 Ethernet controller [0200]: Function [8086:1515] (rev 01) 06:00.0 Ethernet controller [0200]: Connection [8086:10fb] (rev 01) 06:00.1 Ethernet controller [0200]: Connection [8086:10fb] (rev 01) 06:10.0 Ethernet controller [0200]: Function [8086:10ed] (rev 01) 06:10.1 Ethernet controller [0200]: Function [8086:10ed] (rev 01) Intel Corporation Ethernet Controller 10-Gigabit Intel Corporation Ethernet Controller 10-Gigabit Intel Corporation X540 Ethernet Controller Virtual Intel Corporation X540 Ethernet Controller Virtual Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Intel Corporation 82599 Ethernet Controller Virtual Intel Corporation 82599 Ethernet Controller Virtual We see the two Niantic (device 82599) Virtual Functions, 1 for each Physical Device, are PCI devices 06:10.0 and 06:10.1 with PCI device types [8086:10ed]. 7. Make sure the pci-stub device driver is loaded since it is used to transfer physical devices to VMs. It was observed that the pci-stub loaded by default on Fedora 20 and not loaded by default on Ubuntu 14.04. The following checks for pci-stub present. #ls /sys/bus/pci/drivers |grep pci-stub pci-stub If not present load #modprobe pci-stub 8. The Niantic (82599) Virtual Function interfaces are the target interfaces which have a type ID of 8086:10ed. Load the VF devices into pci-stub. # echo "8086 10ed" > /sys/bus/pci/drivers/pci-stub/new_id 9. List the devices owned by the pci-stub driver. # ls /sys/bus/pci/drivers/pci-stub 0000:06:10.0 0000:06:10.1 bind new_id remove_id uevent unbind 10. If the target Virtual Function devices are not listed, they are probably attached to an IXGBEVF driver on the host. The following might be used in that case. # echo 0000:06:10.0 > /sys/bus/pci/devices/0000:06:10.0/driver/unbind # echo 0000:06:10.1 > /sys/bus/pci/devices/0000:06:10.1/driver/unbind # echo "8086 10ed" > /sys/bus/pci/drivers/pci-stub/new_id 11. Verify devices present for transfer to VM. # ls /sys/bus/pci/drivers/pci-stub 0000:06:10.0 0000:06:10.1 bind new_id 62 remove_id uevent unbind Intel® Open Network Platform Server Benchmark Performance Test Report B.5.5.1 SR-IOV VM Startup The VM is started with the SR-IOV physical devices passed to the VM in the QEMU command line. The VM is started with 3 GB of preloaded 1 GB huge memory (minimum 3 GB for one 1 GB page to be available in VM) from /dev/hugepages using 3 vCPUS of host type with first network interface device being virtio used for the bridged management network and second network device being SR-IOV 06:10.0 and third being SR-IOV 06:10.1. The following is a script that was used to start the VM for the test: # cat vm_sr-iov_start.sh #!/bin/sh vm=/vm/Fed20-mp.qcow2 vm_name="SRIOVtest" vnc=10 n1=tap46 bra=br-mgt dn_scrp_a=/vm/vm_ctl/br-mgt-ifdown mac1=00:1f:33:16:64:44 if [ ! -f $vm ]; then echo "VM $vm not found!" else echo "VM $vm started! VNC: $vnc, management network: $n1" tunctl -t $n1 brctl addif $bra $n1 ifconfig $n1 0.0.0.0 up taskset 0x70 qemu-system-x86_64 -cpu host -hda $vm -m 3072 -boot c -smp 3 pidfile /tmp/vm1.pid -monitor unix:/tmp/vm1monitor,server,nowait -mem-path /dev/ hugepages -mem-prealloc -enable-kvm -net nic,model=virtio,netdev=eth0,macaddr=$mac1 -netdev tap,ifname=$n1,id=eth0,script=no,downscript=$dn_scrp_a -device pciassign,host=06:10.0 -device pci-assign,host=06:10.1 -name $vm_name -vnc :$vnc & fi Script for VM shutdown support: # cat br-mgt-ifdown #!/bin/sh bridge='br-mgt' /sbin/ifconfig $1 0.0.0.0 down brctl delif ${bridge} $1 Start the VM: # ./vm_sr-iov_start.sh VM /vm/Fed20-mp.qcow2 started! VNC: 10, management network: tap46 Set 'tap46' persistent and owned by uid 0 B.5.5.2 Operating VM for SR-IOV Test In the VM, the first network is configured for management network access and used for test control. 1. Verify that the kernel has bootline parameters for 1 GB hugepage support with one 1 GB page and that vCPU 1 and vCPU 2 have been isolated from the VM task scheduler. # dmesg |grep command [ 0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-3.15.6-200.fc20.x86_64 root=UUID=6635b8e3-0312-4892-92e7-ccff453ec06d ro default_hugepagesz=1GB hugepagesz=1G hugepages=1 isolcpus=1,2 vconsole.font=latarcyrheb-sun16 rhgb quiet 63 Intel® Open Network Platform Server Benchmark Performance Test Report If not configure generate new grub file and reboot VM. # vim /etc/default/grub // edit file . . . GRUB_CMDLINE_LINUX="default_hugepagesz=1GB hugepagesz=1G hugepages=1 isolcpus=1,2..." . . . # grub2-mkconfig -o /boot/grub2/grub.cfg Generating grub.cfg ... Found linux image: /boot/vmlinuz-3.15.6-200.fc20.x86_64 Found initrd image: /boot/initramfs-3.15.6-200.fc20.x86_64.img Found linux image: /boot/vmlinuz-3.11.10-301.fc20.x86_64 Found initrd image: /boot/initramfs-3.11.10-301.fc20.x86_64.img Found linux image: /boot/vmlinuz-0-rescue-92606197a3dd4d3e8a2b95312bca842f Found initrd image: /boot/initramfs-0-rescue-92606197a3dd4d3e8a2b95312bca842f.img Done # reboot 2. Verify 1 GB memory page available. # cat /proc/meminfo ... HugePages_Total: 1 HugePages_Free: 1 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 1048576 DirectMap4k: 45048 DirectMap2M: 2052096 DirectMap1G: 1048576 kB kB kB kB 3. Check CPU is host type and 3 vCPUs are available. # cat /proc/cpuinfo ... processor : 2 vendor_id : GenuineIntel cpu family : 6 model : 63 model name : Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz stepping : 2 microcode : 0x1 cpu MHz : 2593.992 cache size : 4096 KB physical id : 2 siblings : 1 core id : 0 cpu cores : 1 apicid : 2 initial apicid : 2 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm xsaveopt fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid bogomips : 5187.98 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: 64 Intel® Open Network Platform Server Benchmark Performance Test Report 4. Install DPDK. # # # # # cd /root git clone git://dpdk.org/dpdk cd /root/dpdk git checkout -b test_v1.7.1 v1.7.1 make install T=x86_64-ivshmem-linuxapp-gcc 5. Build l3fwd example program. # cd examples/l3fwd # export RTE_SDK=/root/dpdk # export RTE_TARGET=x86_64-ivshmem-linuxapp-gcc # make CC main.o LD l3fwd INSTALL-APP l3fwd INSTALL-MAP l3fwd.map 6. Mount the hugepage file system. # mount -t hugetlbfs nodev /dev/hugepages 7. Install UIO driver in VM. # modprobe uio # insmod /root/dpdk/x86_64-ivshmem-linuxapp-gcc/kmod/igb_uio.ko 8. Find the SR-IOV devices. # lspci -nn 00:00.0 Host bridge [0600]: Intel Corporation 440FX - 82441FX PMC [Natoma] [8086:1237] (rev 02) 00:01.0 ISA bridge [0601]: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] [8086:7000] 00:01.1 IDE interface [0101]: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II] [8086:7010] 00:01.3 Bridge [0680]: Intel Corporation 82371AB/EB/MB PIIX4 ACPI [8086:7113] (rev 03) 00:02.0 VGA compatible controller [0300]: Cirrus Logic GD 5446 [1013:00b8] 00:03.0 Ethernet controller [0200]: Red Hat, Inc Virtio network device [1af4:1000] 00:04.0 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01) 00:05.0 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function The networks are in the order of the QEMU command line. The first network interface, 00:03.0 is the bridged control network. The second network interface 00:04.0 is the first SR-IOV physical PCI device passed on the QEMU command lined with the third network interface 00:05.0 is the second SR-IOV physical PCI device passed. 9. Bind the SR-IOV devices to the igb_uio device driver. # /root/dpdk/tools/dpdk_nic_bind.py --bind=igb_uio 00:04.0 # /root/dpdk/tools/dpdk_nic_bind.py --bind=igb_uio 00:05.0 10. Run the l3fwd example program. # cd /root/dpdk/examples/l3fwd ]# ./build/l3fwd -c 6 -n 4 --socket-mem 1024 -- -p 0x3 --config="(0,0,1),(1,0,2)" & The -c 6 and -config parameters correctly Affinitizes the l3fwd task to the correct vCPUs in the VM, however since the CPU cores on the host are isolated and QEMU task was started on isolated cores, the real-time l3fwd task will be thrashing since both vCPUs are running on the same CPU core, requiring the CPU affinitization to set correctly on the host or poor network performance will occur. 65 Intel® Open Network Platform Server Benchmark Performance Test Report 11. Locate the vCPU QEMU thread IDs on the host. # ps -eLF |grep qemu root 1718 1 1718 0 6 982701 44368 4 17:41 pts/0 00:00:05 qemu-systemx86_64 -cpu host -hda /vm/Fed20-mp.qcow2 -m 3072 -boot c -smp 3 -pidfile /tmp/vm1.pid monitor unix:/tmp/vm1monitor,server,nowait -mem-path /dev/hugepages -mem-prealloc -enablekvm -net nic,model=virtio,netdev=eth0,macaddr=00:1f:33:16:64:44 -netdev tap,ifname=tap46,id=eth0,script=no,downscript=/vm/vm_ctl/br-mgt-ifdown -device pciassign,host=06:10.0 -device pci-assign,host=06:10.1 -name SRIOVtest -vnc :10 root 1718 1 1720 2 6 982701 44368 4 17:41 pts/0 00:00:32 qemu-systemx86_64 -cpu host -hda /vm/Fed20-mp.qcow2 -m 3072 -boot c -smp 3 -pidfile /tmp/vm1.pid monitor unix:/tmp/vm1monitor,server,nowait -mem-path /dev/hugepages -mem-prealloc -enablekvm -net nic,model=virtio,netdev=eth0,macaddr=00:1f:33:16:64:44 -netdev tap,ifname=tap46,id=eth0,script=no,downscript=/vm/vm_ctl/br-mgt-ifdown -device pciassign,host=06:10.0 -device pci-assign,host=06:10.1 -name SRIOVtest -vnc :10 root 1718 1 1721 30 6 982701 44368 4 17:41 pts/0 00:06:07 qemu-systemx86_64 -cpu host -hda /vm/Fed20-mp.qcow2 -m 3072 -boot c -smp 3 -pidfile /tmp/vm1.pid monitor unix:/tmp/vm1monitor,server,nowait -mem-path /dev/hugepages -mem-prealloc -enablekvm -net nic,model=virtio,netdev=eth0,macaddr=00:1f:33:16:64:44 -netdev tap,ifname=tap46,id=eth0,script=no,downscript=/vm/vm_ctl/br-mgt-ifdown -device pciassign,host=06:10.0 -device pci-assign,host=06:10.1 -name SRIOVtest -vnc :10 root 1718 1 1722 30 6 982701 44368 4 17:41 pts/0 00:06:06 qemu-systemx86_64 -cpu host -hda /vm/Fed20-mp.qcow2 -m 3072 -boot c -smp 3 -pidfile /tmp/vm1.pid monitor unix:/tmp/vm1monitor,server,nowait -mem-path /dev/hugepages -mem-prealloc -enablekvm -net nic,model=virtio,netdev=eth0,macaddr=00:1f:33:16:64:44 -netdev tap,ifname=tap46,id=eth0,script=no,downscript=/vm/vm_ctl/br-mgt-ifdown -device pciassign,host=06:10.0 -device pci-assign,host=06:10.1 -name SRIOVtest -vnc :10 root 1718 1 1724 0 6 982701 44368 4 17:41 pts/0 00:00:00 qemu-systemx86_64 -cpu host -hda /vm/Fed20-mp.qcow2 -m 3072 -boot c -smp 3 -pidfile /tmp/vm1.pid monitor unix:/tmp/vm1monitor,server,nowait -mem-path /dev/hugepages -mem-prealloc -enablekvm -net nic,model=virtio,netdev=eth0,macaddr=00:1f:33:16:64:44 -netdev tap,ifname=tap46,id=eth0,script=no,downscript=/vm/vm_ctl/br-mgt-ifdown -device pciassign,host=06:10.0 -device pci-assign,host=06:10.1 -name SRIOVtest -vnc :10 root 1718 1 1815 0 6 982701 44368 4 18:01 pts/0 00:00:00 qemu-systemx86_64 -cpu host -hda /vm/Fed20-mp.qcow2 -m 3072 -boot c -smp 3 -pidfile /tmp/vm1.pid monitor unix:/tmp/vm1monitor,server,nowait -mem-path /dev/hugepages -mem-prealloc -enablekvm -net nic,model=virtio,netdev=eth0,macaddr=00:1f:33:16:64:44 -netdev tap,ifname=tap46,id=eth0,script=no,downscript=/vm/vm_ctl/br-mgt-ifdown -device pciassign,host=06:10.0 -device pci-assign,host=06:10.1 -name SRIOVtest -vnc :10 We see that QEMU thread task IDs 1721 and 1722 are accumulating process time of 6:07 and 6:06 respectively, compared to the other low usage threads in this case. We know that the two 100% load DPDK treads are running on vCPU1 and vCPU2 respectively, so this indicates that vCPU1 is task ID 1721 and vCPU2 is task ID 1722. Since QEMU creates vCPU task threads sequentially, vCPU0 must be QEMU thread task 1720. 12. Having CPU core 8 and 9 available on this 10 CPU core processors, we will move vCPU1 to CPU core 8 and vCPU2 to CPU core 9 for the test. # taskset -p 100 1721 pid 1721's current affinity mask: 70 pid 1721's new affinity mask: 100 # taskset -p 200 1722 pid 1722's current affinity mask: 70 pid 1722's new affinity mask: 200 13. Using top, we verify that tasks are running on target CPU cores. # top (1) top - 18:03:21 up 35 min, 2 users, load average: 1.93, 1.89, 1.24 Tasks: 279 total, 1 running, 278 sleeping, 0 stopped, 0 zombie %Cpu0 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, %Cpu1 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, %Cpu2 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, %Cpu3 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 66 0.0 0.0 0.0 0.0 st st st st Intel® Open Network Platform Server Benchmark Performance Test Report %Cpu4 : 0.0 us, %Cpu5 : 0.0 us, %Cpu6 : 0.0 us, %Cpu7 : 0.0 us, %Cpu8 :100.0 us, %Cpu9 :100.0 us, %Cpu10 : 0.0 us, %Cpu11 : 0.0 us, %Cpu12 : 0.0 us, %Cpu13 : 0.0 us, %Cpu14 : 0.0 us, %Cpu15 : 0.0 us, %Cpu16 : 0.0 us, %Cpu17 : 0.0 us, %Cpu18 : 0.0 us, %Cpu19 : 0.0 us, %Cpu20 : 0.0 us, %Cpu21 : 0.0 us, %Cpu22 : 0.0 us, %Cpu23 : 0.0 us, %Cpu24 : 0.0 us, %Cpu25 : 0.0 us, %Cpu26 : 0.0 us, %Cpu27 : 0.0 us, KiB Mem: 32879352 KiB Swap: 8191996 PID 1718 218 1 2 USER root root root root 0.0 sy, 0.0 ni,100.0 0.0 sy, 0.0 ni,100.0 0.0 sy, 0.0 ni,100.0 0.0 sy, 0.0 ni,100.0 0.0 sy, 0.0 ni, 0.0 0.0 sy, 0.0 ni, 0.0 0.0 sy, 0.0 ni,100.0 0.0 sy, 0.0 ni,100.0 0.0 sy, 0.0 ni,100.0 0.0 sy, 0.0 ni,100.0 0.0 sy, 0.0 ni,100.0 0.0 sy, 0.0 ni,100.0 0.0 sy, 0.0 ni,100.0 0.0 sy, 0.0 ni,100.0 0.0 sy, 0.0 ni,100.0 0.0 sy, 0.0 ni,100.0 0.0 sy, 0.0 ni,100.0 0.0 sy, 0.0 ni,100.0 0.0 sy, 0.0 ni,100.0 0.0 sy, 0.0 ni,100.0 0.0 sy, 0.0 ni,100.0 0.0 sy, 0.0 ni,100.0 0.0 sy, 0.0 ni,100.0 0.0 sy, 0.0 ni,100.0 total, 21912988 used, total, 0 used, PR 20 20 20 20 id, 0.0 id, 0.0 id, 0.0 id, 0.0 id, 0.0 id, 0.0 id, 0.0 id, 0.0 id, 0.0 id, 0.0 id, 0.0 id, 0.0 id, 0.0 id, 0.0 id, 0.0 id, 0.0 id, 0.0 id, 0.0 id, 0.0 id, 0.0 id, 0.0 id, 0.0 id, 0.0 id, 0.0 10966364 8191996 NI VIRT RES SHR 0 3930804 44368 20712 S 0 0 0 0 0 47536 5288 3744 0 0 0 0 wa, 0.0 hi, 0.0 si, wa, 0.0 hi, 0.0 si, wa, 0.0 hi, 0.0 si, wa, 0.0 hi, 0.0 si, wa, 0.0 hi, 0.0 si, wa, 0.0 hi, 0.0 si, wa, 0.0 hi, 0.0 si, wa, 0.0 hi, 0.0 si, wa, 0.0 hi, 0.0 si, wa, 0.0 hi, 0.0 si, wa, 0.0 hi, 0.0 si, wa, 0.0 hi, 0.0 si, wa, 0.0 hi, 0.0 si, wa, 0.0 hi, 0.0 si, wa, 0.0 hi, 0.0 si, wa, 0.0 hi, 0.0 si, wa, 0.0 hi, 0.0 si, wa, 0.0 hi, 0.0 si, wa, 0.0 hi, 0.0 si, wa, 0.0 hi, 0.0 si, wa, 0.0 hi, 0.0 si, wa, 0.0 hi, 0.0 si, wa, 0.0 hi, 0.0 si, wa, 0.0 hi, 0.0 si, free, 16680 buffers free, 519040 cached 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 st st st st st st st st st st st st st st st st st st st st st st st st S %CPU %MEM TIME+ COMMAND 199.9 0.1 15:10.12 qemu-system-x86 S 0.3 0.0 0:00.31 kworker/10:1 S 0.0 0.0 0:02.94 systemd S 0.0 0.0 0:00.00 kthreadd This shows that the DPDK l3fwd CPU 100% loads are on the target CPU cores 8 and 9. 14. Check that the correct vCPU loads in the VM. # top (1) top - 15:03:45 up 22 min, 3 users, load average: 2.00, 1.88, 1.24 Tasks: 86 total, 2 running, 84 sleeping, 0 stopped, 0 zombie %Cpu0 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu1 :100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu2 : 99.7 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 0.0 si, 0.0 st KiB Mem: 3082324 total, 1465384 used, 1616940 free, 23304 buffers KiB Swap: 2097148 total, 0 used, 2097148 free, 298540 cached PID 7388 7221 1 2 USER root root root root PR 20 20 20 20 NI VIRT 0 1681192 0 0 0 47496 0 0 RES 3144 0 5052 0 SHR 2884 0 3692 0 S %CPU %MEM R 200.1 0.1 S 0.3 0.0 S 0.0 0.2 S 0.0 0.0 TIME+ 2:11.14 0:00.14 0:00.33 0:00.00 COMMAND l3fwd kworker/0:0 systemd kthreadd In the VM, the top utility shows that the DPDK l3task loads are on vCPU1 and vCPU2. Before correct host affiliation, due to CPU isolation the vCPU loads were running on the same host core since CPU load scheduling is disabled causing top to not show close to 100% user task loads in the VM. The VM is now ready for SR-IOV throughput tests. In a production system, the VM startup and affiliation would be done in a different manor to avoid disrupting any real time processes currently running on the host or other VMs. 67 Intel® Open Network Platform Server Benchmark Performance Test Report B.5.6 Affinitization and Performance Tuning To maximize network throughput, individual cores must be affinitized to particular tasks. This can be achieved by using either the taskset command on the host and/or by passing a core mask parameter to the VM application. The VM starts with two cores: vCPU0 and vCPU1. The Linux operating system and related tasks must use only vCPU0. vCPU1 is reserved to run the DPDK processes. B.5.6.1 Affinitization Using Core Mask Parameter in the qemu, and the test-pmd Startup Commands The qemu, test-pmd startup commands offer a core mask parameter that can be set with a hex mask to ensure the tasks use specific cores. Use core mask: -c 0x1 for test-pmd commands. This ensures that the DPDK task application in the VM uses vCPU1. Ensure by running top command if vCPU1 is used at 100%. However, with the qemu command, even though core mask is set to use two host cores for the VM's vCPU0 and vCPU1, it allocates the vCPU0 and vCPU1 tasks on a single host core mostly the first core of the specified core mask. Hence, the qemu task needs to be re-affinitized. B.5.6.2 Affinitized Host Cores for VMs vCPU0 and vCPU1 Use the taskset command to pin specific processes to a core. # taskset -p <core_mask> <pid> Ensure that the VM's vCPU0 and vCPU1 are assigned to two separate host cores. For example: • Intel® DPDK Accelerated vSwitch uses cores 0, 1, 2 and 3 (-c 0x0F) • QEMU task for VM's vCPU0 uses core 4 (-c 0x30) • QEMU task for VM's vCPU1 uses core 5 (-c 0x30; taskset -p 20 <pid_vcpu1>) 68 Intel® Open Network Platform Server Benchmark Performance Test Report PCI Passthrough with Intel® QuickAssist B.6 When setting up a QAT PCI device for a passthrough-to-VM, make sure that VT-d is enabled in the BIOS and “intel_iommu=on iommu=pt” is used in the grub.cfg file to boot the OS with IOMMU enabled. The VM has access to the QAT PCI device using PCI passthrough. Make sure the host has two QAT cards since we will set up an IPSec tunnel between two VMs, each VM using QAT to accelerate the tunnel. The VM uses standard Open vSwitch virtio+standard vhost IO virtualization method for networking. 1. Run the following command to verify that the host has two QAT devices provisioned in it. # lspci -nn |grep 043 0c:00.0 Co-processor [0b40]: Intel Corporation Coleto Creek PCIe Endpoint [8086:0435] 85:00.0 Co-processor [0b40]: Intel Corporation Coleto Creek PCIe Endpoint [8086:0435] 2. Use the following commands to detach PCI devices from the host. # echo 8086 0435 > /sys/bus/pci/drivers/pci-stub/new_id # echo 0000:85:00.0 > /sys/bus/pci/devices/0000\:85\:00.0/driver/unbind # echo 0000:0c:00.0 > /sys/bus/pci/devices/0000\:0c\:00.0/driver/unbind Note: You may need to use the specific PCI bus ID per your system setup. 3. After detaching the acceleration complex from the host operating system, bind the appropriate bus/ device/function to pci-stub driver. # echo 0000:85:00.0 > /sys/bus/pci/drivers/pci-stub/bind # echo 0000:0c:00.0 > /sys/bus/pci/drivers/pci-stub/bind 4. Verify if the devices are bound to pci-stub. # lspci -vv |grep pci-stub 5. On a separate compute node that uses standard Open vSwitch for networking, add an ovs bridge called br0. The tap devices tap1, tap2, tap3 and tap4 is used as data network vNICs for the two VMs. Each of the two 10 GbE on the host are bridged to the ovs bridge br0 as follows: # # # # # # # # # # # ovs-vsctl tunctl -t tunctl -t tunctl -t tunctl -t ovs-vsctl ovs-vsctl ovs-vsctl ovs-vsctl ovs-vsctl ovs-vsctl add-br br0 tap1 tap2 tap3 tap4 add-port br0 add-port br0 add-port br0 add-port br0 add-port br0 add-port br0 tap1 tap2 tap3 tap4 p786p1 p786p2 6. Bring up the tap devices and ports added to the bridge. # # # # # # # ip ip ip ip ip ip ip link link link link link link link set set set set set set set br0 up tap1 up tap2 up tap3 up tap4 up p786p1 up p786p2 up 7. Disable Linux Kernel forwarding on the host. # echo 0 > /proc/sys/net/ipv4/ip_forward 69 Intel® Open Network Platform Server Benchmark Performance Test Report 8. The two VMs can now be started with QAT devices as PCI passthrough. Following is a qemu command to pass 85:00.0 and 0c:00.0 to the VMs: # qemu-system-x86_64 -cpu host -enable-kvm -hda <VM1_image_path> -m 8192 -smp 4 -net nic,model=virtio,netdev=eth0,macaddr=00:00:00:00:00:01 -netdev tap,ifname=tap1,id=eth0,vhost=on,script=no,downscript=no -net nic,model=virtio,netdev=eth1,macaddr=00:00:00:00:00:02 -netdev tap,ifname=tap2,id=eth1,vhost=on,script=no,downscript=no -vnc :15 -name vm1 -device pci- assign,host=85:00.0 & # qemu-system-x86_64 -cpu host -enable-kvm -hda <VM2_image_path> -m 8192 -smp 4 -net nic,model=virtio,netdev=eth0,macaddr=00:00:00:00:00:03 -netdev tap,ifname=tap3,id=eth0,vhost=on,script=no,downscript=no -net nic,model=virtio,netdev=eth1,macaddr=00:00:00:00:00:04 -netdev tap,ifname=tap4,id=eth1,vhost=on,script=no,downscript=no -vnc :16 -name vm2 -device pci- assign,host=0c:00.0 & Refer to Appendix B.6.1 to setup a VM with QAT drivers, netkeyshim module and the strongSwan IPSec application. B.6.1 VM Installation B.6.1.1 Creating VM Image and VM Configuration Refer to Appendix B.4.6. B.6.1.2 Verifying Pass-through Once the guest starts, run the following command within guest: # lspci -nn Pass-through PCI devices should appear with the same description as the host originally showed. For example, if the following was shown on the host: 0c:00.0 Co-processor [0b40]: Intel Corporation Coleto Creek PCIe Endpoint [8086:0435] It should show up on the guest as: Ethernet controller [0200]: Intel Corporation Device [8086:0435] B.6.1.3 Installing Intel® Communications Chipset Software in KVM Guest The instructions in this solutions guide assume that you have super user privileges. The QAT build directory used in this section is /QAT. # su # mkdir /QAT # cd /QAT 1. Transfer the tarball using any preferred method. For example, USB memory stick, CDROM, or network transfer in the /QAT directory. # tar -zxof <QAT_tarball_name> 2. Launch the script using the following command: # ./installer.sh 70 Intel® Open Network Platform Server Benchmark Performance Test Report 3. Choose option 2 to build and install the acceleration software. Choose option 6 to build the sample LKCF code. 4. Start/Stop acceleration software. # service qat_service start/stop 5. Setup the environment to install the Linux kernel crypto framework driver. # export ICP_ROOT=/QAT # export KERNEL_SOURCE_ROOT=/usr/src/kernels/`uname -r` 6. Unpack the Linux kernel crypto driver. # mkdir -p $ICP_ROOT/quickassist/shims/netkey # cd $ICP_ROOT/quickassist/shims/netkey # tar xzof <path_to>/icp_qat_netkey.L.<version>.tar.gz 7. Build the Linux kernel crypto driver. # cd $ICP_ROOT/quickassist/shims/netkey/icp_netkey # make 8. Install the Linux kernel crypto driver. # cd $ICP_ROOT/quickassist/shims/netkey/icp_netkey # insmod ./icp_qat_netkey.ko 9. Verify that the module has been installed. # lsmod | grep icp The following is the expected output: icp_qat_netkey icp_qa_al B.6.2 21868 0 1551435 2 icp_qat_netkey Installing strongSwan IPSec Software Install strongSwan IPsec software on both VM1 and VM2. 1. Download the original strongswan-4.5.3 software package from the following link: http://download.strongswan.org/strongswan-4.5.3.tar.gz 2. Navigate to the shims directory. # cd $ICP_ROOT/quickassist/shims 3. Extract the source files from the strongSwan software package to the shims directory. # tar xzof <path_to>/strongswan-4.5.3.tar.gz 4. Navigate to the strongSwan-4.5.3 directory. # cd $ICP_ROOT/quickassist/shims/strongswan-4.5.3 5. Configure strongSwan using the following command: # ./configure --prefix=/usr --sysconfdir=/etc 6. Build strongSwan using the following command: # make 7. Install strongSwan using the following command: # make install Repeat the installation of strongSwan IPsec software on the other VM. 71 Intel® Open Network Platform Server Benchmark Performance Test Report B.6.3 Configuring strongSwan IPsec Software 1. Add the following line to the configuration file /etc/strongswan.conf on the VM1 and VM2 platforms after the charon { line: load = curl aes des sha1 sha2 md5 pem pkcs1 gmp random x509 revocation hmac xcbc stroke kernel-netlink socket-raw updown After adding the line, the section looks like: # strongswan.conf - strongSwan configuration file charon { load = curl aes des sha1 sha2 md5 pem pkcs1 gmp random x509 revocation hmac xcbc stroke kernel-netlink socket-raw updown # number of worker threads in charon threads = 16 Note: When the /etc/strongswan.conf text files are created, the line that starts with load and ends with updown must be on the same line, despite the appearance of it being on two separate lines in this documentation. 2. Update the strongSwan configuration files on the VM1 platform: a. Edit /etc/ipsec.conf. # ipsec.conf - strongSwan IPsec configuration file # basic configuration config setup plutodebug=none crlcheckinterval=180 strictcrlpolicy=no nat_traversal=no charonstart=no plutostart=yes conn %default ikelifetime=60m keylife=1m rekeymargin=3m keyingtries=1 keyexchange=ikev1 ike=aes128-sha-modp2048! esp=aes128-sha1! conn host-host left=192.168.99.2 leftfirewall=no right=192.168.99.3 auto=start authby=secret b. Edit /etc/ipsec.secrets (this file might not exist and might need to be created). # /etc/ipsec.secrets - strongSwan IPsec secrets file 192.168.99.2 192.168.99.3 : PSK "shared key" 3. Update the strongSwan configuration files on the VM2 platform: a. Edit /etc/ipsec.conf. # /etc/ipsec.conf - strongSwan IPsec configuration file config setup crlcheckinterval=180 strictcrlpolicy=no plutostart=yes plutodebug=none 72 Intel® Open Network Platform Server Benchmark Performance Test Report charonstart=no nat_traversal=no conn %default ikelifetime=60m keylife=1m rekeymargin=3m keyingtries=1 keyexchange=ikev1 ike=aes128-sha1-modp2048! esp=aes128-sha1! conn host-host left=192.168.99.3 leftfirewall=no right=192.168.99.2 auto=start authby=secret b. Edit /etc/ipsec.secrets (this file might not exist and might need to be created). # /etc/ipsec.secrets - strongSwan IPsec secrets file 192.168.99.3 192.168.99.2 : PSK "shared key" B.6.3.1 Starting strongSwan IPsec Software on the VM IPSec tunnel is setup between two VMs as shown in Figure 6-20 on page 32. The IPSec tunnel is setup for network interfaces on the subnet 192.168.99.0/24. The data networks for VM1 and VM2 are 1.1.1.0/24 and 6.6.6.0/24 respectively. 1. VM1 configuration settings: a. Network configuration: # cat /etc/sysconfig/network-scripts/ifcfg-eth1 TYPE=Ethernet NAME=eth1 BOOTPROTO=none ONBOOT=yes NETMASK=255.255.255.0 IPADDR=1.1.1.2 HWADDR=00:00:00:00:00:01 # cat /etc/sysconfig/network-scripts/ifcfg-eth2 TYPE=Ethernet NAME=eth2 BOOTPROTO=none ONBOOT=yes NETMASK=255.255.255.0 IPADDR=192.168.99.2 HWADDR=00:00:00:00:00:02 b. Add a route to the routing table for the VM2 network. # ip route add 6.6.6.0/24 dev eth2 # route -n c. Enable Linux Kernel forwarding and stop the firewall daemon. # echo 1 > /proc/sys/net/ipv4/ip_forward # service firewalld stop d. Check QAT service status and insert the netkeyshim module. # export ICP_ROOT=/qat/QAT1.6 # service qat_service status # insmod $ICP_ROOT/quickassist/shims/netkey/icp_netkey/icp_qat_netkey.ko 73 Intel® Open Network Platform Server Benchmark Performance Test Report e. Start the IPSec process. # ipsec start 2. VM2 configuration settings: a. Network configuration: # cat /etc/sysconfig/network-scripts/ifcfg-eth1 TYPE=Ethernet NAME=eth1 BOOTPROTO=none ONBOOT=yes NETMASK=255.255.255.0 IPADDR=6.6.6.6 HWADDR=00:00:00:00:00:03 # cat /etc/sysconfig/network-scripts/ifcfg-eth2 TYPE=Ethernet NAME=eth2 BOOTPROTO=none ONBOOT=yes NETMASK=255.255.255.0 IPADDR=192.168.99.3 HWADDR=00:00:00:00:00:04 b. Add a route to the routing table for the VM2 network. # ip route add 1.1.1.0/24 dev eth2 # route -n c. Enable Linux Kernel forwarding and stop the firewall daemon # echo 1 > /proc/sys/net/ipv4/ip_forward # service firewalld stop d. Check QAT service status and insert the netkeyshim module. # export ICP_ROOT=/qat/QAT1.6 # service qat_service status # insmod $ICP_ROOT/quickassist/shims/netkey/icp_netkey/icp_qat_netkey.ko e. Start the IPSec process # ipsec start 3. Enable the IPsec tunnel “host-host” on both VM1 and VM2: # ipsec up host-host 4. To test the VM-VM IPSec tunnel performance, use Netperf. 74 Intel® Open Network Platform Server Benchmark Performance Test Report Appendix C Glossary Acronym COTS Description Commercial Off‐The-Shelf DPI Deep Packet Inspection IOMMU Input/Output Memory Management Unit Kpps Kilo packets per seconds KVM Kernel-based Virtual Machine Mpps Millions packets per seconds NIC Network Interface Card pps Packets per seconds QAT Quick Assist Technology RA Reference Architecture RSS Receive Side Scaling SP Service Provider SR-IOV Single root I/O Virtualization 75 Intel® Open Network Platform Server Benchmark Performance Test Report NOTE: 76 This page intentionally left blank. Intel® Open Network Platform Server Benchmark Performance Test Report Appendix D Definitions D.1 Packet Throughput There is a difference between an Ethernet frame, an IP packet, and a UDP datagram. In the sevenlayer OSI model of computer networking, packet refers to a data unit at layer 3 (network layer). The correct term for a data unit at layer 2 (data link layer) is a frame, and at layer 4 (transport layer) is a segment or datagram. Important concepts related to 10GbE performance are frame rate and throughput. The MAC bit rate of 10GbE, defined in the IEEE standard 802.3ae, is 10 billion bits per second. Frame rate is based on the bit rate and frame format definitions. Throughput, defined in IETF RFC 1242, is the highest rate at which the system under test can forward the offered load, without loss. The frame rate for 10GbE is determined by a formula that divides the 10 billion bits per second by the preamble + frame length + inter-frame gap. The maximum frame rate is calculated using the minimum values of the following parameters, as described in the IEEE 802.3ae standard: • Preamble: 8 bytes * 8 = 64 bits • Frame length: 64 bytes (minimum) * 8 = 512 bits • Inter-frame gap: 12 bytes (minimum) * 8 = 96 bits Therefore, Maximum Frame Rate (64-byte packets) = MAC Transmit Bit Rate / (Preamble + Frame Length + Inter-frame Gap) = 10,000,000,000 / (64 + 512 + 96) = 10,000,000,000 / 672 = 14,880,952.38 frame per second (fps) Table D-1 Maximum Throughput IP Packet Size (Bytes) Theoretical Line Rate (fps) (Full duplex) Theoretical Line Rate (fps) (Half duplex) 64 29,761,904 14,880,952 128 16,891,891 8,445,946 256 9,057,971 4,528,986 512 4,699,248 2,349,624 1024 2,394,636 1,197,318 1280 1,923,076 961,538 1518 1,625,487 812,744 77 Intel® Open Network Platform Server Benchmark Performance Test Report D.2 RFC 2544 RFC 2544 is an Internet Engineering Task Force (IETF) RFC that outlines a benchmarking methodology for network Interconnect Devices. The methodology results in performance metrics such as latency, frame loss percentage, and maximum data throughput. In this document network “throughput” (measured in millions of frames per second) is based on RFC 2544, unless otherwise noted. Frame size refers to Ethernet frames ranging from smallest frames of 64 bytes to largest frames of 1518 bytes. Types of tests are: • Throughput test defines the maximum number of frames per second that can be transmitted without any error. Test time during which frames are transmitted must be at least 60 seconds. • Latency test measures the time required for a frame to travel from the originating device through the network to the destination device. • Frame loss test measures the network’s response in overload conditions—a critical indicator of the network’s ability to support real-time applications in which a large amount of frame loss will rapidly degrade service quality. • Burst test assesses the buffering capability of a switch. It measures the maximum number of frames received at full line rate before a frame is lost. In carrier Ethernet networks, this measurement validates the excess information rate (EIR) as defined in many SLAs. • System recovery to characterize speed of recovery from an overload condition • Reset to characterize speed of recovery from device or software reset Although not included in the defined RFC 2544 standard, another crucial measurement in Ethernet networking is packet jitter. 78 Intel® Open Network Platform Server Benchmark Performance Test Report Appendix E References Document Name Internet Protocol version 4 Source http://www.ietf.org/rfc/rfc791.txt Internet Protocol version 6 http://www.faqs.org/rfc/rfc2460.txt Intel® 82599 10 Gigabit Ethernet Controller Datasheet http://www.intel.com/content/www/us/en/ethernet-controllers/8259910-gbe-controller-datasheet.html Intel DDIO https://www-ssl.intel.com/content/www/us/en/io/direct-data-i-o.html? Bandwidth Sharing Fairness http://www.intel.com/content/www/us/en/network-adapters/10-gigabitnetwork-adapters/10-gbe-ethernet-flexible-port-partitioning-brief.html Design Considerations for efficient network applications with Intel® multi-core processor- based systems on Linux http://download.intel.com/design/intarch/papers/324176.pdf OpenFlow with Intel 82599 http://ftp.sunet.se/pub/Linux/distributions/bifrost/seminars/workshop2011-03-31/Openflow_1103031.pdf Wu, W., DeMar,P. & Crawford,M (2012). A TransportFriendly NIC for Multicore / Multiprocessor Systems IEEE transactions on parallel and distributed systems, vol 23, no 4, April 2012. http://lss.fnal.gov/archive/2010/pub/fermilab-pub-10-327-cd.pdf Why does Flow Director Cause Placket Reordering? http://arxiv.org/ftp/arxiv/papers/1106/1106.0443.pdf IA packet processing http://www.intel.com/p/en_US/embedded/hwsw/technology/packetprocessing High Performance Packet Processing on Cloud Platforms using Linux* with Intel® Architecture http://networkbuilders.intel.com/docs/ network_builders_RA_packet_processing.pdf Packet Processing Performance of Virtualized Platforms with Linux* and Intel® Architecture http://networkbuilders.intel.com/docs/network_builders_RA_NFV.pdf DPDK http://www.intel.com/go/dpdk Intel® DPDK Accelerated vSwitch https://01.org/packet-processing RFC 1242 (Benchmarking Terminology for Network Interconnection Devices) http://www.ietf.org/rfc/rfc1242.txt RFC 2544 (Benchmarking Methodology for Network Interconnect Devices) http://www.ietf.org/rfc/rfc2544.txt RFC 6349 (Framework for TCP Throughput Testing) http://www.faqs.org/rfc/rfc6349.txt RFC 3393 (IP Packet Delay Variation Metric for IP Performance Metrics) https://www.ietf.org/rfc/rfc3393.txt MEF end-end measurement metrics http://metroethernetforum.org/Assets/White_Papers/Metro-EthernetServices.pdf DPDK Sample Application User Guide http://dpdk.org/doc/intel/dpdk-sample-apps-1.7.0.pdf Benchmarking Methodology for Virtualization Network Performance https://tools.ietf.org/html/draft-liu-bmwg-virtual-network-benchmark-00 Benchmarking Virtual Network Functions and Their Infrastructure https://tools.ietf.org/html/draft-morton-bmwg-virtual-net-01 79 Intel® Open Network Platform Server Benchmark Performance Test Report LEGAL By using this document, in addition to any agreements you have with Intel, you accept the terms set forth below. You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intel products described herein. You agree to grant Intel a non-exclusive, royalty-free license to any patent claim thereafter drafted which includes subject matter disclosed herein. INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Intel technologies may require enabled hardware, specific software, or services activation. Check with your system manufacturer or retailer. Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete information about performance and benchmark results, visit http://www.intel.com/performance. All products, computer systems, dates and figures specified are preliminary based on current expectations, and are subject to change without notice. Results have been estimated or simulated using internal Intel analysis or architecture simulation or modeling, and provided to you for informational purposes. Any differences in your system hardware, software or configuration may affect your actual performance. No computer system can be absolutely secure. Intel does not assume any liability for lost or stolen data or systems or any damages resulting from such losses. Intel does not control or audit third-party web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate. Intel Corporation may have patents or pending patent applications, trademarks, copyrights, or other intellectual property rights that relate to the presented subject matter. The furnishing of documents and other materials and information does not provide any license, express or implied, by estoppel or otherwise, to any such patents, trademarks, copyrights, or other intellectual property rights. © 2014 Intel Corporation. All rights reserved. Intel, the Intel logo, Core, Xeon and others are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. 80