The Magazine for Developers of Open Communication, Industrial, and Rugged Systems
ARTICLES
PRODUCTS
NEWSWIRE
VENDORS
E-LETTER
E-CAST SCHEDULE
 

I/O

Network I/O virtualization (IOV): Key addition to PCIe in multicore designs

By
Netronome Systems, Inc.
and
Netronome Systems, Inc.

4Rolf and Nabil outline a new class of network I/O virtualization architecture and its role as a key ingredient to enable virtualized network infrastructure and appliances on commodity x86 hardware.

Network infrastructure equipment and network appliances are increasingly built around commodity multicore CPUs – specifically x86-based architectures. As a result, I/O communications are becoming increasingly dependent on standard system interconnects, such as PCI Express (PCIe). While an 8-lane PCIe v2 interconnect can easily support well over 10 Gbps of network I/O traffic, bottlenecks in the host system typically require some level of preprocessing. For example, network traffic may be classified into flows. These flows can be load-balanced among the CPU cores with each flow pinned to a specific core. While reducing the I/O overhead, these techniques do not ease the packet and payload processing load on the general-purpose, multi-core CPUs and their memory subsystems. As a result, there is a need to offload some further network processing to specialized processors – for example, to perform application-specific filtering or to apply security rules.

Furthermore, virtualization is becoming increasingly important in network infrastructure equipment and appliances. The reasons are twofold: (1) to allow the migration of legacy applications and legacy operating systems to multicore CPUs; and (2) to provide strong isolation between different applications – for example to consolidate multiple network appliances or applications as virtual network appliances into a single physical system.

These trends dictate the introduction of an intelligent I/O virtualization co-processor (IOV-P). This co-processor allows network processing tasks, such as packet classification and filtering up to Layer 7, to be offloaded from general-purpose multicore CPUs to specialized network processors, freeing up cycles on these cores for more elaborate processing. While specialized network processors have been around for a while, the commoditization of infrastructure equipment and appliances requires a flexible and intelligent interconnect between these specialized processors and the commodity multicore CPU host system, offering filtering and load-balancing of network traffic to individual x86 cores, and supporting a variety of virtualization options.

High-performance networking designs require multicore processors

It is well documented that a single x86 core cannot handle 10 Gbps of network traffic. As a general rule of a thumb, a single x86 core can typically handle 1-2 Gbps while still performing some meaningful processing on the network packets. Modern 10 Gbps network cards, therefore, deliver network packets along multiple queues to the host system with different queues being handled by different cores (Figure 1). In this figure, multiple applications consume network data with the applications being pinned to the cores where the packets are received.

Figure1
Figure 1: An I/O device distributes network flows to different applications in a multicore system.


While most 10 Gbps network cards offer some simple form of filtering and load-balancing to multiple queues, network appliances, and network infrastructure equipment typically require more intelligent filtering and load-balancing based on information up to Layer 7. Such intelligence, for example, is needed to support multiple different applications or to “cut-through” traffic that no longer needs to be inspected. This level of intelligent filtering requires a flexible and programmable network processor.

Virtualization support is also becoming increasingly important for network infrastructure equipment and network appliances, necessitated by the need to support legacy operating systems (OSs) on multicore systems or to provide strong isolation between different applications. To enable virtualization in these environments, intelligent network devices have to support the same features required to enable efficient network processing on multicore systems as previously outlined.

Figure 2 shows a system configuration where multiple guest Virtual Machines (VMs) share an I/O device. Some of these VMs may run legacy, uni-processor OSs (UP), while others may run modern multiprocessor-capable Oss (SMP). In order to efficiently support these types of configurations at 10 Gbps and beyond, the I/O device needs to determine, based on filtering and classification, to which VM incoming packets are delivered. Packets need to be delivered directly to the core(s) on which the target VM is running.

Figure2
Figure 2: I/O virtualization with several legacy uni-processors, also known as UP, OSs and a modern SMP-capable OS share an I/O device with I/O virtualization.


The basic techniques to enable virtualization of shared I/O devices are the same as those required to efficiently deliver network data to different applications. However, supporting virtualization requires additional mechanisms to manage queues on shared I/O devices.

Virtualized multicore processors require network I/O virtualization

In any given system, there are a limited number of I/O devices – typically many less than the number of VMs the system may be hosting. As all VMs require access to I/O, a Virtual Machine Monitor (VMM) or hypervisor needs to mediate access to these shared I/O devices. The PCI Special Interest Group (PCI-SIG) IO Virtualization (IOV) working group is developing extensions to PCIe. The first IOV specification maintains a Single PCIe Root complex (SR-IOV), enabling one physical PCIe device to be divided into multiple Virtual Functions (VFs). Each VF can then be used by a VM, allowing one physical device to be shared by many VMs and their guest OSs.

Some hypervisors extend the techniques used to run device drivers inside the management VM to allow the direct assignment of PCI devices to Guest VMs. Such hypervisors include Xen and VMware ESX Server. Such an approach eliminates the overhead and added latencies of other I/O virtualization approaches, but requires dedicated I/O hardware for the VM.

SR-IOV: Key addition to PCIe

Recognizing the need to provide a device-centric approach to I/O virtualization, the PCI-SIG introduced SR-IOV standard in September 2007. The SR-IOV standard builds on top of a wide range of existing PCI standards, including PCIe, Address Translation Services (ATS), Alternative Routing ID (ARI), and Function Level Reset (FLR). From the host perspective, SR-IOV on its own is primarily an extension to the PCI configuration space, defining access to lightweight VFs.

When implementing SR-IOV, a physical PCI device may contain several device functions, referred to as Physical Functions (PFs). PFs are standard PCIe devices having their own full PCI configuration space and set of resources, but each has an additional SR-IOV Extended Capability as part of its configuration space.

Every PF can support a number of VFs, which are enumerated and configured through this extended capability. The SR-IOV standard imposes certain restrictions on the configuration and enumeration of VFs, In particular, all VFs have to be of the same type. While this is completely adequate for standard network cards, intelligent network processors typically require more flexibility. Different applications will benefit from having access to network traffic via different, optimized interfaces, for example PCAP for packet capture to user space, and sockets for normal network traffic. Furthermore, acceleration support for encrypted network traffic, like IPSec or SSL, requires different interfaces again. The restrictions on configurations of VFs imposed by the SR-IOV standard do not support the flexibility intelligent I/O processors demand. What’s needed is a more flexible I/O virtualization solution.

A flexible IOV solution

Realizing that SR-IOV relies on a number of PCI standards, and it essentially only adds a device enumeration and resource discovery mechanism in hardware, it presents a limitation that can easily be alleviated with a more flexible software solution. This flexible solution, adopted by Netronome in its processors, retains all the advantages of SR-IOV, including low overhead access to I/O devices.

Netronome has built the IOV-P function into its recently introduced Network Flow Processor (NFP), the NFP-32xx family. With the Netronome IOV solution, device enumeration is delegated to a driver running on the host. This driver is specific to the NFP and capable of managing endpoints on the NFP – creation and removal of endpoints of arbitrary types. To the host OS (or hypervisor), the host driver acts as PCI bus driver and enumerates NFP endpoints as standard PCI devices. It essentially implements a virtual PCI bus. All configuration space access for devices on this virtual PCI bus are passed to the host driver, which either emulates them or translates them to accesses on the NFP. Because the Netronome host driver is not restricted by the SR-IOV device enumeration limitations, it can enumerate arbitrary types of functions on its virtual PCI bus, allowing a single hardware device to support VFs of different types.

Application of IOV in network infrastructure equipment

With multiple VMs running on a single physical service blade or appliance, VMs gain opportunities to cost-effectively share a pool of I/O resources, such as intelligent Network Interface Cards (NICs). In the single- application/single-service blade model each application has access to the entire service blade’s bandwidth. In the virtualized application model, however, multiple applications share network bandwidth.

As more applications are consolidated on one service blade, bandwidth requirements per service blade and service blade utilization increase significantly. The result is that an intelligent NIC is needed to offload network processing for the host, keeping the host CPU from becoming the bottleneck that limits application consolidation. This trend requires low overhead delivery of network data directly to the Guest VM.

Next-generation networking equipment requirements for low latency and low overhead delivery of network I/O directly to VMs can only be provided by hardware-based IOV solutions, such as SR-IOV-based NICs or Netronome’s IOV solution. Software-based IO virtualization imposes too high an overhead to handle the expected network IO demand; and IOV solutions based on multiqueue NICs are unsuitable for latency-sensitive applications, such as Deep Packet Inspection (DPI).

Furthermore, the proliferation of encrypted network traffic (IPSec and SSL) provides the opportunity to offload some or all of its required processing to an intelligent NIC, freeing up host CPU cycles for application processing.

Control plane and application layer functions in infrastructure equipment are built around commodity multicore virtualized CPUs. These too will need an underlying I/O subsystem that is also virtualized. Developers will use such IOV subsystems to implement intelligent service blades, network appliances, intelligent trunk cards, and intelligent line cards in the core network infrastructure. Such cards usually serve the various line cards in the system and will be best implemented based on the IOV-P. These cards can be an intelligent virtualized NIC directly supporting multiple 10 GbE interfaces, service blades supporting multiple services, or trunk cards supporting nested tunnels. Service blades or trunk cards may directly support multiple 10 GbE interfaces or have high-bandwidth connections to internal backplanes or switch fabrics.

Figure 3 depicts a virtualized multicore system, with IOV capability running multiple applications. Such applications run on multiple instances of the same OS, or different OSs. These, in turn, run on a single core, single VM or multiple cores, multiple VMs. By classifying network I/O traffic into flows, applying security rules, and pinning flows to a specific VM on a specific core on the host, and/or by load-balancing various flows into various VMs, the IOV-P enables the overall system to achieve full network performance at 10 Gbps and beyond.

Figure3
Figure 3: Service blade/intelligent NIC architecture for infrastructure equipment. Integrating IOV-P function in the data plane allows the virtualized equipment/appliance to support multiple different virtual functions. Netronome's NFP-3200 processor integrates the IOV-P function.


Conclusion

In summary, intelligent service blades with IOV capability are key ingredient of the virtualized system, as they greatly reduce utilization of the host CPU for network processing, allowing the system to support a larger number of applications, while saving power. Adding IOV capability to the intelligent NIC ensures that each application can be configured with its own virtual NIC, allowing a number of applications to share a single 10 GbE physical NIC. At the same time, the IOV-P concept allows a single physical NIC to provide many different “intelligent” functions to the VM and even create and refine these functions at run time.

Network infrastructure equipment and network appliances are increasingly built around commodity multicore CPUs – specifically x86-based architectures. This CPU subsystem is being virtualized for efficient use of CPUs and lower Total Cost of Ownership (TCO). This trend is projected to accelerate at even faster rate.

In order to garner the full benefit of virtualization, the I/O subsystem also need to be virtualized. The PCI-SIG introduced the SR-IOV standard for this purpose. Although the speed with which vendors may adopt SR-IOV may vary, Netronome intends to lead the pack by building flexible SR-IOV in its newly introduced family of network flow processors, targeting high-performance infrastructure equipment and network appliances applications.

Nabil Damouny is the Senior Director of Business Development at Netronome. He has a BSEE from Illinois Institute of Technology (IIT) and an MSECE from the University of California Santa Barbara (UCSB). He holds three patents in computer architecture and remote networking.

Rolf Neugebauer is a Staff Software engineer at Netronome where he works on virtualization support for Netronome's line of Network Flow Processors. Prior to joining Netronome, Rolf worked at Microsoft and Intel Research. At Intel he was one of the initial researchers developing the Xen hypervisor in collaboration with academics at Cambridge University. Rolf holds a PhD and an MSc from the University of Glasgow.

Netronome Systems, Inc.
www.netronome.com

CompactPCI and AdvancedTCA News with RSS Link
Related: compactpci and advancedtca systems,io compactpci and advancedtca systems, io

©MMX CompactPCI AdvancedTCA & MicroTCA Systems. An OpenSystems Media, LLC publication.
Last updated: 07/29/10 09:51 America/Phoenix
ARTICLES   PRODUCTS   PREFERRED VENDORS   NEWSWIRE   EMBEDDED FORUM   eLETTER   SUBSCRIBE FREE >
About this Magazine and Website | Contact Us | Media Kits | Reload this page