10Gbit Ethernet: Killing Another Bottleneck?
by Johan De Gelas on March 8, 2010 12:00 PM EST- Posted in
- IT Computing
Delving Deeper
Let us take a closer look at the Neterion and Intel 10G chips configuration on VMware’s vSphere/ESX platform. First, we checked what the S2IO driver of Neterion did when ESX was booting.
If you look closely, you can see that eight Rx queues are recognized, but only one Tx queue. Compare this to the Intel ixgbe driver:
Eight Tx and Rx queues are recognized, one for each VM. This is also confirmed when we start up the VMs. Each VM gets its own Rx and Tx queue. The Xframe-E has eight transmit and eight receive paths, but it seems that for some reason the driver is not able to use the full potential of the card on ESX 4.0.
Conclusion
The goal of this short test was to discover the possibilities of 10 Gigabit Ethernet in a virtualized server. If you have suggestion for more real world testing, let us know.
CX4 is still the only affordable option that comes with reasonable power consumption. Our one-year-old dual-port CX4 card consumes only 6.5W; a similar 10GBase-T solution would probably need twice as much. The latest 10GBase-T (4W instead of >10W per port) advancements are very promising, as we might see power efficient 10G cards with CAT-6 UTP cables this year.
The Neterion Xframe-E could not fulfill the promise of near 10Gbit speeds at low CPU utilization, but our test can only give a limited indication. It is rather weird, as the card we tested was announced as one of the first to support NetQueue in ESX 3.5. We can only guess that driver support for ESX 4.0 is not optimal (yet). The Xframe X3100 is Neterion’s most advanced product and the spec sheet emphasizes its VMware NetQueue support. Neterion ships mostly to OEMs, so it is hard to get an idea of the pricing. When you spec your HP, Dell or IBM server for ESX 4.0 virtualization purposes, it is probably a good idea to check if the 10G Ethernet card is not an older Neterion card.
At a price of about $450-$550, the Supermicro AOC-STG-I2 dual-port with the Intel 82598EB chip is a very attractive solution. Typically, a quad-port gigabit Ethernet solution will cost you half as much, but it delivers only half the bandwidth at twice the CPU load in a virtualized environment.
49 Comments
View All Comments
fredsky - Tuesday, March 9, 2010 - link
we do use 10GbE at work, and i passed a long time finding the right solutiom- CX4 is outdated, huge cable, short length power hungry
- XFP is also outdated and fiber only
- SFP + is THE thing to get. very long power, and can used with copper twinax AS WELL as fiber. you can get a 7m twinax cable for 150$.
and the BEST card available are Myricom very powerfull for a decent price.
DanLikesTech - Tuesday, March 29, 2011 - link
CX4 is old? outdated? I just connected two VM host servers using CX4 at 20Gb (40Gb aggregate bandwidth)And it cost me $150. $50 for each card and $50 for the cable.
DanLikesTech - Tuesday, March 29, 2011 - link
And not to mention the low latency of InfiniBand compared to 10GbE.http://www.clustermonkey.net/content/view/222/1/
thehevy - Tuesday, March 9, 2010 - link
Great post. Here is a link to a white paper that I wrote to provide some best practice guidance when using 10G and VMware vShpere 4.Simplify VMware vSphere* 4 Networking with Intel® Ethernet 10 Gigabit Server Adapters white paper -- http://download.intel.com/support/network/sb/10gbe...">http://download.intel.com/support/network/sb/10gbe...
More white papers and details on Intel Ethernet products can be found at www.intel.com/go/ethernet
Brian Johnson, Product Marketing Engineer, 10GbE Silicon, LAN Access Division
Intel Corporation
Linkedin: www.linkedin.com/in/thehevy
twitter: http://twitter.com/thehevy">http://twitter.com/thehevy
emusln - Tuesday, March 9, 2010 - link
Be aware that VMDq is not SR-IOV. Yes, VMDq and NetQueue are methods for splitting the data stream across different interrupts and cpus, but they still go through the hypervisor and vSwitch from the one PCI device/function. With SR-IOV, the VM is directly connected to a virtual PCI function hosted on the SR-IOV capable device. The hypervisor is needed to set up the connection, then gets out of the way. This allows the NIC device, with a little help from an iommu, to DMA directly into the VM's memory, rather than jumping through hypervisor buffers. Intel supports this in their 82599 follow-on to the 82598 that you tested.megakilo - Tuesday, March 9, 2010 - link
Johan,Regarding the 10Gb performance on native Linux, I have tested Intel 10Gb (the 82598 chipset) on RHEL 5.4 with iperf/netperf. It runs at 9.x Gb/s with a single port NIC and about 16Gb/s with a dual-port NIC. I just have a little doubt about the Ixia IxChariot benchmark since I'm not familiar about it.
-Steven
megakilo - Tuesday, March 9, 2010 - link
BTW, in order to reach 9+ Gb/s, the iperf/netperf have to run multiple threads (about 2-4 threads) and use a large TCP window size (I used 512KB).JohanAnandtech - Tuesday, March 9, 2010 - link
Thanks. Good feedback! We'll try this out ourselves.sht - Wednesday, March 10, 2010 - link
I was surprised by the poor native Linux results as well. I got > 9 Gbit/s with Broadcom NetXtreme using nuttcp as well. I don't recall whether multiple threads were required to achieve those numbers. I don't think they were, but perhaps using a newer kernel helped, the Linux networking stack has improved substantially since 2.6.18.themelon - Tuesday, March 9, 2010 - link
Did I miss where you mention this or did you completely leave it out of the article?Intel has had VMDq in Gig-E for at least 3-4 years in the 82575/82576 chips. Basically, anything using the igb driver instead of the e1000g driver.