May 10, 2021

Selecting a supported network interface card for VMware vSAN?

Network Interface Cards (NIC) are an often overlooked component. Many people have mistakenly assumed all NIC’s are the same or are simple commodities. It is time for a 2021 round-up of how to find the best NICs out there. For vSAN clusters there are two supported pools of cards to select from:

The vSAN VCG - It is worth noting that this is not the ONLY list of network cards supported by vSAN, but is currently a subset that has been qualified for use with VMware vSAN RDMA. Specifically, RCoEv2 is what is being supported and tested at this time. While this list is currently small (A selection of Mellanox CX4 LX series 25Gbps NICS) there should be some more entries here in the future that span up to 100Gbps. I would strongly suggest starting and ending your search for vSAN Network Interface cards on this list.

 

The vSphere VCG - This is a broader list of cards that are supported for vSphere and non-RDMA vSAN connections. Do note that the certification process here may or may not include a description of a firmware version. Even if firmware versions are mentioned, that may not be the newest firmware or even the vendor-recommended firmware. VMware and the partner will support any certified driver when used with a firmware version equal to or higher than the level displayed in the Firmware Version column. VMware's official stance is to use the newest supported firmware and check with your vendor. This KB provides links to vendor-specific pages to determine recommended firmware.

Which path should you go?

vSAN VCG Benefits

  • RDMA Support for best performance and lowest overhead
  • Extensive vSAN ReadyLabs testing

 

vSphere VCG Benefits

  • Low performing onboard NICs can be useful in edge use cases
  • Brownfield support for existing hardware
  • Server OEM has not yet validated a vSAN VCG network interface card.

 

What about cost?

As of the time of this writing, vSAN VCG certified cards can be acquired at very reasonable costs.

image 75

 

What are some other offloads and features to look for when trying to find a good interface card?

Offload Features

LRO/LSO – Large Receive Offload, and Large Send Offload allow for packets to be broken up when transmitting and consolidated. Note: TCP segmentation offload (TSO) is a very common form of LSO and you will often see these terms used interchangeably. This provides improvements in CPU overhead. The VMware Performance Team has a great blog showcasing what this looks like for virtual machines. LRO can benefit CPU overhead on 64KB workloads by as much as 90%.

Receive Side Scaling (RSS) – Helps distributed load across multiple CPU cores. At higher throughput operations it is possible that a single CPU thread can not fully saturate larger network interfaces. In sample testing, a 40Gbps NIC could only use 15Gbps when using a single core. RSS is also critical for VxLAN/NSX performance. Note RSSv2 is supported by a limited subset of cards (This appears to allow balancing at a more granular level).

Geneve/VxLAN encapsulation support – For customers using NSX, hardware support for overlays again helps increase performance and should be added to the shopping list when selecting a NIC.

Converged over Ethernet (RCoEv1/RCoEv2) – While vSAN doesn’t yet, support RDMA, vSphere VM’s, iSER (iSCSI RDMA Extensions) support is shipping today, and hopefully in the future additional support will come for other traffic classes. RDMA significantly lowers latency, lowers CPU usage and increases throughput while keeping CPU usage down. Note, RCoE avoids the use of TCP, while iWARP (a competing standard) runs RDMA over TCP. RCoE requires not only NICs that support it, but also physical switches to support it. Mellanox is a popular vendor for both NICs (X4) and switches as they have a long history of working with RDMA.

NSX-T Virtual Distributed Switch (N-VDS). NICs that support this will feature “N-VDS Enhanced Data Path” support in the vSphere VCG. This includes the ability for Traffic flow over NUMA aware Enhanced Data Path. While this is normally something reserved for NFV workloads, the gains in throughput are massive. This Blog is a good starting point.

https://blogs.vmware.com/networkvirtualization/files/2018/06/enhanced.datapath.png
Turn this path up to 11

MISC CNA/Storage Options.

iSCSI HBAs have come in and out of vogue over the years. In general, I have some concerns with the quality of the QA the vendors are doing for this feature that sees such little usage now that the software initiator is the overwhelming majority of the market. FCoE has been removed/deprecated from some Intel NICs and in general, is falling out of favor.

FCoE software initiator now exists, but in general, the fad of FCoE (Which was always a bridge technology) seems to be slowly going away. For those still using FCoE CNA’s be sure to make sure that vVols secondary LUN support is available.

NVMe over Fabrics (NVMe-oF) – Support for NVMe over ethernet fabric is slowing gaining interest in customers looking for the lowest latency possible. While possible to run over Fibre Channel, 100Gbps RDMA Ethernet is also a promising option.

Other Conisderations

Supported Maximums – Not a huge issue for most, but some NICs have caveats on the maximum supported number that can be placed in a host. This is often tied to driver memory allocations. This information will be found in the vSphere maximums as well as in the notes on the VCG entry for a NIC.

 

Long term lifecycle support – Different vendors commit to different end of service life timelines for their devices. Beyond this, it may be worth checking that they intend to provide ongoing support for issues found. A good question to ask an OEM would be “If we have an issue do you have the hardware to recreate this, and were physical? is this lab?”. If you are repeatedly being asked to run async drivers (that are not tested/validated by VMware) this may be a sign that this vendor doesn’t have adequate engineering behind this card, or may have extensively shifted left responsibility to an ODM.

 

Management APIs for CIM Provider and OCSD/OCBB Support – This can allow for better out of band monitoring of this NIC. If there are not good ways to interigate health and pull logging of the card, recreating issues can be painful.

Wake on LAN – Really you should be waking servers using the out of band management, but ocasionally there is a use for this.

So what do these features mean for a HCI Architect?

Lower host CPU usage means more CPU available for processing storage and running virtual machines enabling increased virtualization consolidation.

Higher Throughput per core (as a result of LRO/TSO) means that higher performance per core can be achieved by reducing uncessary CPU usage. This allows faster resync operations (commonly 64KB), as well as higher throughput available. LRO/TSO/RSS help prevents single-threaded networking processes from becoming bottlenecks.

Lower Packets Per Second (PPS) – By consolidating packets with TSO, fewer packets must transverse the physical switches. Many Switch ASICs will have limits as to how many PPS they can process and will be forced to delay packets negatively impacting performance.

Caveat Emptor – Some NIC’s have had a troubled history with these features, and may require driver/firmware updates to make stable. Some vendors may label a feature as offload, but in reality still, process them in CPU. Some features may only be supported in specific driver versions, or might even be quietly deprecated and scrubbed from the datasheets. Note, the suubtle humor of firmware release note writers can not be understated. “may lead to connectivity issues” may translate as “cause the host to crash, and cause a plague of locusts to infest the datacenter”. As always, trust but verify.

 

Filter Tags

Networking vSAN Blog Deployment Considerations Advanced Design