Should I use 10Gbps for vSAN, VMware Cloud Foundation and vSphere? What about SFP56 SFP112?
The year is almost 2024, and you are getting ready to deploy a new vSphere Cluster. You will need to go over several resource requirements
CPU - You are likely looking at more cores, discussing AMD vs. Intel, discussing benefits of newer generations extensions or PCI-E lanes
RAM - For core datacenter hosts with TB's plural is now a real consideration!
Local storage - NVMe for vSAN ESA, no Tri-Mode Controllers in vSAN ReadyNodes, M.2 for boot devices.
Networking - "We only have a Cisco FEX or ancient S4080 Trident+ switch to connect to… I’ll take a pair of 1Gbps for management, and a 2 port X710…"
For some reason Networking is commonly the area I see undersized investments made. This blog will look at some suggestions on how to pick a networking interconnect for your bill of materials, and handle objections. If you have any objections I'm missing (or want to argue with me about my chart below!) feel free to reach out @Lost_Signal on twitter, and I will add them to this list.
What port configuration should I put on my hosts?
A favorite website of mine is Logical Increments. It's a website that helps you work backwards from an outcome and a budget on what components should go in a PC. Taking a similar model here is a chart of 2023 networking options and how they score.
|2 x 10Gbps
|This should not be seriously considered for production, and only on low memory low I/O labs
|Low End Legacy
|4 x 10Gbps
|If the hosts are being replaced next year, You may use LAG/LACP to bond multiple connections together but remember 1+1 does not equal 2
|2 x 25Gbps
|The minimal configuration. You could later expand into 4 x 25Gbps, but keep in mind the cost delta from 4 x 25Gbps to 2 x 100Gbps is low
|2 x 50Gbps
|For deployments with new switches, strongly take a look at SFP56 50Gbps this year. It future proofs you at a reasonable cost.
|Mid Tier +
|2 x 100Gbps
|Once you get closer to 32 hosts, vSAN hosts with over 150TB, and RAM. counts in the TB the cost economics of 100Gbps just make a lot of sense.
|4 x 100Gbps
|For IO and memory rich applications, with the lowest impact of vMotion etc, going big has it's advantages. vSAN/vMotion may need multiple VMK ports
Networking Upgrade Objection Handling
Now I hear a lot of objections to networking upgrades. It often seems to be a chicken egg problem servers being procured with older NICs, and networking teams not upgrading because older NICs are still being bought. There are a number of other common objections to moving past 10Gbps networking:
"I can't buy 25Gbps NICs, I have 10Gbps switches!"
So fun fact about the SFP+ interface commonly used by 10Gbps networks. It is FULLY backwards compatible with SFP28 ports used by 25Gbps switches! Why not go ahead and get yourself “25Gbps ready” with cable plant too if you can, and not end up stuck in a chicken/egg loop of “We have to buy 10Gbps NICs because the switches are 10Gbps, we have to buy 10Gbps switches because of 10Gbps NICs!.
What about 40Gbps?
“2 x 10Gbps is good enough for storage based on what Solarwinds/Other SNMP based monitoring is showing”
Be very careful with design and sizing of networks. Systems that by default poll on 5 minute increments may be missing microburts and other situations where the network is the bottleneck and impacting application performance. Make sure to look for buffer overruns in counter stats. Storage traffic very often involves short microbursts, and this tends to get “rounded out” by long spaced out polling intervals. There are ways to poll more aggressively (vSAN Network Diagnostic mode, vSAN I/O Insight), or look at switch counters/Syslog events for Buffer full conditions. The operations person often saying “Look 20% load” is missing that there was 100% load for 1 minute of that 5 minutes followed by nothing (or even more fun many Bursts of 100% for 5 seconds). Also be aware of other symptoms of buffer full conditions (retransmits, out of order packets caused by TCP incast)
“My blade chases only supports 10Gbps or 40Gbps, or would require extensive re-capitalization 2 years from end of life to improve IO options”
Blade chassis tend to have "VERY UPGRADABLE BACKPLANES(TM)” until.. well… they don’t. Unless you deploy rail cars worth of blades every 6 months the best arguments really for them still remain “It’s less cabling”. While I appreciated this argument in a world when we would have 8 x 1Gbps network connections to a host plus 2 x FC cables, It is generally much more cost effective to purchase rack mount servers and simply run 2 x 100Gbps ethernet than use blades for virtualization hosts.
25/50/100Gbps NICs, Vendor branded Optics and switches are more expensive
NIC Cost Concerns - I rarely hear this, but we are talking ~$300-400 for a 25Gbps NIC, and ~$900 for a 100Gbps NIC. This is a rounding error on BOM costs for most dense VM hosts. Vendors often markup Optics a lot (How else could they discount them 83%) but it’s worth noting you have options.
Direct Attach Copper (DAC) cables - These are copper cables, and in shorter distances they are passive (so rather dumb/simple devices), longer distances they are active and may need vendor coding. These come in regular and also a 4 cable "breakout" (where a 100Gbps switch port can drive 2 x 50Gbps or 4 x 25Gbps ports as seen in this photo).
AIO (All in One) cables - These cables include the optics essentially fused to the fiber on each ends. These things are amazing as they are cheaper than buying. 2 transceivers + mm fiber cables, but also because you can’t accidentally get dust on the end of the transceiver. They exist in standard (Seen below) and breakout also.
SFP56 and SFP112
SFP56 is a new option that allows lower cost 50Gbps connections (Similar unit economics to 25Gbps or 10Gbps that came before), and SFP112 will also lower the cost of 100Gbps ports from existing QSFP28 based 100Gbps ports. While it is still "Early" for these technologies they should be strongly considered going forward.
Do make sure you are not buying access switches for data center usage (Catalyst, or anything marketed as “Campus switching”) is generally not adequate from a buffer/processing for storage and vMotion uses. That said There are plenty of reasonably priced options in the 25/50/100Gbps space.
Some things to think about are: Do I really need 48 ports top of rack, or are 32 port 100Gbps switches a better fit?
How much traffic will cross rack, if this deployment will span multiple racks do I have enough spine bandwidth.
Be mindful of the rise of 50Gbps/Cheap 100Gbps/200Gbps. Underneath the hood the reason how today SFP28 (25Gbps) and 100Gbps (QSFP) displaced SFP+ (10Gbps and 40Gbps (QSFP) is in how channel sizes worked. A single channel or lambda in the SFP+ days was 10Gbps, and if you bound 4 of these into a quad LAMBDA (light frequency) optic you got a 40Gbps QSFP. As improvements were made, we eventually moved on to 25Gbps channels and that’s were SFP28 comes from (and why 100Gbps QSFP displaced 40Gbps). We are about to see another transformation like this as SFP56 and SFP112 comes out and unlocks three things:
- 50Gbps single channel port (SFP56) that will be similarly priced to where 25Gbps was.
- 100Gbps ports (SFP112) that only require 2 channels significantly lowering the cost of mfg and optics per port.
- 200Gbps QSFP connections for spine switches.
We are on the cusp of this as I write about it, but please keep an eye as any recommendations for 25Gbps may suddenly shift to 50Gbps, or just cheaper 100Gbps ports.
Chassis vs. top of rack Leaf/Spine?
Be mindful that chassis switching and pulling everything end of row may be useful in some situations, but more often than not a modular leaf/spin architecture is going to be more flexible to expand, and a much lower cost.
What switch features do I need for vMotion/vSAN?
Storage and vMotion especially don’t need 40000000 niche sub features, but there are few things that might help. RDMA as we move into 100Gbps becomes a serious consideration. Thankfully it is pretty standard on new hosts.