VMware vSAN HCI Mesh Tech Note
Why HCI Mesh
The scale-out architecture of VMware vSAN enables powerful non-disruptive scale-up or scale-out capabilities. You can non-disruptively expand capacity and performance by adding hosts to a cluster (scale-out) or just grow capacity by adding disks to a host (scale-up). As application workloads organically grow, this enables performance to be right sized at each expansion interval. Over time the ratio of storage and compute can be right sized through vSAN. Despite this, inorganic scale events can prove challenging to any architecture. Examples include:
- An application refactor requires significant storage being added for logs.
- A new line of business for analytics may consume excessive compute, potentially stranding storage assets.
- M&A may result in new business units bringing unforeseen storage requirements to a cluster.
- A storage purchase need is imminent, while compute is on a staggered refresh cycle.
- The deployment of Networking Virtualization, powered by NSX, has enabled the migration and consolidation of legacy stranded cluster resources.
Historically, when these scaling events happen it could cause an existing clusters to run out of storage or compute and potentially strand the lower demanded resource. While vMotion enables “Shared nothing migration” between clusters, this still forced storage and compute to move together between clusters.
While vSAN can export iSCSI or NFS, the native vSAN protocol was chosing to export storage to another cluster for a number of reasons:
1. SPBM management is preserved end to end.
2. Lower compute and IO overhead is preserved by using the native vSAN RDT protocol end to end.
3. The vSAN performance service can allow for end to end monitoring of IO.
4. There is no need to manage islands of storage within LUNs or NFS exports, and no need for datastore clustering or VAAI to try to work around issues that would come from adding another layer of abstraction.
5. Storage is still managed and maintained as a cluster resource.
Deploy and manage HCI Mesh
Configure a compute cluster to VMware vSAN HCI Mesh
Before configuring an HCI mesh compute cluster the following steps should be taken:
1. Disable HA on the cluster
2. Configure vSAN VMkernel ports that can talk to the remote vSAN clusters VMkernel ports.
For clusters that will only be consuming remote vSAN clusters, they will need to be initialized as a vSAN Compute Cluster. First create a regular cluster with HA and vSAN disabled, and then go to Cluster --> Configure --> vSAN and enable vSAN. When prompted for the configuration type select "vSAN HCI Mesh Compute cluster". You will see additional cluster options (Stretched cluster, 2-node, custom fault domains if hosts are already in the cluster).
Configure vSAN cluster to vSAN cluster HCI Mesh
After selecting a remote vSAN cluster managed from the same vCenter server, a set of compatibility checks will automatically run to verify that the remote cluster may be mounted.
Finally, for clusters mounting a remote datastore, the Datastore with APD response should be changed.
This setting can be found withing vCenter Server by browsing to: Cluster --> Configure --> vSphere Avalability --> Edit
Note, either aggressive or conservative can be used. For most customers, conservative will be preferred.
Deploy VMware vSAN HCI Mesh Compute Cluster
In vSAN 7 U2, traditional vSphere clusters can mount a remote vSAN datastore. HCI Mesh compute clusters can consume storage resources provided by a remote vSAN cluster, in the same way, that multiple vSphere clusters can connect to a traditional storage array. HCI Mesh compute clusters use native vSAN protocols for maximum efficiency and affords the customer the ability to easily meet a broad variety of use cases. Most importantly, HCI Mesh compute clusters do not need any vSAN licensing. One of the most interesting capabilities, as it relates to HCI Mesh, is the integration with storage policies. When defining a storage policy, an administrator will be able to define the types of data services they are interested in (such as Deduplication and Compression, or Data-at-rest Encryption), and the storage policy wizard will filter out the available datastores that meet that criteria – assuming there are multiple remote datastores already mounted at the time of VM provisioning and policy selection. This allows for an easy understanding of what type of storage may be available to multiple vSAN clusters connected by HCI Mesh.
Migrating Storage to HCI Mesh
Once a mesh relationship has been established, a simple Storage vMotion is all that is required to migrate a Virtual Machine's storage to a remote vSAN cluster datastores. Do note, that you can change the Storage Policy (For instance, changing the RAID from RAID 1 to RAID 5), while you undergo this migration.
It is at this time, not supported to split VMDKs of a given VM across multiple datastores. To Migrate, simply right click on the virtual machine and select "migrate" followed by "storage only". Upon selecting the storage policy and compatible cluster a Storage vMotion process will migrate the storage non-disruptively.
Monitor HCI Mesh
VMware vSAN HCI Mesh includes a number of default health checks that ensure a solution will be supportable at setup. In addition to this, the vSAN performance service when run on both clusters is capable of providing end-to-end visibility of the IO path for both the clusters providing compute and storage to the virtual machine. A new "Remote VM" tab will appear on clusters consuming remote storage that enable this performance visibility. This provides Metrics about clusters in the perspective of remote vSAN VM consumption.
When viewing capacity usage from the vSAN capacity monitoring dashboard, a tooltip will appear with a quick link to any remote-mounted datastores.
HCI Mesh Design Considerations
HCI Mesh Limits
Client cluster: Can mount up to a maximum of 5 remote vSAN datastores
Server cluster: Can only serve its datastore to a maximum of 5 client clusters
Connections per datastore: In vSAN 7 Update 1, the maximum number of hosts connectedto a datastore was 64. In 7 Update 2 this was increased to 128.
Mesh cluster and hosts count totals: The number of clusters and hosts participating in the overall HCI Mesh (Any cluster connected in some form or another to the overall mesh) is limited to the total available clusters and hosts within a single datacenter object in a single vCenter.
Storage Policy Support: A policy being supported is limited based on the cluster storing the data and not the client cluster. (e.g. A VM using FTT=2 via RAID-6 must be using capacity from a cluster that is 6 hosts or larger.) vSAN 7 Update 2 includes support for adding cluster data services or cluster type to the filtering.
Networking Design Considerations
The cross-cluster traffic associated with HCI-Mesh is using the very same protocol stack (RDT over TCP/IP) that exists in a traditional vSAN cluster. Connections are made directly from the host running the virtual machine to the hosts supplying the backing storage.
•Since vSAN and HA communication share the same vmkernel port, HA is dependent on any links that provide communication between clusters. The same principles of HA apply, but recognizing that compute may be provided on one cluster, while storage may be provided on another. In the event of a cross cluster communication issue, an APD will occur 60 seconds after isolation event, and attempt VM restarts after HA determined settings (e.g. 180 seconds)
In an HCI Mesh Architecture, since VM’s living in one cluster may be using storage resources in another cluster, the network communication requirements will need to meet adequate levels of performance to not hinder the workloads. Latency between the clusters will add to the effective latency seen by the VMs using resources across a cluster. The recommendations are as follows:
Network topology that ensures the highest level of availability, including redundant NICs, switching across clusters, etc.)
- Network performance that reduces the likelihood it is the performance bottleneck. 25Gbps end-to-end using storage class gear is recommended.
- A recommended minimum threshold is to provide sub millisecond latency for meshed clusters. The data path may be inherently more complex as it passes across east-west cluster boundaries, which may be reflected in different network topologies. Datastore mounting prechecks are available to warn the administrator if these conditions are not met (alert will trigger at 5,000us/5ms or greater), but will not prevent the mounting of the datastore.
- Use of vSphere Distributed Switches (vDS) is needed to allow for proper bandwidth sharing via NIOC
- L2 and L3 are supported. Configuration of routing for vSAN VMkernel port traffic will be necessary.
To more easily support layer 3 configurations, vSAN 7U1 supports overriding of the default gateway for a VMkernel port from within the UI.
HCI Mesh Requirements
The following use cases are not currently supported:
Remote provisioning workflows for File Services, iSCSI, or CNS based block volume workloads (they can exist locally, but not be served remotely)
Air-gapped vSAN networks, or clusters using multiple vSAN VMkernel ports are not supported with HCI Mesh. LACP is supported as an alternative means of agregating throughput.
Objects of a VM spanning across multiple datastores (e.g. one VMDK in one datastore, and another VMDK for the same VM in another datastore)
RDMA between clusters is not supported at this time.