Migrate a WSFC Cluster on RDMs to vVols
Microsoft Windows Server Failover Clustering (WSFC) on VMware vSphere 6.x requires the use of SCSI-3 Persistent Reservations to be able to share disks between nodes in the cluster. One of the most common, not necessarily popular, ways to achieve this is using RDMs. RDMs allows SCSI commands to be directly passed to a LUN using SCSI-PRs. This allowed WSFC to directly interact with the array for the shared disks. With the release of vSphere 6.7, Virtual Volumes (vVols) added support for SCSI-3 PR and running WSFC. The question then arises, how can you migrate from using RDMs and to vVols? In this article, I will go over the steps and show you how easy it is.
Before migrating, first and foremost, make sure you have a full backup of your WSFC. Next, it's critically important to capture all the details regarding your shared disks in the cluster.
In Figures 1 and 2, you can see these disks are physical LUNs. Disk 2 is on SCSI controller 1, channel 0 (1:0), and disk 3, is on the same controller but channel 1 (1:1). These details are crucial to ensure the cluster will resume without any issues after migrations.
Reviewing the second node in the WSFC, we can see the RDM is the same for each disk and as are the SCSI controller configurations.
In the following figures, you can see the status of the cluster before migration.
Now that you have gathered the vital information regarding the cluster and each node's disk configurations, you can shut down the WSFC.
Now that you have gathered the important information regarding the cluster and each node's disk configurations, you can shutdown the WSFC.
Once the cluster has been shut down, you will then power off all the nodes in the WSFC. Because of the shared disks, there is no live option to migrate. After powering off all the nodes, you will then go to each secondary node, nodes accessing the primary node's disks, and remove the shared disks from their configuration. It is very important you do not select "Delete files from datastore", you are removing the shared disks from the secondary nodes only. Leave the shared disks (RDMs) attached to the primary node.
Once you have removed all the shared disks from all secondary nodes, you may then initiate the storage migration of the primary node. Because the pRDMs are still attached to the primary node, when you do a storage migration, the data from the disk is copied to the new destination datastore.
When do the storage migration, you are only changing the storage.
For the destination, select your vVols datastore with an array that supports SCSI-3 PR with vVols. With vVols you do not need to use the "Configure per disk" option.
vVols are automatically Thin Provisioned for all disks eliminating the need for per disk configuration.
Click Finish to initiate and complete the migration of the primary node and its disks. Once the migration has completed, you can go into the VM's properties and verify the shared disks have been migrated. In Figures 14 and 15 you can see the shared disks are now on the vVols datastore and are no longer RDMs. Also notice the shared disks, now on the vVols datastore, are on the same SCSI controller channels where the RDM previously were.
Now that you have migrated the primary node with the shared disks, you can proceed with migrating all remaining secondary nodes in the WSFC. Follow the same process making sure to select the same destination datastore as the primary node.
Once you have completed all the secondary node migrations, you will then go into the secondary VM's configuration and re-attach all shared disks. To do this you "ADD NEW DEVICE" and select "Existing Hard Disk". When going through this process, make sure you add the disks in the same order and with the same configuration prior to the migration. This is a critical step and failing to use the exact same configuration may result in the secondary nodes not coming back online.
When you add an existing disk, the datastore explorer will open allowing you to navigate to the datastore and primary node folder. There you will select the appropriate shared disk to be added back to the secondary node.
An absolutely key step is to make sure and select the correct SCSI controller and channel for that shared disk. This is the information you captured prior to the migration.
Once you have added all the shared disks back to all secondary nodes, you may then power on the primary node in the WSFC. After the primary node has been powered on you may also power on your secondary nodes in the cluster. With the cluster powered on, go into the primary node and verify the WSFC and all disks are back online.
You can also verify the disks in the disk management.
You can also do a test failover of a function or node to verify operation.
Navigating to the secondary node, you can see the active role for the file share has been moved and is running on the secondary node.
Once you have verified everything and the WSFC is back online, remember, your RDMs still exist in your environment., they just are not being used. Once you are comfortable with the migration, you can reclaim these LUNs and space.
Migrating a WSFC from RDMs to vVols is a fairly straight forward and simple process. The biggest benefit is you no longer have to manage LUNs and RDMs! You can vMotion nodes within your vSphere cluster without worrying if the RDM is accessible on all vSphere hosts. Putting hosts in Maintenance Mode is also supported as long as the vVols datastore has been mounted to all hosts in the cluster, and of course, have more hosts than WSFC nodes.
For more detail on the specific supported features with Microsoft WSFC, please see KB 2147661 Microsoft Windows Server Failover Clustering (WSFC) with shared disks on VMware vSphere 6.x: Guidelines for supported configurations.
Below is a video of the process showing all the steps to migrate from RDMs to vVols.