Intelligent Cluster Aware Shutdown and Start-up Workflows

September 21, 2021

Тhe day-to-day activities at the data center include occasionally shutting down an entire vSAN cluster. For example, the power supply network might need to be maintained, or the servers might need to be moved to another location, and administrators should power off and restart all hosts in a short period of time. vSAN 7 Update 3 introduces a build-in guided workflow for simple power down and restart. This new feature helps administrators perform a graceful shutdown and restart of an entire cluster while monitoring for possible issues during the entire process.

Cluster Shutdown

Once, the shutdown option is selected, the vSAN administrator will be provided with detailed pre-checks in order to confirm the cluster is safe to be shut down immediately. Instead of having to follow a specific set of steps that are more prone to human error, the admin is seamlessly guided through the process of the new automated workflow. After the pre-check is completed, the cluster shutdown will make sure to power off the system VMs in the cluster, disable HA on the same cluster, make sure that there are no undergoing changes related to the vSAN objects, and power of the hosts. The cluster VMs are not shut down by the cluster shutdown procedure because we want to keep this an administrator-controlled activity. This way admins can coordinate the process with the IT teams and the application owners.

Shutting down an HCI Mesh server cluster requires that all VMs consuming storage on the HCI Mesh server cluster are powered down prior to the cluster shutdown proceeding.  One of the many prechecks in place will help ensure that this is performed before the workflow continues.

All these steps are needed to ensure the consistency of the data and management plan, followed by powering off each host within the cluster.

Note that while some of the system VMs like VCLS will be shut down, some others may not be automatically shut down by vSAN. For example, the cluster shutdown will not power off the File Services VMs, the Pod VMs, and the NSX management VMs. The cluster shutdown feature will not be applicable for hosts with lockdown mode enabled.

Cluster start-up

A similar simple restart procedure is also available in vSAN 7 Update 3, so administrators can power on the entire cluster without effort. Another set of pre-checks will be performed, again to enable a safe and efficient power up. For a powered-down cluster, this same menu location will show an option for “Restart Cluster.”  The administrator must manually power up the hosts before the “restart cluster” workflow can be complete.

Orchestration host

For both processes, shutdown and restart, there will be an election of an “orchestration host” within the cluster. The host selection is arbitrary unless vCenter server is running on the vSAN cluster. If that is the case, then the orchestration host will be the host the vCenter server resides on.

    • Each host within the cluster will be aware (through stored metadata) of which host is the orchestration host.  Understanding the orchestration host is important especially for a cluster power-up process.
    • If vCenter Server is on the same cluster as the one being shut down, the orchestration host will be delegated the basic management tasks and responsibilities of shutdown or power-up, if the vCenter Server has already been shut down.

Both workflows will work for a vSAN Stretched Cluster and a 2 Node cluster.

Simplified operations are an essential attribute of all our VMware products. vSAN saves valuable time and effort by implementing workflows that can ensure predictable and efficient results with a few mouse clicks. For more details on the vSAN operations, make sure to go to vSAN Operations Guide.

 

Filter Tags

Blog