June 25, 2024

vSphere Live Patch

With vSphere 8.0 Update 3 we can address critical bugs in the virtual machine execution environment (vmx) without the need to reboot or evacuate the entire host.

vSphere 8 has redefined what lifecycle and maintenance means for the vSphere infrastructure. vSphere Lifecycle Manager streamlines cluster image management, provides full stack driver and firmware lifecycle management and expedites total cluster remediation. Certificate management is non-disruptive, and vCenter updates can be performed with a fraction of the downtime needed in the past.

vSphere 8 Update 3 continues this trend of reducing and eliminating downtime with the introduction of vSphere Live Patch.

vSphere Live Patch allows vSphere clusters to be patched without migrating workloads off the target hosts and without the hosts needing to enter full maintenance mode. The patch is applied live while workloads continue to run.

Requirements

vSphere Live Patch is an opt-in feature that must be enabled before remediation tasks.

  • vCenter must be version 8.0 Update 3 or later.
  • ESXi hosts must be version 8.0 Update 3 or later.
  • The Enforce Live Patch setting must be enabled in the global vSphere Lifecycle Manager remediation settings or at the cluster remediation settings.
  • DRS must be enabled on the vSphere cluster and in fully automated mode.
    • For vGPU enabled VMs, enable Passthrough VM DRS Automation.
  • The current build of the vSphere cluster must be eligible for a live patch. (More on that below)

 

How Live Patching Works

.

  1. ESXi host enters partial maintenance mode. Partial maintenance mode is an automatic state that each host will enter. This special state allows existing VMs to continue to run but disallows the creation of new VMs on the host or for VMs to be migrated to or from the host.
  2. A new revision of the target patch components is mounted in parallel to the current version
  3. The new mount revision files and processes are patched
  4. Virtual machines undergo a fast-suspend-resume to consume the patched revision

 

Not All Patches Are Equal

vSphere Live Patch is initially available for a specific type of patch. Patches for the virtual machine execution component of ESXi are the first target for vSphere Live Patch.

Patches that may change other areas of ESXi, for example VMkernel patches, are not initially supported for vSphere Live Patch and would follow the existing patching workflow requiring maintenance mode and VM evacuation.

vSphere Live Patches can only be deployed on top of supported compatible ESXi versions. Each Live Patch will denote what previous build it is compatible with. vSphere Lifecycle Manager will note the eligible version(s) when defining the cluster image. You can also see the eligible version in the vSphere Lifecycle Manager image depot.

.

For example, patch 8.0 U3a 23658840 can only be applied to the compatible version 8.0 U3 23653650. Systems not running the compatible version can still patch, but will use the existing patching workflow requiring maintenance mode and VM evacuation.


Note: The build numbers used in this article are examples only and not indicative of released builds.


Cluster image compliance will report the eligibility for Live Patch.

.

 

Fastest Suspend and Resume in the West

As mentioned earlier, patches for the virtual machine execution component of ESXi are the first target for vSphere Live Patch. This means that while virtual machines do not need to be evacuated from the host, they do need to perform what is called a fast-suspend-resume (FSR) to consume the patched virtual machine execution environment.

A virtual machine FSR is a non-disruptive operation and is already used in virtual machine operations when adding or removing virtual hardware devices to powered-on virtual machines.

Some virtual machines are not compatible with FSR. VMs configured with vSphere Fault Tolerance, VMs using Direct Path I/O and vSphere Pods cannot use FSR and need to be manually remediated. Manual remediation can either be done by migrating the virtual machine or by power cycling the virtual machine.

The vSphere Lifecycle Manager compliance scan will report virtual machines that are incompatible with FSR and the reason why.

.

Note: After a cluster has been successfully remediated, any hosts running VMs that do not support FSR will continue to report being out of compliance. The VMs must be manually migrated using vMotion or power cycled. Only then will the cluster report full compliance.

 

Filter Tags

ESXi 8 vCenter Server 8 vSphere 8 Blog