What's New in vSphere Update 3 for vSphere IaaS control plane?
Before we dive into the new capabilities, let's start with the name; what is vSphere IaaS control plane?
We’ve embedded a declarative API into vSphere to enable a modern IaaS experience for consumers.
vSphere IaaS control plane builds on the work we've done in vSphere with Tanzu, and extends it with new set of IaaS capabilities.
Now let's look at what is new in vSphere Update 3.
Independent TKG Service
We’re introducing independent TKG Service that is decoupled from vCenter and implemented as a core Supervisor Service. This will allow us to deliver asynchronous releases of the service and ship new Kubernetes versions faster than ever before.
Administrators will be able to upgrade TKG Service without having to upgrade Supervisor or vCenter to seamlessly receive and unlock newer versions of Kubernetes. These can be then directly used by the consumers.
Local Consumption Interface (LCI)
Cloud Consumption Interface (CCI) in Aria Automation provides a UI for users to create VMs, TKG clusters, Load Balancers and Persistent Volume Claims. This interface is now available in vCenter for each individual Namespace. The UI supports complex specifications for VMs and TKG clusters, while generating the YAML for users that wish to interact with the API directly. Virtual Machine UI supports creation of Secrets and configmaps to hold the cloud config for VM instance configuration. The Cluster wizard extends to support of complex cluster types that might include mixed worker nodes of GPU and non-GPU configurations.
Autoscaling for Kubernetes clusters
We’re introducing autoscaling for Kubernetes clusters using the Cluster Autoscaler. This will allow Kubernetes clusters to reflect demand, scale down nodes that are underutilized, and scale them back up when demand increases. Worker nodes will be automatically added when there are not enough resources to satisfy pod scheduling.
Cluster autoscaler can be installed as a standard package using kubectl or tanzu cli. The package version must match the minor Kubernetes versions, for example, in order to install the package on Kubernetes cluster version v1.26.5, you will have to install cluster autoscaler package version v1.26.2.
Minimum required version for cluster autoscaler is v1.25.
vSAN Stretched cluster support
One of the most requested configuration options has been the ability to deploy Supervisor on a vSAN Stretched cluster spanning two physical locations or sites. With this release we addressed underlying requirements to be able to support this implementation, however there are some things to keep in mind.
The main consideration is the availability of etcd, which is the main database storing all data of a Kubernetes cluster. Etcd uses a quorum-based mechanism, which means that it requires more than half of the replicas to be available at any time, there is no benefit of spreading a given set of odd number of CP VMs across the two sites. Instead, for an Active/Active deployment, we recommend that:
-
The 3 SV CP VMs should be placed in the same site, however, that site can be either of the two sites since both the sites are active
-
All the CP VMs of any given TKGs cluster should be placed in the same site, which can be either of the two sites
To control placement the admin will need to configure VM-host affinity rules for each set of related VMs to affine a VM to a particular site.
Admins will also have to create a vSAN Stretched Cluster Policy with specific configuration, such as Site disaster tolerance set to Dual site mirroring and Force provisioning enabled. As vSAN stretched cluster policy is a requirement for content libraries and supervisor itself, vSAN stretched cluster needs to be enabled first, then the Supervisor.
Due to the complexity of this architecture, we would strongly recommend to follow the best practices documentation that we have created for this use case.
Automated Supervisor certificate rotation
Another new feature we have added in this release is an automated certificate rotation of Supervisor certificates, By default, supervisor certificates are valid for a year after initial deployment. Customers have been able to replace the certificates following a number of manual steps. in order to simplify the experience, we have automated the process, and certificates will be simply replaced when they are approaching their expiry date, without any user intervention.
An alarm will be raised only if this auto-renewal process would fail and will be visible in the vCenter UI. The process will attempt to renew the certificate again in 12 hours in default configuration. Once the replacement is successful, the alarm will disappear.
Users can also decide to replace the certificate manually.
VM Class Expanded Configuration
The VM Class User Interface in vCenter has been updated to support a dramatically expanded set of the Hardware Configuration options. Moving HW configuration into VM Class simplifies the Virtual Machine specification file by allowing users to refer to a single VM Class as a template for the complete specification rather than configuring individual settings directly. Admins now have granular control for the HW configuration available to self service users. HW configuration aligns more directly with public cloud consumption models.
Cluster Backup and Restore
Velero is now a core Supervisor service enabled on the Supervisor. This service coordinates backup and restore of both the Supervisor cluster and any TKG clusters you have deployed. Velero must be installed on individual TKG clusters through the CLI. Backup and restore are also done through that CLI. Backup and Restore of the Supervisor is handled through the vCenter UI.
VM Service – VM Backup and Restore
The VM Service now includes the capability to backup and restore virtual machines through any VMware Advanced Data Protection software. This method requires no changes to your backup tool or special configuration. Backups can be done at the VM level or for an entire namespace. For the namespace, you would simply reference the Resource Pool that backs the namespace in vCenter.
When a VM is deployed using the VM service, Custom Resource objects are created in the supervisor that define the VMs state as well as that of supporting objects that facilitate bootstrapping the VM. That metadata must be restored as part of the backup/restore process. VMs deployed with the VM service now store in the Extra-Config fields, all of the information needed to recreate the Supervisor meta data. This registration process is automatic and happens on restore. If auto-registration fails for some reason – like maybe a storage policy or vm class is missing, there is an API available to manually kick off registration after fixing the issue.
Release Notes
For a full list of new enhancements and features visit the VMware vSphere IaaS Control Plane 8.0 Release Notes.