Upgrading On-Disk and Object Formats in vSAN
Software updates are one of the most common administrative events found in the enterprise. Small or large, manual or automated, it is the ubiquitous software update that helps introduce new capabilities and resolve existing problems. Data storage can be complex, especially as new capabilities are introduced that require different things from the stored data. Updating the underlying data structure or associated metadata is a way for the software to implement new capabilities with data already stored.
Enterprise storage systems like vSAN occasionally require these types of modifications. This post is going to describe how vSAN accommodates these changes, what you need to know about them, and why it matters. Let us look at the two types of updates that can occur after you perform an in-place upgrade of a vSAN cluster: The "On-disk format" version, and the "vSAN object format."
Figure 1. vSAN on-disk and object format upgrades in relationship to vSphere/vSAN.
The on-disk format refers to a thin underlying layer that helps vSAN store data and metadata. This substrate has played a key part in VMware's ability to introduce new capabilities such as Deduplication & Compression, Data-at-Rest Encryption, as well as changing logical address boundaries to support larger storage devices.
Each new release of vSAN includes a new on-disk format version, as described in KB 2148493. When an in-place upgrade of vSAN is completed, Skyline Health for vSAN will recognize the on-disk format version used on the storage devices against the version of vSAN installed. If there is an upgrade available for the on-disk format, it will alert the administrator, and provide them an easy way to upgrade the format, by simply clicking on the button "Upgrade On-Disk Format." Figure 2 shows the button grayed out because there is nothing to upgrade. This is sometimes known as a "disk format change" or DFC, but a disk format change can occur in non-upgrade scenarios, such as enabling or turning off a data service like vSAN Data-at-Rest Encryption.
Figure 2. The "disk format version" health check in Skyline Health.
When an on-disk format upgrade occurs, it will proceed to update every storage device and disk group in a cluster until it is complete. In most cases, an on-disk format upgrade is simply an update of metadata, and it can be done in place, with no data movement, and completed across an entire cluster in a matter of seconds. In some cases, it may require a rolling evacuation of data from one disk group to another to complete the upgrade, but this is rare, and if it is required, Skyline Health will run a series of pre-checks (introduced in vSAN 6.7 U3) to determine the success of the on-disk format upgrade without actually moving any data.
The object format refers to the hierarchical data structure of an object and its components, as described in the post "vSAN Objects and Components Revisited" These types of updates haven't always existed, but have recently allowed vSAN to reduce the amount of free capacity required for transient activities, improved efficiency with stripe width settings, and introduced new resilience capabilities in 2-node and stretched clusters. Unlike the on-disk format upgrade, the vSAN object format change does not always occur with each release.
After an in-place upgrade of vSAN has been completed and an on-disk format upgrade has been performed, vSAN recognizes objects that are eligible for an update to a new object format, and will present the eligible objects in the "vSAN object format health" check. This health check also provides an easy way to remediate the issue by simply clicking on "Change Object Format" in the health check, which is grayed out in Figure 3 as there are no objects that need an update.
Figure 3. The "vSAN object format health" health check in Skyline Health.
When an object format upgrade occurs, it will proceed to change the object format of all objects relevant to the type of upgrade being performed. Depending on the type of update that it is performing, it may change the object format on a small subset of objects, or all the objects in the vSAN cluster.
In both types of updates, objects will remain available. But the tasks may create resynchronization traffic and use additional space temporarily as a result of moving data elsewhere in the cluster, or rebuilding the object components of an object to a new data structure. The amount of resynchronization traffic it generates is highly dependent on the nature of the update, and the amount of data that is stored that is subject to the update. Below are a few examples.
- An environment that upgraded from vSAN 7 or older to vSAN 7 U1 or newer and has a significant number of objects larger than 255GB may experience more resynchronization traffic than another cluster performing the very same upgrade that does not have objects larger than 255GB. This is a result of an object format change.
- An environment that upgraded to vSAN 7 U2 or older to vSAN 7 U3 and uses stretched clusters or 2-node clusters. We introduced new capabilities to improve the uptime of these topologies, and it requires an object format change to do so. In this case, it is a relatively small metadata update that is nearly instant.
Object format upgrades can introduce more resynchronization traffic than an on-disk upgrade because vSAN can perform many object upgrades in parallel. Whether it be an on-disk format upgrade that requires data movement, or an object format upgrade that requires a restructuring of object data, vSAN will use the Adaptive Resync capability to maintain sufficient levels of performance for the VMs during these times of resynchronizations. You can easily monitor the progress resynchronizations using by highlighting the cluster, and click on "Monitor" > "vSAN" > "Resyncing Objects." Performance monitoring of resynchronizations can be found by highlighting the cluster, clicking on "Monitor" > "vSAN" > "Performance" > "Backend" and viewing the performance data for the period desired.
Recommendation: Don't be concerned if you see a lot of resynchronization traffic for events like this if it is during off hours. When front-end VM traffic is light, Adaptive Resync will identify this and allow resynchronization data to use more resources to complete more quickly.
Is it Necessary to Perform these Updates?
The updates of the on-disk format and object format are necessary to realize the full benefits of enhancements that were coded into the release of the product. For example, if a customer updated to vSAN 7 U1 or newer to take advantage of the reduced level of free capacity requirements, but chose to not update the object format, the effective benefit would not be available to the cluster.
But VMware understands that customers want the flexibility to introduce these updates in a manner that best suits their environment. Therefore, we offer these updates as a post-upgrade task that can be performed at a time that is best for your organization. This could be immediately, or perhaps during a period when workload activity is reduced across the cluster. While this provides flexibility for our customers, they should not be ignored indefinitely.
As new versions of vSAN introduced exciting new capabilities within the product, changes to the on-disk format and object format are inevitable. They help unlock the potential of new capabilities within the product and can both be completed by a simple click of a button in Skyline Health for vSAN. To ensure that your environment is running optimally, make sure you are running the latest on-disk format, and object format.