November 30, 2022

Stripe Width Storage Policy Rule in the vSAN ESA

Should the "Number of disk stripes per object" storage policy rule be used for VMs running in the vSAN Express Storage Architecture?  Read on to learn the answer!

Storage policy rules serve as a powerful way that administrators can define and prescribe outcomes to VMs in a vSAN environment.  How resilient do you want a VM to be?  Do you want the capacity allocated to a VM to be guaranteed?  These are the types of outcomes you can prescribe easily with vSAN storage policies. 

The vSAN Express Storage Architecture (ESA) introduces new capabilities that may change how certain policy rules and data placement schemes impact an environment.  For example, using space-efficient RAID-5/6 erasure coding no longer has a performance impact as it does in the vSAN Original Storage Architecture (OSA).

The "Number of Disk Stripes per Object" storage policy rule still exists when using the vSAN Express Storage Architecture (ESA), but the policy rule does not have the same effect that it did with the vSAN Original Storage Architecture (OSA).  Let's look at this more closely to understand why, and how you can simplify your storage policy management by leaving it at its default value.

Background

As noted in the post: "Stripe Width Improvements in vSAN 7 U1" the "Number of disk stripes per object" storage policy rule attempts to improve performance by distributing data contained in a single object (such as a VMDK) across more capacity devices.  Commonly known as "stripe width," this storage policy rule will tell vSAN to split the objects into chunks of data (known in vSAN as "components") across more capacity devices which can drive higher levels of I/O parallelism if, and only if the capacity devices or disk groups are a significant source of contention.

When using the vSAN OSA, increasing the "number of disk stripes per object" might benefit a given workload under very specific conditions in which the capacity devices in the OSA were a significant source of contention.  The setting typically provided the most benefit on objects using a RAID-1 mirror.  This is because a mirroring data placement scheme (in comparison to an erasure code) would concentrate more object data on fewer hosts, and potentially fewer storage devices.  Our guidance for the OSA has been to keep the default number of disk stripes per object set to 1, but possibly experiment with the stripe width setting as one potential mitigation step in troubleshooting vSAN performance.

Stripe Width Policy Rule in the vSAN ESA

The "Number of disk stripes per object" storage policy rule has limited relevance in vSAN's new architecture, as a result of how the ESA is optimized for NVMe-based storage devices.  The heavily revised Log Structured Object Manager (LSOM) in the vSAN ESA gets the most credit for delivering near device-level performance to the upper layers of the vSAN stack.  The LSOM sits at the lowest layer of vSAN, and interfaces with the ESXi storage stack and its devices. Its all-new block engine can process large quantities of I/O in parallel to exploit the full capabilities of NVMe devices and is paired with a fast transactional metadata store (known as a key-value store) so that metadata can be written and referenced quickly.  This is especially important as newer high-density storage devices come to market.  It is the combination of these high-performance NVMe devices paired with the architectural change to LSOM to exploit the full power of these devices that allows the vSAN ESA to do away with the limitations and complexities of disk groups.

vSAN ESA Log Structured Object Manager

Figure 1.  A view of the Log Structured Object Manager in the vSAN ESA.

As a result, the vSAN ESA can exploit the full capabilities of these NVMe devices without the need to split the data into smaller chunks and disperse them across more devices - which was sometimes needed in the OSA.

Recommendation:  When using the vSAN ESA, leave the "Number of Disk Stripes per Object" set to the default value of 1.

The Behavior of the Stripe Width Policy Rule in the ESA

While we recommend leaving the stripe width policy rule at its default value of 1, you may be curious as to what happens if the value is increased on a VM running in a cluster using the ESA.  The stripe width setting in a vSAN cluster using the ESA will behave in a very similar way to a cluster using the OSA.  How it splits the components will depend on whether the components are for the capacity leg or performance leg of an object.  The capacity leg and performance leg of an object are new constructs in the object data structure for the ESA and helps it deliver high levels of performance while using space-efficient erasure coding.

As shown in Figure 2, increasing the value to 2 will split the performance leg components on each of the hosts providing the 3-way mirror into 2 components per mirror.  However, when using a stripe width setting of 2, the object components that comprise the RAID-6 erasure code will not be changed.  If the stripe width setting was high enough (to 12), it would eventually increase the number of components in the RAID-6 stripe.  This aligns with what is described in the post: "Stripe Width Improvements in vSAN 7 U1" where the stripe width values affect an erasure code differently than a mirror. 

Stripe width in the ESA

Figure 2.  A view of a stripe width of "2" on an object using a RAID-6 storage policy in the vSAN ESA.

Given that the new data structure and Log Structured Object Manager allow vSAN to deliver near device-level performance of NVMe devices, increasing the stripe width value does little more than create more data components, and complicate placement decisions for vSAN.   Increasing the stripe width is not recommended as a mitigation step when troubleshooting vSAN performance in the ESA.

Why is it still there?

If it is no longer relevant for the ESA, then why does the policy rule still exist?  Storage policies and the rules that make up a storage policy are a construct of the vCenter Server.  A given vCenter Server may be responsible for many vSAN clusters, some of which may be running the OSA, while others use the ESA.  Keeping these policy rules available across all cluster types can help maintain the compatibility of different cluster types and conditions.

Summary

The storage policy rule of "Number of Disk Stripes per Object" can occasionally help performance issues when using the OSA, but can generally be disregarded in the ESA.  Considering that RAID-5/6 erasure coding can be used without any performance compromise, RAID-5 erasure coding can be used on as few as 3 hosts, and the introduction of an all-new high-performance compression, the Express Storage Architecture in vSAN 8 makes storage policy considerations easier. 

@vmpete

 

Filter Tags

Storage vSAN vSAN 8 Blog Intermediate Advanced