Improving RAID-5/6 in vSAN 7 U3 using Heuristics
Each new release of vSAN demonstrates how VMware is committed to the continual improvement in efficiency and performance of vSAN - all through a simple upgrade of the hypervisor. The engineering teams working on vSAN are always looking for ways to reduce the effort it takes for vSAN to process and store data. Examples of these efforts can be found in the specific performance and efficiency-related enhancements introduced with vSAN 7 U1 and vSAN 7 U2.
But the work hasn't stopped there. vSAN 7 U3 introduces the use of innovative heuristics to identify when data can be written in a more efficient way for workloads using RAID-5/6 erasure coding. Reducing the effort needed to write data improves the efficiency of a storage system. This post takes a look at this subtle, but important enhancement introduced in vSAN 7 U3.
vSAN Standard Method for Writing Data using Erasure Coding
To understand the improvements made to RAID-5/6 erasure coding in vSAN 7 U3, we need to review how data is written using RAID-5/6 erasure coding in vSAN. For a primer on data placement schemes in vSAN, see the post: RAID-5/6 Erasure Coding Enhancements in vSAN 7 U2. The illustrations below will reflect when data is written to a RAID-5 erasure code: Data with parity spread across a minimum of 4 hosts. The enhancement also works with a RAID-6 erasure code, where data with dual parity is spread across a minimum of 6 hosts.
When data is updated in an object using a RAID-5/6 erasure-code, the entire stripe of data is not always updated.
- Small write payload using RAID-5: In these conditions, as little as one data fragment and one parity fragment may be written. This is known as a partial stripe write, and under RAID-5 will consist of two reads and two writes for every write operation from the guest VM.
- Small write payload using RAID-6: In these conditions, as little as one data fragment and two parity fragments may be written. This partial stripe write under RAID-6 will consist of three reads and three writes for every write operation from the guest VM.
For a larger sequence of writes, the discrete data fragments and parity fragments will be updated individually, as shown in Figure 1.
Figure 1. vSAN’s standard method for writing data a larger amount of data to an object using RAID-5.
If large amounts of data must be written using RAID-5/6 erasure coding, then this can be more taxing than necessary. Under these conditions, vSAN 7 U3 will use heuristics to look for opportunities to write the data more efficiently.
vSAN's Optimized Method for Writing Data using Erasure Coding
When using RAID-5/6 erasure coding in vSAN 7 U3, vSAN will evaluate the characteristics of the incoming writes in the queues, and if it meets certain conditions, will dynamically adjust itself to write the data with less effort, reducing I/O amplification, network round-trips, serialization, and ultimately, latency as seen by the guest VMs.
When certain I/O conditions are met, vSAN will perform what is known as a "strided write." This will in effect, write the data similar to a full-stripe write, as shown in Figure 2. It helps reduce or eliminate the need for read operations and discrete parity calculations against the individual data fragments to be updated. The heuristics in place will help determine whether the standard method or the high throughput method will be used. No adjustments by the administrator are necessary.
Figure 2. vSAN’s optimized method for writing data a larger amount of data to an object using RAID-5.
When vSAN detects the conditions are sufficient for a strided write to occur, it pairs well with the erasure coding optimizations introduced in vSAN 7 U2. The combination of improved read-modify-write parity calculations included in this previous edition, and the new strided write capability in vSAN 7 U3 can potentially reduce the I/O amplification, computational effort, and serialization dramatically from erasure coding found in vSAN 7 U1 and earlier.
Since the amplification of I/O, network, and computational effort is inherently larger with a RAID-6 erasure code than a RAID-5 erasure code, the potential benefit can be greater with RAID-6 than with RAID-5. The strided write capability will be used only when specific characteristics of the workload occur. If vSAN does not detect the I/O patterns needed for a strided write, the standard method of writing data described above will be used.
When will it Help?
This capability is opportunistic, based on the workload patterns in the environment. It will most likely occur (but not guaranteed) on workloads that have large sequential writes, or large bursts of writes. This can occur with database applications that may perform a "hold and dump" activity from the guest VM. Streaming-write workloads may also see some benefits, as they tend to issue writes with larger I/O sizes, as well as guest VMs Operating Systems that perform in-guest coalescing of writes.
There are some cases where strided writes will not be performed. This includes stretched clusters and 2-node clusters where a secondary level of resilience using erasure coding is used. VMs that have very little write activity will likely see no benefit to this capability because there isn't enough incoming I/O to perform the strided write across the full stripe of data fragments with parity fragments.
vSAN 7 U3 now uses heuristics to dynamically adjust how it writes data when using RAID-5/6 erasure coding and does so automatically. It gives you the ability to do more with what you already have. Just upgrade your cluster and vSAN will do the rest.