The IOPS limits storage policy rule found in vSAN is a simple and flexible way to limit the amount of resources that a VMDK can use at any given point in time. While it is easy to enable, there are specific considerations in how performance metrics will be rendered when IOPS limit rules are being enforced.
Understanding how IOPS Limits are Enforced
The rationale behind capping one or more VMDKs within a VM with an artificial IOPS limit is simple. Since the busiest VMs aren't always the most important VMs, IOPS limits can be used to curtail a "noisy neighbor" VM from consuming a disproportionate amount of resources. Limiting IOPS of a VM consuming more than its fair share of resources can free up these resources for other VMs in the cluster and help ensure more predictable performance across the cluster.
Measuring and throttling I/O payload using just the IOPS metric has its challenges. I/O sizes can vary dramatically, typically ranging from 4KB to 1MB in size. This means that a single I/O could be 256 times the size as another I/O, with the latter taking much more effort to process. When enforcing IOPS limits, vSAN uses a weighted measurement of I/O. When applying an IOPS limit rule to an object within vSAN, the vSAN I/O scheduler will "normalize" the size in 32KB increments. This means that an I/O under 32KB will be seen as one I/O, an I/O under 64KB will be seen as two I/O, and so on. This provides a better-weighted representation of various I/O sizes in the data stream and is the same normalization increment used when imposing limits for VMs running on non-vSAN based storage (SIOC v1). Note that vSAN uses its own scheduler for all I/O processing and control, and thus, does not use SIOC for any I/O control. For vSAN powered VMs, normalized IOPS can be viewed adjacent to vSCSI IOPS at the VMDK level, as shown in Figure 1. When workloads use large I/O sizes, the normalized IOPS metric may be a significantly higher number than the IOPS observed at the vSCSI layer.
Figure 1. Viewing Normalized IOPS versus vSCSI IOPS on a VMDK
This normalization measurement occurs just as I/Os are entering in the top layer of the vSAN storage stack from the vSCSI layer. Because of this, I/Os coming from, or going to the vSAN caching layer, the capacity tier, or client cache on the host are accounted for in the same way. Enforcement of IOPS limits only applies to I/O from guest VM activity. Traffic as the result of resynchronizations and cloning will not be subject to the IOPS limit rule. Reads and writes are accounted for in an equal manner, which is why they are combined into a single "Normalized IOPS" metric as shown in Figure 1.
When IOPS limits are applied to an object using a storage policy rule, there will not be any change in behavior if the demand does not meet or exceed the limit defined in the policy. When the number of I/Os exceed the defined threshold, vSAN will enforce the rule by delaying the I/Os so that the rate does not exceed the established threshold. This means that under these circumstances, the time to wait for completion (latency) for an I/O will be longer.
Viewing Enforced IOPS Limits using the vSAN Performance Service
When a VM is performing an activity that exceeds an applied IOPS limit policy rule, any period that the IOPS limit is being enforced will show up as increased levels of latency on the guest VMDK. This is expected behavior. Figure 2 demonstrates the change in IOPS, and the associated latency under three conditions: 1.) No IOPS limits rule. 2.) IOPS Limit of 200 enforced, and 3.) an IOPS limit of 400 enforced.
Figure 2. Observing enforced IOPS limits on a single VMDK, and the associated vSCSI latency
Note that in Figure 2, the amount of latency introduced reflected the degree to which IOPS needed to be suppressed to achieve the limit. Suppressing the workload less will result in lower amounts of latency shown in the metric. For this workload, suppressing the maximum IOPS to 200 introduced two to three times the amount of latency when compared to capping the IOPS to 400.
Latency introduced by IOPS limits will show up elsewhere. Observed latencies will increase at the VM level, the host level, the cluster level, and even with applications like vR Ops. This is important to consider, especially if the primary motivation for using IOPS limits was to reduce latency for other VMs. When rendering latency, the vSAN Performance Service does not distinguish whether latency came as the result of contention in the storage stack or latency from enforcement of IOPS limits. This is consistent with other forms of limit-based flow control mechanisms.
IOPS limits applied to some VMs can affect VMs that do not use the storage policy rule. Figure 3 shows a VM with no IOPS limits applied, yet the overall I/O was reduced during the same period as the VM shown in Figure 2. How does this happen? In this case, the VM shown in Figure 3 is copying files to and from the VM shown in Figure 2. Since it is interacting with a VM using IOPS limits, it is being constrained by that VM. Note here that unlike the VM shown in Figure 2, the VM shown in Figure 3 does not have any significant increase in vSCSI latency. This is because the reduction in I/O is being forced by the other VM in this interaction, and not by a policy applied to this VM.
Figure 3. A VM not using an IOPS limit rule being affected by a VM using an IOPS limit rule.
It is easy to see how IOPS limits could have secondary impacts to multi-tiered applications, or systems that regularly interact with each other. Unfortunately, this reduction in performance could go easily undetected, as latency would not be the leading indicator of a performance issue.
Are you seeing higher than expected latencies in your vSAN cluster? If so, check those IOPS limit storage policy rules. While IOPS limit rules in storage policies can provide an easy way to cap resource usage, how it presents the throttled activity can be misleading to a user viewing performance metrics who may be unaware that IOPS limits may be in use.