The introduction of the Express Storage Architecture (ESA) in vSAN 8 brings with it extraordinary new levels of performance and resource efficiency. The ESA can process and store data faster than ever before, all while using fewer resources to do so. But the question does occasionally come up, "How much capacity will be used for overheads in the vSAN ESA?"
This post will help explain how much capacity overhead to expect when using the ESA by using simple examples and rounded percentages. All official sizing exercises should use the vSAN ReadyNode sizer to give more precise and thorough guidance for your specific environment.
What are Storage Capacity Overheads?
Storage systems of all types use the additional capacity to accommodate the processing and storing of data. Even your laptop has file system overheads for metadata and uses free space for garbage collection and other administrative tasks.
With highly scalable and resilient enterprise storage systems, the term "capacity overhead" generally refers to the capacity used for administrative purposes by a storage system. It typically consists of the following:
- Metadata, filesystems, and other information that is used to help process, store, and retrieve data.
- The capacity needed to ensure the primary data stored is resilient against failures.
Sometimes these overheads are global and deducted from the total capacity of the storage system to provide a total usable free capacity for the consumption of data. Other types of overheads are deducted from the total capacity as new data is written.
It is not unusual for storage arrays to mask capacity overheads to some degree. Overheads are deducted from the raw capacity provided by all the resources in an array, and the result is presented as a single value representing the total capacity available for consumption. Since vSAN gives administrators the flexibility present data using different topologies, and store data in different ways, vSAN presents capacity in its raw form and then provides the ability to see many of these overheads in vCenter Server. See the post "Demystifying Capacity Reporting in vSAN" for more information.
Capacity Overheads for vSAN ESA
When using the ESA in vSAN 8, capacity overheads will consist of the following categories. Some of these percentages are rounded, with some variables affecting the actual percentages slightly.
- Object replica data. Sometimes referred to as a type of metadata, this will be a percentage of capacity the object is using, based on the type of data placement scheme used. For data using a storage policy of FTT=1 using RAID-5 on clusters 6 hosts or larger, it will use an additional 25% of the object data written to maintain the availability in the event of a single failure against the object. If the same data uses a storage policy of FTT=2, it will use an additional 50% of the object data written to maintain availability in the event of two failures against the object. For more information, see Figure 5 of a recent post describing erasure coding capacity overheads with the vSAN ESA.
- vSAN LFS overheads for an object. This overhead helps the vSAN LFS for the given object process and stores aspects of the log-structured filesystem that helps deliver the high performance associated with the ESA. This will consume an additional 13% (approximate) of the object and replica data written.
- Global metadata. This consists of metadata that helps vSAN ESA store a large amount of data with a very space-efficient and scalable metadata structure. It will typically consume approximately 10% of the total raw capacity of the cluster.
Thin provisioning and data compression will factor into the calculations for the overhead of object data, and is illustrated in Figure 1.
- Thin provisioning. If an 8TB VMDK is provisioned, but only 4TB is written or used, the vSAN LFS and data for resilience overheads apply only to the capacity used, not the capacity provisioned.
- Compression. If compression is enabled, the vSAN LFS and data for resilience overhead percentages apply to the capacity used AFTER compression.
Figure 1. Object capacity overheads in vSAN ESA (excluding global metadata overhead)
When storing data that is protected in the same way (e.g. FTT=2 using RAID-6), the capacity overhead for the vSAN ESA is similar to the vSAN OSA. While the percentages of overhead are very similar, the new data and metadata structures in the ESA result in a higher-performance solution that is more scalable to meet the demands of higher-density storage devices coming into the market. Since the ESA can deliver RAID-5/6 performance space efficiency at the performance of RAID-1, an ESA cluster will typically result in less object replica data for most customers.
Let's run through four examples demonstrating the capacity overheads in the Express Storage Architecture. The numbers and percentages used are rounded for simplicity.
Figure 2. Simple examples demonstrating capacity overheads when using the vSAN ESA.
These examples help illustrate how effective compression can be, as it will reduce the footprint of data and metadata, except for the 10% allocated for global metadata. As noted, these examples do not show the amount of capacity needed for global metadata, which is unaffected by how much data is stored on the system. With global metadata overhead being approximately 10%, a vSAN ESA cluster serving up a total raw capacity of 200TB, the global metadata would consume about 20TB.
Where do the Capacity Overheads Live?
The primary data, replica data, object metadata, and global metadata stored by the ESA in vSAN 8 are sprinkled across the devices claimed by vSAN, storing this data independently from each other. This results in a small failure domain, where the failure of a device does not impact other devices and improves the utilization of device resources.
Figure 3. The different types of data on a storage device that is claimed by vSAN ESA.
Metadata will live in different locations of the data structure for an object. For example, some frequently accessed metadata will live in the performance leg of the respective objects created, while other metadata less frequently accessed may live on the capacity leg of an object. vSAN's unique design also allows for portions of the respective B-trees to live in the part of the data structure where it makes the most sense.
Figure 4. The performance and capacity legs of an object in vSAN ESA, and where the metadata resides.
One visible difference in the data structure for an object in the vSAN ESA is the use of components for a “performance leg” and a “capacity leg” as described in the post: RAID-5/6 with the Performance of RAID-1 using the vSAN ESA. While the components on the performance leg have a theoretical maximum size of 255GB, this is largely just an implementation detail. The percentages for per-object metadata and global metadata noted above mean that these components will remain relatively small. No capacity sizing considerations are necessary for these performance leg components.
Where to see Capacity Overheads
The results of these capacity overheads in the vSAN ESA will show primarily in the cluster capacity overview and the usage breakdown views in vCenter Server. They will show up under “Replica usage” and “System Usage” but will not necessarily be categorized in the manner described in this post. Provisioned space, used space, and the added object replica data will show in the “VMs” view, as described in the post: Demystifying Capacity Reporting in vSAN.
How to Size a Cluster for Capacity Overheads
The vSAN ReadyNode Sizer is being updated to accommodate the unique design aspects of the ESA. As with the OSA, the vSAN Sizer also can help you ensure you are allowing for sufficient free space in the event of host failures and transient activities. For more information, see the post: "Understanding 'Reserved Capacity' Concepts in vSAN."
As storage devices become denser, the complexity of storing vast amounts of data goes beyond the data itself. The storage and management of the associated metadata is the real challenge. The vSAN ESA was designed to store extraordinary amounts of data but do so in a way that keeps metadata fast, efficient, and scalable.