Storage Management in VMware Cloud on AWS

August 02, 2021

VMware Cloud on AWS allows customers to run business-critical apps on a familiar, feature-rich VMware SDDC that integrates vSphere, NSX and vSAN and is delivered as a cloud service on AWS. These are managed cloud services, meaning the customer defines the application requirements and VMware takes care of the rest. This post takes a closer look at the storage cloud service powered, by VMware vSAN.

How does it work?

Customer Responsibility

VMware is responsible for and in control of the vSAN cluster. Customers declare the type and scale of the desired cluster. Once provisioned, each node in the cluster contributes capacity to the vSAN datastore. The administrator then controls via policy how this capacity is consumed.  VMware sets an appropriate default policy, but still allows custom policies. As a result, the level of availability is ultimately under the customer’s control.

Customer Responsibility

By default, the vSAN Default Datastore policy is configured with an appropriate policy ensuring the resilience of the production workload. Customers interested in defining availability and the overall footprint of the SDDC can use vSAN Storage Policy-Based Management, available within VMC. Custom policies can be defined to conform to the needs of the underlying application; employing Erasure-Coding to control capacity consumption or a myriad of other options. In practice, we see customers establish a commonsense baseline, and then layer in specific policies for individual workloads or data classifications.

Customer Responsibility

This flexible system empowers customers to shape how a VMware Cloud on AWS cluster is consumed without needing to get into the specifics of day-to-day vSphere/vSAN management. That said, there are still a few guidelines to adhere to within the service.

Minimum Policy configuration

The VMware Cloud on AWS Service Definition and SLA outline the expectations for both VMware and customers. For any workload to qualify for SLA credits, the configured storage policy must meet the minimum required failures to tolerate. This minimum is based on vSAN’s ability to survive a failure within the AWS cloud.  Customers are empowered to opt-out of the SLA and choose to accept the risk, protecting from permanent data loss by other means.

i3.metal

Customer Responsibility

The AWS i3.metal EC2 hosts use local NVMe media. The VMware Cloud service uses a very sophisticated auto-remediation process that replaces any problematic hosts at the first sign of trouble. This proactive system enables customers to run clusters at capacity without the need for a spare or maintenance host, by relying on the cloud to add those hosts when they’re needed. However, this network-intensive replacement process can take some time on larger clusters. To protect from a potential double failure scenario, the service requires clusters with more than 6 hosts in a single AZ to use a storage policy capable of surviving 2 failures. Customers are responsible for changing the policy configuration in such scenarios.

Elastic vSAN (r5.metal)

Customer Responsibility

Elastic vSAN uses a different kind of EC2 Nitro instance. These instances are diskless and use Elastic Block Store (EBS) volumes as local storage. The service has been optimized to use the additional resiliency of EBS to reduce rebuild times. This is done by moving the EBS volumes themselves instead of the data contained therein. This combination allows VMware to protect all Elastic vSAN instances with a 1 Failure to tolerate. Customers may still choose to use a higher availability policy to protect availability gaps caused by unplanned host replacement, but it is not compelled by the SLA.

Free Space

In addition to the baseline availability requirements, the service also requires a minimum level of free space to be maintained at all times. For VMware to guarantee operational availability, the cluster must have sufficient free capacity to rebuild into, in the event of an unplanned failure or massive policy change. The Service Definition defines that every VMware Cloud on AWS cluster must maintain 25% free space at all times. To ensure availability, the service automatically adds a node via Elastic DRS at 20% free space.

Customer Responsibility

Summary

The VMware Cloud on AWS service enables customers to focus on value-generating work.  Trusting in VMware to maintain the cluster, while retaining ownership for and control over the data itself.  This powerful combination empowers customers without restricting capabilities or increasing complexity thanks to vSAN and Storage Policy-Based Management.

Availability

To view the latest status of features for VMware Cloud on AWS, visit https://cloud.vmware.com/vmc-aws/roadmap.

Resources:

Filter Tags

Storage VMware Cloud on AWS vSAN 2 Node vSAN Compression vSAN Deduplication vSAN File Services vSAN Resilience vSAN Stretched Cluster Blog Technical Overview Intermediate Planning

Pete Flecha

Read More from the Author

Storage & Availability Technical Marketing