Running vSAN in Industrial Environments
Data centers get all the attention these days, and for good reason. Large on-premises environments are more powerful and sophisticated than ever. With dedicated cooling aisles, meticulously cabled systems, and almost surgical-like cleanliness, a modern data center can be a sight to behold. Add powerful solutions like VMware Cloud Foundation, and it sets up organizations well in their effort to build and grow a private or hybrid cloud.
Often overlooked are the thousands of smaller industrial manufacturing and processing facilities that are running vSphere to power their tooling software, inventory control, and ancillary back-office solutions. From the extremely sophisticated and robust, to the antiquated and fragile; Industrial environments typically need to support a variety of proprietary software that is not a good fit for running in the cloud.
At first glance, the needs of industrial environments appear simple, but to those who are in these facilities, their uptime is critical. Unfortunately, the hardware runs in facilities that are woefully insufficient to run the servers in their environments. It can be startling to witness. Many so-called server rooms are nothing more than the hosts and switchgear placed over in a corner with a sticker stating, "Do not turn off." They are often subject to extreme levels of heat, dust, or other aerosols. The servers and supporting switchgear sometimes share the same power circuits as the manufacturing equipment and often have little to no power protection.
Practical Guidance for vSAN Powered Clusters in Smaller Industrial Environments
A proper sizing and design exercise should always be applied to the design of any environment, large or small. The recommendations below offer some general guidance to help improve the simplicity, resilience, and manageability of a vSAN cluster in these types of manufacturing environments.
- Go with vSAN ReadyNodes or VxRail. These options will guarantee compatibility with the VMware Compatibility Guide (VCG). While a "build your own" (BYO) approach is supported, we often see the use of unsupported hardware as the underlying cause of a technical support case raised by a customer. Simplify the experience through a system already pre-approved.
- Choose a rack-mounted 2U form factor. A 2U rack-mounted server has proven to be the most flexible and economical form factor for most environments - especially hyperconverged. A 2U server will allow you to add more storage capacity easily, scaling up the potential capacity in an easy, economical manner.
- Configure your server's out-of-band management interfaces. Whether they use IPMI, iDRAC, ILO, or some other approach, this saves a lot of guesswork out of remote management tasks.
- Use all-flash. Fewer moving parts improve the reliability and performance. Sometimes these manufacturing environments are running software that does not demand high levels of performance, so SATA flash devices at the capacity tier could be a very economical option. Sticking with NVMe devices for the caching/buffering tier is recommended. It has supreme levels of performance, and unlike SAS or SATA flash, it contains its own dedicated, embedded storage controller on the device for a more robust design.
- Use a minimum of four hosts. In smaller industrial environments, when paired with a storage array, it was not uncommon to just see two hosts in a traditional vSphere cluster. This is because the compute demands were minimal. But vSAN relies on hosts for the resilience of storage. While three is the minimum for a vSAN cluster, four hosts will allow you to maintain levels of prescribed resilience in the event of a failure. One may be able to go with single processor hosts to reduce hardware and software costs and remember, with vSAN, there is no storage array to purchase. When paired with a vSAN ROBO license, which is based on total VM count, the type of configuration described above can offer tremendous value.
- If the cluster size is no greater than 4 hosts, stick with storage policies using RAID-1 mirroring. This will give your data the ability to regain their prescribed levels of resilience should you have a host fail.
- Install vSphere/vSAN on persistent flash devices such as an SSD, M.2, U.2, or BOSS module. Using Micro SD cards or USB sticks for the hypervisor was popular at one time, but that trend is fading away fast due to the questionable quality of those devices, and the lack of ability to assign persistent host logging to devices.
- Specify the hosts with at least two disk groups. Two or more disk groups on a host allow the host to still continue to provide storage capacity in the event of a disk group failure. This is especially important in smaller clusters.
- Use appropriate enterprise-class switchgear. vSAN relies on reliable switchgear that has the processing, backplane, and buffering capabilities necessary for transacting high levels of packets per second. Unfortunately, many of the value-based 10Gb switchgear lacks all of those traits. Ensure you are running a sufficient class of switchgear for your environment.
- Use two switches for redundancy. A single switch is a single point of failure for connected environments. Ensure you have two switches, and they are connected with some variation of a LAG. Resist using stacking modules. These are easy to implement and have good cross-switch throughput, but often create a single failure domain for the two switches, which is antithetical to the goal.
- Don't neglect updates. Many times these environments adopt a 'set it and forget it' approach. This approach may work for an old PBX, but is not an effective strategy for hosts powering business-critical workloads using a hypervisor. Remember to update vCenter, the host's firmware, drivers, and hypervisor. The vSphere Lifecycle Manager (vLCM) introduced in vSphere 7 is VMware's all-new approach to host lifecycle management of the hypervisor and the supporting hardware. Use it if possible, and also be sure to sign up for the vSAN VCG Notification Service.
- Use an appropriate and sufficient UPS. Appropriately sized UPS units are neither exciting nor affordable. But the alternative may inflict damage on the equipment, compromise the data, and potentially your career. Also, make sure to allocate funds for battery replacements. UPS units won't provide any help if the batteries are not regularly maintained and replaced.
- Don't forget the backup strategy. Protecting the data, and the software that powers the data is paramount for any environment. Don't let the simplicity of a small environment distract from the requirements of data protection. And no, configuring backups of data on vSAN to a backup target on the vSAN cluster doesn't count as a backup.
While these are good starting point guidelines for running vSAN in small industrial environments, they should not override the specific requirements of an environment. I once provided consultation for a maritime manufacturing facility that used extremely powerful just-in-time cost modeling and inventory forecasting software. It was how they built their advantage over their competitors and needed these processes as fast as possible. In their case, performance mattered, and the recommendations reflected that additional requirement.
Manufacturing facilities and other industrial environments may present unique challenges to the design and operation of a vSphere and/or running a vSAN powered environment. With the proper planning and considerations of those manufacturing or processing facilities, vSphere and vSAN are a perfect fit for those environments.