April 03, 2024

VMware Private AI Foundation with NVIDIA Server Guidance

Recently at the NVIDIA GTC Conference 2024, the initial availability of VMware Private AI Foundation with NVIDIA was announced, powering the era of AI in your data center. VMware Private AI Foundation with NVIDIA allows our customers to run AI workloads on-premises by leveraging VMware Cloud Foundation (VCF), utilizing NVIDIA GPUs and software ecosystem.

This joint platform not only fosters more secure AI workloads, but also adds flexibility and operational efficiencies while maximizing performance. In addition, VCF adds a layer of automation to make the deployment of Deep Learning VMs a breeze for the data scientist, more on that procedure here.

Private AI Foundation Architecture

While VMware and NVIDIA have you covered for your software needs, identifying the best hardware to run Private AI workloads on is also a key ingredient for a successful AI implementation. We have partnered with Dell, Fujitsu, HPE, and Lenovo among other server vendors to identify a comprehensive list of supported platforms optimized to run NVIDIA GPUs with VMware Cloud Foundation. While some AI workloads may run on older NVIDIA A100 GPUs, we are currently recommending NVIDIA’s L40s, and H100 GPUs for modern AI workloads to achieve optimal performance and utilization.

The servers listed below are certified specifically for VMware Private AI Foundation with NVIDIA. The certification process incorporates GPU partner certification with the hardware platform as well as VM DirectPath IO for general purpose GPU support with VMware. Please note that additional vendors and GPUs will be added at a later time, so make sure to check back. 

 

Dell Technologies

 

Server Model

NVIDIA L40s

NVIDIA H100

Max number of GPUs supported

PowerEdge R750 Rack Server

 

2

PowerEdge R760 Rack Server

2

VxRail VP-760

2

PowerEdge R760xa Rack Server

4

PowerEdge R7625 Rack Server

2

 

 

Fujitsu

Server Model

NVIDIA L40s

NVIDIA H100

Max number of GPUs supported

PRIMERGY RX2540 M7

2

 

 

HPE

Server Model

NVIDIA L40s

NVIDIA H100

Max number of GPUs supported

Proliant DL380 Gen11

3

Proliant DL380a Gen11

4

Proliant DL385 Gen11

 

4

 

 

Lenovo

Server Model

NVIDIA L40s

NVIDIA H100

Max number of GPUs supported

ThinkAgile VX650 V3

3

ThinkSystem SR650 V3

3

ThinkSystem SR655 V3

3

ThinkSystem SR665 V3

 

3

ThinkSystem SR670 V2

4

ThinkSystem SR675 V3

8 (PCIe), 4(SXM)

 

 

Supermicro

Server Model

NVIDIA L40s

NVIDIA H100

Max number of GPUs supported

SYS-221H-TNR

 

3

 

 

Useful Links:

VMware Private AI Foundation with NVIDIA Technical Overview

VMware Compatibility Guide

NVIDIA Qualified System Catalog

 

 

 

Filter Tags

AI/ML Cloud Foundation Blog Deployment Considerations Intermediate Planning