Reference Architecture - Healthcare - Philips Image Management System
This section covers the business case, solution overview, and key results.
With the digital transformation phenomenon, consolidation in healthcare creates the unique opportunity for experts afar to collaborate with their peers using the same data to provide an expedient yet thorough analysis of critical matters. The need for peer review analysis and knowledge sharing should not be limited by costs nor response time of the infrastructure or technology. With VMware vSAN™, collaboration is a core competency with respect to the digital transformation.
Digital cataloging, coupled with high-resolution digital imaging, provides healthcare professionals a fast and convenient method to index large numbers of samples for later retrieval and analysis. This produces large volumes of data (usually large block streaming writes), which needs to be stored and accessed quickly. Typically, much of this data is collected from dispersed endpoints and then processed by a central information management system. This translates into IT requirements including large capacity, high-performance, efficient storage with on-demand scaling and failure protection.
The IntelliSite Image Management System (IMS) integrates with existing image analysis software, and are developing connections to Labortory Information Systems (LIS) and various Hospital Information Systems. Philips IMS is part of Philips IntelliSite Pathology Solution, which is an automated digital pathology image creation, management, and review system comprising of an ultra-fast scanner, an image management system and display including advanced features to mange the scanning, storage, presentation, and sharing of information.
VMware vSAN uses industry-standard servers to create a low-cost and high-performance HCI solution. This enables hospital IT departments already familiar with VMware vSphere to natively adopt vSAN for their mission critical workloads such as picture archiving and communication systems (PACS), electronic health records (EHRs), clinical, or auxiliary applications. VMware NSX®, leader in network virtualization allows the highest security level on per VM perspective on a per policy base. Security and simplified manageability allows fast integration and easy maintenance.
VMware and Philips have worked together to validate a solution, powered by VMware vSAN, to provide a storage platform for healthcare digital imaging and indexing, which satisfies the performance, scalability, and failure protection requirements needed for an economical yet powerful eco-system.
The validated solution showcases the use of VMware vSAN as a platform for hosting large datasets in a healthcare environment. We demonstrate the ease of storage management, flexibility, and the performance capabilities of vSAN. Moreover, we validate the networking configurations and throughput, as well as the CPU requirements to provide guidance on system sizing.
This technical reference architecture:
- Provides the solution architecture for hosting healthcare related datasets in a vSAN cluster.
- Validates predictable performance and scalability with vSAN.
- Demonstrates the impact of changing parameters to achieve the optimal performance.
- Identifies the steps required to ensure resiliency and availability against various failures.
- Provides best practices and general guidance.
This section provides the purpose, scope, and audience of this document.
This reference architecture showcases a solution for running healthcare-related image storage, cataloging, and retrieval.
The reference architecture covers the following scenarios:
- Survey of IO profile, including read and write block size and randomness
- Network throughput and required configuration
- vSAN performance and scalability validation
- Configuration optimizations
- Resiliency against hardware failures
This reference architecture is targeted for system architects, designers, and administrators for a vSAN-based system in the context of healthcare image storage, cataloging, and retrieval.
This section provides an overview of the technologies used in this reference architecture:
• VMware vSphere 6.5
• VMware vSAN 6.6
• NSX-V 6.3.4
VMware vSphere 6.5
VMware vSphere 6.5 is the next-generation infrastructure for next-generation applications. It provides a powerful, flexible, and secure foundation for business agility that accelerates the digital transformation to cloud computing and promotes success in the digital economy.
vSphere 6.5 supports both existing and next-generation applications through its:
- Simplified customer experience for automation and management at scale
- Comprehensive built-in security for protecting data, infrastructure, and access
- Universal application platform for running any application anywhere
With vSphere 6.5, customers can run, manage, connect, and secure their applications in a common operating environment, across clouds and devices.
VMware vSAN 6.6
VMware vSAN enables next-generation HCI solutions with low cost and high performance, which converges traditional IT infrastructure silos onto industry-standard servers and virtualizes physical infrastructure. The natively integrated VMware infrastructure combines radically simple VMware vSAN storage management, the market-leading VMware vSphere® Hypervisor, and the VMware vCenter Server® unified management solution all through the broadest and deepest set of HCI deployment options.
The rich feature set of VMware vSAN 6.6 is utilized to achieve the best possible output from the application layer.
NSX for vSphere (NSX-v) 6.3.4
VMware NSX, the network virtualization platform, is a key product in the SDDC architecture. With VMware NSX, virtualization now delivers for networking what it has already delivered for compute and storage. In much the same way that server virtualization programmatically creates, snapshots, deletes, and restores software-based virtual machines (VMs), VMware NSX network virtualization programmatically creates, snapshots, deletes, and restores software-based virtual networks. NSX-v is specific to vSphere hypervisor environments.
This section introduces the resources and configurations for the solution including an architecture diagram, hardware and software resources and other relevant configurations.
Figure 1. Performance Test Environment
Figure 2. Physical Network Overview
Figure 3. Physical Network Integration and Setup
Figure 4. Storage per Host
- 4x HP DL380 G10 series
- Intel(R) Xeon(R) Gold 6150 CPU @ 2.70GHz, total amount of RAM: 256 GB, disabled Intel Hyper-threading
NIC Driver Name VID:DID SVID:SDID Driver Version Firmware Version (HCL link)
--------- -------------- -------------- -------------- -------------- -------------------
vmnic5 nmlx5 core 15b3:1015 1590:00d3 188.8.131.52 14.21.2800
vmnic6 nmlx5 core 15b3:1015 1590:00d4 184.108.40.206 14.21.2800
vmnic7 nmlx5 core 15b3:1015 1590:00d4 220.127.116.11 14.21.2800
vmnic9 nmlx5 core 15b3:1015 1590:00d4 18.104.22.168 14.21.2800
All vmnic uplinks are utilized and spread over two Distributed Virtual Switches (see Figure 3). DVS-1 has 2xvmnics (vmnic 5 and 6) in an active-active configuration, with VM, Management, NSX-V and data lake synchronization traffic. DVS-2 has 2xvmnics (vmnic 7 and 9) configured using LACP for vSAN traffic.
Table 1. Software Resources
VMware NSX for vSphere (NSX-V)
In this section, we present the test methodologies and results used to validate this solution.
The solution validates the use of vSAN as a storage platform for medical imaging (storage and retrieval) for multiple co-current streams, including the following test scenarios:
- Test 1: 10 parallel scanner streams inbound with RAID 5 erasure coding
- Test 2: 10 parallel scanner screams inbound and 13 parallel reads outbound with RAID 5 erasure coding
- Test 3: 10 parallel scanner screams inbound and 13 parallel reads outbound with RAID 1 (FTT=1)
The single-VM application has no additional capability to separate inbound/outbound traffic (including I/O reads and writes). The host CPU core vs. vCore ratio is an important factor in optimizing performance, by not overloading a specific layer and thus lowering the IO throughput. As a reference, an image can vary from 1.25GB to 6GB compressed. The scanning rate is 30MBps per stream uncompressed and 500MBps per Image scanner.
Highly co-current inbound data streams require a high-performance caching tier to sustain the high throughput writes.
The application reaches block sizes of 550KB on writes and 128KB on reads.
Note: Large block sizes, together with FTT demands, require high-throughput network endpoints. Increasing FTT or RAID level implies more write acknowledgments across hosts, which lowers the maximum achievable storage performance.
Single VM setup with the main application with a number of parallel stream test-loads (generated outside the vSAN environment) to simulate inbound traffic. Parallel reads to simulate reads and outbound traffic.
Execute the three tests separately without de-staging. The deduplication and compression feature is not enabled because the sent streams are already compressed (and cannot be compressed further by vSAN).
A micro benchmarking tool was used to determine the different values that are important from the application layer perspective.
The number of streams and IO direction determines the amount of vCores that are required. In our tests, we show the importance of having sufficient cores available on the hypervisor to achieve best performance in the application. We show that 10 inbound streams require 10 parallel vCores. With 13 parallel reads, an additional 13 cores would be required for a 1:1 relationship on executing tasks vs. available vCore CPU resources (which increases the % wait state).
In this example, we only had 18 cores available on the single physical CPU.
Note: The given application requires CPU, memory, and storage resources to process tasks. The task initiated by the application directly or indirectly utilizes vCores/memory and IO.
Figure 5. Testing Overview
Ten Parallel Scanner Streams inbound with RAID 5 Erasure Coding
Simulating the maximum inbound traffic flow to verify the application behaviors with parallel streams. Multiple streams in parallel induce multithread execution on the Java application. The full inbound traffic results in direct IO for the vSAN datastore.
The focus point in this test was to understand the maximum utilization of the vCores for 10 parallel inbound streams with vSAN RAID 5 erasure coding.
10 parallel streams were sent to the VM application layer to verify network throughput on the inbound vmnic and IO behavior on vSAN to verify IO latency.
VM Application workload:
Figure 7. Test Results of 10 Parallel Scanner Streams inbound with RAID 5 Erasure Coding
The test results showed, on 10 parallel incoming streams, the same number of vCores were utilized. Stream data was inbound but due to the application, data would be modified after the write, which introduced reads. IO random level was around 8%, which indicated a very synchronous IO write flow.
On the reads vs. writes from all IOs, the randomness level did not go lower than 30%. High IO block-size equates to high backend utilization, which means that the VM was limited by the vmnic throughput that translates into a VM latency of around 59ms. The lower VM reads and read block-size indicated data manipulation in the application.
Parallel Scanner Screams inbound and 13 Parallel Reads outbound with RAID 5 Erasure Coding
Simulating the parallelism of 10 inbound and 13 outbound traffic streams to determine the Java application behavior on executed threads. The number of threads was directly proportional to the amount of cores assigned to the VM and available cores on the ESXi host. Inbound traffic resulted in direct IO to the vSAN datastore and with RAID 5 erasure coding, a lower vCore throughput was introduced.
The focus point of this test was to understand the maximum utilization of the vCore for 10 parallel inbound streams and 13 parallel outgoing streams, with vSAN RAID 5 erasure coding.
This test used the maximum inbound and outbound streams required for real-world execution.
VM Application workload:
Figure 7. Test Results of Parallel Scanner Screams inbound and 13 Parallel Reads outbound with RAID 5 Erasure Coding
The additional 13 parallel reading streams resulted in even higher latency due to the overutilization of the available cores on the CPU.
The read percentage was slightly higher due to the increase in outstanding IO caused by the overloaded cores (as each queue exhibits first-in-first-out behavior).
Note: RAID 5 relies heavily on the CPU; time to execute is impacted as ESXi needs additional cycles for the read-modify-write operation.
Ten Parallel Scanner Screams inbound and 13 Parallel Reads outbound with RAID 1 (FTT=1)
This test was very similar to the previous one but using RAID 1 instead of RAID 5 erasure coding. Thus the erasure coding overhead was excluded to allow maximum vCore utilization within the VM.
The focus point of this test was to understand the maximum utilization of the vCore for 10 parallel inbound streams and 13 parallel outgoing streams with vSAN RAID 1 (FTT=1).
This test used the maximum inbound and outbound streams required for the real-world execution.
VM Application workload:
Figure 8. Test Results of 10 Parallel Scanner Screams inbound and 13 Parallel Reads outbound with RAID 1 (FTT=1)
The application utilized 16+ vCPU cores by using RAID 1. In-Guest write IO latency dropped from around 200ms down to around 54ms, which allowed higher throughput and reduced the OIO value more than 50%.
This test demonstrated that changing the FTT method (from RAID 5 to RAID 1) could reduce the overhead on the ESXi physical core to achieve better application and IO throughput.
Graphical Test Result Comparison
Figure 9. Graphical Comparison
This section provides the recommended best practices for this solution.
VM vCPU Demand
VM vCPU demand directly depends on the application capabilities and utilization of multiple threads at the same time. vSAN has the highest thread priority on ESXi and therefore the application performance reduces when a higher level of failures-to-tolerate method (for example RAID 5 vs. RAID 1) is chosen. In this reference architecture scenarios, we showed the effects of erasure coding on the physical CPU by maximizing the single VM vCPU.
A well thought-out design of any application is the key for best performance. In-bound parallel stream threads introduced full utilization of all CPU cores. Indeed, 10 streams equated to 10 CPU cores. By adding outgoing read-streams additional threads are generated.
The application goal is to have 10 inbound write streams, with a maximum of 13 outgoing read streams, which would require ideally 24 vCPU cores to be balanced.
In today’s healthcare IT infrastructure ecosystem, understanding a VM and application workload in conjunction with best practice VM sizing is critical to achieving operational excellence. vSAN offers a robust and performant storage solution for the Philips IntelliSite Pathology solution. vSAN demonstrates high-performance with low-latency for large block IO while maintaining low CPU demands for vSAN itself.
For additional information, see the following white papers:
For additional information, see the following product documentation:
About the Author and Contributors
Andreas Scherr, Senior Solutions Architect in the Storage and Availability, Product Enablement team wrote the original version of this paper. Catherine Xu, technical writer in the Product Enablement team, edited this paper to ensure that the contents conform to the VMware writing style.
Contributors to this document include (solution engineers or other reviewers who provided comments on the paper):
- Dharmesh Bhatt, Senior Solutions Architect, Storage and Availability, Product Enablement, VMware
- Christian Rauber, Senior Solutions Architect, Storage and Availability, Product Enablement, VMware
- Aart Kenens, System Engineer, Digital Pathology Solutions, Philips