Cloning virtual machines is an area where VAAI can provide many advantages. Flash storage arrays provide excellent IO performance. We wanted to see what difference VAAI makes in virtual machine cloning operations for “All Flash Arrays”.
The following components were used for testing VAAI performance on an all Flash storage array:
- Dell R910 server with 40 cores and 256 GB RAM
- Pure FA-400 Flash Array with two shelves that included 44 238 GB Flash drives and 8.2 TB usable capacity.
- Centos Linux Virtual Machine with 4 vCPU, 8 GB RAM, 16 GB OS/Boot Disk & 500 GB Data Disk all on the Pure Storage Array
- SW ISCSI on dedicated 10GBPS ports.
Test Virtual Machine:
The virtual machine used for testing was a generic Centos Linux based system with a second virtual data disk with 500GB Capacity. To make the cloning process be truly exercised, we want this data disk to be filled with random data. Making the data random ensures that the data being copied is not repetitive in any way and is not easily compressed or de-duplicated.
Preparing the Data Disk:
The following command was used to create a large 460 GB file with random data with “dd” command on Linux.
dd if=/dev/urandom of=/thinprov/500gb_file bs=1M count=4600000
The disk space used in the data disk is shown below and it contains only the random data file generated with dd command.
root@linux01 thinprov]# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/VolGroup00-LogVol00 10220744 2710700 6982480 28% /
/dev/sda1 101086 20195 75672 22% /boot
tmpfs 4087224 0 4087224 0% /dev/shm
/dev/sdb1 516054864 469853428 19987376 96% /thinprov
Tuning for VAAI and best performance:
VAAI can be enabled or disabled using the following settings: (1 enables, 0 Disables)
esxcli system settings advanced set –int-value 1 –o /DataMover/HardwareAcceleratedMove
esxcli system settings advanced set –int-value 1 -o /DataMover/HardwareAcceleratedInit
esxcli system settings advanced set –int-value 1 -o /VMFS3/HardwareAcceleratedLocking
esxcli system settings advanced set –int-value 1 -o /VMFS3/EnableBlockDelete
Adjust Maximum HW Transfer size for better copy performance:
esxcli system settings advanced set –int-value 16384 –option /DataMover/ MaxHWTransferSize
For larger I/O sizes its found in experiments that settings IOPS to 1 have a positive effect on latency
esxcli storage nmp psp roundrobin deviceconfig set –d <device> -I 1 -t iops
On ESXi 5.5, DSNRO can be set on a per LUN basis!
esxcli storage core device set -d <device> -O 256
Set Disk SchedQuantum to maximum (64)
esxcli system settings advanced set –int-value 64 –o /Disk/SchedQuantum
Phase 1: Cloning with VAAI disabled:
For the first phase of the study VAAI was turned off and the settings validated. The cloning process was initiated for the Linux virtual machine and some of the key metrics were observed and captured at the storage array and in vCenter performance charts.
The cloning process was carefully monitored and the time for the cloning operation was observed to be 63 minutes.
The time in the chart between 2:06 and 3:09 PM represents the cloning operation shown as the blue area. There is a spike in latency (>2ms), IOPS (5000) and Bandwidth utilization around 420 MBPS during this cloning operation.
Phase 2: Cloning with VAAI Enabled:
For the second phase of the study VAAI was turned on and the settings validated. The cloning process was initiated for the Linux virtual machine and some of the key metrics were observed and captured at the storage array and in vCenter performance charts.
The cloning process was carefully monitored and the time for the cloning operation was observed to be 19 minutes.
The time in the chart between 3:54 and 4:13 PM represents the cloning operation shown as the blue area. There is a minimal spike in latency (0.5ms), IOPS (3000) and Bandwidth utilization around 10 MBPS during this cloning operation.
The performance chart for network usage does not correlate with the 10 MBPS average utilization during the cloning operation. The network utilization at the vSphere host level during the operation shows no increase in network utilization as was seen with the Non VAAI operation. This clearly shows that all the network activity occurs within the storage array with no impact the vSphere host.
Effect of VAAI on the cloning operation:
The observations highlight the huge impact that VAAI has on a large copy operation represented by a VM clone. A clone of a VM with 500 GB of random data benefits significantly through the use the use VAAI compliant storage as shown in the following table.
Arrays that are VAAI capable such as the Pure Storage array used in this study dramatically improves write intensive operations such as cloning by reducing time of impact, latency, IOPS and bandwidth consumed. This study shows that even all flash arrays that have fast disks with huge IOPS can significantly benefit from VAAI for cloning