The first blog focused on how to Scale up CPU and Memory by hot adding CPU and Memory on the fly to an Oracle workload which can be found here.
This blog is the 2nd blog of the on-demand Scale up series which focuses on how to Hot Add / Hot Remove vmdk’s to / from a VM running an Oracle workload without any application downtime.
This blog will not focus on the actual steps how to Hot Add / Hot Remove vmdk(s) / rdm(s) to / from the Oracle Database VM.
This blog will focus only on the OS and Oracle database steps to be taken once non-clustered vmdk(s) are added or removed from a VM while the database is running.
Non-Clustered and Clustered vmdk(s)
Contents
- 1 Non-Clustered and Clustered vmdk(s)
- 2 Use Cases for Hot Add / Hot Removal of vmdk(s) / rdm(s)
- 3 Oracle Automatic Storage Management (ASM) and ASM Online Rebalancing
- 4 Linux udev
- 5 Test Setup for Hot Add & Hot Removal of non-clustered vmdk(s)
- 6 Test Steps
- 7 Hot Add & Hot Removal of clustered vmdk(s) / rdm(s)
VMFS is a clustered file system that disables (by default) multiple virtual machines from opening and writing to the same virtual disk (.vmdk file). This prevents more than one virtual machine from inadvertently accessing the same .vmdk file. The multi-writer option allows VMFS-backed disks to be shared by multiple virtual machines.
As we all are aware of, Oracle RAC requires shared disks to be accessed by all nodes of the RAC cluster. KB 1034165 provides more details on how to set the multi-writer option to allow VM’s to share vmdk’s. Requirement for shared disks with the multi-writer flag setting for a RAC environment is that the shared disk is
- has to set to Eager Zero Thick provisioned
- need not be set to Independent persistent
KB 2121181 provides more details on how to set the multi-writer option to allow VM’s to share vmdk’s on a VMware vSAN environment.
Starting VMware vSAN 6.7 P01 (ESXi 6.7 Patch Release ESXi670-201912001), the requirement to have the shared disks as Eager Zero Thick (EZT) has been removed, so the RAC shared disks can now be thin provisioned.
This applies to other clustered applications as well running on vSAN which uses multi-writer disks for clustering purposes e.g. Oracle RAC, Redhat Clustering, Veritas Infoscale etc
More information on running Oracle RAC on vSAN 6.7 P01 with thin provisioned vmdk’s can be found here.
Supported and Unsupported Actions or Features with Multi-Writer Flag ( KB 1034165 & KB 2121181 )
Current restriction of shared vmdk’s using the multi-writer attribute is that Storage vMotion is disallowed as per KB 1034165 & KB 2121181
Use Cases for Hot Add / Hot Removal of vmdk(s) / rdm(s)
Some of the use cases would include
- Scaling up / down VM Storage on the fly without application downtime – Hot Add / Hot Removal of both non-clustered & clustered vmdk(s) / rdm (s) can be done online to an Oracle workload VM without any Application downtime or powering off the VM
- Relocating clustered vmdk(s) using multi-writer attribute from one datastore to another datastore on the same storage array or from one storage array to a different storage array
Oracle Automatic Storage Management (ASM) and ASM Online Rebalancing
Oracle ASM is Oracle’s recommended storage management solution that provides an alternative to conventional volume managers, file systems, and raw devices. Oracle ASM is a volume manager and a file system for Oracle Database files that supports single-instance Oracle Database and Oracle Real Application Clusters (Oracle RAC) configurations. Oracle ASM uses disk groups to store data files.
More information on Oracle ASM can be found here.
You can add or drop ASM disks to / from an ASM diskgroup online without downtime. After you add new disks, the new disks gradually begin to accommodate their share of the workload as rebalancing progresses. Oracle ASM automatically rebalances disk groups when their configuration changes, including changes to file group. However, you might want to do a manual rebalance operation to control the speed of what would otherwise be an automatic rebalance operation.
More information on Oracle ASM rebalancing can be found here.
Linux udev
Udev is the mechanism used to create and name /dev device nodes corresponding to the devices that are present in the system. Udev uses matching information provided by sysfs with rules provided by the user to dynamically add the required device nodes. To preserve the device names across reboots in Linux, udev rules is used.
More information on Linux custom udev rules in RHEL7 can be found here,
The SCSI device’s unique ID needs to be extracted and assigned to a symbolic name which will persist across reboots. To get the SCSI device’s unique ID, Linux command scsi_id can be used.
The key-value pair “disk.EnableUUID = “TRUE”” parameter needs to be added to the .vmx file for the VM to present the SCSI ID of the device.
[root@sb_ol76_ora19c ~]# /usr/lib/udev/scsi_id -gud /dev/sdc
36000c298e708af163b3f024b2b9421df
[root@sb_ol76_ora19c ~]#
More information on this can be found in the RedHat article “How to check ‘disk.EnableUUID’ parameter from VM in vSphere“.
Test Setup for Hot Add & Hot Removal of non-clustered vmdk(s)
The steps to add a new vmdk(s) to a VM online without powering off the VM can be found here. The steps to add a new rdm(s) to a VM online without powering off the VM can be found here. The steps to remove vmdk(s) / rdm (s) from a VM are the same.
Care must be taken to make sure that disk (vmdk / rdm) is dropped / removed from the database first and then cleanup operations , if any , must be performed on the OS side before removing the disk (vmdk / rdm) from the VM.
VM ‘SB-OL-Ora19C-HotDisk’ was created on ESXI 7.0 platform with OS OEL 7.6 UEK with Oracle 19c Grid Infrastructure & RDBMS installed.
Oracle ASM was the storage platform with Oracle ASMLIB. Oracle ASMFD can also be used instead of Oracle ASMLIB.
In case we are using Linux udev for device persistence , please refer to the RedHat article which details how to add Oracle ASM devices in RHEL7 using udev.
The rest of the steps, whether we use Oracle ASMLIB or Oracle ASMFD or Linux udev , are the same when adding or dropping Oracle ASM disks to / from an Oracle ASM disk group.
The VM has 3 vmdks:
- Hard Disk 1 of 60G is for the OS
- Hard Disk 2 of 60G is for the Oracle 19c Grid Infrastructure & RDBMS binaries
- Hard Disk 3 of 140G is for the Oracle database which is using Oracle ASM storage with Oracle ASMLIB and is at SCSI position SCSI 1:0
Details of Hard Disk 3 140G Oracle ASM Disk:
OS Details of the Oracle ASM Disk of size 140G :
[root@sb_ol76_ora19c ~]# fdisk -lu /dev/sdc
Disk /dev/sdc: 150.3 GB, 150323855360 bytes, 293601280 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0xc262dc7c
Device Boot Start End Blocks Id System
/dev/sdc1 2048 146800639 73399296 83 Linux
[root@sb_ol76_ora19c ~]#
Oracle ASM and Oracle Database are online :
oracle@sb_ol76_ora19c:ora19c:/home/oracle> smon
0 S grid 11279 1 0 80 0 – 388555 – 21:33 ? 00:00:00 asm_smon_+ASM
0 S oracle 12208 1 0 80 0 – 640228 – 21:44 ? 00:00:00 ora_smon_ora19c
oracle@sb_ol76_ora19c:ora19c:/home/oracle>
List of Oracle ASM disks :
[root@sb_ol76_ora19c ~]# oracleasm listdisks
DATA_DISK01
[root@sb_ol76_ora19c ~]#
Test Steps
- ASM diskgroup DATA_DG has an existing ASM disk ‘DATA_DISK01’ of size 140GB at SCSI 1:0 position
- Add a new ASM disk ‘DATA_DISK02’ of size 300GB at SCSI 2:0 position
- After the new disk ‘DATA_DISK02’ is added to Oracle ASM , relocate all Oracle data to the new ASM disk ‘DATA_DISK02’
- Drop the old ASM disk ‘DATA_DISK01’ from Oracle ASM
- Remove old ASM disk ‘DATA_DISK01’ from VM
Rescan the OS to see the new added disk :
[root@sb_ol76_ora19c ~]# fdisk -lu
…..
Disk /dev/sdc: 150.3 GB, 150323855360 bytes, 293601280 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0xc262dc7c
Device Boot Start End Blocks Id System
/dev/sdc1 2048 293601279 146799616 83 Linux
…….
Disk /dev/sdd: 322.1 GB, 322122547200 bytes, 629145600 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
[root@sb_ol76_ora19c ~]#
Partition the new disk ‘dev/sdd’ using fdisk / parted utilities. Partitioning is a requirement for ASMLIB / ASMFD , more information on that can be found here. Linux udev though does not require the disks to be partitioned. General best practices is to partition the disks so that no one would inadvertently create a partition table on the new disk and cause an outage.
Rescan the OS to see the new partition :
[root@sb_ol76_ora19c ~]# fdisk -lu
…..
Disk /dev/sdc: 150.3 GB, 150323855360 bytes, 293601280 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0xc262dc7c
Device Boot Start End Blocks Id System
/dev/sdc1 2048 293601279 146799616 83 Linux
…….
Disk /dev/sdd: 322.1 GB, 322122547200 bytes, 629145600 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
[root@sb_ol76_ora19c ~]#
Create Oracle ASM disk ‘DATA_DISK02’
[root@sb_ol76_ora19c ~]# oracleasm createdisk DATA_DISK02 /dev/sdd1
Writing disk header: done
Instantiating disk: done
[root@sb_ol76_ora19c ~]#
[root@sb_ol76_ora19c ~]# oracleasm listdisks
DATA_DISK01
DATA_DISK02
[root@sb_ol76_ora19c ~]#
Add the new ASM disk ‘DATA_DISK02’ to the ASM diskgroup ‘DATA_DG’. Then after the operation is successfully completed, drop the old ASM disk ‘DATA_DISK01’ with ASM rebalance power set to maximum to speed up the operation.
grid@sb_ol76_ora19c:+ASM:/home/grid> sqlplus / as sysasm
SQL*Plus: Release 19.0.0.0.0 – Production on Wed Aug 12 22:31:20 2020
Version 19.3.0.0.0
Copyright (c) 1982, 2019, Oracle. All rights reserved.
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 – Production
Version 19.3.0.0.0
SQL> alter diskgroup DATA_DG add disk ‘ORCL:DATA_DISK02’ name DATA_DISK02;
Diskgroup altered.
SQL> select * from v$asm_operation;
GROUP_NUMBER OPERA PASS STAT POWER ACTUAL SOFAR EST_WORK EST_RATE EST_MINUTES ERROR_CODE CON_ID
———— —– ——— —- ———- ———- ———- ———- ———- ———– ——————————————– ———-
1 REBAL COMPACT WAIT 1 1 0 0 0 0 0
1 REBAL REBALANCE RUN 1 1 1091 2846 4338 0 0
1 REBAL REBUILD DONE 1 1 0 0 0 0 0
SQL>
SQL> select * from v$asm_operation;
no rows selected
SQL>
SQL> alter diskgroup DATA_DG drop disk DATA_DISK01 rebalance power 1024;
Diskgroup altered.
SQL>
SQL> select * from v$asm_operation;
GROUP_NUMBER OPERA PASS STAT POWER ACTUAL SOFAR EST_WORK EST_RATE EST_MINUTES ERROR_CODE CON_ID
———— —– ——— —- ———- ———- ———- ———- ———- ———– ——————————————– ———-
1 REBAL COMPACT WAIT 1024 1024 0 0 0 0 0
1 REBAL REBALANCE RUN 1024 1024 1040 1333 70849 0 0
1 REBAL REBUILD DONE 1024 1024 0 0 0 0 0
SQL>
Run the above query a couple of times till the SQL query does not return any rows which means the online rebalance operation is completed.
Now its safe to reclaim the old disk ‘DATA_DISK01’ of size 140GB. Remove the old disk from VM.
Running a fdisk command shows the old 140GB disk has been successfully removed. The database is online with no issues.
Hot Add & Hot Removal of clustered vmdk(s) / rdm(s)
The Hot Add & Hot Removal of clustered vmdk(s) / rdm(s) steps has been described in detail and can be found in the below blog articles:
Add shared vmdk online without downtime for Oracle RAC ASM / OCFS2
https://blogs.vmware.com/apps/2017/09/rac-n-rac-night-oracle-rac-vsphere-6-x.html
Add Shared RDM in Physical/Virtual Compatibility mode for Oracle RAC
https://blogs.vmware.com/apps/2017/08/rdm-oracle-rac-not-question.html
Summary
- We can Hot Add / Hot Remove vmdk (s) / rdm(s) to / from a VM running an Oracle workload without any application downtime.
- Some of the use cases would include
- Scaling up / down VM Storage on the fly without application downtime – Hot Add / Hot Removal of both non-clustered & clustered vmdk(s) / rdm (s) can be done online to an Oracle workload VM without any Application downtime or powering off the VM
- Relocating clustered vmdk(s) using multi-writer attribute from one datastore to another datastore on the same storage array or from one storage array to a different storage array
All Oracle on vSphere white papers including Oracle licensing on vSphere/vSAN, Oracle best practices, RAC deployment guides, workload characterization guide can be found in the url below
Oracle on VMware Collateral – One Stop Shop
https://blogs.vmware.com/apps/2017/01/oracle-vmware-collateral-one-stop-shop.html