Application Workload Guidance and Design for Virtualized SAP S/4HANA® on vSphere (Part 3/4)

In part 1 we introduced the concept of SAP HANA Application Workload guidance and using example business requirements to come up with a workload and vSphere cluster design for the SAP environment. In part 2  we looked at storage, network and security design for the proposed customer environment. In this part we will look at monitoring & management, backup/recovery and disaster recovery for SAP S4/HANA.

SAP S/4HANA Monitoring and Management

Nearly every component of the IT stack contributes to application performance, which can make it challenging to identify the cause of issues when they arise. For many organizations, a lack of visibility can lead to mean-time-to-innocence hunts that waste time and create alert storms that drain the productivity of business teams. With a complex application such as SAP S/4HANA, performance issues can be even more difficult to specify because the application requires resources from the virtual environment, the network, and databases. However, integrating monitoring into a single console—such as VMware vRealize Operations Manager  can provide visibility into SAP workloads and other IT relationships to impact performance.

The Management Pack for SAP S/4HANA enhances vRealize Operations Manager by adding three dashboards that include the following features:

  • SAP overview dashboard ­– See heat maps depicting the overall health of SAP landscapes, host systems, and popular instance types such as Java, ABAP, and dual stack.
  • SAP relationships – Access relationships, badges, health trees, and metrics for a particular SAP resource.
  • SAP host overview – View top alerts, heat maps, and relationships to VMware VMs, parent VM KPIs, and CPU and memory metrics for SAP hosts.
  • SAP landscape overview – See top alerts, heat maps, CPU and memory usage metrics, and services utilization for an SAP landscape.
  • SAP HANA environment – See details about SAP HANA resources, including alerts and key metrics.
  • SAP HANA host details – Access detailed information about SAP HANA hosts, including workload, capacity remaining, statements summary, and connections.
  • SAP HANA overview – Select a system to see properties, workload, capacity remaining, request summary, and topology view of relationships.

Reports and Dashboards

Performance issues, particularly across a wide-sprawling application such as SAP, can be challenging to ascertain, especially with the growing complexity of the IT stack in today’s environment. Having the ability to clearly see where issues develop can be a game changer in ensuring availability and a better experience for end users. Reporting and dashboards can extend visibility into key areas of the SAP environment and can identify issues as soon as they arise rather than when they are wreaking havoc on the system

Figure 10. Example of the SAP System Overview Dashboard in vRealize Operations

Capacity Planning

Predictive analytics in vRealize Operations can provide the insight required to optimize the capacity and health of an SAP environment. Analysis badges offer a visual indication of the current condition of the virtual environment. Updates in real time quickly help determine whether capacity issues are being caused by various indicators such as workload, capacity, or stress on the application. Capacity definitions help extend that visibility into specific areas of an SAP application and enable reporting on key elements that help determine trends and how to improve application performance.

Figure 11. Example of Analysis Badge for SAP S/4HANA in vRealize Operations

SAP S/4HANA Automation

VMware Adapter for SAP Landscape Management Solution Overview

VMware Adapter for SAP Landscape Management, part of the VMware private cloud solution for SAP is a virtual appliance that integrates SAP Landscape Management with VMware management software—that is, vCenter Server and vRealize Automation. This delivers unique automation capabilities that radically simplify how SAP basis administrators and end users provision and manage SAP landscapes. The appliance accepts application calls from SAP Landscape Management, then uses vRealize Automation or VMware vRealize Orchestrator ™ workflows to execute commands to vCenter Server or operations related to VMware products, such as starting and stopping a VM. Furthermore, IT administrators can now leverage SA-API to automatically provision SAP systems from templates with vRealize Automation in conjunction with SAP Landscape Management.

Key Benefits

The deployment of new SAP systems can take days or even weeks before systems are ready for use. Customers have long used various cloning methods to speed up the deployment process. However, these processes are complex and labor intensive. The
 VMware Adapter for SAP Landscape Management – Connector for vRealize Automation (Connector) greatly simplifies the deployment process by utilizing vSphere cloning and SAP Landscape Management to create new SAP workloads in an automated and repeatable form and from proven templates.

Figure 12. VMware Adapter for SAP Landscape Management

SAP S/4HANA Backup and Recovery

SAP HANA on vSphere Backup and Restore

After reviewing how SAP HANA data persistence works, ensure that the savepoints and logs SAP HANA uses to persist its data are backed up and stored securely to have them available to recover from data loss via restoring this data.

An SAP HANA database backup consists of data backup—that is, snapshots—and transaction log backups. The data backup can be scheduled or started manually within SAP HANA Studio, DBA Cockpit, or via SQL commands. Logs are saved automatically in an asynchronous way whenever a log segment is full or the timeout for log backup has elapsed.

Transaction redo logs are used to record any changes made to the database. In the case of failure, the most recent consistent state of the database can be restored by replaying the changes recorded in the log, redoing completed transactions, and rolling back incomplete ones. Savepoints are created and described as periodic, representing the data stored in the SAP HANA database. They are coordinated across all processes—called SAP HANA services—and instances of the database to ensure transaction consistency. New savepoints normally overwrite older savepoints, but it is possible to freeze a savepoint for future use; this is called a snapshot. Snapshots can be replicated in the form of full data backups, which can be used to restore a database to a specific point in time. Snapshots can also be used to create a database copy for SAP HANA test-and-development systems. Periodic backup of the snapshots and logs ensures the ability to recover from fatal storage faults with minimal loss of data.

Backup and recovery of virtualized SAP HANA systems is similar to that of any physically deployed SAP HANA system. The backup of the necessary files can be performed as a normal file system backup to an external NFS server. When a backint-compatible backup solution is used, the backup can be performed directly via the backint interface to a backup server and then to the final backup device. Using storage built-in snapshot functionality to create backups is another option. This method is the fastest way to create a backup. Some vendors work on backups that are vSphere snapshot compatible. Storage systems that today support the new VVOL standard already enable snapshotting VMs with the full awareness of the virtual disks belonging to a VM. Figure 13 provides an overview of the SAP HANA backup and recovery methods.

Figure 13: Backup and Recovery for SAP S/4HANA:

Disaster Recovery Solutions with vSphere for SAP S/4HANA

We have already discussed recovery solutions for local failures—component or OS failures, for example. In addition to these solutions for local failures, SAP HANA offers disaster recovery solutions supported by vSphere that replicate the data from the primary data center to VMs in a secondary data center. SAP HANA system replication provides a very robust solution to replicate the SAP HANA database content to a secondary disaster site; this storage-based system replication can be used as well. When using SAP HANA System Replication, the same number of SAP HANA VMs must exist at the disaster recovery site. These VMs must be configured and installed similarly to a natively running SAP HANA system with System Replication enabled.

SAP HANA System Replication provides various modes for system replication:

  • Synchronous
  • Synchronous in-memory
  • Asynchronous

Depending on requirements, the disaster recovery VMs can consume higher or lower amounts of resources on the disaster recovery vSphere cluster. For instance, the synchronous in-memory mode consumes the same amount of RAM as with the primary systems. This mode is required only if the customer requests the shortest recovery time. In most customer scenarios, using synchronous data replication should be sufficient. SAP states that by replicating only the data, about 10 percent of the system resources are required, enabling up to 90 percent of the resources to continue to be used by other systems such as test or QA systems.

Figure 14. SAP HANA Scale-Out Solution with Replication

In this scenario, resource over-commitments are used to enable the co-deployment of such an environment. By using resource pools and resource shares, required resources can be provided to the disaster recovery SAP HANA scale-out VMs. The co-deployed system, with fewer resource shares, experiences performance degradation after the disaster recovery systems are used following a site failover. Evacuate these VMs to other available vSphere systems to free up all resources for the disaster recovery SAP HANA VMs. This is another option, as opposed to running the two systems in parallel—with resource limitations—on the same platform.

System replication via storage or the SAP HANA replication solution requires additional steps after a site failover has taken place, to switch the network identity (IP redirect) of the replicated systems from the disaster recovery configuration to the production configuration. This can be done manually or via automated tools such as HP ServiceGuard, SUSE cluster connector, SAP Landscape Virtualization Management (LVM), or other cluster managers. The configuration of such a solution in a virtualized environment is similar to that of natively running systems. Contact your storage vendor to discuss a cluster manager solution supported by its storage solution.

VMware Site Recovery Manager

VMware Site Recovery Manager™ can help reduce the complexity of a system replication disaster recovery solution by automating the complex disaster recovery steps on any level. Site Recovery Manager is designed for disaster recovery of a complete site or a data center failure. It supports both unidirectional and bidirectional failover. It also supports “shared recovery site,” enabling organizations to fail over multiple protected sites into a single, shared recovery site. This site can, for instance, also be a cloud data center provided by VMware vCloud® Air, the VMware cloud service offering.

The following key elements compose a Site Recovery Manager deployment for SAP:

  • Site Recovery Manager – Designed for virtual-to-virtual disaster recovery. Site Recovery Manager requires a vCenter Server management server at each site. These two vCenter Server instances are independent, each managing its own site. Site Recovery Manager informs them of the VMs they must recover if a disaster occurs.
  • Site Recovery Manager manages, updates, and executes disaster recovery plans. It is managed via a vCenter Server plug-in.
  • Site Recovery Manager relies on storage vendors’ array-based replication Fibre Channel or NFS storage that supports block-level replication of SAP HANA data and log files to the disaster recovery site. Site Recovery Manager communicates with the replication process via storage replication adapters offered by the storage vendor and that have been certified for Site Recovery Manager.
  • vSphere Replication has no such restrictions on use of storage type or adapters. It can be used for VMs that are either static or are not performance critical, such as infrastructure services or SAP application servers with RPO of 15 minutes or longer.

Figure 15 shows an example SAP landscape protected by Site Recovery Manager and storage. The VMs running on the primary site contain all needed infrastructure and SAP components such as LDAP, SAP HANA database, and SAP application servers, as in an SAP Business Suite implementation. The VMs can be replicated, depending on the recovery point objective (RPO), via vSphere, SAP HANA, or storage replication. vSphere replication can be used with VMs that tolerate an RPO time of 15 minutes or longer.

Figure 15. SAP Landscape Protected by VMware Site Recovery Manager

Here is a summary of the benefits of using Site Recovery Manager for managing the disaster recovery process for SAP landscapes:

  • Reduce the cost of disaster recovery by up to 50 percent.xlv
  • Application-agnostic protection eliminates the need for application-specific point solutions.
  • Support for vSphere Replication and array-based replication offers choices and options for synchronous replication with zero data loss.
  • Centralized management of recovery plans directly from VMware vSphere Web Client replaces manual runbooks.
  • Self-service, policy-based provisioning via vRealize Automation automates protection.
  • Frequent, nondisruptive testing of recovery plans ensures highly predictable recovery objectives.
  • Automated orchestration of site failover and failback with a single click reliably reduces RTO.
  • Planned migration workflows enable disaster avoidance and data center mobility.

More details from this blog series can be found in this comprehensive Whitepaper. In the final part we will validate the design that was built over the past three parts and conclude the four part blog series.