We have received some requests from SAP Basis colleagues on how to go about designing SAP systems on VMware. Now that vSphere 5 can support up to 32-way virtual machines it is possible to fit larger SAP systems into one single virtual machine (VM) so should we go with 2-tier versus 3-tier? Here are some guidelines.
First let’s cover sizing as this will impact the final VM architecture. SAP sizing is conducted in the SAP metric “SAPS” (http://www.sap.com/solutions/benchmark/measuring/index.epx ). All SAP on VMware sizing is officially conducted by the server vendor SAP practice. VMware partners with the server vendors so we can help but we are not ultimately responsible for sizing. The background behind this is as follows:
- SAP has officially deferred sizing on physical and virtual to their hardware partners (since 1993 approx). SAP’s position is documented on SAP Marketplace https://service.sap.com/sizing (logon credentials are required). As of April 2012:
- Under “Sizing Responsibilities” it says “The hardware vendors are responsible for providing hardware that will meet the customer’s throughput and response time requirements”
- Under “Virtualization – Some Statements about Sizing and Virtualization” it says “For the right virtualization strategy you should get in touch with your hardware vendor”.
- The SAPS rating per vCPU only depends on the processor model. The hardware vendor has the most up-to-date SAPS ratings of their servers so they can size most accurately. For example the SAPS rating of a virtual machine with 4 vCPUs will change if moved from one server model to another.
The hardware vendor can conduct the sizing and provide the number of ESX servers required to fulfill the business requirements. Once this is available VMware can work with the hardware vendor and customer to jointly fine-tune the VM size and layout.
We recommend starting conservatively for business critical workloads. An initial sizing option could be to allocate number of vCPUs = number of cores on the ESX server – we would do this even for hyper-threaded systems.
To achieve higher utilizations, the total amount of vCPUs running on an ESX server can be higher than the total amount of physical cores. The ESX hypervisor is designed to optimally schedule the workload amongst the available CPUs. Additionally, it can be configured to give more important virtual machines a higher priority. Hardware supported features like hyper-threading will increase the CPU scheduling efficiency. No general statement can be made regarding the optimal CPU over-commitment ratio, as this always depends on individual utilization patterns of the workload.
2-tier versus 3-tier
The architecture of a single SAP system consists of: a database instance; application server instances; Central Instance (CI – includes locking and message services and other SAP processes). In newer SAP releases the Central Instance is replaced with Central Services (CS – locking and messaging only) and the Primary Application Server instance (PAS). 2-tier refers to all these components running in the same guest-OS/ VM. 3-tier refers to the situation where, for a single SAP system these components are spread out onto at least two VMs. Each of the components can be deployed into a separate VM. Advantage of 2-tier systems is that there are less VMs to manage and there is no network latency between the SAP components.
3-tier has the following advantages:
- For flexibility, better resource management and better overall high availability i.e. if everything is in one VM and the VM/guest OS / ESX server goes down you lose every component. If workload is dynamic e.g. month-end requires more app tier resource you can add/remove application server VMs as required so 3-tier is better for this (same principles as physical).
- You can set up a ESX cluster in a “n+1” setup i.e. if one ESX server goes down all the VMs can restart on remaining ESX servers and continue to perform as before (auto-restart scripts required for the instances or enter “Autostart=1” in the instance startup profile). 3-tier setups allows you to spread the VMs for a single SAP system across multiple ESX servers so if one ESX server goes down then it minimizes the impact to a single system (off course if DB/CS virtual machine is offline the SAP system is down but hopefully only one component needs to be restarted).
- 3-tier setups allow you to size VMs better so they align with NUMA architecture.
- DB VM – this needs to scale-up vertically so if sizing requires a large DB you can put it in a large VM. Ideally you want the VM to fit inside a NUMA node but if it can’t no big deal vSphere 5 can support a wide VM that crosses a NUMA node (and you can configure virtual NUMA to take advantage of any NUMA optimizations inside the guest OS).
- Application server VMs can scale-out horizontally – size these in smaller blocks such that they fit inside of a NUMA node.
- For more background, see http://www.vmware.com/pdf/Perf_Best_Practices_vSphere5.0.pdf, pages 39-40.
- ABAP+JAVA stack: SAP has a policy whereby they prefer to separate ABAP + JAVA out – we can comply with this in virtual by putting the stacks in separate VMs. Check out http://wiki.sdn.sap.com/wiki/display/SI/SAPs+Dual+Stack+Strategy – this recommends single stack except when it is a hard requirement in the SAP product e.g. Solution Manager. Advantage of this is you can manage performance tuning separately in each VM, for example ABAP does not support large pages, but Java does (see SAP note 1681501 – Configure a SAP JVM to use large pages on Linux). However if you need to run a dual stack, you can in a single VM, just size the VM large enough to handle memory + CPU of both stacks.
In the physical world some customers run batch jobs on the CI which is on the same physical server as the DB instance. The advantage – the jobs run quicker as there is no network hop between the app and the DB. In virtual a similar setup would require a large VM with the DB and app server/CI instance installed in the same guest-OS. The only downside is if there are long periods where the batch jobs are not run – we end up with an oversized VM with low utilization. Some datacenters may have a security requirement to separate the DB in its own guest-OS in which case your options are limited. VMware supports hot-add vCPU for the latest Linux and Windows versions but hot-remove is not supported. One solution, if the batch job is designed to run in parallel threads (many SAP ABAP batch jobs have this ability), increase the degree of parallelism and distribute the batch workload across more app server VMs to decrease the overall runtime of the job (this assumes you have available CPU) – the latter can be provisioned and de-provisioned based on the cyclical nature of the workload.
Matthias Schlarb (SAP Technical Alliance Engineer)
Michael Hesse (SAP Technical Alliance Manager)
Vas Mitra (SAP Solutions Architect)