Configuring a Basic High Availability Cluster for SAP Netweaver

Install the Node Software

The Red Hat High Availability add-on requires installing the required set of software packages, configuring the firewall, and authenticating nodes.

Red Hat Enterprise Linux 8 and Red Hat Enterprise Linux 7 cluster nodes are not compatible in a single cluster. All nodes in a Pacemaker cluster must use the same major version of Red Hat Enterprise Linux. Red Hat Enterprise Linux 8 clusters use Corosync 3.x for communication; Red Hat Enterprise Linux 7 Pacemaker clusters use Corosync 2.x.

Install Required Software on the Node to be Part of the Cluster

The pcs package provides the cluster configuration software. The pcs package requires the corosync and pacemaker packages, which are automatically installed as dependencies for an installation with Yum. The fence-agents-all package pulls in all available fencing agent packages. Administrators can also choose to install only the fence-agents-XYZ package, where XYZ is the intended fencing agent to use. The pcs and fence-agents-all packages must be installed on all the cluster nodes.

[root@node ~]# yum install pcs fence-agents-all

Configure the Firewall for Cluster Communication

You can skip this step if you are not using the Linux built-in Firewall. You must allow cluster communications through any external firewall as applicable in your environment for all the cluster nodes. The standard firewall service on a Red Hat Enterprise Linux 8 system is the firewalld service. The firewalld daemon ships with a standard high-availability service for cluster communication. To activate the high-availability firewall service on each of the cluster nodes to allow cluster communication through the firewall, execute the following commands:

[root@node ~]# firewall-cmd --permanent --add-service=high-availability
[root@node ~]# firewall-cmd --reload

Enable Pacemaker and Corosync on the Nodes

The pcsd service provides the cluster configuration synchronization and the web front end for cluster configuration. The service is required on all cluster nodes. Use the systemctl command to start and enable the pcsd service on all cluster nodes.

[root@node ~]# systemctl enable --now pcsd

The pcsd service uses the hacluster system user for cluster communication and configuration. You must set the password of the hacluster system user on all cluster nodes. Red Hat recommends to use the same password for the hacluster user on all nodes in the cluster. The following example sets the hacluster user password to redhat:

[root@node ~]# echo redhat | passwd --stdin hacluster

You must authenticate the cluster nodes in the pcsd service with the hacluster user and the password that you set up for this user. You need to run the pcs host auth command on only one node to authenticate all nodes in the cluster.

The node1.example.com and node2.example.com cluster nodes are authenticated on the node1.example.com system with the hacluster user and the corresponding password.

[root@node ~]# pcs host auth node1.example.com \
> node2.example.com
Username: hacluster
Password: redhat
node1.example.com: Authorized
node2.example.com: Authorized

For automation purposes, the -u <USERNAME> and -p <PASSWORD> options can also be used.

Configure Basic Cluster Communication

After you prepare the two nodes for the cluster setup, the pcs cluster setup command creates the cluster. This command takes as arguments the cluster name and fully qualified domain names or IP addresses of the cluster nodes. The optional --start parameter starts the cluster on all supplied cluster nodes.

[root@node ~]# pcs cluster setup mycluster --start \
> node1.example.com \
> node2.example.com

By default, a cluster node that gets rebooted does not automatically rejoin the cluster. You can use the pcs cluster enable command to enable automatic starting of the cluster service. The --all option enables automatic starting of cluster services on every cluster member.

The following command enables all cluster nodes to start the cluster service and to automatically join the cluster when executed on one of the cluster nodes.

[root@node ~]# pcs cluster enable --all

Red Hat recommends that you verify that the cluster is working as expected. The pcs cluster status command provides an overview of the current cluster status.

[root@node ~]# pcs cluster status
Cluster Status:
 Cluster Summary:
   * Stack: corosync
   * Current DC: node2.example.com (version 2.0.4-6.el8-2deceaa3ae) - partition with quorum
   * Last updated: Fri Mar  5 12:23:08 2021
   * Last change:  Fri Mar  5 12:22:57 2021 by root via cibadmin on node1.example.com
   * 2 nodes configured
   * 0 resource instances configured
 Node List:
   * Online: [ node1.example.com node2.example.com ]

PCSD Status:
  node1.example.com: Online
  node2.example.com: Online

The pcs cluster status command shows the status of all nodes if they are communicating with each other. The status indicator is the Online: [ node1.example.com node2.example.com ] statement within the Node List section. Any communication issue between nodes is also indicated in this section.

Configure Cluster Node Fencing

Fencing is a requirement for any high availability cluster. It prevents data corruption from an errant node. Fencing also isolates and restarts a cluster member if the node fails to join the cluster and the remaining cluster members still form a quorum. Depending on the hardware used, the cluster can fence a node by turning off the connection to the shared storage or by power-cycling the node.

The first step to set up fencing is to configure the physical fencing device. Different hardware devices are capable of fencing cluster nodes, for example:

  • Uninterruptible power supplies (UPS)
  • Power distribution units (PDU)
  • Blade power control devices
  • Lights-out devices

The fence devices must be added to the cluster. For physical machine fencing, each cluster node might require its own fence device. Use the pcs stonith create command. The command expects a set of parameter and value pairs that the fence agent requires to fence the cluster node. To use the fence_ipmilan fencing agent, the pcmk_host_list, username, password, and ip parameters are required. The pcmk_host_list parameter lists the corresponding host as the cluster knows it. The ip parameter expects the IP address or hostname of the fencing device.

For example:

[root@node ~]# pcs stonith create <fence_device_name> fence_ipmilan \
> pcmk_host_list=node_private_fqdn \
> ip=node_IP_BMC \
> username=username \
> password=password

The pcs stonith status command shows the status of the fence devices that are attached to the cluster. All fence_ipmilan fence devices should show Started status.

[root@node ~]# pcs stonith status
  * fence_nodea (stonith:fence_ipmilan):     Started node1.example.com
  * fence_nodeb (stonith:fence_ipmilan):     Started node2.example.com

If the status of any fence device is Stopped, then a communication problem likely exists between the fencing agent and the fencing server. Verify the settings of the fence device with the pcs stonith config fence_device command. You can update the settings with the pcs stonith update command.

It is highly recommended to test the fencing even if the devices show Started state:

When the testing is complete, you can configure the SAP resources.

Red Hat Enterprise Linux is shipped with many fence devices. You must verify that your intended fence method is supported for your environment:

Setting up HA for SAP NetWeaver

Create a resource for ASCS instance

  • For ENSA1: When the installation and testing are complete according to earlier chapters, you can integrate the SAP NetWeaver system into the pacemaker cluster. Assuming that your underlying storage and network environment as applicable are configured according to SAP guidelines and are part of the cluster, then the following command starts the SAP NetWeaver resources into the pacemaker cluster.
[root@node ~]# pcs resource create <sid>_ascs<InstanceNumber> SAPInstance \
> InstanceName="<SID>_ASCS<InstanceNumber>_rhascs" \
> START_PROFILE=/sapmnt/<SID>/profile/<SID>_ASCS<InstanceNumber>_rhascs \
> AUTOMATIC_RECOVER=false meta resource-stickiness=5000 migration-threshold=1 \
> failure-timeout=60 --group <sid>_ASCS<InstanceNumber>_group \
> op monitor interval=20 on-fail=restart timeout=60 \
> op start interval=0 timeout=600 \
> op stop interval=0 timeout=600

The meta resource-stickiness=5000 value balances out the failover constraint with ERS, so the resource stays on the node where it started, and does not migrate around the cluster uncontrollably. The migration-threshold=1 value ensures ASCS failover to another node when an issue is detected instead of restarting on the same node.

  • For ENSA2:

    [root@node ~]# pcs resource create <sid>_ascs<InstanceNumber> SAPInstance \
    > InstanceName="<SID>_ASCS<InstanceNumber>_s4ascs" \
    > START_PROFILE=/sapmnt/<SID>/profile/<SID>_ASCS<InstanceNumber>_s4ascs \
    > AUTOMATIC_RECOVER=false \
    > meta resource-stickiness=5000 \
    > --group <sid>_ASCS<InstanceNumber>_group \
    > op monitor interval=20 on-fail=restart timeout=60 \
    > op start interval=0 timeout=600 \
    > op stop interval=0 timeout=600
    

    Add a resource stickiness value to the group to ensure that the ASCS stays on a node if possible:

    [root@node ~]# pcs resource meta <sid>_ASCS<InstanceNumber>_group \
    > resource-stickiness=3000
    

    Create a resource for an ERS instance Create the ERS instance cluster resource.

The IS_ERS=true attribute is mandatory for ENSA1 deployments. For more information about IS_ERS, see How Does the IS_ERS Attribute Work on an SAP NetWeaver Cluster with Stand-alone Enqueue Server (ENSA1 and ENSA2)?:

  • For ENSA1:

    [root@node ~]# pcs resource create <sid>_ers<InstanceNumber> SAPInstance \
    > InstanceName="<SID>_ERS<InstanceNumber>_rhers" \
    > START_PROFILE=/sapmnt/<SID>/profile/<SID>_ERS<InstanceNumber>_rhers \
    > AUTOMATIC_RECOVER=false IS_ERS=true --group rh2_ERS29_group \
    > op monitor interval=20 on-fail=restart timeout=60 \
    > op start interval=0 timeout=600 \
    > op stop interval=0 timeout=600
    
  • For ENSA2:

    [root@node ~]# pcs resource create s4h_ers29 SAPInstance \
    > InstanceName="S4H_ERS29_s4ers" \
    > START_PROFILE=/sapmnt/S4H/profile/S4H_ERS29_s4ers \
    > AUTOMATIC_RECOVER=false \
    > --group s4h_ERS29_group \
    > op monitor interval=20 on-fail=restart timeout=60 \
    > op start interval=0 timeout=600 \
    > op stop interval=0 timeout=600
    

Create the required constraints

Create a colocation constraint for ASCS and ERS resource groups.

Resource groups <sid>_ASCS<InstanceNumber>_group and <sid>_ERS<InstanceNumber>_group should avoid running on the same node whenever both nodes are available.

[root@node ~]# pcs constraint colocation add rh2_ERS29_group with \
> rh2_ASCS20_group -5000

This concludes the chapter on Configuring a Basic High Availability Cluster for SAP.

Additional Information

Configuring and Managing High Availability Clusters

Automating SAP HANA Scale-Up System Replication using the RHEL HA Add-On

Sysadmin Blog: How to set up a Pacemaker cluster for high availability Linux