Dear Expert,
I was doing a PoC for SAP Netweaver 7.5 [A(S)CS/ER] HA Configuration on SLES 12 SP2 for SAP Application but it never worked. I have tried to follow SuSE Offical guide step by step but it never worked. I never faced this kind of issue with Red Hat or any other HA software vendor like VCS or HPE Service Guard.
SuSE Official Guide: https://www.suse.com/docrep/documents/tae65krcgz/SLES4SAP-NetWeaver-ha-guide-EnqRepl-12_color_en.pdf
Ideal Situation:
[CODE]root@hanode1:/root>crm status
Stack: corosync
Current DC: hanode1 (version 1.1.15-21.1-e174ec8) - partition with quorum
Last updated: Mon Jul 24 23:43:57 2017
Last change: Mon Jul 24 23:25:16 2017 by root via cibadmin on hanode1
2 nodes configured
14 resources configured
Online: [ hanode1 hanode2 ]
Full list of resources:
stonith-sbd (stonith:external/sbd): Started hanode1
admin_addr (ocf:IPaddr2): Started hanode1
Clone Set: base-clone [base-group]
Started: [ hanode1 hanode2 ]
Clone Set: cl-nfsserver [nfsserver]
Started: [ hanode1 hanode2 ]
Resource Group: g-nfs
exportfs_nfs (ocf:exportfs): Started hanode1
vip_nfs (ocf:IPaddr2): Started hanode1
Resource Group: grp_sap_as_HA0
rsc_ip_HA0_sapha0as (ocf:IPaddr2): Started hanode1
Resource Group: grp_sap_er_HA0
rsc_ip_HA0_sapha0er (ocf:IPaddr2): Started hanode2
Master/Slave Set: msl_sap_enqrepl_HA0 [rsc_sap_HA0_ASCS00]
Masters: [ hanode1 ]
Slaves: [ hanode2 ]
root@hanode1:/root>
[/CODE]
But when I reboot primary node i.e. “hanode1” failover of ASCS instance never happend.
My Pacemaker configuration:
node 1: hanode1
node 2: hanode2
primitive admin_addr IPaddr2 \\
params ip=192.168.220.13 \\
op monitor interval=10 timeout=20
primitive dlm ocf:pacemaker:controld \\
op monitor interval=60 timeout=60
primitive exportfs_nfs exportfs \\
params directory="/sapmnt/HA0" options="rw,mountpoint,no_root_squash,sync,no_subtree_check" fsid=1 clientspec="192.168.220.0/24" wait_for_leasetime_on_stop=true \\
op monitor interval=30s
primitive nfsserver systemd:nfs-server \\
op monitor interval=30s
primitive ocfs2-1 Filesystem \\
params device="/dev/mapper/OCFS2" directory="/sapmnt/HA0" fstype=ocfs2 options=acl \\
op monitor interval=20 timeout=40 \\
op start timeout=60 interval=0 \\
op stop timeout=60 interval=0
primitive rsc_ip_HA0_sapha0as IPaddr2 \\
params ip=192.168.220.15 \\
op monitor interval=10s timeout=20s on_fail=restart
primitive rsc_ip_HA0_sapha0er IPaddr2 \\
params ip=192.168.220.16 \\
op monitor interval=10s timeout=20s on_fail=restart
primitive rsc_sap_HA0_ASCS00 SAPInstance \\
operations $id=rsc_sap_HA0_ASCS00-operations \\
op monitor interval=11 role=Slave timeout=60 \\
op monitor interval=13 role=Master timeout=60 \\
params InstanceName=HA0_ASCS00_sapha0as START_PROFILE="/usr/sap/HA0/SYS/profile/HA0_ASCS00_sapha0as" ERS_InstanceName=HA0_ERS10_sapha0er ERS_START_PROFILE="/usr/sap/HA0/SYS/profile/HA0_ERS10_sapha0er"
primitive stonith-sbd stonith:external/sbd \\
params pcmk_delay_max=30s \\
meta target-role=Started
primitive vip_nfs IPaddr2 \\
params ip=192.168.220.14 cidr_netmask=24 \\
op monitor interval=10 timeout=20
group base-group dlm ocfs2-1
group g-nfs exportfs_nfs vip_nfs \\
meta target-role=Started
group grp_sap_as_HA0 rsc_ip_HA0_sapha0as \\
meta target-role=Started is-managed=true resource-stickiness=1000
group grp_sap_er_HA0 rsc_ip_HA0_sapha0er \\
meta target-role=Started is-managed=true resource-stickiness=1000
ms msl_sap_enqrepl_HA0 rsc_sap_HA0_ASCS00 \\
meta clone-max=2 target-role=Started master-max=1 is-managed=true
clone base-clone base-group \\
meta interleave=true target-role=Started
clone cl-nfsserver nfsserver
colocation col_grp_sap_as_HAO_msl_sap_enqrepl_HA0_MASTER 2000: grp_sap_as_HA0 msl_sap_enqrepl_HA0:Master
colocation col_grp_sap_er_HAO_msl_sap_enqrepl_HA0_SLAVE 2000: grp_sap_er_HA0 msl_sap_enqrepl_HA0:Slave
location loc_grp_sap_as_HA0_hanode1 grp_sap_as_HA0 10: hanode1
order ord_grp_sap_as_HA0_msl_sap_enq_repl Optional: grp_sap_as_HA0:start msl_sap_enqrepl_HA0:promote symmetrical=true
property cib-bootstrap-options: \\
have-watchdog=true \\
dc-version=1.1.15-21.1-e174ec8 \\
cluster-infrastructure=corosync \\
cluster-name=sapcluster \\
stonith-enabled=true \\
placement-strategy=balanced \\
no-quorum-policy=ignore \\
stonith-action=reboot \\
startup-fencing=false \\
stonith-timeout=150 \\
last-lrm-refresh=1500918768
rsc_defaults rsc-options: \\
resource-stickiness=1000 \\
migration-threshold=5000
op_defaults op-options: \\
timeout=600 \\
record-pending=true
Error:
[CODE]root@hanode1:/root>crm status
Stack: corosync
Current DC: hanode1 (version 1.1.15-21.1-e174ec8) - partition with quorum
Last updated: Mon Jul 24 23:21:35 2017
Last change: Mon Jul 24 23:18:47 2017 by root via cibadmin on hanode1
2 nodes configured
14 resources configured
Online: [ hanode1 hanode2 ]
Full list of resources:
stonith-sbd (stonith:external/sbd): Started hanode1
admin_addr (ocf:IPaddr2): Started hanode1
Clone Set: base-clone [base-group]
Started: [ hanode1 hanode2 ]
Clone Set: cl-nfsserver [nfsserver]
Started: [ hanode1 hanode2 ]
Resource Group: g-nfs
exportfs_nfs (ocf:exportfs): Started hanode1
vip_nfs (ocf:IPaddr2): Started hanode1
Resource Group: grp_sap_as_HA0
rsc_ip_HA0_sapha0as (ocf:IPaddr2): Started hanode1
Resource Group: grp_sap_er_HA0
rsc_ip_HA0_sapha0er (ocf:IPaddr2): Started hanode2
Master/Slave Set: msl_sap_enqrepl_HA0 [rsc_sap_HA0_ASCS00]
Stopped: [ hanode1 hanode2 ]
[B]Failed Actions:
- rsc_sap_HA0_ASCS00_start_0 on hanode2 ‘not configured’ (6): call=51, status=complete, exitreason=‘none’,
last-rc-change=‘Mon Jul 24 23:21:18 2017’, queued=0ms, exec=512ms[/B]
[/CODE]
Am i missing something here? Please help.
Thanks
Arunabha