Hello experts,
I am handling my newly built landscape on my SAP HANA with System Replication using HAE resource agents. Recently my system had some issue and we had no choice but to keep maintenance-mode=true for the time being. During then i have also rebooted my server and now crm mon shows no resource. During the next available downtime, i was ready to recover my system to normal configuration.
When i put maintenance mode=false, my HANA system stops. I proceeded to do my linux patching, crm standby both nodes, offline pacemaker and rebooted my 2 host. When my 2 host are back, i perform crm node online node1, follow by crm node online node2. However i notice the status are all wrong. the crm shows node1 as master but my SAP replication was node2 as primary.
i standby both nodes again and try to correct my replication by making node1 as primary, replicating to node2.
After correcting my replication, i perform crm node online node1 followed by node2, but my Resource Agents shutdown my HANA. I could not even manually start HANA. Tried performing resource cleanup but it still does not work.
In the end i had to resort to putting with both crm node online and putting maintenance mode=true then my HANA can be started with replication from node1 to node2. I’m really lost on how to correct the crm status and prevent my resource to fail. Moreover this is my production site and I can only do during downtime.
Hello Shawn,
the information provided is not much in order to debug your case.
Most probably you will need to provide the following information:
- OS version & patch level
- output of:
crm configure ahow
corosync-quorumtool -s
corosync-cfgtool -s
And of course the logs , especially from the DC.