Can't take node out of standby after upgrade to SP2

allenb_1121 · April 18, 2012, 11:44pm

I have a two-node HAE cluster that I am trying to upgrade to SP2. After upgrading one node to SP2, I can’t take it out of standby.

Here is what I have done so far:

Put node1 into standby
Did “zypper update” to bring everything up to date on SP1 and rebooted
Installed the sp2 migration packages for SLES, HAE, and SDK
Did “suse_register -d 2 -L /root/.suse_register.log”
Did “zypper dup”
Rebooted.

When I go into “crm”, to “node” and do “online”, I get:
Error setting standby=off (section=nodes, set=): Application of an update diff failed
Error performing operation: Application of an update diff failed

Any ideas?

To sum it up: node1 is in standby, and has been upgraded to SP2, while node2 is online and at SP1. crm status on node 1 shows node1 standby and node2 offline, and on node 2 it shows node1 and standby and node2 online

Any ideas???
Surely there isn’t a problem that would prevent SP1 and SP2 from co-existing long enough to do an upgrade? If so , how do you do an upgrade without taking the cluster down.
Or, could something have failed on my install on node1?

Any help would be greatly appreciated if anyone has the answer. Otherwise, it looks like I’m going to be opening a support incident.

Allen Beddingfield
Systems Engineer
The University of Alabama

Jens-U · April 19, 2012, 12:57pm

Hi Allen,

according to the documentation (http://www.suse.com/documentation/sle_ha/book_sleha/?page=/documentation/sle_ha/book_sleha/data/part_install.html chapter D.3), a rolling update is supported. The steps given in D.2 (which are, according to D.3 basically correct) do not mention setting the node into standby, but I would have assumed that to work (and to get it back operational, of course).

crm status on node 1 shows node1 standby and node2 offline, and on node 2 it shows node1 and standby and node2 online

Seems that for some reason, you face a split brain situation, i.e. caused by communication failures (firewalls, configuration trouble during/after upgrade, driver problems, etc) - is there anything in the logs that might point you in the right direction?

Regards,
Jens

Topic		Replies	Views
Proper method for updating a cluster? SLES High Availability Extension	2	320	September 26, 2012
SUSE HAE - standby node disconnect from network SLES High Availability Extension	2	258	September 25, 2012
node unclean (offline) SLES High Availability Extension	1	427	February 7, 2013
SUSE High Availability Extension for SAP HANA SLES High Availability Extension	1	639	July 3, 2020
SLES 11 SP2 - 2 node cluster, unclean state / res. migration SLES High Availability Extension	1	300	July 25, 2012

Can't take node out of standby after upgrade to SP2

Related topics