Hi Folks!
We have a 4-node SLES 12 SP3 cluster setup which has gone bad and now I want to recover the setup.
So First we bring up only one node by starting corosync and pacemaker, and want to start the resources on this node only. We want to make sure that resources remain started on this node only unless all the other nodes have joined this node.
Is there a way to bringup all the resources only on one running node while other nodes are still not up?
Thanks Ahead!
It’s not an easy task but it’s doable.
You got 2 approaches:
A ) start everything manually - for example mount filesystems with mount, add IPs via ‘ip addr add’, etc
B ) The second approach is more complicated but sometimes it’s necessary
- Power up the ‘good’ node
- Run
crm configure show
to identify the order of each resource.
For example let’s imagine that we gor GroupA with resources 1,2 & 3 and another GroupB with resources 4,5 & 6. Let’s imagine there is an order constraint defining that GroupA is first then GroupB (order rules have ‘order’ in them).
Then resource order will be: 1,2,3,4,5,6
Next set the cluster in maintenance:
crm configure property maintenance-mode=true
Next, get all the parameters for a resource:
crm resource show RESOURCE
Next export each parameter as described in this article:
https://wiki.clusterlabs.org/wiki/Debugging_Resource_Failures
Then you can start the resource by using the relevant script in the /usr/lib/ocf/resource.d . For example, resource of type ocf:vendor:script can be started via /usr/lib/ocf/resource.d/vendor/script start
Repeat for all resources.
Note: the commands are based on memory, so verify them before that.