SLES 11 SP2 - 2 node cluster, unclean state / res. migration

Hi,

i have just installed SLES 11 SP2 on two servers. Then I configured the HA pattern.
Because this is a 2 node cluster I set the no-quorum-policy to “ignore”.

At the moment this is the state of the cluster:

[CODE]============
Last updated: Wed Jul 25 16:42:12 2012
Last change: Wed Jul 25 16:21:42 2012 by hacluster via crm_attribute on Server2
Current DC: Server2 - partition with quorum
Version: 1.1.6-b988976485d15cb702c9307df55512d323831a5e
2 Nodes configured, unknown expected votes
4 Resources configured.

Online: [ Server1 Server2 ]

stonith-sbd (stonith:external/sbd): Started Server1
Resource Group: web-server
virtual-ip (ocf::heartbeat:IPaddr2): Started Server2
apache (lsb:apache2): Started Server2
hawk (lsb:hawk): Started Server2 [/CODE]

When using the pacemaker gui, I can migrate the resources successfully from one node to the other.
I can also standby any node, and observe that the resources are started in the other node.

Using the hawk webapplication, I can simulate what happens when one of the nodes get in the “unclean” state. The simulation tells me that the resources are started in the other node.

So far so good.

Then I tried pulling out the ethernet cable from server 2.
After that, I checked the cluster state on both servers:

  • server 2 shows this:

[CODE]============
Last updated: Wed Jul 25 16:48:07 2012
Last change: Wed Jul 25 16:48:06 2012 by hacluster via crmd on Server2
Current DC: Server2 - partition WITHOUT quorum
Version: 1.1.6-b988976485d15cb702c9307df55512d323831a5e
2 Nodes configured, 2 expected votes
4 Resources configured.

Node Server1 : UNCLEAN (offline)
Online: [ Server2 ][/CODE]

  • server 1 shows this:

[CODE]============
Last updated: Wed Jul 25 16:48:07 2012
Last change: Wed Jul 25 16:48:06 2012 by hacluster via crmd on Server1
Stack: openais
Current DC: Server1 - partition WITHOUT quorum
Version: 1.1.6-b988976485d15cb702c9307df55512d323831a5e
2 Nodes configured, 2 expected votes
4 Resources configured.

Node Server2 : UNCLEAN (offline)
Online: [ Server1 ]

stonith-sbd (stonith:external/sbd): started server1[/CODE]

This is driving me crazy, why aren’t the resources (apache, virtual-ip) migrating automatically to server 1?
Shouldn’t Server1 determine that server2 is unclean, and then start the resources?

Please someone give me some lights on this.

On Wed, 25 Jul 2012 16:04:03 +0000, tchico esteves wrote:
[color=blue]

This is driving me crazy, why aren’t the resources (apache, virtual-ip)
migrating automatically to server 1?
Shouldn’t Server1 determine that server2 is unclean, and then start the
resources?[/color]

I could be wrong, but I believe that without a quorum (three servers) or
other communications channel, node1 and node2 are equal and neither can
decide what to do about the other.

David Gersic dgersic_@_niu.edu
Knowledge Partner http://forums.novell.com

Please post questions in the forums. No support provided via email.