configured resource disappear

I have 2 node cluster SLES 11 sp2
My problem is that many times I added resources, which disappear after
restart servers. Also I can see resources already deleted by meself.
I use pacemaker GUI to manage resources.
What am I doingo wrong?

Hi Adam,

seems like some sort of issue with cluster db sync:

  • can you confirm that prior to the reboot, both nodes have the same content?
  • does this happen when a single node reboots, or do both nodes need to reboot for this to occur?
  • after the according reboot, is there anything in the logs that maybe hints at some roll-back of the database?

Regards,
Jens

U¿ytkownik “jmozdzen” jmozdzen@no-mx.forums.suse.com napisa³ w wiadomo¶ci
news:jmozdzen.5kmo9b@no-mx.forums.suse.com…[color=blue]

Hi Adam,

seems like some sort of issue with cluster db sync:

  • can you confirm that prior to the reboot, both nodes have the same
    content?
  • does this happen when a single node reboots, or do both nodes need to
    reboot for this to occur?
  • after the according reboot, is there anything in the logs that maybe
    hints at some roll-back of the database?

Regards,
Jens


jmozdzen

jmozdzen’s Profile: http://forums.suse.com/member.php?userid=51
View this thread: http://forums.suse.com/showthread.php?t=1921
[/color]
Both node have the same content.
When I go out with node1 from the cluster changes are in place, the same
when I only go out from the cluster with node2. I can do that many times,
and nothing disappear
When I stop corosync on both, and start, I can observe that my last last
changes disappear.

Hi Adam,

in addition to checking the logs during cluster start, you may want to check in what state the cluster DB is while both nodes are down. My guess is that for a currently unknown reason, the cluster disregards the lastest DB version and resorts to the previous one. Might be disk space, md5 problem or something else.

Regards,
Jens

It is happening on my 2 diffrent clusters instalations. One is in my lab,
another is at the client side. I am using vmware virtual machines as nodes.
Client is using Dell servers with EMS SAN Storage. Both have the same
version of sles, and ha.

U¿ytkownik “jmozdzen” jmozdzen@no-mx.forums.suse.com napisa³ w wiadomo¶ci
news:jmozdzen.5kmvo0@no-mx.forums.suse.com…[color=blue]

Hi Adam,

in addition to checking the logs during cluster start, you may want to
check in what state the cluster DB is while both nodes are down. My
guess is that for a currently unknown reason, the cluster disregards the
lastest DB version and resorts to the previous one. Might be disk space,
md5 problem or something else.

Regards,
Jens


jmozdzen

jmozdzen’s Profile: http://forums.suse.com/member.php?userid=51
View this thread: http://forums.suse.com/showthread.php?t=1921
[/color]

Hi Adam,

how do you take out the cluster nodes - maybe the (latest) change hasn’t made it to persistence yet? Does this happen if you simply stop & start cluster services on both nodes?

Regards,
Jens

When I stop both nodes using rcopenais stop, they are stoping without
errors.
Then I start them, and latest changes to cluster disapper.

U¿ytkownik “jmozdzen” jmozdzen@no-mx.forums.suse.com napisa³ w wiadomo¶ci
news:jmozdzen.5kw8pb@no-mx.forums.suse.com…[color=blue]

Hi Adam,

how do you take out the cluster nodes - maybe the (latest) change
hasn’t made it to persistence yet? Does this happen if you simply stop &
start cluster services on both nodes?

Regards,
Jens


jmozdzen

jmozdzen’s Profile: http://forums.suse.com/member.php?userid=51
View this thread: http://forums.suse.com/showthread.php?t=1921
[/color]

Hi Adam,

When I stop both nodes using rcopenais stop, they are stoping without errors.
Then I start them, and latest changes to cluster disapper.

Then “lost disk writes” or alike don’t seem to be the problem.

When AIS restarts and loads the cluster db, do you see any indication in the log that it tries and fails to load the lastest (in terms of prior to stopping AIS) db, thus resorting to an older copy?

If you have a support contract, this might be a good time to open an incident with Novell/SuSE.

Regards,
Jens