Xen VirtualDomain OCF Heartbeat resource not starting VM

6529034 · January 5, 2015, 8:09am

I have been testing SLES 12 with the High Availability Extension pack installed.
I have 2 servers/nodes with all the latest patches installed.

I have configured DRBD as my shared storage and this is working fine - both nodes can start the VM successfully via the VMM.

I have configured the resource with parameters, config=, hypervisor=xen:///, migration_transport=.

The monitor, start, and stop op are as per defaults.

I have tried xen:///system and xen:///session for the hypervisor setting, but no differences noted. I have also tried a number of op settings for monitor, start, and stop settings - again, not differences.

My 2 nodes are configured for passwordless ssh login - this has been tested successfully.

When I start the resource it appears in a stopped state in the hawk cluster resources configured, however, the primitive is shown as started in its meta-attrbutes settings.

When I ‘view details’ on the resource it is shown as target role started and fail count = 0 for both nodes. There is no exit reason listed.
When I look at ‘view recent events’ I have 3 entries, ‘Success’ on node 1, ‘Success’ on node 2, and ‘Success’ on node 1 again. No errors are reported anywhere.

I have used the xen ocf resource before withiut issues, in SLES 12, there is only the OCF VirtualDomain resource.

Can anyone help in getting this resource settup such that the VM can be started by the cluster?

One interesting effect I have also noticed is if I try to live migrate a VM it ends up running on both nodes when I use the command line virsh migrate --live xen+ssh://<node 2>.corp ( I also tried the ip address of the node). There is an error, error: operation failed: Failed to unpause domain.

Can anyone shed any light on why this might be?

Thank you for any help,

John

system · January 11, 2015, 3:30pm

6529034,

It appears that in the past few days you have not received a response to your
posting. That concerns us, and has triggered this automated reply.

Has your issue been resolved? If not, you might try one of the following options:

Visit http://www.suse.com/support and search the knowledgebase and/or check all
the other support options available.
You could also try posting your message again. Make sure it is posted in the
correct newsgroup. (http://forums.suse.com)

Be sure to read the forum FAQ about what to expect in the way of responses:
http://forums.suse.com/faq.php

If this is a reply to a duplicate posting, please ignore and accept our apologies
and rest assured we will issue a stern reprimand to our posting bot.

Good luck!

Your SUSE Forums Team
http://forums.suse.com

6529034 · March 26, 2015, 1:24am

I have overcome my issue when trying to start a XEN VM via a HA resource in SLES 12. It turned out to be a cluster configuration setting needing adjustment - the ‘placement strategy’ needed to be set to ‘default’ (i had it on 'balanced). I had the dual-master drbd resources running fine with the ‘balanced’ setting, just the VM resource would not start.

This is great, of course.

However, using XEN, I still have the issue (weather using HA VM resource or command line virsh), where the live migration still fails. Both methods return the same error, ‘Migration failed: Could not unpause domain’. The result is that the VM ends up running on both nodes simultaneously.

I researched much on how libvirt should be configured for live migration and could not find anything that I have incorrectly configured. I did find that the ‘tun’ service should be enabled - I installed and enabled it - for the purpose of live migration and preventing simultaneous dual instances of the VM running - this, however, did not resolve the issue.

I then rebooted my 2 nodes in the normal kernel (which I had already installed KVM on) and tried the HA resource and live migration using KVM. The HA resource started the VM successfully (as per the XEN VM previously) and live migration worked perfectly (it also worked perfectly using the virsh commandline.

So, to wrap up, KVM works fine using HA and live migration and XEN does not live migrate at all - all this as tested on the exact same 2 nodes running SLES 12 with HA extension pack installed.

I guess I will be using KVM on SLES 12 for now until the XEN issues are ironed out.

However, does anyone know of a solution to this XEN VM issue running on SLES 12 with live migration set up?

Regards,
John

Topic		Replies	Views
XEN Live migration issue - migrate_to option not selectable SLES High Availability Extension	1	395	January 19, 2016
Help with xen ha resource in sles 12 SLES Virtualization	3	262	March 26, 2015
SLES 12 High Availability Extension Resources not starting SLES High Availability Extension	5	464	March 26, 2015
SLES 12 SP1 XEN HA Resource not shutting down guest when shu SLES High Availability Extension	4	401	June 7, 2016
Pacemaker + DRBD + Xen: failback issues SLES High Availability Extension	1	309	March 15, 2012

Xen VirtualDomain OCF Heartbeat resource not starting VM

Related topics