HA Configuration Issue

raju7258 · July 26, 2017, 8:45am

Hi,

I am facing issue with HA cluster. I have 2 nodes. I am able to see one node online in each server respectively.

NODE 1

SRVPHN85:~ # crm status
Stack: corosync
Current DC: SRVPHN85 (version 1.1.15-19.15-e174ec8) - partition WITHOUT quorum
Last updated: Tue Jul 25 15:39:36 2017
Last change: Tue Jul 25 14:04:02 2017 by root via cibadmin on SRVPHN85

1 node configured
1 resource configured

Online: [ SRVPHN85 ]

Full list of resources:

admin_addr (ocf:IPaddr2): Stopped

NODE 2

SRVPHN87:~ # crm status
Stack: corosync
Current DC: SRVPHN87 (version 1.1.15-21.1-e174ec8) - partition WITHOUT quorum
Last updated: Tue Jul 25 15:35:00 2017
Last change: Tue Jul 25 15:05:57 2017 by root via cibadmin on SRVPHN87

1 node configured
0 resources configured

Online: [ SRVPHN87 ]

Full list of resources:

Need to resolved the issue. Need help.

arunabha_banerjee · July 26, 2017, 10:10am

Hi Raju,

Could you please share “crm configure show” output? I am suspecting there is some problem with network communication (multicast) between the nodes. Please change it to unicast and try to join the second node again.

Thanks

raju7258 · July 26, 2017, 10:56am

Hi,

Thanks for the reply. I think ports 5404 and 5405 are blocked between nodes. Will there be any issue due to this.

SRVPHN85:~ # crm configure show
node 184357468: SRVPHN85
property cib-bootstrap-options: \
have-watchdog=false \
dc-version=1.1.15-21.1-e174ec8 \
cluster-infrastructure=corosync \
cluster-name=hacluster \
show \
stonith-enabled=false

SRVPHN87:~ # crm configure show
node 184357468: SRVPHN87
property cib-bootstrap-options: \
have-watchdog=false \
dc-version=1.1.15-21.1-e174ec8 \
cluster-infrastructure=corosync \
cluster-name=hacluster \
show \
stonith-enabled=false

raju7258 · July 26, 2017, 11:39am

Sorry mistake.

SRVPHN85:~ # crm configure show
node 184357467: SRVPHN85 \
attributes standby=off
primitive admin_addr IPaddr2 \
params ip=xx.xx.xx.xx \
op monitor interval=10 timeout=20 \
meta target-role=Started
property cib-bootstrap-options: \
have-watchdog=false \
dc-version=1.1.15-19.15-e174ec8 \
cluster-infrastructure=corosync \
cluster-name=hacluster \
stonith-enabled=false \
placement-strategy=balanced
rsc_defaults rsc-options: \
resource-stickiness=1 \
migration-threshold=3
op_defaults op-options: \
timeout=600 \
record-pending=true

SRVPHN87:~ # crm configure show
node 184357468: SRVPHN87
property cib-bootstrap-options: \
have-watchdog=false \
dc-version=1.1.15-21.1-e174ec8 \
cluster-infrastructure=corosync \
cluster-name=hacluster \
show \
stonith-enabled=false

arunabha_banerjee · July 26, 2017, 5:10pm

Please share “/etc/corosync/corosync.conf” file.

raju7258 · July 26, 2017, 5:35pm

NODE 1

Please read the corosync.conf.5 manual page

totem {
version: 2
secauth: on
crypto_hash: sha1
crypto_cipher: aes256
cluster_name: hacluster
clear_node_high_bit: yes

token:		5000
token_retransmits_before_loss_const: 10
join:		60
consensus:	6000
max_messages:	20

interface {
	ringnumber:	0
	bindnetaddr:	xx.xx.xx.xx
	mcastaddr:	239.108.147.175
	mcastport:	5405
	ttl:		1
}

}
logging {
fileline: off
to_stderr: no
to_logfile: no
logfile: /var/log/cluster/corosync.log
to_syslog: yes
debug: off
timestamp: on
logger_subsys {
subsys: QUORUM
debug: off
}
}
quorum {
# Enable and configure quorum subsystem (default: off)
# see also corosync.conf.5 and votequorum.5
provider: corosync_votequorum
expected_votes: 3
two_node: 0
}

NODE 2

Please read the corosync.conf.5 manual page

totem {
version: 2
secauth: on
crypto_hash: sha1
crypto_cipher: aes256
cluster_name: hacluster
clear_node_high_bit: yes

token:		5000
token_retransmits_before_loss_const: 10
join:		60
consensus:	6000
max_messages:	20

interface {
	ringnumber:	0
	bindnetaddr:	xx.xx.xx.xx
	mcastaddr:	239.108.147.175
	mcastport:	5405
	ttl:		1
}

}
logging {
fileline: off
to_stderr: no
to_logfile: no
logfile: /var/log/cluster/corosync.log
to_syslog: yes
debug: off
timestamp: on
logger_subsys {
subsys: QUORUM
debug: off
}
}
quorum {
# Enable and configure quorum subsystem (default: off)
# see also corosync.conf.5 and votequorum.5
provider: corosync_votequorum
expected_votes: 3
two_node: 0
}

arunabha_banerjee · July 26, 2017, 5:56pm

You have to change few things.

Change network communication ===> udpu

[CODE]totem {
version: 2
secauth: on
crypto_hash: sha1
crypto_cipher: aes256
cluster_name: hacluster
clear_node_high_bit: yes
token: 5000
token_retransmits_before_loss_const: 10
join: 60
consensus: 6000
max_messages: 20
interface {
ringnumber: 0
bindnetaddr: 192.168.220.0
mcastport: 5405
ttl: 1
}

    transport: udpu

}
[/CODE]

Change quorum section (For two nodes)

[CODE]quorum {

    # Enable and configure quorum subsystem (default: off)
    # see also corosync.conf.5 and votequorum.5
    provider: corosync_votequorum
    expected_votes: 2
    two_node: 1

}

[/CODE]

Set proper cib-bootstrap option (For two nodes)

property cib-bootstrap-options: \\ stonith-enabled=true \\ placement-strategy=balanced \\ no-quorum-policy=ignore \\ stonith-action=reboot \\ startup-fencing=false \\ stonith-timeout=150 \\

raju7258 · July 27, 2017, 8:13am

[QUOTE=arunabha_banerjee;38879]You have to change few things.

Change network communication ===> udpu

[CODE]totem {
version: 2
secauth: on
crypto_hash: sha1
crypto_cipher: aes256
cluster_name: hacluster
clear_node_high_bit: yes
token: 5000
token_retransmits_before_loss_const: 10
join: 60
consensus: 6000
max_messages: 20
interface {
ringnumber: 0
bindnetaddr: 192.168.220.0
mcastport: 5405
ttl: 1
}

    transport: udpu

}
[/CODE]

Change quorum section (For two nodes)

[CODE]quorum {

    # Enable and configure quorum subsystem (default: off)
    # see also corosync.conf.5 and votequorum.5
    provider: corosync_votequorum
    expected_votes: 2
    two_node: 1

}

[/CODE]

Set proper cib-bootstrap option (For two nodes)

property cib-bootstrap-options: \\ stonith-enabled=true \\ placement-strategy=balanced \\ no-quorum-policy=ignore \\ stonith-action=reboot \\ startup-fencing=false \\ stonith-timeout=150 \\[/QUOTE]

how can i remove the cluster from both nodes and start installing from first?

rajeshsamineni · September 24, 2018, 11:26am

Seems both nodes are behaving like individual nodes, please use sleha-join -c to resolve the issue.

[QUOTE=raju7258;38869]Hi,

I am facing issue with HA cluster. I have 2 nodes. I am able to see one node online in each server respectively.

NODE 1

SRVPHN85:~ # crm status
Stack: corosync
Current DC: SRVPHN85 (version 1.1.15-19.15-e174ec8) - partition WITHOUT quorum
Last updated: Tue Jul 25 15:39:36 2017
Last change: Tue Jul 25 14:04:02 2017 by root via cibadmin on SRVPHN85

1 node configured
1 resource configured

Online: [ SRVPHN85 ]

Full list of resources:

admin_addr (ocf:IPaddr2): Stopped

NODE 2

SRVPHN87:~ # crm status
Stack: corosync
Current DC: SRVPHN87 (version 1.1.15-21.1-e174ec8) - partition WITHOUT quorum
Last updated: Tue Jul 25 15:35:00 2017
Last change: Tue Jul 25 15:05:57 2017 by root via cibadmin on SRVPHN87

1 node configured
0 resources configured

Online: [ SRVPHN87 ]

Full list of resources:

Need to resolved the issue. Need help.[/QUOTE]

nnikalje · March 20, 2019, 1:14pm

[QUOTE=raju7258;38869]Hi,

I am facing issue with HA cluster. I have 2 nodes. I am able to see one node online in each server respectively.

NODE 1

SRVPHN85:~ # crm status
Stack: corosync
Current DC: SRVPHN85 (version 1.1.15-19.15-e174ec8) - partition WITHOUT quorum
Last updated: Tue Jul 25 15:39:36 2017
Last change: Tue Jul 25 14:04:02 2017 by root via cibadmin on SRVPHN85

1 node configured
1 resource configured

Online: [ SRVPHN85 ]

Full list of resources:

admin_addr (ocf:IPaddr2): Stopped

NODE 2

SRVPHN87:~ # crm status
Stack: corosync
Current DC: SRVPHN87 (version 1.1.15-21.1-e174ec8) - partition WITHOUT quorum
Last updated: Tue Jul 25 15:35:00 2017
Last change: Tue Jul 25 15:05:57 2017 by root via cibadmin on SRVPHN87

1 node configured
0 resources configured

Online: [ SRVPHN87 ]

Full list of resources:

Need to resolved the issue. Need help.[/QUOTE]

“Please check the servers are syncing time with NTP properly”

-Nitiratna Nikalje

strahil · June 24, 2019, 11:19am

Have you checked the firewall ports are opened?
I have seen such behaviour when the nodes cannot communicate with each other.

As you haven’t mentioned which verison of SLES you are using - I assume SLES 15.
It’s using firewalld by default and that doesn’t have a firewall service by default.

On my test openSUSE 15.1 I am using the following:

[CODE]# cat /etc/firewalld/services/high-availability.xml

<?xml version="1.0" encoding="utf-8"?> Custom High Availability Service This allows you to use the High Availability . Ports are opened for corosync, pacemaker_remote, dlm , hawk and corosync-qnetd. [/CODE]

Topic		Replies	Views
Unable to bring node 2 in a 2 node cluster online SLES High Availability Extension	9	375	December 11, 2012
Cluster comms broken, or not SLES High Availability Extension	0	372	October 19, 2015
SLES for Vmware HA problems SLES High Availability Extension	1	277	March 15, 2013
SUSE High Availability Extension for SAP HANA SLES High Availability Extension	1	639	July 3, 2020
SLES HA Software Loadbalancer Pacemaker Issue!! SLES High Availability Extension	5	828	March 21, 2020

HA Configuration Issue

NODE 1

NODE 2

NODE 1

Please read the corosync.conf.5 manual page

NODE 2

Please read the corosync.conf.5 manual page

NODE 1

NODE 2

NODE 1

NODE 2

Related topics