bonding apply fails upon reboot

Hello,

I have configured bonding mode 4 (802.3ad) with 4 physical NICs.

ifcfg-public
STARTMODE=‘auto’
BOOTPROTO=‘none’
MTU=9000
BONDING_MASTER=‘yes’
BONDING_SLAVE_0=‘eth2’
BONDING_SLAVE_1=‘eth3’
BONDING_SLAVE_2=‘eth4’
BONDING_SLAVE_3=‘eth5’
BONDING_MODULE_OPTS=‘mode=802.3ad miimon=100’

ifcfg-eth2 (same for ifcfg-eth3~eth5)
BOOTPROTO=None
STARTMODE=hotplug

A problem is if I reboot the node, the bond mode is reset to 1 (round robin) as confirmed in /proc/net/bonding/public.
I am suspicious SLES fails to apply the configured bond mode (4) during the start-up process.

This log just repeats without further progress.
kernel: [ 20.011781] bonding: public is being created…
kernel: [ 20.323366] bonding: public: enslaving eth5 as an active interface with an up link.
kernel: [ 21.490152] bonding: public: enslaving eth2 as an active interface with an up link.
kernel: [ 21.892754] bonding: public: enslaving eth3 as an active interface with an up link.
kernel: [ 22.289116] bonding: public: enslaving eth4 as an active interface with an up link.

To resolve it, I found I have to run ‘systemctl restart wicked wickedd’, then the bond mode is changed to 4 (802.3ad) again.
However I want to avoid such manual operation upon reboot.

Has anyone experienced it or any suggestion?

SLES 12 SP1

Thanks,
Jerry

I just setup a bond on a VM and so far I am not seeing this. My bond is
not really effective since it’s just a test of the Yast stuff in the VM,
but it persists with mode=802.3ad across reboots. I’m applying all
patches now to see if I can interfere with things, but so far it seems
happy enough.

For what it is worth, my various /etc/sysconfig/network/ifcfg-* files look
different when compare with yours. For example, for
/etc/sysconfig/network/ifcfg-eth1 :

BOOTPROTO='none'
BROADCAST=''
ETHTOOL_OPTIONS=''
IPADDR=''
MTU=''
NAME='RTL-8100/8101L/8139 PCI Fast Ethernet Adapter'
NETMASK=''
NETWORK=''
REMOTE_IPADDR=''
STARTMODE='hotplug

For /etc/sysconfig/network/ifcfg-bond0 which, yes, has only one device in
it (will test fixing that after patching everything):

BONDING_MASTER='yes'
BONDING_MODULE_OPTS='mode=802.3ad miimon=100'
BONDING_SLAVE0='eth1'
BOOTPROTO='static'
BROADCAST=''
ETHTOOL_OPTIONS=''
IPADDR='1.2.3.4/24'
MTU=''
NAME=''
NETWORK=''
REMOTE_IPADDR=''
STARTMODE='auto'

Still testing; we’ll see what happens next.


Good luck.

If you find this post helpful and are logged into the web interface,
show your appreciation and click on the star below…

Thanks. I believe it’s not a configuration issue, but rather something about wicked.
I hope to find a right solution to remove the manual workaround.

I still cannot duplicate the problem on my VM, though it’s a VM (KVM) and
I am using a static IP address for the bond. Have you tried setting that
differently so that the boot should set an IP rather than relying on (I
presume) DHCP?

I haven’t seen it in my setup, but that you have BOOTPROTO=None (capital
‘N’) for a device just seems odd; perhaps that’s related, since your other
config file for the bond (‘public’) seems to have it set more like what I
see. Maybe it’s just a copy/paste/retype problem, but considering you see
it and I do not I’m going into super-pedantic mode. :slight_smile:


Good luck.

If you find this post helpful and are logged into the web interface,
show your appreciation and click on the star below…

Hi Jerry,

I see you’re using jumbo frames - as you probably didn’t c&p the full configs of the slave interfaces - have you set MTU=9000 on each slave interface as well?

For more details to show, maybe you could set WICKED_DEBUG=ifconfig in /etc/sysconfig/network/config and recheck the output during boot?

Regards,
J

If it was a config issue, it would not work in the beginning :slight_smile:

Thanks jmozdzen , that seems worth to try.