Ubuntu 22.04 LTS Network Bonding – Active/Standby

Linux

Setting up network bonding between two 10Gbit NICs within a physical machine; this assumes Active-Standby bonding so a link down event on one NIC will cause the other to become Active, no LACP or switch side configuration is required in this case. The documentation assumes you are running under root, stick sudo in front of the commands if not. These instructions have been developed on the aforementioned OS version and have been tested to work. Note that the Active-Standby approach has limitations, but its simplicity means it is often a suitable option for providing full redundancy via dual edge switch (A and B) configuration.

Ubuntu 22.04, the recommended approach is to use “netplan” and the associated YAML configuration files to build up and apply a network configuration to the machine.

The example used below shows the configuration for a physical machines that has 4 x 10Gbit NIC Ports, on two dual port 10Gbit NIC cards, there are also two 1Gbit Copper NIC ports on-board. The 10Gbit Ports are named: eno1, eno2, ens2f0 and ens2f1, the 1Gbit copper ports are named: eno3 and eno4 in this case. Your machine may have different configurations and/or different names, but the basic bonding configuration shown below is the same.

Install and Activate ifenslave

Firstly install the “ifenslave” package.

sudo apt-get install ifenslave

Ensure everything is now up to date.

apt-get upgrade

Ensure the module is registered.

root@host1:/etc/netplan# lsmod | grep bonding
bonding               167936  0

It is now recommended to reboot the server to ensure that all upgraded packages are put in place, and if there are any Kernel upgrades that need a reboot to take effect.

Possible Step

You may need to perform these steps if the module does not start automatically at boot.

modprobe bonding

And then add it to the /etc/modules file to ensure it is loaded automatically at startup. Open the /etc/modules file and add the following at the bottom of the file.

bonding

Configure Netplan YAML with Bonding Configuration

The example script below includes two NICs eno1 and ens2f1, these are added to a bonded interface as slave NICs and then a static IP configuration is then applied to the bond. The example below is from a server called deep-nfs, it is a physical host that is dual-homed into two separate edge switches (A and B), in this configuration there is no required configuration on the edge switch required beyond the tagging to the correct VLAN. The configuration provides a quick failover (and failback) performance where the failover typically happens almost instantly (typically no ICMP ping would be dropped), and a failback typically results in a single ICMP ping drop, this makes the process suitable for that vast majority of services where the loss of a small number of packets (in the event of a service failure) does not affect the applications and services operating on top.

Open, edit and save the /etc/netplan/00-installer-config.yaml, with the configuration as below, adjust the names of the NIC interfaces an IP addresses as required for your case.

# This is the network config written by 'subiquity'
network:
  version: 2
  renderer: networkd
  ethernets:
    ens2f0: {}
    ens2f1: {}
  bonds:
    bond0:
      dhcp4: no
      interfaces:
        - ens2f0
        - ens2f1
      addresses:
        - 172.16.152.10/24
      routes:
        - to: default
          via: 172.16.152.1
      nameservers:
        addresses:
          - 172.16.10.1
          - 172.16.10.2
          - 172.16.10.3
        search:
          - domain.com
      parameters:
        mode: active-backup
        primary: ens2f0
        mii-monitor-interval: 100
        gratuitous-arp: 100
        up-delay: 100
        down-delay: 100

You’ll notice that there are two slave interfaces of the bond, and the primary is set to the interface “ens2f1”, in this example this NIC was connected to the A side edge switch, which provided the most optimal traffic flow in normal operation. As you can see the slave interfaces are enabled with no configuration, the bond is created and is then applied the relevant IP address settings. The parameters of the bond are described in more detail below, but you can also see the offical documentation that includes all possible options: https://netplan.io/reference/, specifcally the “Properties for device type bonds:” section of the document.

  • mode – The mode of the bond, we are using “active-backup” which means one interface is active and the other is inactive, until the active interface fails.
  • primary – Which interface will be the active (by default) when the machine boots, the active interface will be this interface unless it fails.
  • mii-monitor-interval – The interval (in milliseconds) that the MII monitoring takes place, i.e. the interval between the checks to verify if an interface has a carrier – if it is UP or DOWN), typically 100 ms is suitable.
  • gratuitous-arp – Specify how many ARP packets to send after failover. Once a link is up on a new slave, a notification is sent and possibly repeated if this value is set to a number greater than 1. The default value is 1 and valid values are between 1 and 255. This only affects active-backup mode. For historical reasons, the misspelling gratuitious-arp is also accepted and has the same function. A typical value for this is 5, you may want to add more e.g. 100 if you are concerned that the failover/fail-back may not occur as expected (see FDB Aging and Failback (Intermittent Failback) below for more information), this means that when a NIC goes down the surviving interface sends ARP packets to the switch to ensure it updates its MAC address table as soon as possible ensuring that the traffic is then sent to this switch port minimising the impact of the failover. Without this option the failover and/or failback may occur within a few seconds or up to the FDB (MAC Address table) aging time which could be 300 seconds (5 minutes), thus leading to a long disruption during a failover.
  • up-delay – A delay in milliseconds for MII to consider the link up after the link becomes physically up, the longer the delay the better the bond can tolerate flapping of a link without initiating a failover, but this delay affects the length of break in traffic during a failover. It is recommended to set this to 100 ms unless there is a reason not to.
  • down-delay – Like the up-delay, a delay in milliseconds for MII to consider the link down after the link goes physically down.

(tick) The use of the “Default Routes” section is required instead of the now deprecated “gateway4” declaration.

Apply the Netplan Configuration (and Debug the Application of the Configuration)

To apply the configuration you can run the following command from the /etc/netplan directory:

netplan apply

To apply and see additional “debug” information you can use the following command, this can be helpful in troubleshooting a configuration.

netplan --debug apply

Once the configuration has been applied, check the configuration by running:

ip addr

You should see an output such as the below:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eno1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP group default qlen 1000
    link/ether ab:66:0d:b0:17:60 brd ff:ff:ff:ff:ff:ff
3: eno3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether ab:66:4b:f1:3e:1e brd ff:ff:ff:ff:ff:ff
4: eno4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether ab:66:4b:f1:3e:1f brd ff:ff:ff:ff:ff:ff
5: eno2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether ab:66:4b:f1:3e:00 brd ff:ff:ff:ff:ff:ff
6: ens2f0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether ab:66:1e:b5:d0:f0 brd ff:ff:ff:ff:ff:ff
7: ens2f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP group default qlen 1000
    link/ether ab:66:0d:b0:17:60 brd ff:ff:ff:ff:ff:ff
8: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether ab:66:0d:b0:17:60 brd ff:ff:ff:ff:ff:ff
    inet 172.18.154.46/24 brd 172.17.154.255 scope global bond0
       valid_lft forever preferred_lft forever
    inet6 fe80::b877:dff:feb0:1760/64 scope link
       valid_lft forever preferred_lft forever

And the bond should now be active. If you are having issues reaching the bond, it is recommended to either reboot the machine or restart networking with:

systemctl restart networking.service

Troubleshooting and Monitoring the Bond

The bond when configured is fairly simple to monitor, you can use the /var/log/syslog to verify the port/bond operations, if you require more in-depth information you can query the bond directly.

Monitoring the Bond

Check the status of the bond with the following, in this example the bond is called “bond0”, your bond name may be different.

cat /proc/net/bonding/bond0

You should get something like the following, where you can see both the NICs are up and running at 10Gbit in active-backup mode.

Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
 
Bonding Mode: fault-tolerance (active-backup)
Primary Slave: ens2f1 (primary_reselect always)
Currently Active Slave: ens2f1
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 100
Down Delay (ms): 100
Peer Notification Delay (ms): 0
 
Slave Interface: ens2f1
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: ab:66:1e:b5:d0:f1
Slave queue ID: 0
 
Slave Interface: eno1
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: ab:66:4b:f1:3d:fe
Slave queue ID: 0

Clearing Stuck NIC Configuration

If you are adding the above configuration to an already existing and utilised machine, settings can get stuck, the below can clear an interface configuration if required:

ip addr flush dev ens2f0

You can check the log file for any issues when the NICs are brought up with:

tail -f /var/log/syslog

MII Not Identifying Port State

If you see the port state not showing correctly on a bond, you’ll also notice that it doesn’t operate as expected, this is because MII cannot determine the operation of the link for some reason.

Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
 
Bonding Mode: fault-tolerance (active-backup)
Primary Slave: ens2f1 (primary_reselect always)
Currently Active Slave: ens2f1
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 100
Down Delay (ms): 100
Peer Notification Delay (ms): 0
 
Slave Interface: ens2f1
MII Status: Unknown
Speed: Unknown
Duplex: Unknown
Link Failure Count: 0
Permanent HW addr: ab:66:1e:b5:d0:f1
Slave queue ID: 0
 
Slave Interface: eno1
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: ab:66:4b:f1:3d:fe
Slave queue ID: 0

This can be resolved by ensuring that the ifenslave (bonding) module is loaded, ensuring that the latest updates had been installed on a Ubuntu 20.04/22.04 machine and checking that the MII polling interval, gratuitous-arp and up/down delay is configuration and shown in the output of the: “cat /proc/net/bonding/bond0” command.

1″. You should be aware of the “Age” field, when data is flowing to the host down the port (i.e. the port at the other end is responding), the “Age” value should be fairly low, (i.e. 0 – 50), this means that the switch considers the MAC address on the port to be “fresh”.

SwitchA.22 # show fdb port 8
MAC                                      VLAN Name( Tag)  Age  Flags          Port / Virtual Port List
------------------------------------------------------------------------------------------------------
ab:66:0d:a4:17:60                       myvlan(1773) 0002  d m            8
 
Flags : d - Dynamic, s - Static, p - Permanent, n - NetLogin, m - MAC, i - IP,
        x - IPX, l - lockdown MAC, L - lockdown-timeout MAC, M- Mirror, B - Egress Blackhole,
        b - Ingress Blackhole, v - MAC-Based VLAN, P - Private VLAN, T - VLAN translation,
        D - drop packet, h - Hardware Aging, o - IEEE 802.1ah Backbone MAC,
        S - Software Controlled Deletion, r - MSRP,
        X - VXLAN, Z - OpenFlow, E - EVPN
 
Total: 70 Static: 0  Perm: 0  Dyn: 70  Dropped: 0  Locked: 0  Locked with Timeout: 0
FDB Aging time: 300

This can be resolved by ensuring that the ifenslave (bonding) module is loaded, ensuring that the latest updates had been installed on a Ubuntu 20.04/22.04 machine and checking that the MII polling interval, gratuitous-arp and up/down delay is configuration and shown in the output of the: “cat /proc/net/bonding/bond0” command.

FDB Aging and Failback (Intermittent Failback)

It has been observed that in certain situations that after a failover has taken place successfully, i.e. the primary link failed “ens2f1” in the example above and the backup link “eno1” takes over, the primary link is then restored but at that point a break in network service is observed. Examining the FDB (MAC address table) of the A and B side switch shows that for some reason the ARP entry is not cleared from the B side switch (that has the backup link connected), however when examining the server itself it has determined that the primary interface is restored and has started to use that for traffic. 

Essentially the issue is that the physical switch (specially the B side) does not realise that the failover has taken place, so the FDB entry persists, thus directing traffic to the now inactive/backup NIC “eno1”, hence the outage persists until such a time that the FDB entry ages out of the B side switch after typically 300 seconds; at which point the A side switch 

Here is an example of the FDB table of the A and B side switches in normal operation, i.e. the primary link is active “ens2f1”. You should be aware of the “Age” field, when data is flowing to the host down the port (i.e. the port at the other end is responding), the “Age” value should be fairly low, (i.e. 0 – 50), this means that the switch considers the MAC address on the port to be “fresh”.

The issue would appear to be caused by the gratuitous ARP not be sent by the server, received by the switch or honoured by the switch when the fail-back takes place therefore the B side switch is not told to flush its FDB for the port that the host is attached to. The sending of a large number of gratuitous ARPs when the failover or failback takes place can help to ensure that the switches are aware of a MAC address (and active port) moving from one physical host interface as quickly as possible thus ensuring little or no break to network service and certainly not one that would affect service.

Any planned maintenance work should always be conducted with managing the failover of devices, even though they can automatically cope with the failover themselves, this is to ensure that if an issue such as the above occurs, the remediation can be put in place as soon as possible.

Additional Information and Reference

Image Attribution

Leave a Reply

Your email address will not be published. Required fields are marked *