Upgraded multiple systems to systemd 249.3 and all had eth1 not started / configured

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello everyone,

So I have around seven Arch Linux based systems.

All systems have 2 or 3 network cards.

eth0 is LAN side (192.168.x.x/24 range). eth1 and eth2 has WAN (public internet) connectivity.

Today I upgraded all systems to systemd 249.3 and Linux kernel 5.13.10.arch1-1.

Everything worked fine before upgrading. And to my horror I started getting complaints from all sites that internet is not working.

Then I realized that all machines have interface eth1 down.

Here is the journal log showing error after upgrading (journalctl -b 0 -u systemd-networkd)

Aug 16 09:30:18 kk systemd[1]: Starting Network Configuration...
Aug 16 09:30:18 kk systemd-networkd[429]: lo: Link UP
Aug 16 09:30:18 kk systemd-networkd[429]: lo: Gained carrier
Aug 16 09:30:18 kk systemd-networkd[429]: Enumeration completed
Aug 16 09:30:18 kk systemd[1]: Started Network Configuration.
Aug 16 09:30:18 kk systemd-networkd[429]: eth1: Interface name change detected, renamed to eth0.
Aug 16 09:30:18 kk systemd-networkd[429]: Could not process link message: File exists
Aug 16 09:30:18 kk systemd-networkd[429]: eth0: Failed
Aug 16 09:30:18 kk systemd-networkd[429]: eth2: Interface name change detected, renamed to eth1.
Aug 16 09:30:18 kk systemd-networkd[429]: eth0: Interface name change detected, renamed to tmpeth1.
Aug 16 09:30:18 kk systemd-networkd[429]: eth1: Interface name change detected, renamed to tmpeth2.
Aug 16 09:30:18 kk systemd-networkd[429]: eth0: Interface name change detected, renamed to tmpeth0.
Aug 16 09:30:18 kk systemd-networkd[429]: tmpeth0: Interface name change detected, renamed to eth0.
Aug 16 09:30:18 kk systemd-networkd[429]: tmpeth1: Interface name change detected, renamed to eth1.
Aug 16 09:30:18 kk systemd-networkd[429]: tmpeth2: Interface name change detected, renamed to eth2.
Aug 16 09:30:19 kk systemd-networkd[429]: eth0: Link UP
Aug 16 09:30:19 kk systemd-networkd[429]: eth2: Link UP
Aug 16 09:30:19 kk systemd-networkd[429]: eth2: Gained carrier
Aug 16 09:30:22 kk systemd-networkd[429]: eth0: Gained carrier

Explanation about tmpeth* naming is below but that is probably not related to this issue. So can be ignored.

Notice how there is an error about renaming eth1 to eth0. I dont know what is doing this renaming of eth1 to eth0. This renaming didn't happen before upgrade. (see journal log below)

And also notice that there is no line stating eth1: Link UP.


Here is the journal log when systems worked perfectly.

Aug 13 09:17:20 kk systemd[1]: Starting Network Service...
Aug 13 09:17:21 kk systemd-networkd[421]: Enumeration completed
Aug 13 09:17:21 kk systemd[1]: Started Network Service.
Aug 13 09:17:21 kk systemd-networkd[421]: eth0: Interface name change detected, eth0 has been renamed to tmpeth0.
Aug 13 09:17:21 kk systemd-networkd[421]: eth2: Interface name change detected, eth2 has been renamed to eth0.
Aug 13 09:17:21 kk systemd-networkd[421]: eth0: Interface name change detected, eth0 has been renamed to tmpeth2.
Aug 13 09:17:21 kk systemd-networkd[421]: eth1: Interface name change detected, eth1 has been renamed to eth0.
Aug 13 09:17:21 kk systemd-networkd[421]: eth0: Interface name change detected, eth0 has been renamed to tmpeth1.
Aug 13 09:17:21 kk systemd-networkd[421]: tmpeth1: Interface name change detected, tmpeth1 has been renamed to eth1.
Aug 13 09:17:21 kk systemd-networkd[421]: tmpeth2: Interface name change detected, tmpeth2 has been renamed to eth2.
Aug 13 09:17:21 kk systemd-networkd[421]: tmpeth0: Interface name change detected, tmpeth0 has been renamed to eth0.
Aug 13 09:17:21 kk systemd-networkd[421]: eth1: Link UP
Aug 13 09:17:21 kk systemd-networkd[421]: eth2: Link UP
Aug 13 09:17:21 kk systemd-networkd[421]: eth2: Gained carrier
Aug 13 09:17:21 kk systemd-networkd[421]: eth0: Link UP
Aug 13 09:17:26 kk systemd-networkd[421]: eth0: Gained carrier

Notice how there was no attempt to rename eth1 to eth0 at the beginning (i.e. when everything worked fine)

And notice how all interfaces showed Link UP.


So something changed either in systemd or in Linux kernel.

Any idea what is wrong where? And which process is trying to rename eth1 to eth0 at system startup?

All systems are production systems and after today's long downtime, I can not downgrade any system to check what is wrong as management would be on fire if there is another downtime.

Thank you in advance,

Amish


PS:

Little about tmpeth* naming.

Some old scripts that we have expect interface names starting with eth. But those names are not predictable.

So to get predictable names starting with eth*, first I temporarily rename all interface with tmpeth*. This is done via udev rules.

SUBSYSTEM=="net", ACTION="" ATTR{address}=="XX:XX:XX:XX:XX:XX", NAME="tmpeth0"
SUBSYSTEM=="net", ACTION="" ATTR{address}=="XX:XX:XX:XX:XX:YY", NAME="tmpeth1"
SUBSYSTEM=="net", ACTION="" ATTR{address}=="XX:XX:XX:XX:XX:ZZ", NAME="tmpeth2"

Then I have a small service (script) which runs before network-pre.target to convert these names back to eth*

#search for network interface with name starting from "tmpeth" and rename them to "eth"
/usr/bin/find /sys/class/net -maxdepth 1 -name "tmpeth[0-9]" -type l -printf "%f\n" | while read tmpiface; do /usr/bin/ip link set dev "$tmpiface" name "$(echo $tmpiface | sed s/tmpeth/eth/)"; done

This ensures that I have predictable names starting with eth*. And it is working fine from 2-3 years. Even with current issue, name assignment is working fine.


[Index of Archives]     [LARTC]     [Bugtraq]     [Yosemite Forum]     [Photo]

  Powered by Linux