Am Dienstag, den 15.05.2018, 08:58 -0600 schrieb Jason Gunthorpe: > On Tue, May 15, 2018 at 04:47:22PM +0200, Benjamin Drung wrote: > > Hi, > > > > I have a Debian 9 (stretch) system with a backported rdma-core > > 17.0-1 > > package. The system has a mlx4 card (mlx4_ib and mlx4_core kernel > > modules) and following network configuration in > > /etc/network/interfaces: > > > > ``` > > auto ib0.dddd > > iface ib0.dddd inet6 static > > address fd44:1:5255:: > > netmask 64 > > pre-up echo connected > /sys/class/net/$IFACE/mode > > dad-attempts 600 > > > > auto ib1.dddd > > iface ib1.dddd inet6 static > > address fd44:2:5255:: > > netmask 64 > > pre-up echo connected > /sys/class/net/$IFACE/mode > > dad-attempts 600 > > ``` > > > > The terminal shows following ordering: > > > > ``` > > [FAILED] Failed to start Raise network interfaces. > > [ OK ] Started Load RDMA modules from /etc/rdma/modules/rdma.conf > > [ OK ] Started Load RDMA modules from > > /etc/rdma/modules/infiniband.conf > > [ OK ] Reached target RDMA Hardware. > > ``` > > > > the networking.service fails with: > > ``` > > $ journalctl --no-host -u networking.service > > [...] > > Mai 15 13:16:40 ifup[1645]: /bin/sh: 1: cannot create > > /sys/class/net/ib0.dddd/mode: Directory nonexistent > > Mai 15 13:16:40 ifup[1645]: ifup: failed to bring up ib0.dddd > > Mai 15 13:16:40 ifup[1645]: /bin/sh: 1: cannot create > > /sys/class/net/ib1.dddd/mode: Directory nonexistent > > Mai 15 13:16:40 ifup[1645]: ifup: failed to bring up ib1.dddd > > Mai 15 13:16:40 systemd[1]: networking.service: Main process > > exited, code=exited, status=1/FAILURE > > Mai 15 13:16:40 systemd[1]: Failed to start Raise network > > interfaces. > > Mai 15 13:16:40 systemd[1]: networking.service: Unit entered failed > > state. > > Mai 15 13:16:40 systemd[1]: networking.service: Failed with result > > 'exit-code'. > > > > ``` > > > > The networking.service fails because it tries to bring up > > ib0.dddd/ib1.dddd before the rdma-load-modules@infiniband.service > > loads > > the ib_ipoib kernel module. networking.service declares that it > > should > > run after the network-pre.target and rdma-load-modules@infiniband.s > > ervi > > ce declares to run before network-pre.target. Therefore the order > > should be rdma-load-modules@infiniband.service -> network- > > pre.target -> > > networking.service, but this is obviously not the case. > > > > I am writing to this mailing list, because got stuck with debugging > > this issue and need your help. > > The udev.md explains this: > > ## Interaction with legacy non-hotplug services > > Services that cannot handle hot plug must be ordered after > systemd-udev-settle.service, which will wait for udev to complete > loading > modules and scheduling systemd services. This ensures that all RDMA > hardware > present at boot is setup before proceeding to run the legacy > service. > > Admins using legacy services can also place their RDMA hardware > modules > (e.g. mlx4_ib) directly in /etc/modules-load.d/ or in their initrd > which will > cause systemd to defer passing to sysinit.target until all RDMA > hardware is > setup, this is usually sufficient for legacy services. This is > probably the > default behavior in many configurations. > > Since you see the backwards ordering and the errors it meands that > ifupdown in stretch does not support hotplug. IMHO it is a bug in > that > package that it doesn't order after settle to try and avoid boot time > hot plug events that it cannot handle. > > The modules solution is simplest, add ipoib and HCA drivers to > modules.conf I added the systemd-udev-settle.service dependency: ``` $ systemctl cat networking.service # /lib/systemd/system/networking.service [Unit] Description=Raise network interfaces Documentation=man:interfaces(5) DefaultDependencies=no Wants=network.target After=local-fs.target network-pre.target apparmor.service systemd-sysctl.service systemd-modules-load.service Before=network.target shutdown.target network-online.target Conflicts=shutdown.target [Install] WantedBy=multi-user.target WantedBy=network-online.target [Service] Type=oneshot EnvironmentFile=-/etc/default/networking ExecStartPre=-/bin/sh -c '[ "$CONFIGURE_INTERFACES" != "no" ] && [ -n "$(ifquery --read-environment --list --exclude=lo)" ] && udevadm settle' ExecStart=/sbin/ifup -a --read-environment ExecStop=/sbin/ifdown -a --read-environment --exclude=lo RemainAfterExit=true TimeoutStartSec=5min # /etc/systemd/system/networking.service.d/rdma.conf [Unit] # See https://marc.info/?l=linux-rdma&m=152639629213650&w=2 After=systemd-udev-settle.service ``` but it is still not working (same error messages). -- Benjamin Drung System Developer Debian & Ubuntu Developer ProfitBricks GmbH Greifswalder Str. 207 10405 Berlin Email: benjamin.drung@xxxxxxxxxxxxxxxx URL: https://www.profitbricks.de Sitz der Gesellschaft: Berlin Registergericht: Amtsgericht Charlottenburg, HRB 125506 B Geschäftsführer: Achim Weiss, Matthias Steinberg, Christoph Steffens -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html