On Tue, May 15, 2018 at 04:47:22PM +0200, Benjamin Drung wrote: > Hi, > > I have a Debian 9 (stretch) system with a backported rdma-core 17.0-1 > package. The system has a mlx4 card (mlx4_ib and mlx4_core kernel > modules) and following network configuration in > /etc/network/interfaces: > > ``` > auto ib0.dddd > iface ib0.dddd inet6 static > address fd44:1:5255:: > netmask 64 > pre-up echo connected > /sys/class/net/$IFACE/mode > dad-attempts 600 > > auto ib1.dddd > iface ib1.dddd inet6 static > address fd44:2:5255:: > netmask 64 > pre-up echo connected > /sys/class/net/$IFACE/mode > dad-attempts 600 > ``` > > The terminal shows following ordering: > > ``` > [FAILED] Failed to start Raise network interfaces. > [ OK ] Started Load RDMA modules from /etc/rdma/modules/rdma.conf > [ OK ] Started Load RDMA modules from /etc/rdma/modules/infiniband.conf > [ OK ] Reached target RDMA Hardware. > ``` > > the networking.service fails with: > ``` > $ journalctl --no-host -u networking.service > [...] > Mai 15 13:16:40 ifup[1645]: /bin/sh: 1: cannot create /sys/class/net/ib0.dddd/mode: Directory nonexistent > Mai 15 13:16:40 ifup[1645]: ifup: failed to bring up ib0.dddd > Mai 15 13:16:40 ifup[1645]: /bin/sh: 1: cannot create /sys/class/net/ib1.dddd/mode: Directory nonexistent > Mai 15 13:16:40 ifup[1645]: ifup: failed to bring up ib1.dddd > Mai 15 13:16:40 systemd[1]: networking.service: Main process exited, code=exited, status=1/FAILURE > Mai 15 13:16:40 systemd[1]: Failed to start Raise network interfaces. > Mai 15 13:16:40 systemd[1]: networking.service: Unit entered failed state. > Mai 15 13:16:40 systemd[1]: networking.service: Failed with result 'exit-code'. > > ``` > > The networking.service fails because it tries to bring up > ib0.dddd/ib1.dddd before the rdma-load-modules@infiniband.service loads > the ib_ipoib kernel module. networking.service declares that it should > run after the network-pre.target and rdma-load-modules@infiniband.servi > ce declares to run before network-pre.target. Therefore the order > should be rdma-load-modules@infiniband.service -> network-pre.target -> > networking.service, but this is obviously not the case. > > I am writing to this mailing list, because got stuck with debugging > this issue and need your help. The udev.md explains this: ## Interaction with legacy non-hotplug services Services that cannot handle hot plug must be ordered after systemd-udev-settle.service, which will wait for udev to complete loading modules and scheduling systemd services. This ensures that all RDMA hardware present at boot is setup before proceeding to run the legacy service. Admins using legacy services can also place their RDMA hardware modules (e.g. mlx4_ib) directly in /etc/modules-load.d/ or in their initrd which will cause systemd to defer passing to sysinit.target until all RDMA hardware is setup, this is usually sufficient for legacy services. This is probably the default behavior in many configurations. Since you see the backwards ordering and the errors it meands that ifupdown in stretch does not support hotplug. IMHO it is a bug in that package that it doesn't order after settle to try and avoid boot time hot plug events that it cannot handle. The modules solution is simplest, add ipoib and HCA drivers to modules.conf The robust and future looking solution is to use systemd-networkd instead of legacy ifupdown... It is a bit annoying today to get the connected setting though. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html