On Fri, May 18, 2018 at 11:26:07AM +0200, Benjamin Drung wrote: > Am Donnerstag, den 17.05.2018, 11:16 -0600 schrieb Jason Gunthorpe: > > On Thu, May 17, 2018 at 07:02:47PM +0200, Benjamin Drung wrote: > > > Am Dienstag, den 15.05.2018, 13:20 -0600 schrieb Jason Gunthorpe: > > > > On Tue, May 15, 2018 at 02:15:54PM -0400, Doug Ledford wrote: > > > > > > I added the systemd-udev-settle.service dependency: > > > > > > > > > > > > ``` > > > > > > $ systemctl cat networking.service > > > > > > # /lib/systemd/system/networking.service > > > > > > [Unit] > > > > > > Description=Raise network interfaces > > > > > > Documentation=man:interfaces(5) > > > > > > DefaultDependencies=no > > > > > > Wants=network.target > > > > > > After=local-fs.target network-pre.target apparmor.service > > > > > > systemd-sysctl.service systemd-modules-load.service > > > > > > Before=network.target shutdown.target network-online.target > > > > > > Conflicts=shutdown.target > > > > > > > > > > > > [Install] > > > > > > WantedBy=multi-user.target > > > > > > WantedBy=network-online.target > > > > > > > > > > > > [Service] > > > > > > Type=oneshot > > > > > > EnvironmentFile=-/etc/default/networking > > > > > > ExecStartPre=-/bin/sh -c '[ "$CONFIGURE_INTERFACES" != "no" ] > > > > > > && > > > > > > [ -n "$(ifquery --read-environment --list --exclude=lo)" ] && > > > > > > udevadm settle' > > > > > > > > > > I wouldn't trust that you can run udevadm settle here and get > > > > > the > > > > > right > > > > > results. This will only wait for the current udev hotplug > > > > > events > > > > > to > > > > > complete. > > > > > > > > Oh, neat, so udev settle is already called by Debian's > > > > networking.service (as it should be) - assuming > > > > CONFIGURE_INTERFACES > > > > is set, and whatever that other stuff does (Ben is this > > > > triggering > > > > for you?) > > > > > > I should have looked more closely at the service file (I didn't > > > notice > > > the udevadm settle in there). CONFIGURE_INTERFACES is not set in > > > /etc/default/networking and ifquery returns a bunch of interfaces. > > > Therefore 'udevadm settle' is executed. > > > > > > I tried to debug it further by injecting commands to the pre-up > > > hook. > > > When pre-up runs: > > > > > > * lsmod shows that ib_ipoib is loaded > > > * 'ls -l /sys/class/net/' shows that neither ib0 and ib1 are > > > present > > > > > > To me it looks like a race condition between populating > > > /sys/class/net/ibX after loading ib_ipoib and the networking > > > service. > > > > Is the rdma device present at this point? eg sys/class/infiniband ? > > /sys/class/infiniband/mlx4_0 is present. > > > Is any systemd-modules-load processes still running? > > '/lib/systemd/systemd-modules-load /etc/rdma/modules/infiniband.conf' > is still running. > > Are the mlx IB modules loaded? > > Yes: mlx4_ib, mlx4_core, and mlx_compat are loaded (according to > lsmod). The first two modules are already loaded in the initrd. Also > ib_ipoib, ib_uverbs, ib_sa, ib_mad, ib_core, ib_addr, ib_netlink are > loaded. Hmm, that is very mysterious, then, I can't think how systemd-modules-load could still be running at this point. If you load the ib driver in initrd then the above should have been scheduled very early in boot, and it has a Before=network-pre.target which should delay networking.service from starting while it is running. What does the logging say about when rdma-load-modules was started and was the IB device created before the initrd device exited? Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html