Am Freitag, den 18.05.2018, 11:31 -0600 schrieb Jason Gunthorpe: > On Fri, May 18, 2018 at 06:22:12PM +0200, Benjamin Drung wrote: > > > Hmm, that is very mysterious, then, I can't think how systemd- > > > modules-load > > > could still be running at this point. > > > > > > If you load the ib driver in initrd then the above should have > > > been > > > scheduled very early in boot, and it has a Before=network- > > > pre.target > > > which should delay networking.service from starting while it is > > > running. > > > > > > What does the logging say about when rdma-load-modules was > > > started > > > and > > > was the IB device created before the initrd device exited? > > > > I opened a bug report against systemd in Debian: > > https://bugs.debian.org/899002 > > > > Then I tried to implement a workaround (which does not work): > > > > $ cat /etc/systemd/system/networking.service.d/rdma.conf > > [Service] > > # Work around systemd bug https://bugs.debian.org/899002 > > # See also https://marc.info/?l=linux-rdma&m=152639629213650&w=2 > > ExecStartPre=/bin/ps auxff > > ExecStartPre=/bin/ls -l /sys/class/infiniband > > ExecStartPre=/bin/systemctl status rdma-load-modules@xxxxxxxxxxxxxx > > vice > > ExecStartPre=/bin/sh -c 'while pid=$(pidof -s systemd-modules- > > load); do echo "Waiting for systemd-modules-load process $pid to > > exit..."; tail --pid=$pid -f /dev/null; done' > > > > systemctl status says that rdma-load-modules@infiniband.service was > > started one second after networking.service. > > > > The ps command from ExecStartPre says that only systemd-journald, > > systemd-udevd, multipathd, and init were running. "ls -l > > /sys/class/infiniband" says that mlx4_0 is present. And "systemctl > > status rdma-load-modules@infiniband.service" says: > > > > rdma-load-modules@infiniband.service - Load RDMA modules from > > /etc/rdma/modules/infiniband.conf > > Loaded: loaded (/lib/systemd/system/rdma-load-modules@.service; > > static; vendor preset: enabled) > > Active: inactive (dead) > > Docs: file:/usr/share/doc/rdma-core/udev.md > > > > So it is clear, that rdma-load-modules@infiniband.service is not > > triggered when networking.service is started. > > Hum, if you have the modules in the initrd then udev should schedule > this service to run essentially immediately on boot, and it should > become ordered properly.. > > Ie the rdma device should already present when udev is started. > > Starting *after* networking.service suggests that the mlx4 RDMA > device > was hotplugged into the system a long time after early boot! Which is > not at all what I expect. > > What does dmesg say about the mlx4 driver load? I booted with break=bottom and listed the loaded modules in the initrd. They were: mlx4_ib ib_sa ib_mad ib_core ib_addr ib_netlink mlx4_core mlx_compat > Upstream blocks module completion until the driver is done (this > takes > a long time), is it possible that MOFED does this async? That could > explain everything. > > Also, IMHO, the networking.service above is wrong. It should not > attempt to do udevadm settle internally, but it must depend on > systemd-udev-settle.service. > > The reason is due to how systemd scheduals ordering. Once it starts > running networking.service 'ExecStartPre' it will not re-consider > order past that point. So any activations done by udev while settling > have no impact on networking.service at all. > > Having it depend on systemd-udev-settle.service means it gets to > recheck ordering after settle is done, but before starting > networking.sevice - which is the behavior it is really trying to get. > > That may be a big part of this bug, go back to doing: > > After=systemd-udev-settle.service > Requires=systemd-udev-settle.service You are right. I modified networking.service accordingly and it works as expected now. I send a patch for ifupdown to Debian, but a discussion about the fix started: https://bugs.debian.org/899002 -- Benjamin Drung System Developer Debian & Ubuntu Developer ProfitBricks GmbH Greifswalder Str. 207 10405 Berlin Email: benjamin.drung@xxxxxxxxxxxxxxxx URL: https://www.profitbricks.de Sitz der Gesellschaft: Berlin Registergericht: Amtsgericht Charlottenburg, HRB 125506 B Geschäftsführer: Achim Weiss, Matthias Steinberg, Christoph Steffens -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html