Re: rdma-core: Bringing up IPoIB devices on boot fail

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 18, 2018 at 06:22:12PM +0200, Benjamin Drung wrote:
> > Hmm, that is very mysterious, then, I can't think how systemd-
> > modules-load
> > could still be running at this point.
> > 
> > If you load the ib driver in initrd then the above should have been
> > scheduled very early in boot, and it has a Before=network-pre.target
> > which should delay networking.service from starting while it is
> > running.
> >
> > What does the logging say about when rdma-load-modules was started
> > and
> > was the IB device created before the initrd device exited?
> 
> I opened a bug report against systemd in Debian:
> https://bugs.debian.org/899002
> 
> Then I tried to implement a workaround (which does not work):
> 
> $ cat /etc/systemd/system/networking.service.d/rdma.conf
> [Service]
> # Work around systemd bug https://bugs.debian.org/899002
> # See also https://marc.info/?l=linux-rdma&m=152639629213650&w=2
> ExecStartPre=/bin/ps auxff
> ExecStartPre=/bin/ls -l /sys/class/infiniband
> ExecStartPre=/bin/systemctl status rdma-load-modules@infiniband.service
> ExecStartPre=/bin/sh -c 'while pid=$(pidof -s systemd-modules-load); do echo "Waiting for systemd-modules-load process $pid to exit..."; tail --pid=$pid -f /dev/null; done'
> 
> systemctl status says that rdma-load-modules@infiniband.service was
> started one second after networking.service.
> 
> The ps command from ExecStartPre says that only systemd-journald,
> systemd-udevd, multipathd, and init were running. "ls -l
> /sys/class/infiniband" says that mlx4_0 is present. And "systemctl
> status rdma-load-modules@infiniband.service" says:
> 
> rdma-load-modules@infiniband.service - Load RDMA modules from /etc/rdma/modules/infiniband.conf
>   Loaded: loaded (/lib/systemd/system/rdma-load-modules@.service; static; vendor preset: enabled)
>   Active: inactive (dead)
>     Docs: file:/usr/share/doc/rdma-core/udev.md
> 
> So it is clear, that rdma-load-modules@infiniband.service is not
> triggered when networking.service is started.

Hum, if you have the modules in the initrd then udev should schedule
this service to run essentially immediately on boot, and it should
become ordered properly..

Ie the rdma device should already present when udev is started.

Starting *after* networking.service suggests that the mlx4 RDMA device
was hotplugged into the system a long time after early boot! Which is
not at all what I expect.

What does dmesg say about the mlx4 driver load?

Upstream blocks module completion until the driver is done (this takes
a long time), is it possible that MOFED does this async? That could
explain everything.

Also, IMHO, the networking.service above is wrong. It should not
attempt to do udevadm settle internally, but it must depend on
systemd-udev-settle.service.

The reason is due to how systemd scheduals ordering. Once it starts
running networking.service 'ExecStartPre' it will not re-consider
order past that point. So any activations done by udev while settling
have no impact on networking.service at all.

Having it depend on systemd-udev-settle.service means it gets to
recheck ordering after settle is done, but before starting
networking.sevice - which is the behavior it is really trying to get.

That may be a big part of this bug, go back to doing:

After=systemd-udev-settle.service
Requires=systemd-udev-settle.service

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux