Re: rdma-core: Bringing up IPoIB devices on boot fail

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 17, 2018 at 07:02:47PM +0200, Benjamin Drung wrote:
> Am Dienstag, den 15.05.2018, 13:20 -0600 schrieb Jason Gunthorpe:
> > On Tue, May 15, 2018 at 02:15:54PM -0400, Doug Ledford wrote:
> > > > I added the systemd-udev-settle.service dependency:
> > > > 
> > > > ```
> > > > $ systemctl cat networking.service 
> > > > # /lib/systemd/system/networking.service
> > > > [Unit]
> > > > Description=Raise network interfaces
> > > > Documentation=man:interfaces(5)
> > > > DefaultDependencies=no
> > > > Wants=network.target
> > > > After=local-fs.target network-pre.target apparmor.service
> > > > systemd-sysctl.service systemd-modules-load.service
> > > > Before=network.target shutdown.target network-online.target
> > > > Conflicts=shutdown.target
> > > > 
> > > > [Install]
> > > > WantedBy=multi-user.target
> > > > WantedBy=network-online.target
> > > > 
> > > > [Service]
> > > > Type=oneshot
> > > > EnvironmentFile=-/etc/default/networking
> > > > ExecStartPre=-/bin/sh -c '[ "$CONFIGURE_INTERFACES" != "no" ] &&
> > > > [ -n "$(ifquery --read-environment --list --exclude=lo)" ] &&
> > > > udevadm settle'
> > > 
> > > I wouldn't trust that you can run udevadm settle here and get the
> > > right
> > > results.  This will only wait for the current udev hotplug events
> > > to
> > > complete.
> > 
> > Oh, neat, so udev settle is already called by Debian's
> > networking.service (as it should be) - assuming CONFIGURE_INTERFACES
> > is set, and whatever that other stuff does (Ben is this triggering
> > for you?)
> 
> I should have looked more closely at the service file (I didn't notice
> the udevadm settle in there). CONFIGURE_INTERFACES is not set in
> /etc/default/networking and ifquery returns a bunch of interfaces.
> Therefore 'udevadm settle' is executed.
> 
> I tried to debug it further by injecting commands to the pre-up hook.
> When pre-up runs:
> 
> * lsmod shows that ib_ipoib is loaded
> * 'ls -l /sys/class/net/' shows that neither ib0 and ib1 are present
> 
> To me it looks like a race condition between populating
> /sys/class/net/ibX after loading ib_ipoib and the networking
> service.

Is the rdma device present at this point? eg sys/class/infiniband ?

Is any systemd-modules-load processes still running?

Are the mlx IB modules loaded?

> Do you have a suggestion how to address this? We are using Mellanox
> OFED on the affected hosts. The mainline ipoib is not affected. Are the
> commits that are related to this and that we should cherry-pick?

Oh, mainline ipoib works? Great.

I have no idea what is in Mellanox OFED, sorry..

I don't think this is anything that was fixed in mainline.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux