RE: [PATCH v2 rdma-core] irdma: Add ice and irdma to kernel-boot rules

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Jason Gunthorpe <jgg@xxxxxxxxxx>
> Sent: Monday, September 20, 2021 6:24 PM
> To: Nikolova, Tatyana E <tatyana.e.nikolova@xxxxxxxxx>
> Cc: dledford@xxxxxxxxxx; leon@xxxxxxxxxx; linux-rdma@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH v2 rdma-core] irdma: Add ice and irdma to kernel-boot
> rules
> 
> On Mon, Sep 20, 2021 at 07:41:21PM +0000, Nikolova, Tatyana E wrote:
> >
> >
> > > From: Jason Gunthorpe <jgg@xxxxxxxxxx>
> > > Sent: Thursday, September 2, 2021 10:40 AM
> > > To: Nikolova, Tatyana E <tatyana.e.nikolova@xxxxxxxxx>
> > > Cc: dledford@xxxxxxxxxx; leon@xxxxxxxxxx; linux-rdma@xxxxxxxxxxxxxxx
> > > Subject: Re: [PATCH v2 rdma-core] irdma: Add ice and irdma to
> > > kernel-boot rules
> > >
> > > On Thu, Sep 02, 2021 at 03:29:43PM +0000, Nikolova, Tatyana E wrote:
> > > > > Given that ice is both iwarp and roce, is there some better way
> > > > > to detect this? Doesn't the aux device encode it?
> > > >
> > > > Hi Jason,
> > > >
> > > > We tried a few experiments without success. The auxiliary devices
> > > > alias with our driver and not ice, so maybe this is the reason?
> > > >
> > > > Here is an example of what we tried.
> > > >
> > > > udevadm info
> > > > /sys/devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0/ice.roce.0
> > > > P: /devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0/ice.roce.0
> > > > E:
> > > > DEVPATH=/devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0/ice.roce.0
> > > > E: DRIVER=irdma
> > > > E: MODALIAS=auxiliary:ice.roce
> > > > E: SUBSYSTEM=auxiliary
> > > >
> > > > udevadm info /sys/bus/auxiliary/devices/ice.roce.0
> > > > P: /devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0/ice.roce.0
> > > > E:
> > > > DEVPATH=/devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0/ice.roce.0
> > > > E: DRIVER=irdma
> > > > E: MODALIAS=auxiliary:ice.roce
> > > > E: SUBSYSTEM=auxiliary
> > > >
> > > > Given the udevadm output, we put the following line in the udev
> > > > rdma-
> > > description.rules:
> > > >
> > > > SUBSYSTEMS=="auxiliary",
> > > DEVPATH=="*/devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0/ice.roce.0/
> > > *",
> > > ENV{ID_RDMA_ROCE}="1"
> > >
> > > What is the SUBSYSTEM=="infiniband" device like?
> > >
> > > This seems like the right direction, you need to wrangle udev though..
> > >
> >
> > Hi Jason,
> >
> > After more research and given the udevadm output, we revised the irdma
> udev rule to make it work. Could you please review the patch bellow?
> >
> > diff --git a/kernel-boot/rdma-description.rules
> > b/kernel-boot/rdma-description.rules
> > index 48a7cede..09deb451 100644
> > +++ b/kernel-boot/rdma-description.rules
> > @@ -1,7 +1,7 @@
> >  # This is a version of net-description.rules for
> > /sys/class/infiniband devices
> >
> >  ACTION=="remove", GOTO="rdma_description_end"
> > -SUBSYSTEM!="infiniband", GOTO="rdma_description_end"
> > +SUBSYSTEM!="infiniband", GOTO="rdma_infiniband_end"
> >
> >  # NOTE: DRIVERS searches up the sysfs path to find the driver that is
> bound to  # the PCI/etc device that the RDMA device is linked to. This is not
> the kernel @@ -40,4 +40,9 @@ DEVPATH=="*/infiniband/rxe*",
> ATTR{parent}=="*", ENV{ID_RDMA_ROCE}="1"
> >  SUBSYSTEMS=="pci", ENV{ID_BUS}="pci",
> ENV{ID_VENDOR_ID}="$attr{vendor}", ENV{ID_MODEL_ID}="$attr{device}"
> >  SUBSYSTEMS=="pci", IMPORT{builtin}="hwdb --subsystem=pci"
> >
> > +LABEL="rdma_infiniband_end"
> > +
> > +SUBSYSTEM!="auxiliary", GOTO="rdma_description_end"
> > +KERNEL=="ice.iwarp.?", ENV{ID_RDMA_IWARP}="1"
> > +KERNEL=="ice.roce.?", ENV{ID_RDMA_ROCE}="1"
> >  LABEL="rdma_description_end"
> 
> This doesn't seem right, the ID_* must be applied to an infiniband device or
> the other stuff doesn't that consumes this won't work right.

Hi Jason,

Based on the following output, it seems that some systemd services won't work. I just tested with the port mapper which worked.

udevadm info -q all  /sys/devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0/ice.iwarp.0
P: /devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0/ice.iwarp.0
E: DEVPATH=/devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0/ice.iwarp.0
E: DRIVER=irdma
E: ID_RDMA_IWARP=1
E: MODALIAS=auxiliary:ice.iwarp
E: SUBSYSTEM=auxiliary
E: SYSTEMD_WANTS=iwpmd.service
E: TAGS=:systemd:
E: USEC_INITIALIZED=33683420

The parent of the aux device (and our ib device) is 

'/devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0':
    KERNELS=="0000:2f:00.0"
    SUBSYSTEMS=="pci"
    DRIVERS=="ice"
    ATTRS{ari_enabled}=="1"
    ATTRS{broken_parity_status}=="0"
    ATTRS{class}=="0x020000"
...

If we need to use the aux device name in the udev rules, then I am not aware how to get to the aux device through the infiniband or the pci subsystem.

> What does the udev debugging say about these ID tags?
> 
> The SUBSYSTEMS=="" is the right approach, as shown above for the other
> metadata. If you are having trobule I'm wondering if there is some kind of
> kernel problem creating the wrong sysfs?
> 

Previously I was using an RC1 kernel and seeing issues with sysfs. After switching to a GA kernel, it works better. 

udevadm info --attribute-walk /sys/class/infiniband/rocep47s0f0

  looking at device '/devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0/infiniband/rocep47s0f0':
    KERNEL=="rocep47s0f0"
    SUBSYSTEM=="infiniband"
    DRIVER==""
    ATTR{fw_ver}=="1.48"
    ATTR{node_desc}==""
    ATTR{node_guid}=="6a05:caff:fec1:c790"
    ATTR{node_type}=="1: CA"
    ATTR{sys_image_guid}=="6805:cac1:c790:0000"

  looking at parent device '/devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0':
    KERNELS=="0000:2f:00.0"
    SUBSYSTEMS=="pci"
    DRIVERS=="ice"
    ATTRS{ari_enabled}=="1"
    ATTRS{broken_parity_status}=="0"
    ATTRS{class}=="0x020000" 
    ...

So adding the following to rdma-description.rules seems to work. Is this acceptable?

diff --git a/kernel-boot/rdma-description.rules b/kernel-boot/rdma-description.rules
index 48a7ced..9a18b67 100644
--- a/kernel-boot/rdma-description.rules
+++ b/kernel-boot/rdma-description.rules
@@ -33,6 +33,8 @@ DRIVERS=="mlx4_core", ENV{ID_RDMA_ROCE}="1"
 DRIVERS=="mlx5_core", ENV{ID_RDMA_ROCE}="1"
 DRIVERS=="qede", ENV{ID_RDMA_ROCE}="1"
 DRIVERS=="vmw_pvrdma", ENV{ID_RDMA_ROCE}="1"
+KERNEL=="iw*", ENV{ID_RDMA_IWARP}="1"
+KERNEL=="roce*", ENV{ID_RDMA_ROCE}="1"
 DEVPATH=="*/infiniband/rxe*", ATTR{parent}=="*", ENV{ID_RDMA_ROCE}="1"

This script results in the following settings:

udevadm info -q all /sys/devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0/infiniband/iw-ifname
P: /devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0/infiniband/iw-ifname
E: DEVPATH=/devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0/infiniband/iw-ifname
E: ID_BUS=pci
E: ID_MODEL_ID=0x1593
E: ID_PCI_CLASS_FROM_DATABASE=Network controller
E: ID_PCI_SUBCLASS_FROM_DATABASE=Ethernet controller
E: ID_RDMA_IWARP=1
E: ID_VENDOR_FROM_DATABASE=Intel Corporation
E: ID_VENDOR_ID=0x8086
E: NAME=iw-ifname
E: SUBSYSTEM=infiniband
E: SYSTEMD_WANTS=rdma-ndd.service iwpmd.service rdma-hw.target rdma-load-modules@iwarp.service
E: TAGS=:systemd:
E: USEC_INITIALIZED=41070786

Thank you,
Tatyana





[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux