> -----Original Message----- > From: Jason Gunthorpe <jgg@xxxxxxxxxx> > Sent: Monday, September 20, 2021 6:24 PM > To: Nikolova, Tatyana E <tatyana.e.nikolova@xxxxxxxxx> > Cc: dledford@xxxxxxxxxx; leon@xxxxxxxxxx; linux-rdma@xxxxxxxxxxxxxxx > Subject: Re: [PATCH v2 rdma-core] irdma: Add ice and irdma to kernel-boot > rules > > On Mon, Sep 20, 2021 at 07:41:21PM +0000, Nikolova, Tatyana E wrote: > > > > > > > From: Jason Gunthorpe <jgg@xxxxxxxxxx> > > > Sent: Thursday, September 2, 2021 10:40 AM > > > To: Nikolova, Tatyana E <tatyana.e.nikolova@xxxxxxxxx> > > > Cc: dledford@xxxxxxxxxx; leon@xxxxxxxxxx; linux-rdma@xxxxxxxxxxxxxxx > > > Subject: Re: [PATCH v2 rdma-core] irdma: Add ice and irdma to > > > kernel-boot rules > > > > > > On Thu, Sep 02, 2021 at 03:29:43PM +0000, Nikolova, Tatyana E wrote: > > > > > Given that ice is both iwarp and roce, is there some better way > > > > > to detect this? Doesn't the aux device encode it? > > > > > > > > Hi Jason, > > > > > > > > We tried a few experiments without success. The auxiliary devices > > > > alias with our driver and not ice, so maybe this is the reason? > > > > > > > > Here is an example of what we tried. > > > > > > > > udevadm info > > > > /sys/devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0/ice.roce.0 > > > > P: /devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0/ice.roce.0 > > > > E: > > > > DEVPATH=/devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0/ice.roce.0 > > > > E: DRIVER=irdma > > > > E: MODALIAS=auxiliary:ice.roce > > > > E: SUBSYSTEM=auxiliary > > > > > > > > udevadm info /sys/bus/auxiliary/devices/ice.roce.0 > > > > P: /devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0/ice.roce.0 > > > > E: > > > > DEVPATH=/devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0/ice.roce.0 > > > > E: DRIVER=irdma > > > > E: MODALIAS=auxiliary:ice.roce > > > > E: SUBSYSTEM=auxiliary > > > > > > > > Given the udevadm output, we put the following line in the udev > > > > rdma- > > > description.rules: > > > > > > > > SUBSYSTEMS=="auxiliary", > > > DEVPATH=="*/devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0/ice.roce.0/ > > > *", > > > ENV{ID_RDMA_ROCE}="1" > > > > > > What is the SUBSYSTEM=="infiniband" device like? > > > > > > This seems like the right direction, you need to wrangle udev though.. > > > > > > > Hi Jason, > > > > After more research and given the udevadm output, we revised the irdma > udev rule to make it work. Could you please review the patch bellow? > > > > diff --git a/kernel-boot/rdma-description.rules > > b/kernel-boot/rdma-description.rules > > index 48a7cede..09deb451 100644 > > +++ b/kernel-boot/rdma-description.rules > > @@ -1,7 +1,7 @@ > > # This is a version of net-description.rules for > > /sys/class/infiniband devices > > > > ACTION=="remove", GOTO="rdma_description_end" > > -SUBSYSTEM!="infiniband", GOTO="rdma_description_end" > > +SUBSYSTEM!="infiniband", GOTO="rdma_infiniband_end" > > > > # NOTE: DRIVERS searches up the sysfs path to find the driver that is > bound to # the PCI/etc device that the RDMA device is linked to. This is not > the kernel @@ -40,4 +40,9 @@ DEVPATH=="*/infiniband/rxe*", > ATTR{parent}=="*", ENV{ID_RDMA_ROCE}="1" > > SUBSYSTEMS=="pci", ENV{ID_BUS}="pci", > ENV{ID_VENDOR_ID}="$attr{vendor}", ENV{ID_MODEL_ID}="$attr{device}" > > SUBSYSTEMS=="pci", IMPORT{builtin}="hwdb --subsystem=pci" > > > > +LABEL="rdma_infiniband_end" > > + > > +SUBSYSTEM!="auxiliary", GOTO="rdma_description_end" > > +KERNEL=="ice.iwarp.?", ENV{ID_RDMA_IWARP}="1" > > +KERNEL=="ice.roce.?", ENV{ID_RDMA_ROCE}="1" > > LABEL="rdma_description_end" > > This doesn't seem right, the ID_* must be applied to an infiniband device or > the other stuff doesn't that consumes this won't work right. Hi Jason, Based on the following output, it seems that some systemd services won't work. I just tested with the port mapper which worked. udevadm info -q all /sys/devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0/ice.iwarp.0 P: /devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0/ice.iwarp.0 E: DEVPATH=/devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0/ice.iwarp.0 E: DRIVER=irdma E: ID_RDMA_IWARP=1 E: MODALIAS=auxiliary:ice.iwarp E: SUBSYSTEM=auxiliary E: SYSTEMD_WANTS=iwpmd.service E: TAGS=:systemd: E: USEC_INITIALIZED=33683420 The parent of the aux device (and our ib device) is '/devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0': KERNELS=="0000:2f:00.0" SUBSYSTEMS=="pci" DRIVERS=="ice" ATTRS{ari_enabled}=="1" ATTRS{broken_parity_status}=="0" ATTRS{class}=="0x020000" ... If we need to use the aux device name in the udev rules, then I am not aware how to get to the aux device through the infiniband or the pci subsystem. > What does the udev debugging say about these ID tags? > > The SUBSYSTEMS=="" is the right approach, as shown above for the other > metadata. If you are having trobule I'm wondering if there is some kind of > kernel problem creating the wrong sysfs? > Previously I was using an RC1 kernel and seeing issues with sysfs. After switching to a GA kernel, it works better. udevadm info --attribute-walk /sys/class/infiniband/rocep47s0f0 looking at device '/devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0/infiniband/rocep47s0f0': KERNEL=="rocep47s0f0" SUBSYSTEM=="infiniband" DRIVER=="" ATTR{fw_ver}=="1.48" ATTR{node_desc}=="" ATTR{node_guid}=="6a05:caff:fec1:c790" ATTR{node_type}=="1: CA" ATTR{sys_image_guid}=="6805:cac1:c790:0000" looking at parent device '/devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0': KERNELS=="0000:2f:00.0" SUBSYSTEMS=="pci" DRIVERS=="ice" ATTRS{ari_enabled}=="1" ATTRS{broken_parity_status}=="0" ATTRS{class}=="0x020000" ... So adding the following to rdma-description.rules seems to work. Is this acceptable? diff --git a/kernel-boot/rdma-description.rules b/kernel-boot/rdma-description.rules index 48a7ced..9a18b67 100644 --- a/kernel-boot/rdma-description.rules +++ b/kernel-boot/rdma-description.rules @@ -33,6 +33,8 @@ DRIVERS=="mlx4_core", ENV{ID_RDMA_ROCE}="1" DRIVERS=="mlx5_core", ENV{ID_RDMA_ROCE}="1" DRIVERS=="qede", ENV{ID_RDMA_ROCE}="1" DRIVERS=="vmw_pvrdma", ENV{ID_RDMA_ROCE}="1" +KERNEL=="iw*", ENV{ID_RDMA_IWARP}="1" +KERNEL=="roce*", ENV{ID_RDMA_ROCE}="1" DEVPATH=="*/infiniband/rxe*", ATTR{parent}=="*", ENV{ID_RDMA_ROCE}="1" This script results in the following settings: udevadm info -q all /sys/devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0/infiniband/iw-ifname P: /devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0/infiniband/iw-ifname E: DEVPATH=/devices/pci0000:2e/0000:2e:00.0/0000:2f:00.0/infiniband/iw-ifname E: ID_BUS=pci E: ID_MODEL_ID=0x1593 E: ID_PCI_CLASS_FROM_DATABASE=Network controller E: ID_PCI_SUBCLASS_FROM_DATABASE=Ethernet controller E: ID_RDMA_IWARP=1 E: ID_VENDOR_FROM_DATABASE=Intel Corporation E: ID_VENDOR_ID=0x8086 E: NAME=iw-ifname E: SUBSYSTEM=infiniband E: SYSTEMD_WANTS=rdma-ndd.service iwpmd.service rdma-hw.target rdma-load-modules@iwarp.service E: TAGS=:systemd: E: USEC_INITIALIZED=41070786 Thank you, Tatyana