Re: [PATCH] kernel-boot: Do not perform device rename on OPA devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/5/2020 3:54 PM, Jason Gunthorpe wrote:
On Wed, Feb 05, 2020 at 03:35:13PM -0500, Dennis Dalessandro wrote:
On 2/5/2020 2:12 PM, Jason Gunthorpe wrote:
On Tue, Feb 04, 2020 at 08:55:20AM -0500, Goldman, Adam wrote:
From: "Goldman, Adam" <adam.goldman@xxxxxxxxx>

PSM2 will not run with recent rdma-core releases. Several tools and
libraries like PSM2, require the hfi1 name to be present.

Recent rdma-core releases added a new feature to rename kernel devices,
but the default configuration will not work with hfi1 fabrics.

Related opa-psm2 github issue:
    https://github.com/intel/opa-psm2/issues/43

Fixes: 5b4099d47be3 ("kernel-boot: Perform device rename to make stable names")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@xxxxxxxxx>
Signed-off-by: Goldman, Adam <adam.goldman@xxxxxxxxx>
   kernel-boot/rdma-persistent-naming.rules | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel-boot/rdma-persistent-naming.rules b/kernel-boot/rdma-persistent-naming.rules
index 9b61e16..95d6851 100644
+++ b/kernel-boot/rdma-persistent-naming.rules
@@ -25,4 +25,4 @@
   #   Device type = RoCE
   #   mlx5_0 -> rocex525400c0fe123455
   #
-ACTION=="add", SUBSYSTEM=="infiniband", PROGRAM="rdma_rename %k NAME_FALLBACK"
+ACTION=="add", SUBSYSTEM=="infiniband", KERNEL!="hfi1*", PROGRAM="rdma_rename %k NAME_FALLBACK"

We are moving to the new names by default slowly, when wrong
assumptions are found in other packages they need to be updated and
their fixes pushed out.

At some point the major distros will default this to On. People using
leading edge distros can turn it off with the global switch Leon
mentioned.

This is the same process netdev went through when they introduced
persistent names.

If I recall, hfi was one of the reason this work was done. HFI has
problems generating consistent names for its multi-function devices in
various cases and I NAK'd the kernel hack to try and 'fix' that.

So are you saying you won't take this patch then?

No, this is not a longterm solution. The point of upstream here is to
highlight what needs to be fixed so leading edge distro can fix their
stuff.

I guess we can work with distros to get the right rules in place outside of
rdma-core so that things continue to work.

I would actively block an attempt to try and do an end-run around
upstream like this. rdma-core is supposed to be the defacto
configuration, not be modified randomly by distros as before.

No but users should be free to name their devices how they want should they not?

You can request distros delay enabling renaming until psm/etc are
fixed.

Not an end-run around upstream at all. I didn't mean to imply anything about how it's done, delaying the enabling, or whatever is fine for now. I just meant something that does *not* change/impact rdma-core.

The distros know the users/cases where renaming is needed and can
decide if they are more or less important than psm for default
enablement.

Exactly. We are on the same page here.

You are correct someone tried to put forth a hack for the flip-flop name
thing [1]. However even if this was used as a solution for that issue we
would still have the same library looking for hfi1_0 problem.

It was always a bad design to hardwire strings like this, that library
needs to be fixed up.

Do you remember when I was so annoyed that HFI1 created it's own char
dev, and told you not to do it? This is yet another reason why...

Why isn't psm keying off it's own chardev anyhow? There should be back
links to the RDMA device in sysfs from there.

No arguments here. No sense in going down this road though at this point in the game.

-Denny



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux