Re: RDMA device renames and node description

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/19/2020 11:58 AM, Jason Gunthorpe wrote:
On Wed, Feb 19, 2020 at 09:14:06AM -0500, Dennis Dalessandro wrote:

ABI breakage is a strong word, luckily enough it is not defined at all.
We never considered dmesg prints, device names, device ordering as an
ABI. You can't rely on debug features too, they can disappear too.

Agree, it is a strong word and we can call it what you want. The point is
you should be able to rely on the node description not being changed out
from under you unnecessarily though. We aren't talking about a debug feature
here but a core feature to real world deployments.

People really use the node description as some stable name? And then
they put the HCA name in it? Why?

I've seen it in multiple places. Including storage configuration files. Suffice to say, yes people use it.

Is that some thing unique to the OPA subnet manager?

I don't think so.

I don't recall people complaining about this when we introduced
rdma-ndd by default and changed all the node descriptions away from
the kernel default.

Sure but the reason rdma-ndd exists is because people care about the node descriptions. I can't really speak to the historical adoption of rdma-ndd but I believe it was a stand alone package/feature and was a conscious decision to use or not as opposed to the one package to rule them all rdma-core like we have now.

Also don't forget the whole thing about the node description is
inherently racey, so relying on it is Rather A Bad Idea.

I think that point is well taken and I don't think anyone is against the idea of fixing the "hacky" things as you like to say. This one just caught people by surprise is all.

Should we change the default format string of rdma-ndd to something
else?

I'm not sure. I can envision situations where a user has updated libraries that are happy with the new persistent names but still want the node description to not change. If rdma-ndd could do something to keep the node desc the same, then in situations like this the device rename would not have to be disabled.

Given that we have seen problems with MVAPICH (even with mlx5), libfabric, psm2, and I believe open mpi has a similar issue, and that Intel, Amazon, RedHat, and Suse are experiencing issues from this I think we should make things as flexible as possible to protect users from breakages.

We do want to move in a forward direction though so we don't want to go back to the old way unilaterally. I think distros can handle their upgrade situations and if we build in protection to rdma-ndd something like a specific udev rule for keeping the node desc the same. That gives us the flexibility until all the software and use cases catch up.

-Denny



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux