Re: RDMA subsystem namespace related questions (was Re: Finding the namespace of a struct ib_device)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/5/20 9:16 PM, Jason Gunthorpe wrote:
On Mon, Oct 05, 2020 at 06:27:39PM +0800, Ka-Cheong Poon wrote:
On 10/2/20 10:04 PM, Jason Gunthorpe wrote:
that namespace to use it.  If there are a large number of namespaces,
there won't be enough devices to assign to all of them (e.g. the
hardware I have access to only supports up to 24 VFs).  The shared
mode can be used in this case.  Could you please explain what needs
to be done to support a large number of namespaces in exclusive
mode?

Modern HW supports many more than 24 VFs, this is the expected
interface

Do you have a ballpark on how many VFs are supported?  Is it in
the range of many thousands?

Yes

BTW, while the shared mode is still here, can there be a simple
way for a client to find out which mode the RDMA subsystem is using?

Return NULL for the namespace


OK, will add that to rdma_dev_to_netns().


The new cm_id starts with the same ->context as the listener, the ULP should
use this to pass any data, such as the namespace.

This is what I suspected as mentioned in the previous email.  But
this makes it inconvenient if the context is already used for
something else.

Don't see why. the context should be allocated memory, so the ULP can
put several things lin there.

I'm skeptical ULPs should be doing per-ns stuff like that. A ns aware
ULP should fundamentally be linked to some FD and the ns to use should
derived from the process that FD is linked to. Keeping per-ns stuff
seems wrong.


It is a kernel module.  Which FD are you referring to?  It is
unclear why a kernel module must associate itself with a user
space FD.  Is there a particular reason that rdma_create_id()
needs to behave differently than sock_create_kern() in this
regard?

Somehow the kernel module has to be commanded to use this namespace,
and generally I expect that command to be connected to FD.


It is an unnecessary restriction on what a kernel module
can do.  Is it a problem if a kernel module initiates its
own RDMA connection for doing various stuff in a namespace?
Any kernel module can initiate a TCP connection to do various
stuff without worrying about namespace deletion problem.  It
does not cause a problem AFAICT.  If the module needs to make
sure that the namespace does not go away, it can add its own
reference.  Is there a particular reason that RDMA subsystem
needs to behave differently?


We don't have many use cases where the kernel operates namespaces
independently..


FWIW, I am adding code to do that.  It works fine using
TCP kernel socket.  It has the namespace deletion problem
with RDMA connection.


While discussing about per namespace stuff, what is the reason
that the cma_wq is a global shared by all namespaces instead of
per namespace?  Is there a problem to have a per namespace cma_wq?

Why would we want to do that?


For scalability and namespace separation reasons as cma_wq is
single threaded.  For example, there can be many work to be done
in one namespace.  But this should not have an adverse effect on
other namespaces (as long as there are resources available).


--
K. Poon
ka-cheong.poon@xxxxxxxxxx





[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux