Re: RDMA subsystem namespace related questions (was Re: Finding the namespace of a struct ib_device)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Oct 9, 2020, at 11:07 AM, Jason Gunthorpe <jgg@xxxxxxxx> wrote:
> 
> On Fri, Oct 09, 2020 at 11:00:22AM -0400, Chuck Lever wrote:
>> 
>> 
>>> On Oct 9, 2020, at 10:57 AM, Jason Gunthorpe <jgg@xxxxxxxx> wrote:
>>> 
>>> On Fri, Oct 09, 2020 at 10:48:55AM -0400, Chuck Lever wrote:
>>>> Hi Jason-
>>>> 
>>>>> On Oct 9, 2020, at 10:39 AM, Jason Gunthorpe <jgg@xxxxxxxx> wrote:
>>>>> 
>>>>> On Fri, Oct 09, 2020 at 12:49:30PM +0800, Ka-Cheong Poon wrote:
>>>>>> As I mentioned before, this is a very serious restriction on how
>>>>>> the RDMA subsystem can be used in a namespace environment by kernel
>>>>>> module.  The reason given for this restriction is that any kernel
>>>>>> socket without a corresponding user space file descriptor is "rogue".
>>>>>> All Internet protocol code create a kernel socket without user
>>>>>> interaction.  Are they all "rogue"?
>>>>> 
>>>>> You should work with Chuck to make NFS use namespaces properly and
>>>>> then you can propose what changes might be needed with a proper
>>>>> justification.
>>>> 
>>>> The NFS server code already uses namespaces for creating listener
>>>> endpoints, already has a user space component that drives the
>>>> creation of listeners, and already passes an appropriate struct
>>>> net to rdma_create_id. As far as I am aware, it is namespace-aware
>>>> and -friendly all the way down to rdma_create_id().
>>>> 
>>>> What more needs to be done?
>>> 
>>> I have no idea, if you are able to pass a namespace all the way down
>>> to the listening cm_id and everything works right (I'm skeptical) then
>>> there is nothing more to worry about - why are we having this thread?
>> 
>> The thread is about RDS, not NFS. NFS has some useful examples to
>> crib, but it's not the main point.
>> 
>> I don't think NFS/RDMA namespacing works today, but it's not because
>> NFS isn't ready. I agree that is another thread.
> 
> Exactly, so instead of talking about RDS stuff without any patches,

Roger that. Maybe Ka-Cheong and team can propose some patches to
help the discussion along.


> let's talk about NFS with patches - if you can make NFS work then I
> assume RDS will be happy.

Perhaps not a valid assumption :-)

NFS is a traditional client-server model, and has a user space tool
that drives the creation of endpoints, just as you expect.

With RDS, listener endpoints are not visible in user space. They
are a globally-managed shared resource, more like network interfaces
than listener sockets.

Therefore I think the approach is going to be "one RDS listener per
net namespace". The problem Ka-Cheong is trying to address is how to
manage the destruction of a listener-namespace pair. The extra
reference count on the cm_id is pinning the namespace so it cannot
be destroyed.


> NFS has an established model for using namespaces that the other
> transports uses, so I'd rather focus on this.

Understood, but it doesn't seem like there is enough useful overlap
between the NFS and RDS usage scenarios. With NFS, I would expect
an explicit listener shutdown from userland prior to namespace
destruction.


>>>>> The rules for lifetime on IB clients are tricky, and the interaction
>>>>> with namespaces makes it all a lot more murky.
>>>> 
>>>> I think what Ka-cheong is asking is for a detailed explanation of
>>>> these lifetime rules so we can understand why rdma_create_id bumps
>>>> the namespace reference count.
>>> 
>>> It is because the CM has no code to revoke a CM ID before the
>>> namespace goes away and the pointer becomes invalid.
>> 
>> Is it just a question of "no-one has yet written this code" or is
>> there a deeper technical reason why this has not been done?
> 
> It is hard to know without spending a big deep look at this
> stuff.

Fair enough.

--
Chuck Lever
chucklever@xxxxxxxxx






[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux