On 11/05/2020 18:35, Yishai Hadas wrote: > On 5/11/2020 5:31 PM, Gal Pressman wrote: >> On 11/05/2020 16:12, Yishai Hadas wrote: >>> Introduce import verbs for device, PD, MR, it enables processes to share >>> their ibv_contxet and then share PD and MR that is associated with. >>> >>> A process is creating a device and then uses some of the Linux systems >>> calls to dup its 'cmd_fd' member which lets other process to obtain >>> owning on. >>> >>> Once other process obtains the 'cmd_fd' it can call ibv_import_device() >>> which returns an ibv_contxet on the original RDMA device. >>> >>> On the imported device there is an option to import PD(s) and MR(s) to >>> achieve a sharing on those objects. >>> >>> This is the responsibility of the application to coordinate between all >>> ibv_context(s) that use the imported objects, such that once destroy is >>> done no other process can touch the object except for unimport. All >>> users of the context must collaborate to ensure this. >>> >>> A matching unimport verbs where introduced for PD and MR, for the device >>> the ibv_close_device() API should be used. >>> >>> Detailed man pages are introduced as part of this RFC patch to clarify >>> the expected usage and notes. >>> >>> Signed-off-by: Yishai Hadas <yishaih@xxxxxxxxxxxx> >> >> Hi Yishai, >> >> A few questions: >> Can you please explain the use case? I remember there was a discussion on the >> previous shared PD kernel submission (by Yuval and Shamir) but I'm not sure if >> there was a conclusion. >> > > The expected flow and use case are as follows. > > One process creates an ibv_context by calling ibv_open_device() and then enables > owning of its 'cmd_fd' with other processes by some Linux system call, (see man > page as part of this RFC for some alternatives). Then other process that owns > this 'cmd_fd' will be able to have its own ibv_context for the same RDMA device > by calling ibv_import_device(). > > At that point those processes really work on same kernel context and PD(s), > MR(s) and potentially other objects in the future can be shared by calling > ibv_import_pd()/mr() assuming that the initiator process let's the other ones > know the kernel handle value. > > Once a PD and MR which points to this PD were shared it enables a memory that > was registered by one process to be used by others with the matching lkey/rkey > for RDMA operations. Thanks Yishai. Which type of applications need this kind of functionality? >> Could you please elaborate more how the process cleanup flow (e.g killed >> process) is going to change? I know it's a very broad question but I'm just >> trying to get the general idea. >> > > For now the model in those suggested APIs is that cleanup will be done or > explicitly by calling the relevant destroy command or alternatively once all > processes that own the cmd_fd will be closed. > > From kernel side there is only one object and its ref count is not increased as > part of the import_xxx() functions, see in the man pages some notes regarding > this point. ACK. >> What's expected to happen in a case where we have two processes P1 & P2, both >> use a shared PD, but separate MRs and QPs (created under the same shared PD). >> Now when an RDMA read request arrives at P2's QP, but refers to an MR of P1 >> (which was not imported, but under the same PD), how would you expect the device >> to handle that? >> > > The processes are behaving almost like 2 threads each have a QP and an MR, if > you mix them around it will work just like any buggy software. > In this case I would expect the device to scatter to the MR that was pointed by > the RDMA read request, any reason that it will behave differently ? I meant that the process is the RDMA read responder, not requester (although it's very similar), are we OK with one process accessing memory of a different process even though the MR isn't exported? I'm wondering whether there are any assumption about the "security" model of this feature, or are both processes considered exactly the same. Especially since both the kernel and the device aren't aware of the shared resources. It's a bit confusing that some of the resources are shared while others aren't though all created using the same PD.