On Mon, May 07, 2018 at 09:38:53AM +0800, 858585 jemmy wrote: > On Sat, May 5, 2018 at 2:23 AM, Jason Gunthorpe <jgg@xxxxxxxx> wrote: > > On Fri, May 04, 2018 at 04:51:15PM +0800, 858585 jemmy wrote: > >> On Fri, May 4, 2018 at 11:14 AM, 858585 jemmy <jemmy858585@xxxxxxxxx> wrote: > >> > On Thu, May 3, 2018 at 11:33 PM, Jason Gunthorpe <jgg@xxxxxxxx> wrote: > >> >> On Thu, May 03, 2018 at 10:04:34PM +0800, Lidong Chen wrote: > >> >>> The userspace may invoke ibv_reg_mr and ibv_dereg_mr by different threads. > >> >>> If when ibv_dereg_mr invoke and the thread which invoked ibv_reg_mr has > >> >>> exited, get_pid_task will return NULL, ib_umem_release does not decrease > >> >>> mm->pinned_vm. This patch fixes it by use tgid. > >> >>> > >> >>> Signed-off-by: Lidong Chen <lidongchen@xxxxxxxxxxx> > >> >>> drivers/infiniband/core/umem.c | 12 ++++++------ > >> >>> include/rdma/ib_umem.h | 2 +- > >> >>> 2 files changed, 7 insertions(+), 7 deletions(-) > >> >> > >> >> Why are we even using a struct pid for this? Does anyone know? > >> > > >> > commit 87773dd56d5405ac28119fcfadacefd35877c18f add pid in ib_umem structure. > >> > > >> > and the comment has such information: > >> > Later a different process with a different mm_struct than the one that > >> > allocated the ib_umem struct > >> > ends up releasing it which results in decrementing the new processes > >> > mm->pinned_vm count past > >> > zero and wrapping. > >> > >> I think a different process should not have the permission to release ib_umem. > >> so maybe the reason is not a different process? > >> can ib_umem_release be invoked in interrupt context? > > > > We plan to restore fork support and add some way to share MRs between > > processes, so we must consider having a different process release the > > umem than acquired it. > > If restore fork support, what is the expected behavior? > If parent process pinned_vm is x, what is the child process pinned_vm > value after fork? It reset to zero now. > If the parent process call ibv_dereg_mr after fork, should the child > process decrease pinned_vm? > If the child process call ibv_dereg_mr after fork, should the parent > process decrease pinned_vm? If I recall the purpose of accessing the MM during de-register is to undo the pinned pages change (pinned_vm) that register performed. So, the semantic is simple, during deregister we must access excatly the same MM that was used during register and undo the change to pinned_vm. The approach should be to find the most reliably way to hold a reference to the MM that was used during register. Apparently we can't just hold a ref on the mm (according to mm_get's comment at least) tgid is clearly a better indirect reference to the mm than pid (pid is so obviously wrong) But I am wondering why not just hold struct task here instead of tgid? Isn't task->mm going to be more reliably than tgid->task->mm ?? Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html