> -----Original Message----- > From: Leon Romanovsky <leon@xxxxxxxxxx> > Sent: Tuesday, October 8, 2019 2:52 AM > To: Parav Pandit <parav@xxxxxxxxxxxx> > Cc: Doug Ledford <dledford@xxxxxxxxxx>; Jason Gunthorpe > <jgg@xxxxxxxxxxxx>; RDMA mailing list <linux-rdma@xxxxxxxxxxxxxxx> > Subject: Re: [PATCH rdma-next 2/2] RDMA/core: Check that process is still > alive before sending it to the users > > On Mon, Oct 07, 2019 at 06:58:13PM +0000, Parav Pandit wrote: > > > > > > > -----Original Message----- > > > From: linux-rdma-owner@xxxxxxxxxxxxxxx <linux-rdma- > > > owner@xxxxxxxxxxxxxxx> On Behalf Of Leon Romanovsky > > > Sent: Wednesday, October 2, 2019 7:33 AM > > > To: Doug Ledford <dledford@xxxxxxxxxx>; Jason Gunthorpe > > > <jgg@xxxxxxxxxxxx> > > > Cc: Leon Romanovsky <leonro@xxxxxxxxxxxx>; RDMA mailing list <linux- > > > rdma@xxxxxxxxxxxxxxx> > > > Subject: [PATCH rdma-next 2/2] RDMA/core: Check that process is > > > still alive before sending it to the users > > > > > > From: Leon Romanovsky <leonro@xxxxxxxxxxxx> > > > > > > The PID information can disappear asynchronically because task can > > > be killed and moved to zombie state. In such case, PID will be zero > > > in similar way to the kernel tasks. Recognize such situation where > > > we are asking to return orphaned object and simply skip filling PID > attribute. > > > > > > As part of this change, document the same scenario in counter.c code. > > > > > > Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxxxx> > > > --- > > > drivers/infiniband/core/counters.c | 14 ++++++++++++-- > > > drivers/infiniband/core/nldev.c | 31 ++++++++++++++++++++++-------- > > > 2 files changed, 35 insertions(+), 10 deletions(-) > > > > > > diff --git a/drivers/infiniband/core/counters.c > > > b/drivers/infiniband/core/counters.c > > > index 12ba2685abcf..47c551a0bcb0 100644 > > > --- a/drivers/infiniband/core/counters.c > > > +++ b/drivers/infiniband/core/counters.c > > > @@ -149,8 +149,18 @@ static bool auto_mode_match(struct ib_qp *qp, > > > struct rdma_counter *counter, > > > struct auto_mode_param *param = &counter->mode.param; > > > bool match = true; > > > > > > - /* Ensure that counter belongs to the right PID */ > > > - if (task_pid_nr(counter->res.task) != task_pid_nr(qp->res.task)) > > > + /* > > > + * Ensure that counter belongs to the right PID. > > > + * This operation can race with user space which kills > > > + * the process and leaves QP and counters orphans. > > > + * > > > + * It is not a big deal because exitted task will leave both > > > + * QP and counter in the same bucket of zombie process. Just ensure > > > + * that process is still alive before procedding. > > > + * > > > + */ > > > + if (task_pid_nr(counter->res.task) != task_pid_nr(qp->res.task) || > > > + !task_pid_nr(qp->res.task)) > > > return false; > > > > > > if (auto_mask & RDMA_COUNTER_MASK_QP_TYPE) diff --git > > > a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c > > > index 71bc08510064..c6fe0c52f6dc 100644 > > > --- a/drivers/infiniband/core/nldev.c > > > +++ b/drivers/infiniband/core/nldev.c > > > @@ -399,20 +399,35 @@ static int fill_res_info(struct sk_buff *msg, > > > struct ib_device *device) static int fill_res_name_pid(struct sk_buff > *msg, > > > struct rdma_restrack_entry *res) { > > > + int err = 0; > > > + pid_t pid; > > > + > > > /* > > > * For user resources, user is should read /proc/PID/comm to get the > > > * name of the task file. > > > */ > > > if (rdma_is_kernel_res(res)) { > > > - if (nla_put_string(msg, > > > RDMA_NLDEV_ATTR_RES_KERN_NAME, > > > - res->kern_name)) > > > - return -EMSGSIZE; > > > - } else { > > > - if (nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_PID, > > > - task_pid_vnr(res->task))) > > > - return -EMSGSIZE; > > > + err = nla_put_string(msg, > > > RDMA_NLDEV_ATTR_RES_KERN_NAME, > > > + res->kern_name); > > > + goto out; > > > } > > > - return 0; > > > + > > > + pid = task_pid_vnr(res->task); > > > + /* > > > + * PID == 0 returns in two scenarios: > > > + * 1. It is kernel task, but because we checked above, it won't be > > > possible. > > Please drop above comment point 1. See more below. > > > > > + * 2. Task is dead and in zombie state. There is no need to print > > > +PID > > > anymore. > > > + */ > > > + if (pid) > > > + /* > > > + * This part is racy, task can be killed and PID will be zero > right > > > + * here but it is ok, next query won't return PID. We don't > > > promise > > > + * real-time reflection of SW objects. > > > + */ > > > + err = nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_PID, pid); > > > + > > > +out: > > > + return err ? -EMSGSIZE : 0; > > > } > > > > Below code reads better along with rest of the comments in the patch. > > > > if (kern_resource) { > > err = nla_put_string(msg, RDMA_NLDEV_ATTR_RES_KERN_NAME, > > res->kern_name); > > } else { > > pid_t pid; > > > > pid = task_pid_vnr(res->task); > > if (pid) > > err = nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_PID, pid); } > > Why do you think that nested "if" reads better? > Because pid access is required only for non-kernel resource. Hence it shouldn't be called for kernel resource; it doesn't matter it return zero or not.