On Mon, Jan 14, 2019 at 3:09 PM Moni Shoua <monis@xxxxxxxxxxxx> wrote: > > On Sun, Jan 13, 2019 at 8:18 PM Parav Pandit <parav@xxxxxxxxxxxx> wrote: > > > > > > > > > -----Original Message----- > > > From: Moni Shoua <monis@xxxxxxxxxxxx> > > > Sent: Sunday, January 13, 2019 10:57 AM > > > To: Jason Gunthorpe <jgg@xxxxxxxxxxxx> > > > Cc: Parav Pandit <parav@xxxxxxxxxxxx>; Leon Romanovsky > > > <leon@xxxxxxxxxx>; linux-rdma@xxxxxxxxxxxxxxx > > > Subject: Re: [PATCH for-rc 2/2] IB/mlx5: Fix how advise_mr() launches async > > > work > > > > > > > Ah! I *thought* I checked this, yes, using the system work queue is > > > > why I added the put_device :) > > > > > > > > > It should be done to advise_mr_wq(). This will give chance to flush > > > > > the wq when IB device is unregistered by the core. > > > > > > > > Good question - Moni?? > > > > > > > > Jason > > > Thanks. > > > Using schedule_work() is obviously a bug. The intention was to use the > > > advise_mr_wq. I'll send a fix for that. > > > However, After this will be fixed there is no need to get_device() since it is > > > assured that no work items will be processed after > > > mlx5_ib_stage_init_cleanup() finishes. > > > > Also you either need to flush the advise_mr_wq or cancel the pending work item when MR is destroyed to make sure that whatever page fault user triggered are canceled. Otherwise a MR can get recycled (dealloc, alloc) (looked up by MR key) for same PD which is page faulting now, which user didn't ask to. There is also a check for valid address. So, if the the new MR has different address range then the operation will not take and if the address is in range of the new MR so there is no difference.