RE: [PATCH for-rc 2/2] IB/mlx5: Fix how advise_mr() launches async work

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Jason Gunthorpe
> Sent: Saturday, January 12, 2019 11:39 AM
> To: Parav Pandit <parav@xxxxxxxxxxxx>
> Cc: Leon Romanovsky <leon@xxxxxxxxxx>; linux-rdma@xxxxxxxxxxxxxxx; Moni
> Shoua <monis@xxxxxxxxxxxx>
> Subject: Re: [PATCH for-rc 2/2] IB/mlx5: Fix how advise_mr() launches async
> work
> 
> On Sat, Jan 12, 2019 at 10:32:54AM -0700, Parav Pandit wrote:
> >
> >
> > > From: linux-rdma-owner@xxxxxxxxxxxxxxx <linux-rdma-
> > > owner@xxxxxxxxxxxxxxx> On Behalf Of Jason Gunthorpe
> > > Sent: Saturday, January 12, 2019 11:06 AM
> > > To: Leon Romanovsky <leon@xxxxxxxxxx>
> > > Cc: linux-rdma@xxxxxxxxxxxxxxx; Moni Shoua <monis@xxxxxxxxxxxx>
> > > Subject: Re: [PATCH for-rc 2/2] IB/mlx5: Fix how advise_mr()
> > > launches async work
> > >
> > > On Sat, Jan 12, 2019 at 05:06:02PM +0200, Leon Romanovsky wrote:
> > > > On Fri, Jan 11, 2019 at 07:31:24PM -0700, Jason Gunthorpe wrote:
> > > > > From: Jason Gunthorpe <jgg@xxxxxxxxxxxx>
> > > > >
> > > > > Work must hold a kref on the ib_device otherwise the dev pointer
> > > > > can become free before the work runs.
> > > >
> > > > How can it occur exactly?
> > > >
> > > > "dev" will be released after mlx5_ib will exist, mlx5_ib exist
> > > > will drain advise_mr queue before it. Such drain will ensure that
> > > > no works are running.
> > >
> > > Okay, that is fine for the get_device, it can be removed.
> > >
> > > Are we OK with the reg_state hunk?
> > >
> > > Jason
> >
> > Unrelated to your question, I noticed that advise_mr does
> > uverbs_attr_ptr_get_array_size() for the SGEs.  I think this is
> > incorrect.  It should have been mr handles whose ownership can be
> > check against the PID or ucontext that registered those MRs.
> 
> The SGE's have lkeys in them, and the lkeys are local to the PD passed in as
> argument.
> 
> So the caller cannot touch resources outside the PD, hopefully the driver
> implementation checks this..
>
pagefault_single_data_segment () invokes __mlx5_mr_lookup() to lookup a MR per device and not per PD.
mlx5_ib_advise_mr_prefetch() doesn't touch pd.
work item doesn't have PD in it.
Lets say PD check is added and it passes in pagefault_single_data_segment() under device overloaded srcu lock in work item context.

After that MR got reallocated to other process before work item can do the work and now it fault's some other process.

To avoid this, instead of overloading dev->mr_srcu, this work item should be enqueued to advise_mr_wq and that should be flushed during destroy_mr() which ensures that nothing dangling for this MR.
Such code will create right synchronization without depending on randomness of key, number of MR etc parameters.

> > Who enqueues work to dev->advise_mr_wq? schedule_work() in
> > mlx5_ib_advise_mr_prefetch() does to global workqueue.
> 
> Ah! I *thought* I checked this, yes, using the system work queue is why I
> added the put_device :)
> 
> > It should be done to advise_mr_wq().  This will give chance to flush
> > the wq when IB device is unregistered by the core.
> 
> Good question - Moni??
> 
> Jason




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux