Re: [PATCH rdma-next] RDMA/i40w: Hold read semaphore while looking after VMA

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 26, 2018 at 10:46:09AM -0600, Jason Gunthorpe wrote:
> On Thu, Jul 26, 2018 at 11:34:42AM -0500, Shiraz Saleem wrote:
> > On Mon, Jul 23, 2018 at 04:29:04PM -0600, Jason Gunthorpe wrote:
> > > On Mon, Jul 23, 2018 at 05:19:41PM -0500, Shiraz Saleem wrote:
> > > > On Mon, Jul 23, 2018 at 02:39:56PM -0600, Jason Gunthorpe wrote:
> > > > > On Sat, Jul 21, 2018 at 08:31:30AM -0500, Shiraz Saleem wrote:
> > > > > > On Sun, Jul 01, 2018 at 10:36:24AM -0600, Leon Romanovsky wrote:
> > > > > > > From: Leon Romanovsky <leonro@xxxxxxxxxxxx>
> > > > > > > 
> > > > > > > VMA lookup is supposed to be performed while mmap_sem is held.
> > > > > > > 
> > > > > > > Fixes: f26c7c83395b ("i40iw: Add 2MB page support")
> > > > > > > Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxxxx>
> > > > > > >
> > > > > > Thanks for fixing this bug.
> > > > > > 
> > > > > > Acked-by: Shiraz Saleem <shiraz.saleem@xxxxxxxxx> 
> > > > > 
> > > > > This code in the driver all looks wrong to me, it should not rely on
> > > > > the huge pages flag and it shouldn't do this VM walking.
> > > > >
> > > > Hi Jason - Why is that we cant use umem->hugetlb to determine if
> > > > MR is composed of huge pages? Isnt that why the flag was defined for
> > > > in the first place. All the VMAs w.r.t the user-space buffer are
> > > > scanned to make this determination in ib_umem_get.
> > > > Isnt it better to use it than implement it in individual driver.
> > > 
> > > It was a mistake to have this flag and all drivers except i40iw and
> > > bnxt don't use it. I wish to remove it.
> > I see. What was the drawback to the approach?
> 
> It is just wongly designed.. page merging is about combining physical
> addresses within a HW supported size list, not about matching the CPU
> huge page size(s) to HW.

Well ok. That is a different way of looking at it.
Our assumption was tied to the fact that the OS makes
available only a few page sizes (depending on the CPU arch) and
huge pages fall in that bracket. So it made sense for it be
supported by core. But your approach seems to give more
flexibility and gets rid of the locking. 

> We should never even care about 'huge pages' in the VM sense, and
> certainly never scan a VM, and it is buggy/racy to do so, I
> think. Leon already noticed missing locking, but I think there is
> worse here since we have desync'd with the lock that get_user_pages
> was run under.

OK. At this point, we are looking to fix in i40iw with an algo that
looks at physical page address.

Not sure I follow the desync argument as the pages are pinned and given
to us by the core. But if this is really a concern, there are other places
where the VMA scan is done too.

https://elixir.bootlin.com/linux/v4.18-rc7/source/drivers/infiniband/hw/mlx4/mr.c#L389
https://elixir.bootlin.com/linux/v4.18-rc6/source/drivers/infiniband/core/umem_odp.c#L341

Shiraz
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux