Re: [patch 2/6] mm: fix vma_is_anonymous() false-positives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Jul 22, 2018 at 01:21:06PM -0700, Linus Torvalds wrote:
> On Sat, Jul 21, 2018 at 3:49 PM Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > IOW, Kirill's patch now just boils down to the attached trial patch.
> 
> Side note: we have a few rdma drivers that mess with vm_ops.
> 
> One does so intentionally, see mlx4_ib_vma_open() in
> 
>     drivers/infiniband/hw/mlx4/main.c

Both drivers should be doing this for identical reasons, the long
comment in mlx4/main.c appears to be the rational and expected
behavior..

mlx5/main.c also does this.

Most likely this entire process should be part of the core support
library and not open coded like this in drivers.

> and one *may* be intentional but it's not entirely clear:
> hns_roce_vma_open() may bve the same issue, but
> hns_roce_disassociate_ucontext() just looks pointless in
> 
>     drivers/infiniband/hw/hns/hns_roce_main.c

All three drivers should have code like this as well, it is triggered
during something like a PCI hot unplug,

What the drivers are doing is mapping PCI BAR memory in response to
mmap().

When the device becomes unplugged then the BAR memory map is to be
revoked and replaced with a zero page, so the physical device can be
physically un-plugged, or hard reset. (ie access to the BAR memory may
now MCE or something)

This is probably the strongest reason why these pages need to be
protected, as the driver must be able to hunt down all user space
mappings of the BAR page and do this zeroing, so "don't copy" and
other flags seem necessary with the current vma tracking approach.

Would be glad to hear if this algorithm with zap_vma_ptes makes any
sense or is totally wrong.

However, I believe, at least for mlx4/5, it is actually actively
tested and does work in the straightfoward case.

> Afaik, they should just use  VM_DONTCOPY to not see the fork open
> case, VM_DONTEXPAND to fail a size-changing mremap, and set a
> vm_ops->split() function that returns an error.

This does makes more sense, I certainly would prefer well defined
flags over this strange manipulation.

If you say this approach will do what is needed I'll ask the driver
maintainer to code and test with it..

Jason



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux