Re: [PATCH v1 06/11] mm: support GUP-triggered unsharing via FAULT_FLAG_UNSHARE (!hugetlb)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Dec 21, 2021 at 04:19:33PM +0100, David Hildenbrand wrote:

> >> Note that I am trying to make also any kind of R/O pins on an anonymous
> >> page work as expected as well, to fix any kind of GUP after fork() and
> >> GUP before fork(). So taking a R/O pin on an !PageAnonExclusive() page
> >> similarly has to make sure that the page is exclusive -- even if it's
> >> mapped R/O (!).
> > 
> > Why? AFAIK we don't have bugs here. If the page is RO and has an
> > elevated refcount it cannot be 'PageAnonExclusive' and so any place
> > that wants to drop the WP just cannot. What is the issue?

> But what I think you actually mean is if we want to get R/O pins
> right.

What I ment was a page that is GUP'd RO, is not PageAnonExclusive and
has an elevated refcount. Those cannot be transformed to
PageAnonExclusive, or re-used during COW, but also they don't have
problems today. Either places are like O_DIRECT read and are tolerant
of a false COW, or they are broken like VFIO and should be using
FOLL_FORCE|FOLL_WRITE, which turns them into a WRITE and then we know
they get PageAnonExclusive.

So, the swap issue is fixed directly with PageAnonExclusive and no
change to READ GUP is required, at least in your #1 scenario, AFAICT..

> There are 2 models, leaving FOLL_FAULT_UNSHARE out of the picture for now:
> 
> 1) Whenever mapping an anonymous page R/W (after COW, during ordinary
> fault, on swapin), we mark the page exclusive. We must never lose the
> PageAnonExclusive bit, not during migration, not during swapout.

I prefer this one as well.

It allows us to keep Linus's simple logic that refcount == 1 means
always safe to re-use, no matter what.

And refcount != 1 goes on to consider the additional bit to decide
what to do. The simple bit really means 'we know this page has one PTE
so ignore the refcount for COW reuse decisions'.

> fork() will process the bit for each and every process, even if there
> was no GUP, and will copy if there are additional references.

Yes, just like it does today already for mapcount.

> 2) Whenever GUP wants to pin/ref a page, we try marking it exclusive. We
> can lose the PageAnonExclusive bit during migration and swapout, because
> that can only happen when there are no additional references.

I haven't thought of a way this is achievable.

At least not without destroying GUP fast..

Idea #2 is really a "this page is GUP'd" flag with some sneaky logic
to clear it.  That comes along with all the races too because as an
idea it is fundamentally about GUP which runs without locks.

Jason



[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux