Re: [patch 128/192] mm: improve mprotect(R|W) efficiency on pages referenced once

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



(sorry for a very late reply)

On Thu, Jul 01, 2021 at 11:29:50AM -0700, Linus Torvalds wrote:
> On Wed, Jun 30, 2021 at 6:27 PM Peter Xu <peterx@xxxxxxxxxx> wrote:
> >
> > On Wed, Jun 30, 2021 at 11:03:25AM -0700, Linus Torvalds wrote:
> > >
> > > What about that page_count() test, for example: it has a comment, it
> > > looks obvious, but it's very different from what do_wp_page() does. So
> > > what happens if we have a page-out at the same time that turns that
> > > page into a swap cache page, and increments the page count? What about
> > > that race? Do we end up with a writable page that is shared with a
> > > swap cache entry? Is that ok? Why isn't it ok in the page fault case?
> >
> > That looks fine to me: when the race happens we must have checked page_count==1
> > first and granted the write bit, then add_to_swap_cache() happens after the
> > page_count==1 check (as it takes another refcount, so >2 otherwise).  Then it
> > also means unmap mappings should happen even after that point.  If my above
> > understanding is correct, our newly installed pte will be zapped safely soon,
> > but of course after we release the pgtable lock in change_pte_range().
> 
> So if this is fine, then maybe we should just remove the page lock in
> the do_wp_page() path (and remove the PageKSM check while at it)?

I could be wrong, but I thought the page lock in do_wp_page() is more for the
PageKsm() race (e.g., to make sure we don't grant write to a page that is
becoming a ksm page in parallel).

> 
> If it's not required by mprotect() to say "I can make the page
> writable directly", then it really shouldn't be required by the page
> fault path either.
> 
> Which I'd love to do, and was really itching to do (it's a nasty
> lock), but I worried about it..
> 
> I'd hate to have mprotect do one thing, and page faulting do another
> thing, and not have some logic to why they have to be different.

Agreed; perhaps no need to be identical - I think the mprotect path can be even
stricter than the fault page, as it's a fast-path only. It should never apply
the write bit when the page fault path won't.  So I think the original patch
does need a justification on why it didn't handle ksm page while do_wp_page
handled it.

Thanks,

-- 
Peter Xu




[Index of Archives]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux