Re: [PATCH v1 06/11] mm: support GUP-triggered unsharing via FAULT_FLAG_UNSHARE (!hugetlb)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Dec 17, 2021 at 9:03 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
>
> Why are we comparing page_count() against 1 and not 1 + PageLRU(page)?
> Having a reference from the LRU should be expected.  Is it because of
> some race that we'd need to take the page lock to protect against?

The LRU doesn't actually count towards a reference - the LRU list is
maintained independently of the lifetime of the page (and is torn down
on last release - which wouldn't work if the LRU list itself held a
ref to the page).

But atr least some of the code that gathers up pages to then put them
on the LRU list takes a ref to the pages before passing them off, just
to guarantee to keep them around during the operation.

So yes, various things can increment page counts in a transitory manner.

I still *much* prefer a reliable COW over one that doesn't happen enough.

The page count can have these (on the whole fairly rare) blips. That's
ok. The page count is still *reliable*, in ways that teh mapcount can
never be. The mapcount fundamentally doesn't show "other non-mapped
users".

So Nadav is correct that unnecessary cow events will cause extra work
(and the TLB flush is a good point - just marking a page writable
as-is is much cheaper).

But I'm looking at teh actual code, and the actual logic, and I am
dismissign the whole mapcount games completely.

David has a 10-patch series (plus one test) of complex, grotty,
hard-to-understand code with new flags.

I posted a patch that removed 10 lines, and fixes the problem case his
test-case was designed for.

I think that really speaks to the issues.

My approach is *simpler* and a hell of a lot more robust. And as
mentioned, I can explain it.

And christ the thing I'm advocating for is WHAT WE ALREADY DO FOR
99.99% of all cases. Why? Because it's literally how the regular COW
paths work TODAY.

And we had benchmarks show performance improvements (or no movement at
all) from when we made those changes. Not the downsides that people
claim.

It's only the THP paths that are broken (and possibly some individual
mis-uses of GUP - people have mentioned virtio).

So nbow people are trying to do a fragile, complex thing that was
shown to be problematic for the common case, and they are doing it for
the insanely rare case? When a ten-line removal patch fixes that one
too?

              Linus

PS. Yes, yes, that 10-line removal patch is obviously still not
tested, it's still likely incomplete because the THP case needs to do
the page-pinning logic on the other side too, so I'm very obviously
over-simplifying. But the fact that the *normal* pages already do this
correctly - and don't use mapcount - should really make people go
"Hmm".




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux