Re: folio mapcount

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Dec 16, 2021 at 03:54:36PM +0000, Matthew Wilcox wrote:
> On Thu, Dec 16, 2021 at 11:19:17AM -0400, Jason Gunthorpe wrote:
> > On Thu, Dec 16, 2021 at 01:56:57PM +0000, Matthew Wilcox wrote:
> > > p = mmap(x, 2MB, PROT_READ|PROT_WRITE, ...): THP allocated
> > > mprotect(p, 4KB, PROT_READ): THP split.
> > > 
> > > And in that case, I would say the THP now has mapcount of 2 because
> > > there are 2 VMAs mapping it.
> > 
> > At least today mapcount is only loosely connected to VMAs. It really
> > counts the number of PUD/PTEs that point at parts of the memory. 
> 
> Careful.  Currently, you need to distinguish between total_mapcount(),
> page_trans_huge_mapcount() and page_mapcount().  Take a look at
> __page_mapcount() to be sure you really know what the mapcount "really"
> counts today ...

Right, I was mostly trying to describe one of the difficult problems
all this different stuff is trying to solve.

> > If the above ends up with a mapcount of 2 then COW will copy not
> > re-use, which will cause unexpected data corruption in all those
> > annoying side cases.
> 
> As I understood David's presentation yesterday, we actually have
> data corruption issues in all the annoying side cases with THPs
> in current upstream, so that's no worse than we have now.  But let's
> see if we can avoid them.

Possibly, I'm not sure :)
 
> It feels like what we want from a COW perspective is a count of the
> number of MMs mapping a page, not the number of VMAs, PTEs or PMDs
> mapping the page.  Right?

Interesting..

For the COW the interesting question is if wp_page_reuse() happens in
do_wp_page(), and it looks like mapcount is only used to make that
decision for anonymous pages. So, at least for COW's use of mapcount
we can focus entirely on anon pages?

For anon pages.. At least the number of VMA's pointing to anon memory
is a limit on map_count - I assume there is some way we can copy and
double-map anonymous VMAs into the same mm? Still, if we can know the
number of VMAs is 1 then we are safe to allow wp_page_reuse()

However, it needs to be more exact than that, if num VMAs > 1 we then
have to query on a per-page granularity

Though, it seems interesting, if we knew how many anonymous VMA's
pointed at an anonymous page (at a 4k granularity) that would replace
mapcount for COW?

I wonder if we could somehow know that only 1 VMA is pointing at the
pages as the normal fast path and if COW encounters more than 1 VMA it
does some more expensive calculation?

> p = mmap(x, 2MB, PROT_READ|PROT_WRITE, ...): THP allocated
> mremap(p + 128K, 128K, 128K, MREMAP_MAYMOVE | MREMAP_FIXED, p + 2MB):
> PMD split

> Should mapcount be 1 or 2 at this point?

If I read this right it should be 1 because each 4k page is pointed to
by only 1 PTE/PMD. mremap moves, not copies..

Jason




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux