On 08.03.22 15:14, David Hildenbrand wrote: > The basic question we would like to have a reliable and efficient answer > to is: is this anonymous page exclusive to a single process or might it > be shared? > > In an ideal world, we'd have a spare pageflag. Unfortunately, pageflags > don't grow on trees, so we have to get a little creative for the time > being. > > Introduce a way to mark an anonymous page as exclusive, with the > ultimate goal of teaching our COW logic to not do "wrong COWs", whereby > GUP pins lose consistency with the pages mapped into the page table, > resulting in reported memory corruptions. > > Most pageflags already have semantics for anonymous pages, so we're left > with reusing PG_slab for our purpose: for PageAnon() pages PG_slab now > translates to PG_anon_exclusive, teach some in-kernel code that manually > handles PG_slab about that. > > Add a spoiler on the semantics of PG_anon_exclusive as documentation. More > documentation will be contained in the code that actually makes use of > PG_anon_exclusive. > > We won't be clearing PG_anon_exclusive on destructive unmapping (i.e., > zapping) of page table entries, page freeing code will handle that when > also invalidate page->mapping to not indicate PageAnon() anymore. > Letting information about exclusivity stick around will be an important > property when adding sanity checks to unpinning code. > > RFC notes: in-tree tools/cgroup/memcg_slabinfo.py looks like it might need > some care. We'd have to lookup the head page and check if > PageAnon() is set. Similarly, tools living outside the kernel > repository like crash and makedumpfile might need adaptions. > > Cc: Roman Gushchin <guro@xxxxxx> > Signed-off-by: David Hildenbrand <david@xxxxxxxxxx> > --- I'm currently testing with the following. My tests so far with a swapfile on all different kinds of weird filesystems (excluding networking fs, though) revealed no surprises so far: