If we'd really want to identify whether a zeropage was deduplciated by
KSM, we could try storing that information inside the PTE instead of
this is interesting, but needs caution, for the reason you mention below
inside the RMAP. Then, we could directly adjust the counter when zapping
the shared zeropage or during MADV_DONTNEED/when unmerging.
Eventually, we could simply say that
* !pte_dirty(): zeropage placed during fault
* pte_dirty(): zeropage placed by KSM
Then it would also be easy to adjust counters and unmerge. We'd limit
this handling to known-working architectures initially (spec64 still has
^ I meant sparc64 here. We can (and should) have a testcase that
deduplicates the shared zeropage using KSM and makes sure that writes
properly lead to a write fault.
the issue that pte_mkdirty() will set a pte writable ... and my patch to
fix that was not merged yet). We'd have to double-check all
pte_mkdirty/pte_mkclean() callsites.
this will be... interesting
IIRC, most code that touches the dirty bit makes sure that it operates
on a proper struct page (via vm_normal_folio()).
madvise_free_pte_range() is one such user. But we have to double-check.
--
Thanks,
David / dhildenb