On 18.12.22 04:32, Huacai Chen wrote:
Hi, David,
What is the opposite of exclusive here? Shared or inclusive? I prefer
pte_swp_mkshared() or pte_swp_mkinclusive() rather than
pte_swp_clear_exclusive(). Existing examples: dirty/clean, young/old
...
Hi Huacai,
thanks for having a look!
Please note that this series doesn't add these primitives but merely
implements them on all remaining architectures.
Having that said, the semantics are "exclusive" vs. "maybe shared", not
"exclusive" vs. "shared" or sth. else. It would have to be
pte_swp_mkmaybe_shared().
Note that this naming matches just the way we handle it for the other
pte_swp_ flags we have, namely:
pte_swp_mksoft_dirty()
pte_swp_soft_dirty()
pte_swp_clear_soft_dirty()
and
pte_swp_mkuffd_wp()
pte_swp_uffd_wp()
pte_swp_clear_uffd_wp()
For example, we also (thankfully) didn't call it pte_mksoft_clean().
Grepping for "pte_swp.*soft_dirty" gives you the full picture.
Thanks!
David
Huacai
On Tue, Dec 6, 2022 at 10:48 PM David Hildenbrand <david@xxxxxxxxxx> wrote:
This is the follow-up on [1]:
[PATCH v2 0/8] mm: COW fixes part 3: reliable GUP R/W FOLL_GET of
anonymous pages
After we implemented __HAVE_ARCH_PTE_SWP_EXCLUSIVE on most prominent
enterprise architectures, implement __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all
remaining architectures that support swap PTEs.
This makes sure that exclusive anonymous pages will stay exclusive, even
after they were swapped out -- for example, making GUP R/W FOLL_GET of
anonymous pages reliable. Details can be found in [1].
This primarily fixes remaining known O_DIRECT memory corruptions that can
happen on concurrent swapout, whereby we can lose DMA reads to a page
(modifying the user page by writing to it).
To verify, there are two test cases (requiring swap space, obviously):
(1) The O_DIRECT+swapout test case [2] from Andrea. This test case tries
triggering a race condition.
(2) My vmsplice() test case [3] that tries to detect if the exclusive
marker was lost during swapout, not relying on a race condition.
For example, on 32bit x86 (with and without PAE), my test case fails
without these patches:
$ ./test_swp_exclusive
FAIL: page was replaced during COW
But succeeds with these patches:
$ ./test_swp_exclusive
PASS: page was not replaced during COW
Why implement __HAVE_ARCH_PTE_SWP_EXCLUSIVE for all architectures, even
the ones where swap support might be in a questionable state? This is the
first step towards removing "readable_exclusive" migration entries, and
instead using pte_swp_exclusive() also with (readable) migration entries
instead (as suggested by Peter). The only missing piece for that is
supporting pmd_swp_exclusive() on relevant architectures with THP
migration support.
As all relevant architectures now implement __HAVE_ARCH_PTE_SWP_EXCLUSIVE,,
we can drop __HAVE_ARCH_PTE_SWP_EXCLUSIVE in the last patch.
RFC because some of the swap PTE layouts are really tricky and I really
need some feedback related to deciphering these layouts and "using yet
unused PTE bits in swap PTEs". I tried cross-compiling all relevant setups
(phew, I might only miss some power/nohash variants), but only tested on
x86 so far.
CCing arch maintainers only on this cover letter and on the respective
patch(es).
[1] https://lkml.kernel.org/r/20220329164329.208407-1-david@xxxxxxxxxx
[2] https://gitlab.com/aarcange/kernel-testcases-for-v5.11/-/blob/main/page_count_do_wp_page-swap.c
[3] https://gitlab.com/davidhildenbrand/scratchspace/-/blob/main/test_swp_exclusive.c
David Hildenbrand (26):
mm/debug_vm_pgtable: more pte_swp_exclusive() sanity checks
alpha/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
arc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
arm/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
csky/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
hexagon/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
ia64/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
loongarch/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
m68k/mm: remove dummy __swp definitions for nommu
m68k/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
microblaze/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
mips/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
nios2/mm: refactor swap PTE layout
nios2/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
openrisc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
parisc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
powerpc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit book3s
powerpc/nohash/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
riscv/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
sh/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
sparc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit
sparc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 64bit
um/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
x86/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE also on 32bit
xtensa/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
mm: remove __HAVE_ARCH_PTE_SWP_EXCLUSIVE
arch/alpha/include/asm/pgtable.h | 40 ++++++++-
arch/arc/include/asm/pgtable-bits-arcv2.h | 26 +++++-
arch/arm/include/asm/pgtable-2level.h | 3 +
arch/arm/include/asm/pgtable-3level.h | 3 +
arch/arm/include/asm/pgtable.h | 34 ++++++--
arch/arm64/include/asm/pgtable.h | 1 -
arch/csky/abiv1/inc/abi/pgtable-bits.h | 13 ++-
arch/csky/abiv2/inc/abi/pgtable-bits.h | 19 ++--
arch/csky/include/asm/pgtable.h | 17 ++++
arch/hexagon/include/asm/pgtable.h | 36 ++++++--
arch/ia64/include/asm/pgtable.h | 31 ++++++-
arch/loongarch/include/asm/pgtable-bits.h | 4 +
arch/loongarch/include/asm/pgtable.h | 38 +++++++-
arch/m68k/include/asm/mcf_pgtable.h | 35 +++++++-
arch/m68k/include/asm/motorola_pgtable.h | 37 +++++++-
arch/m68k/include/asm/pgtable_no.h | 6 --
arch/m68k/include/asm/sun3_pgtable.h | 38 +++++++-
arch/microblaze/include/asm/pgtable.h | 44 +++++++---
arch/mips/include/asm/pgtable-32.h | 86 ++++++++++++++++---
arch/mips/include/asm/pgtable-64.h | 23 ++++-
arch/mips/include/asm/pgtable.h | 35 ++++++++
arch/nios2/include/asm/pgtable-bits.h | 3 +
arch/nios2/include/asm/pgtable.h | 37 ++++++--
arch/openrisc/include/asm/pgtable.h | 40 +++++++--
arch/parisc/include/asm/pgtable.h | 40 ++++++++-
arch/powerpc/include/asm/book3s/32/pgtable.h | 37 ++++++--
arch/powerpc/include/asm/book3s/64/pgtable.h | 1 -
arch/powerpc/include/asm/nohash/32/pgtable.h | 22 +++--
arch/powerpc/include/asm/nohash/32/pte-40x.h | 6 +-
arch/powerpc/include/asm/nohash/32/pte-44x.h | 18 +---
arch/powerpc/include/asm/nohash/32/pte-85xx.h | 4 +-
arch/powerpc/include/asm/nohash/64/pgtable.h | 24 +++++-
arch/powerpc/include/asm/nohash/pgtable.h | 15 ++++
arch/powerpc/include/asm/nohash/pte-e500.h | 1 -
arch/riscv/include/asm/pgtable-bits.h | 3 +
arch/riscv/include/asm/pgtable.h | 28 ++++--
arch/s390/include/asm/pgtable.h | 1 -
arch/sh/include/asm/pgtable_32.h | 53 +++++++++---
arch/sparc/include/asm/pgtable_32.h | 26 +++++-
arch/sparc/include/asm/pgtable_64.h | 37 +++++++-
arch/sparc/include/asm/pgtsrmmu.h | 14 +--
arch/um/include/asm/pgtable.h | 36 +++++++-
arch/x86/include/asm/pgtable-2level.h | 26 ++++--
arch/x86/include/asm/pgtable-3level.h | 26 +++++-
arch/x86/include/asm/pgtable.h | 3 -
arch/xtensa/include/asm/pgtable.h | 31 +++++--
include/linux/pgtable.h | 29 -------
mm/debug_vm_pgtable.c | 25 +++++-
mm/memory.c | 4 -
mm/rmap.c | 11 ---
50 files changed, 943 insertions(+), 227 deletions(-)
--
2.38.1
--
Thanks,
David / dhildenb