On Sun, Dec 18, 2022 at 5:59 PM David Hildenbrand <david@xxxxxxxxxx> wrote: > > On 18.12.22 04:32, Huacai Chen wrote: > > Hi, David, > > > > What is the opposite of exclusive here? Shared or inclusive? I prefer > > pte_swp_mkshared() or pte_swp_mkinclusive() rather than > > pte_swp_clear_exclusive(). Existing examples: dirty/clean, young/old > > ... > > Hi Huacai, > > thanks for having a look! > > Please note that this series doesn't add these primitives but merely > implements them on all remaining architectures. > > Having that said, the semantics are "exclusive" vs. "maybe shared", not > "exclusive" vs. "shared" or sth. else. It would have to be > pte_swp_mkmaybe_shared(). > > > Note that this naming matches just the way we handle it for the other > pte_swp_ flags we have, namely: > > pte_swp_mksoft_dirty() > pte_swp_soft_dirty() > pte_swp_clear_soft_dirty() > > and > > pte_swp_mkuffd_wp() > pte_swp_uffd_wp() > pte_swp_clear_uffd_wp() > > > For example, we also (thankfully) didn't call it pte_mksoft_clean(). > Grepping for "pte_swp.*soft_dirty" gives you the full picture. > > Thanks! OK, got it. Huacai > > David > > > > > Huacai > > > > On Tue, Dec 6, 2022 at 10:48 PM David Hildenbrand <david@xxxxxxxxxx> wrote: > >> > >> This is the follow-up on [1]: > >> [PATCH v2 0/8] mm: COW fixes part 3: reliable GUP R/W FOLL_GET of > >> anonymous pages > >> > >> After we implemented __HAVE_ARCH_PTE_SWP_EXCLUSIVE on most prominent > >> enterprise architectures, implement __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all > >> remaining architectures that support swap PTEs. > >> > >> This makes sure that exclusive anonymous pages will stay exclusive, even > >> after they were swapped out -- for example, making GUP R/W FOLL_GET of > >> anonymous pages reliable. Details can be found in [1]. > >> > >> This primarily fixes remaining known O_DIRECT memory corruptions that can > >> happen on concurrent swapout, whereby we can lose DMA reads to a page > >> (modifying the user page by writing to it). > >> > >> To verify, there are two test cases (requiring swap space, obviously): > >> (1) The O_DIRECT+swapout test case [2] from Andrea. This test case tries > >> triggering a race condition. > >> (2) My vmsplice() test case [3] that tries to detect if the exclusive > >> marker was lost during swapout, not relying on a race condition. > >> > >> > >> For example, on 32bit x86 (with and without PAE), my test case fails > >> without these patches: > >> $ ./test_swp_exclusive > >> FAIL: page was replaced during COW > >> But succeeds with these patches: > >> $ ./test_swp_exclusive > >> PASS: page was not replaced during COW > >> > >> > >> Why implement __HAVE_ARCH_PTE_SWP_EXCLUSIVE for all architectures, even > >> the ones where swap support might be in a questionable state? This is the > >> first step towards removing "readable_exclusive" migration entries, and > >> instead using pte_swp_exclusive() also with (readable) migration entries > >> instead (as suggested by Peter). The only missing piece for that is > >> supporting pmd_swp_exclusive() on relevant architectures with THP > >> migration support. > >> > >> As all relevant architectures now implement __HAVE_ARCH_PTE_SWP_EXCLUSIVE,, > >> we can drop __HAVE_ARCH_PTE_SWP_EXCLUSIVE in the last patch. > >> > >> > >> RFC because some of the swap PTE layouts are really tricky and I really > >> need some feedback related to deciphering these layouts and "using yet > >> unused PTE bits in swap PTEs". I tried cross-compiling all relevant setups > >> (phew, I might only miss some power/nohash variants), but only tested on > >> x86 so far. > >> > >> CCing arch maintainers only on this cover letter and on the respective > >> patch(es). > >> > >> > >> [1] https://lkml.kernel.org/r/20220329164329.208407-1-david@xxxxxxxxxx > >> [2] https://gitlab.com/aarcange/kernel-testcases-for-v5.11/-/blob/main/page_count_do_wp_page-swap.c > >> [3] https://gitlab.com/davidhildenbrand/scratchspace/-/blob/main/test_swp_exclusive.c > >> > >> David Hildenbrand (26): > >> mm/debug_vm_pgtable: more pte_swp_exclusive() sanity checks > >> alpha/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE > >> arc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE > >> arm/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE > >> csky/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE > >> hexagon/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE > >> ia64/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE > >> loongarch/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE > >> m68k/mm: remove dummy __swp definitions for nommu > >> m68k/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE > >> microblaze/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE > >> mips/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE > >> nios2/mm: refactor swap PTE layout > >> nios2/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE > >> openrisc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE > >> parisc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE > >> powerpc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit book3s > >> powerpc/nohash/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE > >> riscv/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE > >> sh/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE > >> sparc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit > >> sparc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 64bit > >> um/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE > >> x86/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE also on 32bit > >> xtensa/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE > >> mm: remove __HAVE_ARCH_PTE_SWP_EXCLUSIVE > >> > >> arch/alpha/include/asm/pgtable.h | 40 ++++++++- > >> arch/arc/include/asm/pgtable-bits-arcv2.h | 26 +++++- > >> arch/arm/include/asm/pgtable-2level.h | 3 + > >> arch/arm/include/asm/pgtable-3level.h | 3 + > >> arch/arm/include/asm/pgtable.h | 34 ++++++-- > >> arch/arm64/include/asm/pgtable.h | 1 - > >> arch/csky/abiv1/inc/abi/pgtable-bits.h | 13 ++- > >> arch/csky/abiv2/inc/abi/pgtable-bits.h | 19 ++-- > >> arch/csky/include/asm/pgtable.h | 17 ++++ > >> arch/hexagon/include/asm/pgtable.h | 36 ++++++-- > >> arch/ia64/include/asm/pgtable.h | 31 ++++++- > >> arch/loongarch/include/asm/pgtable-bits.h | 4 + > >> arch/loongarch/include/asm/pgtable.h | 38 +++++++- > >> arch/m68k/include/asm/mcf_pgtable.h | 35 +++++++- > >> arch/m68k/include/asm/motorola_pgtable.h | 37 +++++++- > >> arch/m68k/include/asm/pgtable_no.h | 6 -- > >> arch/m68k/include/asm/sun3_pgtable.h | 38 +++++++- > >> arch/microblaze/include/asm/pgtable.h | 44 +++++++--- > >> arch/mips/include/asm/pgtable-32.h | 86 ++++++++++++++++--- > >> arch/mips/include/asm/pgtable-64.h | 23 ++++- > >> arch/mips/include/asm/pgtable.h | 35 ++++++++ > >> arch/nios2/include/asm/pgtable-bits.h | 3 + > >> arch/nios2/include/asm/pgtable.h | 37 ++++++-- > >> arch/openrisc/include/asm/pgtable.h | 40 +++++++-- > >> arch/parisc/include/asm/pgtable.h | 40 ++++++++- > >> arch/powerpc/include/asm/book3s/32/pgtable.h | 37 ++++++-- > >> arch/powerpc/include/asm/book3s/64/pgtable.h | 1 - > >> arch/powerpc/include/asm/nohash/32/pgtable.h | 22 +++-- > >> arch/powerpc/include/asm/nohash/32/pte-40x.h | 6 +- > >> arch/powerpc/include/asm/nohash/32/pte-44x.h | 18 +--- > >> arch/powerpc/include/asm/nohash/32/pte-85xx.h | 4 +- > >> arch/powerpc/include/asm/nohash/64/pgtable.h | 24 +++++- > >> arch/powerpc/include/asm/nohash/pgtable.h | 15 ++++ > >> arch/powerpc/include/asm/nohash/pte-e500.h | 1 - > >> arch/riscv/include/asm/pgtable-bits.h | 3 + > >> arch/riscv/include/asm/pgtable.h | 28 ++++-- > >> arch/s390/include/asm/pgtable.h | 1 - > >> arch/sh/include/asm/pgtable_32.h | 53 +++++++++--- > >> arch/sparc/include/asm/pgtable_32.h | 26 +++++- > >> arch/sparc/include/asm/pgtable_64.h | 37 +++++++- > >> arch/sparc/include/asm/pgtsrmmu.h | 14 +-- > >> arch/um/include/asm/pgtable.h | 36 +++++++- > >> arch/x86/include/asm/pgtable-2level.h | 26 ++++-- > >> arch/x86/include/asm/pgtable-3level.h | 26 +++++- > >> arch/x86/include/asm/pgtable.h | 3 - > >> arch/xtensa/include/asm/pgtable.h | 31 +++++-- > >> include/linux/pgtable.h | 29 ------- > >> mm/debug_vm_pgtable.c | 25 +++++- > >> mm/memory.c | 4 - > >> mm/rmap.c | 11 --- > >> 50 files changed, 943 insertions(+), 227 deletions(-) > >> > >> -- > >> 2.38.1 > >> > >> > > > > -- > Thanks, > > David / dhildenb >