Yu Xu <xuyu@xxxxxxxxxxxxxxxxx> writes: > On 10/21/21 1:02 AM, Ankur Arora wrote: >> Expose the low-level uncached primitives (clear_page_movnt(), >> clear_page_clzero()) as alternatives via clear_page_uncached(). >> Also fallback to clear_page(), if X86_FEATURE_MOVNT_SLOW is set >> and the CPU does not have X86_FEATURE_CLZERO. >> Both the uncached primitives use stores which are weakly ordered >> with respect to other instructions accessing the memory hierarchy. >> To ensure that callers don't mix accesses to different types of >> address_spaces, annotate clear_user_page_uncached(), and >> clear_page_uncached() as taking __incoherent pointers as arguments. >> Also add clear_page_uncached_make_coherent() which provides the >> necessary store fence to flush out the uncached regions. >> Signed-off-by: Ankur Arora <ankur.a.arora@xxxxxxxxxx> >> --- >> Notes: >> This patch adds the fallback definitions of clear_user_page_uncached() >> etc in include/linux/mm.h which is likely not the right place for it. >> I'm guessing these should be moved to include/asm-generic/page.h >> (or maybe a new include/asm-generic/page_uncached.h) and for >> architectures that do have arch/$arch/include/asm/page.h (which >> seems like all of them), also replicate there? >> Anyway, wanted to first check if that's the way to do it, before >> doing that. >> arch/x86/include/asm/page.h | 10 ++++++++++ >> arch/x86/include/asm/page_32.h | 9 +++++++++ >> arch/x86/include/asm/page_64.h | 32 ++++++++++++++++++++++++++++++++ >> include/linux/mm.h | 14 ++++++++++++++ >> 4 files changed, 65 insertions(+) >> diff --git a/arch/x86/include/asm/page_32.h b/arch/x86/include/asm/page_32.h >> index 94dbd51df58f..163be03ac422 100644 >> --- a/arch/x86/include/asm/page_32.h >> +++ b/arch/x86/include/asm/page_32.h >> @@ -39,6 +39,15 @@ static inline void clear_page(void *page) >> memset(page, 0, PAGE_SIZE); >> } >> +static inline void clear_page_uncached(__incoherent void *page) >> +{ >> + clear_page((__force void *) page); >> +} >> + >> +static inline void clear_page_uncached_make_coherent(void) >> +{ >> +} >> + >> static inline void copy_page(void *to, void *from) >> { >> memcpy(to, from, PAGE_SIZE); >> diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h >> index 3c53f8ef8818..d7946047c70f 100644 >> --- a/arch/x86/include/asm/page_64.h >> +++ b/arch/x86/include/asm/page_64.h >> @@ -56,6 +56,38 @@ static inline void clear_page(void *page) >> : "cc", "memory", "rax", "rcx"); >> } >> +/* >> + * clear_page_uncached: only allowed on __incoherent memory regions. >> + */ >> +static inline void clear_page_uncached(__incoherent void *page) >> +{ >> + alternative_call_2(clear_page_movnt, >> + clear_page, X86_FEATURE_MOVNT_SLOW, >> + clear_page_clzero, X86_FEATURE_CLZERO, >> + "=D" (page), >> + "0" (page) >> + : "cc", "memory", "rax", "rcx"); >> +} >> + >> +/* >> + * clear_page_uncached_make_coherent: executes the necessary store >> + * fence after which __incoherent regions can be safely accessed. >> + */ >> +static inline void clear_page_uncached_make_coherent(void) >> +{ >> + /* >> + * Keep the sfence for oldinstr and clzero separate to guard against >> + * the possibility that a cpu-model both has X86_FEATURE_MOVNT_SLOW >> + * and X86_FEATURE_CLZERO. >> + * >> + * The alternatives need to be in the same order as the ones >> + * in clear_page_uncached(). >> + */ >> + alternative_2("sfence", >> + "", X86_FEATURE_MOVNT_SLOW, >> + "sfence", X86_FEATURE_CLZERO); >> +} >> + >> void copy_page(void *to, void *from); >> #ifdef CONFIG_X86_5LEVEL >> diff --git a/include/linux/mm.h b/include/linux/mm.h >> index 73a52aba448f..b88069d1116c 100644 >> --- a/include/linux/mm.h >> +++ b/include/linux/mm.h >> @@ -3192,6 +3192,20 @@ static inline bool vma_is_special_huge(const struct vm_area_struct *vma) >> #endif /* CONFIG_TRANSPARENT_HUGEPAGE || CONFIG_HUGETLBFS */ >> +#ifndef clear_user_page_uncached > > Hi Ankur Arora, > > I've been looking for where clear_user_page_uncached is defined in this > patchset, but failed. > > There should be something like follows in arch/x86, right? > > static inline void clear_user_page_uncached(__incoherent void *page, > unsigned long vaddr, struct page *pg) > { > clear_page_uncached(page); > } > > > Did I miss something? > Hi Yu Xu, Defined in include/linux/mm.h. Just below :). >> +/* >> + * clear_user_page_uncached: fallback to the standard clear_user_page(). >> + */ >> +static inline void clear_user_page_uncached(__incoherent void *page, >> + unsigned long vaddr, struct page *pg) >> +{ >> + clear_user_page((__force void *)page, vaddr, pg); >> +} That said, as this note in the patch mentions, this isn't really a great place for this definition. As you also mention, the right place for this would be somewhere in the arch/.../include and include/asm-generic hierarchy. >> This patch adds the fallback definitions of clear_user_page_uncached() >> etc in include/linux/mm.h which is likely not the right place for it. >> I'm guessing these should be moved to include/asm-generic/page.h >> (or maybe a new include/asm-generic/page_uncached.h) and for >> architectures that do have arch/$arch/include/asm/page.h (which >> seems like all of them), also replicate there? >> Anyway, wanted to first check if that's the way to do it, before >> doing that. Recommendations on how to handle this, welcome. Thanks -- ankur