On 9/20/24 14:12, Anshuman Khandual wrote: > > > On 9/20/24 12:09, David Hildenbrand wrote: >> On 20.09.24 07:30, Anshuman Khandual wrote: >>> set_pxd_safe() helpers that serve a specific purpose for both x86 and riscv >>> platforms, do not need to be in the common memory code. Otherwise they just >>> unnecessarily make the common API more complicated. This moves the helpers >>> from common code to platform instead. >>> >>> Cc: Paul Walmsley <paul.walmsley@xxxxxxxxxx> >>> Cc: Palmer Dabbelt <palmer@xxxxxxxxxxx> >>> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> >>> Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> >>> Cc: David Hildenbrand <david@xxxxxxxxxx> >>> Cc: Ryan Roberts <ryan.roberts@xxxxxxx> >>> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> >>> Cc: x86@xxxxxxxxxx >>> Cc: linux-mm@xxxxxxxxx >>> Cc: linux-riscv@xxxxxxxxxxxxxxxxxxx >>> Cc: linux-kernel@xxxxxxxxxxxxxxx >>> Suggested-by: David Hildenbrand <david@xxxxxxxxxx> >>> Signed-off-by: Anshuman Khandual <anshuman.khandual@xxxxxxx> >>> --- >>> arch/riscv/include/asm/pgtable.h | 19 ++++++++++++++++ >>> arch/x86/include/asm/pgtable.h | 37 +++++++++++++++++++++++++++++++ >>> include/linux/pgtable.h | 38 -------------------------------- >>> 3 files changed, 56 insertions(+), 38 deletions(-) >>> >>> diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h >>> index 089f3c9f56a3..39ca652c5ebe 100644 >>> --- a/arch/riscv/include/asm/pgtable.h >>> +++ b/arch/riscv/include/asm/pgtable.h >>> @@ -957,6 +957,25 @@ void misc_mem_init(void); >>> extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)]; >>> #define ZERO_PAGE(vaddr) (virt_to_page(empty_zero_page)) >>> +/* >>> + * Use set_p*_safe(), and elide TLB flushing, when confident that *no* >>> + * TLB flush will be required as a result of the "set". For example, use >>> + * in scenarios where it is known ahead of time that the routine is >>> + * setting non-present entries, or re-setting an existing entry to the >>> + * same value. Otherwise, use the typical "set" helpers and flush the >>> + * TLB. >>> + */ >>> +#define set_p4d_safe(p4dp, p4d) \ >>> +({ \ >>> + WARN_ON_ONCE(p4d_present(*p4dp) && !p4d_same(*p4dp, p4d)); \ >>> + set_p4d(p4dp, p4d); \ >>> +}) >>> + >>> +#define set_pgd_safe(pgdp, pgd) \ >>> +({ \ >>> + WARN_ON_ONCE(pgd_present(*pgdp) && !pgd_same(*pgdp, pgd)); \ >>> + set_pgd(pgdp, pgd); \ >>> +}) >>> #endif /* !__ASSEMBLY__ */ >>> #endif /* _ASM_RISCV_PGTABLE_H */ >>> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h >>> index e39311a89bf4..fefb52bb6b4d 100644 >>> --- a/arch/x86/include/asm/pgtable.h >>> +++ b/arch/x86/include/asm/pgtable.h >>> @@ -1701,6 +1701,43 @@ bool arch_is_platform_page(u64 paddr); >>> #define arch_is_platform_page arch_is_platform_page >>> #endif >>> +/* >>> + * Use set_p*_safe(), and elide TLB flushing, when confident that *no* >>> + * TLB flush will be required as a result of the "set". For example, use >>> + * in scenarios where it is known ahead of time that the routine is >>> + * setting non-present entries, or re-setting an existing entry to the >>> + * same value. Otherwise, use the typical "set" helpers and flush the >>> + * TLB. >>> + */ >>> +#define set_pte_safe(ptep, pte) \ >>> +({ \ >>> + WARN_ON_ONCE(pte_present(*ptep) && !pte_same(*ptep, pte)); \ >>> + set_pte(ptep, pte); \ >>> +}) >>> + >>> +#define set_pmd_safe(pmdp, pmd) \ >>> +({ \ >>> + WARN_ON_ONCE(pmd_present(*pmdp) && !pmd_same(*pmdp, pmd)); \ >>> + set_pmd(pmdp, pmd); \ >>> +}) >>> + >>> +#define set_pud_safe(pudp, pud) \ >>> +({ \ >>> + WARN_ON_ONCE(pud_present(*pudp) && !pud_same(*pudp, pud)); \ >>> + set_pud(pudp, pud); \ >>> +}) >>> + >>> +#define set_p4d_safe(p4dp, p4d) \ >>> +({ \ >>> + WARN_ON_ONCE(p4d_present(*p4dp) && !p4d_same(*p4dp, p4d)); \ >>> + set_p4d(p4dp, p4d); \ >>> +}) >>> + >>> +#define set_pgd_safe(pgdp, pgd) \ >>> +({ \ >>> + WARN_ON_ONCE(pgd_present(*pgdp) && !pgd_same(*pgdp, pgd)); \ >>> + set_pgd(pgdp, pgd); \ >>> +}) >>> #endif /* __ASSEMBLY__ */ >> >> I'm wondering if we can completely get rid of these, for example via: >> >> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c >> index d8dbeac8b206..bc71c25930bb 100644 >> --- a/arch/x86/mm/init_64.c >> +++ b/arch/x86/mm/init_64.c >> @@ -79,10 +79,8 @@ DEFINE_POPULATE(pmd_populate_kernel, pmd, pte, init) >> static inline void set_##type1##_init(type1##_t *arg1, \ >> type2##_t arg2, bool init) \ >> { \ >> - if (init) \ >> - set_##type1##_safe(arg1, arg2); \ >> - else \ >> - set_##type1(arg1, arg2); \ >> + WARN_ON_ONCE(init && ##type1##_present(*arg1) && !##type1##_same(*arg1, arg2)); \ >> + set_##type1(arg1, arg2); \ >> } >> >> >> We might be able to handle the pgd_populate etc part similarly, possibly getting >> rid of the pgd_populate_safe etc as well. >> >> Assuming I don't miss anything important :) > > Sounds feasible but will just leave that upto the x86 platform folks to > change later on, after this patch which just moves these helpers inside > the platform code. > >> >> Ideally, we get rid of the macros here and just use inline functions ... >> > > Sure, makes sense. Will change these as inline functions. > > -#define set_pte_safe(ptep, pte) \ > -({ \ > - WARN_ON_ONCE(pte_present(*ptep) && !pte_same(*ptep, pte)); \ > - set_pte(ptep, pte); \ > -}) > +static inline void set_pte_safe(pte_t *ptep, pte_t pte) > +{ > + WARN_ON_ONCE(pte_present(*ptep) && !pte_same(*ptep, pte)); > + set_pte(ptep, pte); > +} > This has hit a road block in converting these macros as static inline functions as suggested earlier, because pmd/pud/p4d/pgd_same() macros are defined in generic header include/linux/pgtable.h, but way after <asm/pgtable.h> gets included. I guess then the current patch should be left as it is.