On Fri, Oct 18, 2024 at 2:23 PM maobibo <maobibo@xxxxxxxxxxx> wrote: > > > > On 2024/10/18 下午12:23, Huacai Chen wrote: > > On Fri, Oct 18, 2024 at 12:16 PM maobibo <maobibo@xxxxxxxxxxx> wrote: > >> > >> > >> > >> On 2024/10/18 下午12:11, Huacai Chen wrote: > >>> On Fri, Oct 18, 2024 at 11:44 AM maobibo <maobibo@xxxxxxxxxxx> wrote: > >>>> > >>>> > >>>> > >>>> On 2024/10/18 上午11:14, Huacai Chen wrote: > >>>>> Hi, Bibo, > >>>>> > >>>>> I applied this patch but drop the part of arch/loongarch/mm/kasan_init.c: > >>>>> https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson.git/commit/?h=loongarch-next&id=15832255e84494853f543b4c70ced50afc403067 > >>>>> > >>>>> Because kernel_pte_init() should operate on page-table pages, not on > >>>>> data pages. You have already handle page-table page in > >>>>> mm/kasan/init.c, and if we don't drop the modification on data pages > >>>>> in arch/loongarch/mm/kasan_init.c, the kernel fail to boot if KASAN is > >>>>> enabled. > >>>>> > >>>> static inline void set_pte(pte_t *ptep, pte_t pteval) > >>>> { > >>>> WRITE_ONCE(*ptep, pteval); > >>>> - > >>>> - if (pte_val(pteval) & _PAGE_GLOBAL) { > >>>> - pte_t *buddy = ptep_buddy(ptep); > >>>> - /* > >>>> - * Make sure the buddy is global too (if it's !none, > >>>> - * it better already be global) > >>>> - */ > >>>> - if (pte_none(ptep_get(buddy))) { > >>>> -#ifdef CONFIG_SMP > >>>> - /* > >>>> - * For SMP, multiple CPUs can race, so we need > >>>> - * to do this atomically. > >>>> - */ > >>>> - __asm__ __volatile__( > >>>> - __AMOR "$zero, %[global], %[buddy] \n" > >>>> - : [buddy] "+ZB" (buddy->pte) > >>>> - : [global] "r" (_PAGE_GLOBAL) > >>>> - : "memory"); > >>>> - > >>>> - DBAR(0b11000); /* o_wrw = 0b11000 */ > >>>> -#else /* !CONFIG_SMP */ > >>>> - WRITE_ONCE(*buddy, __pte(pte_val(ptep_get(buddy)) | _PAGE_GLOBAL)); > >>>> -#endif /* CONFIG_SMP */ > >>>> - } > >>>> - } > >>>> + DBAR(0b11000); /* o_wrw = 0b11000 */ > >>>> } > >>>> > >>>> No, please hold on. This issue exists about twenty years, Do we need be > >>>> in such a hurry now? > >>>> > >>>> why is DBAR(0b11000) added in set_pte()? > >>> It exists before, not added by this patch. The reason is explained in > >>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.12-rc3&id=f93f67d06b1023313ef1662eac490e29c025c030 > >> why speculative accesses may cause spurious page fault in kernel space > >> with PTE enabled? speculative accesses exists anywhere, it does not > >> cause spurious page fault. > > Confirmed by Ruiyang Wu, and even if DBAR(0b11000) is wrong, that > > means another patch's mistake, not this one. This one just keeps the > > old behavior. > > +CC Ruiyang Wu here. > Also from Ruiyang Wu, the information is that speculative accesses may > insert stale TLB, however no page fault exception. > > So adding barrier in set_pte() does not prevent speculative accesses. > And you write patch here, however do not know the actual reason? > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.12-rc3&id=f93f67d06b1023313ef1662eac490e29c025c030 I have CCed Ruiyang, whether the description is correct can be judged by him. Huacai > > Bibo Mao > > > > Huacai > > > >> > >> Obvious you do not it and you write wrong patch. > >> > >>> > >>> Huacai > >>> > >>>> > >>>> Regards > >>>> Bibo Mao > >>>>> Huacai > >>>>> > >>>>> On Mon, Oct 14, 2024 at 11:59 AM Bibo Mao <maobibo@xxxxxxxxxxx> wrote: > >>>>>> > >>>>>> Unlike general architectures, there are two pages in one TLB entry > >>>>>> on LoongArch system. For kernel space, it requires both two pte > >>>>>> entries with PAGE_GLOBAL bit set, else HW treats it as non-global > >>>>>> tlb, there will be potential problems if tlb entry for kernel space > >>>>>> is not global. Such as fail to flush kernel tlb with function > >>>>>> local_flush_tlb_kernel_range() which only flush tlb with global bit. > >>>>>> > >>>>>> With function kernel_pte_init() added, it can be used to init pte > >>>>>> table when it is created for kernel address space, and the default > >>>>>> initial pte value is PAGE_GLOBAL rather than zero at beginning. > >>>>>> > >>>>>> Kernel address space areas includes fixmap, percpu, vmalloc, kasan > >>>>>> and vmemmap areas set default pte entry with PAGE_GLOBAL set. > >>>>>> > >>>>>> Signed-off-by: Bibo Mao <maobibo@xxxxxxxxxxx> > >>>>>> --- > >>>>>> arch/loongarch/include/asm/pgalloc.h | 13 +++++++++++++ > >>>>>> arch/loongarch/include/asm/pgtable.h | 1 + > >>>>>> arch/loongarch/mm/init.c | 4 +++- > >>>>>> arch/loongarch/mm/kasan_init.c | 4 +++- > >>>>>> arch/loongarch/mm/pgtable.c | 22 ++++++++++++++++++++++ > >>>>>> include/linux/mm.h | 1 + > >>>>>> mm/kasan/init.c | 8 +++++++- > >>>>>> mm/sparse-vmemmap.c | 5 +++++ > >>>>>> 8 files changed, 55 insertions(+), 3 deletions(-) > >>>>>> > >>>>>> diff --git a/arch/loongarch/include/asm/pgalloc.h b/arch/loongarch/include/asm/pgalloc.h > >>>>>> index 4e2d6b7ca2ee..b2698c03dc2c 100644 > >>>>>> --- a/arch/loongarch/include/asm/pgalloc.h > >>>>>> +++ b/arch/loongarch/include/asm/pgalloc.h > >>>>>> @@ -10,8 +10,21 @@ > >>>>>> > >>>>>> #define __HAVE_ARCH_PMD_ALLOC_ONE > >>>>>> #define __HAVE_ARCH_PUD_ALLOC_ONE > >>>>>> +#define __HAVE_ARCH_PTE_ALLOC_ONE_KERNEL > >>>>>> #include <asm-generic/pgalloc.h> > >>>>>> > >>>>>> +static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm) > >>>>>> +{ > >>>>>> + pte_t *pte; > >>>>>> + > >>>>>> + pte = (pte_t *) __get_free_page(GFP_KERNEL); > >>>>>> + if (!pte) > >>>>>> + return NULL; > >>>>>> + > >>>>>> + kernel_pte_init(pte); > >>>>>> + return pte; > >>>>>> +} > >>>>>> + > >>>>>> static inline void pmd_populate_kernel(struct mm_struct *mm, > >>>>>> pmd_t *pmd, pte_t *pte) > >>>>>> { > >>>>>> diff --git a/arch/loongarch/include/asm/pgtable.h b/arch/loongarch/include/asm/pgtable.h > >>>>>> index 9965f52ef65b..22e3a8f96213 100644 > >>>>>> --- a/arch/loongarch/include/asm/pgtable.h > >>>>>> +++ b/arch/loongarch/include/asm/pgtable.h > >>>>>> @@ -269,6 +269,7 @@ extern void set_pmd_at(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp, pm > >>>>>> extern void pgd_init(void *addr); > >>>>>> extern void pud_init(void *addr); > >>>>>> extern void pmd_init(void *addr); > >>>>>> +extern void kernel_pte_init(void *addr); > >>>>>> > >>>>>> /* > >>>>>> * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that > >>>>>> diff --git a/arch/loongarch/mm/init.c b/arch/loongarch/mm/init.c > >>>>>> index 8a87a482c8f4..9f26e933a8a3 100644 > >>>>>> --- a/arch/loongarch/mm/init.c > >>>>>> +++ b/arch/loongarch/mm/init.c > >>>>>> @@ -198,9 +198,11 @@ pte_t * __init populate_kernel_pte(unsigned long addr) > >>>>>> if (!pmd_present(pmdp_get(pmd))) { > >>>>>> pte_t *pte; > >>>>>> > >>>>>> - pte = memblock_alloc(PAGE_SIZE, PAGE_SIZE); > >>>>>> + pte = memblock_alloc_raw(PAGE_SIZE, PAGE_SIZE); > >>>>>> if (!pte) > >>>>>> panic("%s: Failed to allocate memory\n", __func__); > >>>>>> + > >>>>>> + kernel_pte_init(pte); > >>>>>> pmd_populate_kernel(&init_mm, pmd, pte); > >>>>>> } > >>>>>> > >>>>>> diff --git a/arch/loongarch/mm/kasan_init.c b/arch/loongarch/mm/kasan_init.c > >>>>>> index 427d6b1aec09..34988573b0d5 100644 > >>>>>> --- a/arch/loongarch/mm/kasan_init.c > >>>>>> +++ b/arch/loongarch/mm/kasan_init.c > >>>>>> @@ -152,6 +152,8 @@ static void __init kasan_pte_populate(pmd_t *pmdp, unsigned long addr, > >>>>>> phys_addr_t page_phys = early ? > >>>>>> __pa_symbol(kasan_early_shadow_page) > >>>>>> : kasan_alloc_zeroed_page(node); > >>>>>> + if (!early) > >>>>>> + kernel_pte_init(__va(page_phys)); > >>>>>> next = addr + PAGE_SIZE; > >>>>>> set_pte(ptep, pfn_pte(__phys_to_pfn(page_phys), PAGE_KERNEL)); > >>>>>> } while (ptep++, addr = next, addr != end && __pte_none(early, ptep_get(ptep))); > >>>>>> @@ -287,7 +289,7 @@ void __init kasan_init(void) > >>>>>> set_pte(&kasan_early_shadow_pte[i], > >>>>>> pfn_pte(__phys_to_pfn(__pa_symbol(kasan_early_shadow_page)), PAGE_KERNEL_RO)); > >>>>>> > >>>>>> - memset(kasan_early_shadow_page, 0, PAGE_SIZE); > >>>>>> + kernel_pte_init(kasan_early_shadow_page); > >>>>>> csr_write64(__pa_symbol(swapper_pg_dir), LOONGARCH_CSR_PGDH); > >>>>>> local_flush_tlb_all(); > >>>>>> > >>>>>> diff --git a/arch/loongarch/mm/pgtable.c b/arch/loongarch/mm/pgtable.c > >>>>>> index eb6a29b491a7..228ffc1db0a3 100644 > >>>>>> --- a/arch/loongarch/mm/pgtable.c > >>>>>> +++ b/arch/loongarch/mm/pgtable.c > >>>>>> @@ -38,6 +38,28 @@ pgd_t *pgd_alloc(struct mm_struct *mm) > >>>>>> } > >>>>>> EXPORT_SYMBOL_GPL(pgd_alloc); > >>>>>> > >>>>>> +void kernel_pte_init(void *addr) > >>>>>> +{ > >>>>>> + unsigned long *p, *end; > >>>>>> + unsigned long entry; > >>>>>> + > >>>>>> + entry = (unsigned long)_PAGE_GLOBAL; > >>>>>> + p = (unsigned long *)addr; > >>>>>> + end = p + PTRS_PER_PTE; > >>>>>> + > >>>>>> + do { > >>>>>> + p[0] = entry; > >>>>>> + p[1] = entry; > >>>>>> + p[2] = entry; > >>>>>> + p[3] = entry; > >>>>>> + p[4] = entry; > >>>>>> + p += 8; > >>>>>> + p[-3] = entry; > >>>>>> + p[-2] = entry; > >>>>>> + p[-1] = entry; > >>>>>> + } while (p != end); > >>>>>> +} > >>>>>> + > >>>>>> void pgd_init(void *addr) > >>>>>> { > >>>>>> unsigned long *p, *end; > >>>>>> diff --git a/include/linux/mm.h b/include/linux/mm.h > >>>>>> index ecf63d2b0582..6909fe059a2c 100644 > >>>>>> --- a/include/linux/mm.h > >>>>>> +++ b/include/linux/mm.h > >>>>>> @@ -3818,6 +3818,7 @@ void *sparse_buffer_alloc(unsigned long size); > >>>>>> struct page * __populate_section_memmap(unsigned long pfn, > >>>>>> unsigned long nr_pages, int nid, struct vmem_altmap *altmap, > >>>>>> struct dev_pagemap *pgmap); > >>>>>> +void kernel_pte_init(void *addr); > >>>>>> void pmd_init(void *addr); > >>>>>> void pud_init(void *addr); > >>>>>> pgd_t *vmemmap_pgd_populate(unsigned long addr, int node); > >>>>>> diff --git a/mm/kasan/init.c b/mm/kasan/init.c > >>>>>> index 89895f38f722..ac607c306292 100644 > >>>>>> --- a/mm/kasan/init.c > >>>>>> +++ b/mm/kasan/init.c > >>>>>> @@ -106,6 +106,10 @@ static void __ref zero_pte_populate(pmd_t *pmd, unsigned long addr, > >>>>>> } > >>>>>> } > >>>>>> > >>>>>> +void __weak __meminit kernel_pte_init(void *addr) > >>>>>> +{ > >>>>>> +} > >>>>>> + > >>>>>> static int __ref zero_pmd_populate(pud_t *pud, unsigned long addr, > >>>>>> unsigned long end) > >>>>>> { > >>>>>> @@ -126,8 +130,10 @@ static int __ref zero_pmd_populate(pud_t *pud, unsigned long addr, > >>>>>> > >>>>>> if (slab_is_available()) > >>>>>> p = pte_alloc_one_kernel(&init_mm); > >>>>>> - else > >>>>>> + else { > >>>>>> p = early_alloc(PAGE_SIZE, NUMA_NO_NODE); > >>>>>> + kernel_pte_init(p); > >>>>>> + } > >>>>>> if (!p) > >>>>>> return -ENOMEM; > >>>>>> > >>>>>> diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c > >>>>>> index edcc7a6b0f6f..c0388b2e959d 100644 > >>>>>> --- a/mm/sparse-vmemmap.c > >>>>>> +++ b/mm/sparse-vmemmap.c > >>>>>> @@ -184,6 +184,10 @@ static void * __meminit vmemmap_alloc_block_zero(unsigned long size, int node) > >>>>>> return p; > >>>>>> } > >>>>>> > >>>>>> +void __weak __meminit kernel_pte_init(void *addr) > >>>>>> +{ > >>>>>> +} > >>>>>> + > >>>>>> pmd_t * __meminit vmemmap_pmd_populate(pud_t *pud, unsigned long addr, int node) > >>>>>> { > >>>>>> pmd_t *pmd = pmd_offset(pud, addr); > >>>>>> @@ -191,6 +195,7 @@ pmd_t * __meminit vmemmap_pmd_populate(pud_t *pud, unsigned long addr, int node) > >>>>>> void *p = vmemmap_alloc_block_zero(PAGE_SIZE, node); > >>>>>> if (!p) > >>>>>> return NULL; > >>>>>> + kernel_pte_init(p); > >>>>>> pmd_populate_kernel(&init_mm, pmd, p); > >>>>>> } > >>>>>> return pmd; > >>>>>> -- > >>>>>> 2.39.3 > >>>>>> > >>>> > >>>> > >> > >> > >