Re: [PATCH] riscv: mm: Fixup spurious fault of kernel vaddr

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jul 21, 2023 at 4:01 PM Alexandre Ghiti <alex@xxxxxxxx> wrote:
>
>
> On 21/07/2023 18:08, Guo Ren wrote:
> > On Fri, Jul 21, 2023 at 11:19 PM Alexandre Ghiti <alex@xxxxxxxx> wrote:
> >>
> >> On 21/07/2023 16:51, guoren@xxxxxxxxxx wrote:
> >>> From: Guo Ren <guoren@xxxxxxxxxxxxxxxxx>
> >>>
> >>> RISC-V specification permits the caching of PTEs whose V (Valid)
> >>> bit is clear. Operating systems must be written to cope with this
> >>> possibility, but implementers are reminded that eagerly caching
> >>> invalid PTEs will reduce performance by causing additional page
> >>> faults.
> >>>
> >>> So we must keep vmalloc_fault for the spurious page faults of kernel
> >>> virtual address from an OoO machine.
> >>>
> >>> Signed-off-by: Guo Ren <guoren@xxxxxxxxxxxxxxxxx>
> >>> Signed-off-by: Guo Ren <guoren@xxxxxxxxxx>
> >>> ---
> >>>    arch/riscv/mm/fault.c | 3 +--
> >>>    1 file changed, 1 insertion(+), 2 deletions(-)
> >>>
> >>> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
> >>> index 85165fe438d8..f662c9eae7d4 100644
> >>> --- a/arch/riscv/mm/fault.c
> >>> +++ b/arch/riscv/mm/fault.c
> >>> @@ -258,8 +258,7 @@ void handle_page_fault(struct pt_regs *regs)
> >>>         * only copy the information from the master page table,
> >>>         * nothing more.
> >>>         */
> >>> -     if ((!IS_ENABLED(CONFIG_MMU) || !IS_ENABLED(CONFIG_64BIT)) &&
> >>> -         unlikely(addr >= VMALLOC_START && addr < VMALLOC_END)) {
> >>> +     if (unlikely(addr >= TASK_SIZE)) {
> >>>                vmalloc_fault(regs, code, addr);
> >>>                return;
> >>>        }
> >>
> >> Can you share what you are trying to fix here?
> > We met a spurious page fault panic on an OoO machine.
> >
> > 1. The processor speculative execution brings the V=0 entries into the
> > TLB in the kernel virtual address.
> > 2. Linux kernel installs the kernel virtual address with the page, and V=1
> > 3. When kernel code access the kernel virtual address, it would raise
> > a page fault as the V=0 entry in the tlb.
> > 4. No vmalloc_fault, then panic.
> >
> >> I have a fix (that's currently running our CI) for commit 7d3332be011e
> >> ("riscv: mm: Pre-allocate PGD entries for vmalloc/modules area") that
> >> implements flush_cache_vmap() since we lost the vmalloc_fault.
> > Could you share that patch?
>
>
> Here we go:
>
>
> Author: Alexandre Ghiti <alexghiti@xxxxxxxxxxxx>
> Date:   Fri Jul 21 08:43:44 2023 +0000
>
>      riscv: Implement flush_cache_vmap()
>
>      The RISC-V kernel needs a sfence.vma after a page table
> modification: we
>      used to rely on the vmalloc fault handling to emit an sfence.vma, but
>      commit 7d3332be011e ("riscv: mm: Pre-allocate PGD entries for
>      vmalloc/modules area") got rid of this path for 64-bit kernels, so
> now we
>      need to explicitly emit a sfence.vma in flush_cache_vmap().
>
>      Note that we don't need to implement flush_cache_vunmap() as the
> generic
>      code should emit a flush tlb after unmapping a vmalloc region.
>
>      Fixes: 7d3332be011e ("riscv: mm: Pre-allocate PGD entries for
> vmalloc/modules area")
>      Signed-off-by: Alexandre Ghiti <alexghiti@xxxxxxxxxxxx>
>
> diff --git a/arch/riscv/include/asm/cacheflush.h
> b/arch/riscv/include/asm/cacheflush.h
> index 8091b8bf4883..b93ffddf8a61 100644
> --- a/arch/riscv/include/asm/cacheflush.h
> +++ b/arch/riscv/include/asm/cacheflush.h
> @@ -37,6 +37,10 @@ static inline void flush_dcache_page(struct page *page)
>   #define flush_icache_user_page(vma, pg, addr, len) \
>          flush_icache_mm(vma->vm_mm, 0)
>
> +#ifdef CONFIG_64BIT
> +#define flush_cache_vmap(start, end) flush_tlb_kernel_range(start, end)
> +#endif
I don't want that, and flush_tlb_kernel_range is flush_tlb_all. In
addition, it would call IPI, which is a performance killer.

What's the problem of spurious fault replay? It only costs a
local_tlb_flush with vaddr.

> +
>   #ifndef CONFIG_SMP
>
>   #define flush_icache_all() local_flush_icache_all()
>
>
> Let me know if that works for you!
>
>
> >
> >

--
Best Regards
 Guo Ren




[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux