Re: [PATCH v2 1/3] LoongArch: Set initial pte entry with PAGE_GLOBAL for kernel space

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 21, 2024 at 9:23 AM maobibo <maobibo@xxxxxxxxxxx> wrote:
>
>
>
> On 2024/10/18 下午2:32, Huacai Chen wrote:
> > On Fri, Oct 18, 2024 at 2:23 PM maobibo <maobibo@xxxxxxxxxxx> wrote:
> >>
> >>
> >>
> >> On 2024/10/18 下午12:23, Huacai Chen wrote:
> >>> On Fri, Oct 18, 2024 at 12:16 PM maobibo <maobibo@xxxxxxxxxxx> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On 2024/10/18 下午12:11, Huacai Chen wrote:
> >>>>> On Fri, Oct 18, 2024 at 11:44 AM maobibo <maobibo@xxxxxxxxxxx> wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 2024/10/18 上午11:14, Huacai Chen wrote:
> >>>>>>> Hi, Bibo,
> >>>>>>>
> >>>>>>> I applied this patch but drop the part of arch/loongarch/mm/kasan_init.c:
> >>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson.git/commit/?h=loongarch-next&id=15832255e84494853f543b4c70ced50afc403067
> >>>>>>>
> >>>>>>> Because kernel_pte_init() should operate on page-table pages, not on
> >>>>>>> data pages. You have already handle page-table page in
> >>>>>>> mm/kasan/init.c, and if we don't drop the modification on data pages
> >>>>>>> in arch/loongarch/mm/kasan_init.c, the kernel fail to boot if KASAN is
> >>>>>>> enabled.
> >>>>>>>
> >>>>>> static inline void set_pte(pte_t *ptep, pte_t pteval)
> >>>>>>      {
> >>>>>>            WRITE_ONCE(*ptep, pteval);
> >>>>>> -
> >>>>>> -       if (pte_val(pteval) & _PAGE_GLOBAL) {
> >>>>>> -               pte_t *buddy = ptep_buddy(ptep);
> >>>>>> -               /*
> >>>>>> -                * Make sure the buddy is global too (if it's !none,
> >>>>>> -                * it better already be global)
> >>>>>> -                */
> >>>>>> -               if (pte_none(ptep_get(buddy))) {
> >>>>>> -#ifdef CONFIG_SMP
> >>>>>> -                       /*
> >>>>>> -                        * For SMP, multiple CPUs can race, so we need
> >>>>>> -                        * to do this atomically.
> >>>>>> -                        */
> >>>>>> -                       __asm__ __volatile__(
> >>>>>> -                       __AMOR "$zero, %[global], %[buddy] \n"
> >>>>>> -                       : [buddy] "+ZB" (buddy->pte)
> >>>>>> -                       : [global] "r" (_PAGE_GLOBAL)
> >>>>>> -                       : "memory");
> >>>>>> -
> >>>>>> -                       DBAR(0b11000); /* o_wrw = 0b11000 */
> >>>>>> -#else /* !CONFIG_SMP */
> >>>>>> -                       WRITE_ONCE(*buddy, __pte(pte_val(ptep_get(buddy)) | _PAGE_GLOBAL));
> >>>>>> -#endif /* CONFIG_SMP */
> >>>>>> -               }
> >>>>>> -       }
> >>>>>> +       DBAR(0b11000); /* o_wrw = 0b11000 */
> >>>>>>      }
> >>>>>>
> >>>>>> No, please hold on. This issue exists about twenty years, Do we need be
> >>>>>> in such a hurry now?
> >>>>>>
> >>>>>> why is DBAR(0b11000) added in set_pte()?
> >>>>> It exists before, not added by this patch. The reason is explained in
> >>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.12-rc3&id=f93f67d06b1023313ef1662eac490e29c025c030
> >>>> why speculative accesses may cause spurious page fault in kernel space
> >>>> with PTE enabled?  speculative accesses exists anywhere, it does not
> >>>> cause spurious page fault.
> >>> Confirmed by Ruiyang Wu, and even if DBAR(0b11000) is wrong, that
> >>> means another patch's mistake, not this one. This one just keeps the
> >>> old behavior.
> >>> +CC Ruiyang Wu here.
> >> Also from Ruiyang Wu, the information is that speculative accesses may
> >> insert stale TLB, however no page fault exception.
> >>
> >> So adding barrier in set_pte() does not prevent speculative accesses.
> >> And you write patch here, however do not know the actual reason?
> >>
> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.12-rc3&id=f93f67d06b1023313ef1662eac490e29c025c030
> > I have CCed Ruiyang, whether the description is correct can be judged by him.
>
> There are some problems to add barrier() in set_pte():
>
> 1. There is such issue only for HW ptw enabled and kernel address space,
> is that? Also it may be two heavy to add barrier in set_pte(), comparing
> to do this in flush_cache_vmap().
So adding a barrier in set_pte() may not be the best solution for
performance, but you cannot say it is a wrong solution. And yes, we
can only care the kernel space, which is also the old behavior before
this patch, so set_pte() should be:

static inline void set_pte(pte_t *ptep, pte_t pteval)
{
        WRITE_ONCE(*ptep, pteval);
#ifdef CONFIG_SMP
        if (pte_val(pteval) & _PAGE_GLOBAL)
                DBAR(0b11000); /* o_wrw = 0b11000 */
#endif
}

Putting a dbar unconditionally in set_pte() is my mistake, I'm sorry for  that.

>
> 2. LoongArch is different with other other architectures, two pages are
> included in one TLB entry. If there is two consecutive page mapped and
> memory access, there will page fault for the second memory access. Such
> as:
>     addr1 =percpu_alloc(pagesize);
>     val1 = *(int *)addr1;
>       // With page table walk, addr1 is present and addr2 is pte_none
>       // TLB entry includes valid pte for addr1, invalid pte for addr2
>     addr2 =percpu_alloc(pagesize); // will not flush tlb in first time
>     val2 = *(int *)addr2;
>       // With page table walk, addr1 is present and addr2 is present also
>       // TLB entry includes valid pte for addr1, invalid pte for addr2
>     So there will be page fault when accessing address addr2
>
> There there is the same problem with user address space. By the way,
> there is HW prefetching technology, negative effective of HW prefetching
> technology will be tlb added. So there is potential page fault if memory
> is allocated and accessed in the first time.
As discussed internally, there may be three problems related to
speculative access in detail: 1) a load/store after set_pte() is
prioritized before, which can be prevented by dbar, 2) a instruction
fetch after set_pte() is prioritized before, which can be prevented by
ibar, 3) the buddy tlb problem you described here, if I understand
Ruiyang's explanation correctly this can only be prevented by the
filter in do_page_fault().


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux