Re: [PATCH v2 4/6] LoongArch: KVM: Add memory barrier before update pmd entry

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Mon, Jun 24, 2024 at 12:18 PM Huacai Chen <chenhuacai@xxxxxxxxxx> wrote:
>
> On Mon, Jun 24, 2024 at 10:21 AM maobibo <maobibo@xxxxxxxxxxx> wrote:
> >
> >
> >
> > On 2024/6/24 上午9:56, Huacai Chen wrote:
> > > On Mon, Jun 24, 2024 at 9:37 AM maobibo <maobibo@xxxxxxxxxxx> wrote:
> > >>
> > >>
> > >>
> > >> On 2024/6/23 下午6:18, Huacai Chen wrote:
> > >>> Hi, Bibo,
> > >>>
> > >>> On Wed, Jun 19, 2024 at 4:09 PM Bibo Mao <maobibo@xxxxxxxxxxx> wrote:
> > >>>>
> > >>>> When updating pmd entry such as allocating new pmd page or splitting
> > >>>> huge page into normal page, it is necessary to firstly update all pte
> > >>>> entries, and then update pmd entry.
> > >>>>
> > >>>> It is weak order with LoongArch system, there will be problem if other
> > >>>> vcpus sees pmd update firstly however pte is not updated. Here smp_wmb()
> > >>>> is added to assure this.
> > >>> Memory barriers should be in pairs in most cases. That means you may
> > >>> lose smp_rmb() in another place.
> > >> The idea adding smp_wmb() comes from function __split_huge_pmd_locked()
> > >> in file mm/huge_memory.c, and the explanation is reasonable.
> > >>
> > >>                   ...
> > >>                   set_ptes(mm, haddr, pte, entry, HPAGE_PMD_NR);
> > >>           }
> > >>           ...
> > >>           smp_wmb(); /* make pte visible before pmd */
> > >>           pmd_populate(mm, pmd, pgtable);
> > >>
> > >> It is strange that why smp_rmb() should be in pairs with smp_wmb(),
> > >> I never hear this rule -:(
> > > https://docs.kernel.org/core-api/wrappers/memory-barriers.html
> > >
> > > SMP BARRIER PAIRING
> > > -------------------
> > >
> > > When dealing with CPU-CPU interactions, certain types of memory barrier should
> > > always be paired.  A lack of appropriate pairing is almost certainly an error.
> >     CPU 1                 CPU 2
> >          ===============       ===============
> >          WRITE_ONCE(a, 1);
> >          <write barrier>
> >          WRITE_ONCE(b, 2);     x = READ_ONCE(b);
> >                                <read barrier>
> >                                y = READ_ONCE(a);
> >
> > With split_huge scenery to update pte/pmd entry, there is no strong
> > relationship between address ptex and pmd.
> > CPU1
> >       WRITE_ONCE(pte0, 1);
> >       WRITE_ONCE(pte511, 1);
> >       <write barrier>
> >       WRITE_ONCE(pmd, 2);
> >
> > However with page table walk scenery, address ptep depends on the
> > contents of pmd, so it is not necessary to add smp_rmb().
> >          ptep = pte_offset_map_lock(mm, pmd, address, &ptl);
> >          if (!ptep)
> >                  return no_page_table(vma, flags, address);
> >          pte = ptep_get(ptep);
> >          if (!pte_present(pte))
> >
> > It is just my option, or do you think where smp_rmb() barrier should be
> > added in page table reader path?
> There are some possibilities:
> 1. Read barrier is missing in some places;
> 2. Write barrier is also unnecessary here;
> 3. Read barrier is really unnecessary, but there is a better API to
> replace the write barrier;
> 4. Read barrier is really unnecessary, and write barrier is really the
> best API here.
>
> Maybe Rui Wang knows better here.

It appears that reading the pte address is data-dependent on the pmd,
rather than control-dependent. This creates an opportunity to omit the
read-side memory barrier.

Cheers,
-Rui


>
> Huacai
>
> >
> > Regards
> > Bibo Mao
> > >
> > >
> > > Huacai
> > >
> > >>
> > >> Regards
> > >> Bibo Mao
> > >>>
> > >>> Huacai
> > >>>
> > >>>>
> > >>>> Signed-off-by: Bibo Mao <maobibo@xxxxxxxxxxx>
> > >>>> ---
> > >>>>    arch/loongarch/kvm/mmu.c | 2 ++
> > >>>>    1 file changed, 2 insertions(+)
> > >>>>
> > >>>> diff --git a/arch/loongarch/kvm/mmu.c b/arch/loongarch/kvm/mmu.c
> > >>>> index 1690828bd44b..7f04edfbe428 100644
> > >>>> --- a/arch/loongarch/kvm/mmu.c
> > >>>> +++ b/arch/loongarch/kvm/mmu.c
> > >>>> @@ -163,6 +163,7 @@ static kvm_pte_t *kvm_populate_gpa(struct kvm *kvm,
> > >>>>
> > >>>>                           child = kvm_mmu_memory_cache_alloc(cache);
> > >>>>                           _kvm_pte_init(child, ctx.invalid_ptes[ctx.level - 1]);
> > >>>> +                       smp_wmb(); /* make pte visible before pmd */
> > >>>>                           kvm_set_pte(entry, __pa(child));
> > >>>>                   } else if (kvm_pte_huge(*entry)) {
> > >>>>                           return entry;
> > >>>> @@ -746,6 +747,7 @@ static kvm_pte_t *kvm_split_huge(struct kvm_vcpu *vcpu, kvm_pte_t *ptep, gfn_t g
> > >>>>                   val += PAGE_SIZE;
> > >>>>           }
> > >>>>
> > >>>> +       smp_wmb();
> > >>>>           /* The later kvm_flush_tlb_gpa() will flush hugepage tlb */
> > >>>>           kvm_set_pte(ptep, __pa(child));
> > >>>>
> > >>>> --
> > >>>> 2.39.3
> > >>>>
> > >>
> > >>
> >
> >
>






[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux