Hi, On Mon, Jun 24, 2024 at 12:18 PM Huacai Chen <chenhuacai@xxxxxxxxxx> wrote: > > On Mon, Jun 24, 2024 at 10:21 AM maobibo <maobibo@xxxxxxxxxxx> wrote: > > > > > > > > On 2024/6/24 上午9:56, Huacai Chen wrote: > > > On Mon, Jun 24, 2024 at 9:37 AM maobibo <maobibo@xxxxxxxxxxx> wrote: > > >> > > >> > > >> > > >> On 2024/6/23 下午6:18, Huacai Chen wrote: > > >>> Hi, Bibo, > > >>> > > >>> On Wed, Jun 19, 2024 at 4:09 PM Bibo Mao <maobibo@xxxxxxxxxxx> wrote: > > >>>> > > >>>> When updating pmd entry such as allocating new pmd page or splitting > > >>>> huge page into normal page, it is necessary to firstly update all pte > > >>>> entries, and then update pmd entry. > > >>>> > > >>>> It is weak order with LoongArch system, there will be problem if other > > >>>> vcpus sees pmd update firstly however pte is not updated. Here smp_wmb() > > >>>> is added to assure this. > > >>> Memory barriers should be in pairs in most cases. That means you may > > >>> lose smp_rmb() in another place. > > >> The idea adding smp_wmb() comes from function __split_huge_pmd_locked() > > >> in file mm/huge_memory.c, and the explanation is reasonable. > > >> > > >> ... > > >> set_ptes(mm, haddr, pte, entry, HPAGE_PMD_NR); > > >> } > > >> ... > > >> smp_wmb(); /* make pte visible before pmd */ > > >> pmd_populate(mm, pmd, pgtable); > > >> > > >> It is strange that why smp_rmb() should be in pairs with smp_wmb(), > > >> I never hear this rule -:( > > > https://docs.kernel.org/core-api/wrappers/memory-barriers.html > > > > > > SMP BARRIER PAIRING > > > ------------------- > > > > > > When dealing with CPU-CPU interactions, certain types of memory barrier should > > > always be paired. A lack of appropriate pairing is almost certainly an error. > > CPU 1 CPU 2 > > =============== =============== > > WRITE_ONCE(a, 1); > > <write barrier> > > WRITE_ONCE(b, 2); x = READ_ONCE(b); > > <read barrier> > > y = READ_ONCE(a); > > > > With split_huge scenery to update pte/pmd entry, there is no strong > > relationship between address ptex and pmd. > > CPU1 > > WRITE_ONCE(pte0, 1); > > WRITE_ONCE(pte511, 1); > > <write barrier> > > WRITE_ONCE(pmd, 2); > > > > However with page table walk scenery, address ptep depends on the > > contents of pmd, so it is not necessary to add smp_rmb(). > > ptep = pte_offset_map_lock(mm, pmd, address, &ptl); > > if (!ptep) > > return no_page_table(vma, flags, address); > > pte = ptep_get(ptep); > > if (!pte_present(pte)) > > > > It is just my option, or do you think where smp_rmb() barrier should be > > added in page table reader path? > There are some possibilities: > 1. Read barrier is missing in some places; > 2. Write barrier is also unnecessary here; > 3. Read barrier is really unnecessary, but there is a better API to > replace the write barrier; > 4. Read barrier is really unnecessary, and write barrier is really the > best API here. > > Maybe Rui Wang knows better here. It appears that reading the pte address is data-dependent on the pmd, rather than control-dependent. This creates an opportunity to omit the read-side memory barrier. Cheers, -Rui > > Huacai > > > > > Regards > > Bibo Mao > > > > > > > > > Huacai > > > > > >> > > >> Regards > > >> Bibo Mao > > >>> > > >>> Huacai > > >>> > > >>>> > > >>>> Signed-off-by: Bibo Mao <maobibo@xxxxxxxxxxx> > > >>>> --- > > >>>> arch/loongarch/kvm/mmu.c | 2 ++ > > >>>> 1 file changed, 2 insertions(+) > > >>>> > > >>>> diff --git a/arch/loongarch/kvm/mmu.c b/arch/loongarch/kvm/mmu.c > > >>>> index 1690828bd44b..7f04edfbe428 100644 > > >>>> --- a/arch/loongarch/kvm/mmu.c > > >>>> +++ b/arch/loongarch/kvm/mmu.c > > >>>> @@ -163,6 +163,7 @@ static kvm_pte_t *kvm_populate_gpa(struct kvm *kvm, > > >>>> > > >>>> child = kvm_mmu_memory_cache_alloc(cache); > > >>>> _kvm_pte_init(child, ctx.invalid_ptes[ctx.level - 1]); > > >>>> + smp_wmb(); /* make pte visible before pmd */ > > >>>> kvm_set_pte(entry, __pa(child)); > > >>>> } else if (kvm_pte_huge(*entry)) { > > >>>> return entry; > > >>>> @@ -746,6 +747,7 @@ static kvm_pte_t *kvm_split_huge(struct kvm_vcpu *vcpu, kvm_pte_t *ptep, gfn_t g > > >>>> val += PAGE_SIZE; > > >>>> } > > >>>> > > >>>> + smp_wmb(); > > >>>> /* The later kvm_flush_tlb_gpa() will flush hugepage tlb */ > > >>>> kvm_set_pte(ptep, __pa(child)); > > >>>> > > >>>> -- > > >>>> 2.39.3 > > >>>> > > >> > > >> > > > > >