On 2024/6/24 上午9:56, Huacai Chen wrote:
On Mon, Jun 24, 2024 at 9:37 AM maobibo <maobibo@xxxxxxxxxxx> wrote:
On 2024/6/23 下午6:18, Huacai Chen wrote:
Hi, Bibo,
On Wed, Jun 19, 2024 at 4:09 PM Bibo Mao <maobibo@xxxxxxxxxxx> wrote:
When updating pmd entry such as allocating new pmd page or splitting
huge page into normal page, it is necessary to firstly update all pte
entries, and then update pmd entry.
It is weak order with LoongArch system, there will be problem if other
vcpus sees pmd update firstly however pte is not updated. Here smp_wmb()
is added to assure this.
Memory barriers should be in pairs in most cases. That means you may
lose smp_rmb() in another place.
The idea adding smp_wmb() comes from function __split_huge_pmd_locked()
in file mm/huge_memory.c, and the explanation is reasonable.
...
set_ptes(mm, haddr, pte, entry, HPAGE_PMD_NR);
}
...
smp_wmb(); /* make pte visible before pmd */
pmd_populate(mm, pmd, pgtable);
It is strange that why smp_rmb() should be in pairs with smp_wmb(),
I never hear this rule -:(
https://docs.kernel.org/core-api/wrappers/memory-barriers.html
SMP BARRIER PAIRING
-------------------
When dealing with CPU-CPU interactions, certain types of memory barrier should
always be paired. A lack of appropriate pairing is almost certainly an error.
CPU 1 CPU 2
=============== ===============
WRITE_ONCE(a, 1);
<write barrier>
WRITE_ONCE(b, 2); x = READ_ONCE(b);
<read barrier>
y = READ_ONCE(a);
With split_huge scenery to update pte/pmd entry, there is no strong
relationship between address ptex and pmd.
CPU1
WRITE_ONCE(pte0, 1);
WRITE_ONCE(pte511, 1);
<write barrier>
WRITE_ONCE(pmd, 2);
However with page table walk scenery, address ptep depends on the
contents of pmd, so it is not necessary to add smp_rmb().
ptep = pte_offset_map_lock(mm, pmd, address, &ptl);
if (!ptep)
return no_page_table(vma, flags, address);
pte = ptep_get(ptep);
if (!pte_present(pte))
It is just my option, or do you think where smp_rmb() barrier should be
added in page table reader path?
Regards
Bibo Mao
Huacai
Regards
Bibo Mao
Huacai
Signed-off-by: Bibo Mao <maobibo@xxxxxxxxxxx>
---
arch/loongarch/kvm/mmu.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/loongarch/kvm/mmu.c b/arch/loongarch/kvm/mmu.c
index 1690828bd44b..7f04edfbe428 100644
--- a/arch/loongarch/kvm/mmu.c
+++ b/arch/loongarch/kvm/mmu.c
@@ -163,6 +163,7 @@ static kvm_pte_t *kvm_populate_gpa(struct kvm *kvm,
child = kvm_mmu_memory_cache_alloc(cache);
_kvm_pte_init(child, ctx.invalid_ptes[ctx.level - 1]);
+ smp_wmb(); /* make pte visible before pmd */
kvm_set_pte(entry, __pa(child));
} else if (kvm_pte_huge(*entry)) {
return entry;
@@ -746,6 +747,7 @@ static kvm_pte_t *kvm_split_huge(struct kvm_vcpu *vcpu, kvm_pte_t *ptep, gfn_t g
val += PAGE_SIZE;
}
+ smp_wmb();
/* The later kvm_flush_tlb_gpa() will flush hugepage tlb */
kvm_set_pte(ptep, __pa(child));
--
2.39.3