On Tue, 2023-05-16 at 20:45 +0800, Huacai Chen wrote: > Traditionally, LoongArch uses "dbar 0" (full completion barrier) for > everything. But the full completion barrier is a performance killer, so > Loongson-3A6000 and newer processors introduce different hints: > > Bit4: ordering or completion (0: completion, 1: ordering) > Bit3: barrier for previous read (0: true, 1: false) > Bit2: barrier for previous write (0: true, 1: false) > Bit1: barrier for succeeding read (0: true, 1: false) > Bit0: barrier for succedding write (0: true, 1: false) > > Hint 0x700: barrier for "read after read" from the same address, which > is needed by LL-SC loops. Great! I guess Xuerui would add this into his weekly news :). I don't really understand these C++-memory-model-like concepts so I'll not review the "orwrw" parts, but... /* snip */ > diff --git a/arch/loongarch/mm/tlbex.S b/arch/loongarch/mm/tlbex.S > index 244e2f5aeee5..240ced55586e 100644 > --- a/arch/loongarch/mm/tlbex.S > +++ b/arch/loongarch/mm/tlbex.S > @@ -184,7 +184,7 @@ tlb_huge_update_load: > ertn > > nopage_tlb_load: > - dbar 0 > + dbar 0x700 There is no LL/SC loop here. I guess this shares a same internal uarch logic as the LL/SC loop, but there should be a clarification for 0x700 in the commit message. > csrrd ra, EXCEPTION_KS2 > la_abs t0, tlb_do_page_fault_0 > jr t0 > @@ -333,7 +333,7 @@ tlb_huge_update_store: > ertn > > nopage_tlb_store: > - dbar 0 > + dbar 0x700 > csrrd ra, EXCEPTION_KS2 > la_abs t0, tlb_do_page_fault_1 > jr t0 > @@ -480,7 +480,7 @@ tlb_huge_update_modify: > ertn > > nopage_tlb_modify: > - dbar 0 > + dbar 0x700 > csrrd ra, EXCEPTION_KS2 > la_abs t0, tlb_do_page_fault_1 > jr t0 -- Xi Ruoyao <xry111@xxxxxxxxxxx> School of Aerospace Science and Technology, Xidian University