Hello Catalin, Thanks for sharing about CnP assid experience. See my comment below. On Mon, Jul 1, 2019 at 5:17 PM Catalin Marinas > From the ASID reservation/allocation perspective, the mechanism is the > same between multi-threaded with a shared TLB and multi-core. On arm64, > a local_flush_tlb_all() on a thread invalidates the TLB for the other > threads of the same core. > > The actual problem with multi-threaded CPUs is a lot more subtle. > Digging some internal email from 1.5 years ago and pasting it below > (where "current ASID algorithm" refers to the one prior to the fix and > CnP - Common Not Private - means shared TLBs on a multi-threaded CPU): > > > The current ASID roll-over algorithm allows for a small window where > active_asids for a CPU (P1) is different from the actual ASID in TTBR0. > This can lead to a roll-over on a different CPU (P2) allocating an ASID > (for a different task) which is still hardware-active on P1. > > A TLBI on a CPU (or a peer CPU with CnP) does not guarantee that all the > entries corresponding to a valid TTBRx are removed as they can still be > speculatively loaded immediately after TLBI. > > While having two different page tables with the same ASID on different > CPUs should be fine without CnP, it becomes problematic when CnP is > enabled: > > P1 P2 > -- -- > TTBR0.BADDR = T1 > TTBR0.ASID = A1 > check_and_switch_context(T2,A2) > asid_maps[P1] = A2 > goto fastpath > check_and_switch_context(T3,A0) > new_context > ASID roll-over allocates A1 > since it is not active > TLBI ALL > speculate TTBR0.ASID = A1 entry > TTBR0.BADDR = T3 > TTBR0.ASID = A1 > TTBR0.BADDR = T2 > TTBR0.ASID = A2 > > After this, the common TLB on P1 and P2 (CnP) contains entries > corresponding to the old T1 and A1. Task T3 using the same ASID A1 can > hit such entries. (T1,A1) will eventually be removed from the TLB on the > next context switch on P1 since tlb_flush_pending was set but this is > not guaranteed to happen. > > > The fix on arm64 (as part of 5ffdfaedfa0a - "arm64: mm: Support Common > Not Private translations") was to set the reserved TTBR0 in > check_and_switch_context(), preventing speculative loads into the TLB > being tagged with the wrong ASID. So this is specific to the ARM CPUs > behaviour w.r.t. speculative TLB loads, it may not be the case (yet) for > your architecture. The most important thing is that TLBI ALL occurs between "asid_maps[P1] = A2" and "TTBR0.BADDR = T2", then speculative execution after TLBI which access to user space code/data will result in a valid asid entry which re-filled into the TLB by PTW. A similar problem should exist if C-SKY ISA supports SMT. Although the C-SKY kernel prohibits the kernel from speculating on user space code directly, ld/st can access user space memory in csky kernel mode. Therefore, a similar problem occurs when it speculatively executes copy_from / to_user codes in that window. RISC-V ISA has a SUM setting bit that prevents the kernel from speculating access to user space. So this problem has been bypassed from the design. I saw arm64 to prevent speculation by temporarily setting TTBR0.el1 to a zero page table. Is that used to prevent speculative execution user space code or just prevent ld/st in copy_use_* ? -- Best Regards Guo Ren ML: https://lore.kernel.org/linux-csky/ _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm