> -----Original Message----- > From: Catalin Marinas [mailto:catalin.marinas@xxxxxxx] > Sent: 19 January 2022 11:50 > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@xxxxxxxxxx> > Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; kvmarm@xxxxxxxxxxxxxxxxxxxxx; > linux-kernel@xxxxxxxxxxxxxxx; maz@xxxxxxxxxx; will@xxxxxxxxxx; > james.morse@xxxxxxx; julien.thierry.kdev@xxxxxxxxx; > suzuki.poulose@xxxxxxx; jean-philippe@xxxxxxxxxx; > Alexandru.Elisei@xxxxxxx; qperret@xxxxxxxxxx; Jonathan Cameron > <jonathan.cameron@xxxxxxxxxx>; Linuxarm <linuxarm@xxxxxxxxxx> > Subject: Re: [PATCH v4 0/4] kvm/arm: New VMID allocator based on asid > > On Wed, Jan 19, 2022 at 09:23:31AM +0000, Shameerali Kolothum Thodi > wrote: > > > On Mon, Nov 22, 2021 at 12:18:40PM +0000, Shameer Kolothum wrote: > > > > -TLA+ model. Modified the asidalloc model to incorporate the new > > > > VMID algo. The main differences are, > > > > -flush_tlb_all() instead of local_tlb_flush_all() on rollover. > > > > -Introduced INVALID_VMID and vCPU Sched Out logic. > > > > -No CnP (Removed UniqueASIDAllCPUs & UniqueASIDActiveTask > invariants). > > > > -Removed UniqueVMIDPerCPU invariant for now as it looks like > > > > because of the speculative fetching with flush_tlb_all() there > > > > is a small window where this gets triggered. If I change the > > > > logic back to local_flush_tlb_all(), UniqueVMIDPerCPU seems to > > > > be fine. With my limited knowledge on TLA+ model, it is not > > > > clear to me whether this is a problem with the above logic > > > > or the VMID model implementation. Really appreciate any help > > > > with the model. > > > > The initial VMID TLA+ model is here, > > > > > https://github.com/shamiali2008/kernel-tla/tree/private-vmidalloc-v1 > > > > > > I only had a brief look at the TLA+ model and I don't understand why you > > > have a separate 'shed_out' process. It would run in parallel with the > > > 'sched' but AFAICT you can't really schedule a guest out while you are > > > in the middle of scheduling it in. I'd rather use the same 'sched' > > > process and either schedule in an inactive task or schedule out an > > > active one for a given CPU. > > > > > > Also active_vmids[] for example is defined on the CPUS domain but you > > > call vcpu_sched_out() from a process that's not in the CPUS domain but > > > the SCHED_OUT one. > > > > Many thanks for taking a look. My bad!. The 'sched_out' would indeed run in > parallel > > and defeat the purpose. I must say I was really confused by the TLA+ syntax > and > > is still not very confident about it. > > Yeah, it can be confusing. If you have time, you could give CBMC a try > and the 'spec' would be pretty close to your C version. Each CPU would > be modelled as a thread with an extra thread that simulates the > speculative TLB look-ups for all CPUS together with the asserts for the > invariants. The spinlocks would be pthread_mutexes. Ok, will take a look. > > Based on the above suggestion, I have modified it as below, > > > > \* vCPU is scheduled out by KVM > > macro vcpu_sched_out() { > > active_kvm[self].task := 0; > > active_vmids[self] := INVALID_VMID; > > } > > Could you call cpu_switch_kvm(0, INVALID_VMID) instead? You could do > this directly below and avoid another macro. Well, whatever you find > clearer. Sure, will change that. > What confuses me is that your INVALID_VMID looks like a valid one: vmid > 0, generation 1. Do you ensure that you never allocate VMID 0? This is same as in asidalloc wherein the cur_idx starts at 1 for any new allocation. I think that guarantees VMID 0 is never allocated. > > > \* About to run a Guest VM > > process (sched \in CPUS) > > { > > sched_loop: > > while (TRUE) { > > with (t \in TASKS) { > > if (t # ActiveTask(self)) > > call kvm_arm_vmid_update(t); > > else > > vcpu_sched_out(); > > } > > } > > } > > Yes, that's what I meant. Ok. > > > > The corresponding > > > UniqueASIDPerCPU meant that for any two TLB entries on a single CPU, if > > > they correspond to different tasks (pgd), they should have different > > > ASIDs. That's a strong requirement, otherwise we end up with the wrong > > > translation. > > > > Yes, I understand that it is a strong requirement. Also, I thought this is > something > > that will trigger easily with the test setup I had with the real hardware. But > testing > > stayed on for days, so I was not sure it is a problem with the TLA+ > implementation > > or not. > > Well, you'd have to check the TLA+ state trace and see how it got > there, whether the last state would be a valid one. It's either > something missing in the spec that the hardware enforces or the > algorithm is wrong and just hard to hit in practice. If this condition > is violated in hardware for a very brief period, e.g. due to some TLBI, > you'd not notice an issue under normal circumstances. But it's still > incorrect. Hmm..I had a quick implementation with a separate thread for speculative load as done in the arm64kpti.tla, but still get the UniqueVMIDPerCPU violation error. I will go through the state trace and see whether I can interpret something :) > > > Why did you remove the CnP? Do we have this disabled for KVM guests? > > > > I removed CnP related Invariants to simplify things for the first version. Also > not sure > > what specific changes we need to do for CnP here. Do we still need that > switching to > > global pg_dir to prevent any speculative reading? I didn't see that being > done in KVM > > anywhere at the moment. Maybe I am missing something. > > It make sense to ignore CnP for now. Maybe KVM doesn't even bother and > sets VTTBR_EL2.CnP to 0 (I haven't checked). I think KVM sets the CnP here, https://elixir.bootlin.com/linux/latest/source/arch/arm64/include/asm/kvm_mmu.h#L264 But I haven't figured out what else we need to do in this new VMID allocator for CnP case. Marc, Will? > > > On a side note, In my setup, the CnP=TRUE case for asidalloc.tla now fails > with, > > "Error: Invariant TLBEmptyInvalPTE is violated.". Please check. > > This was added later as part of try-to-unmap and I only checked this > with CnP = FALSE. I'll need to look into, it's possible that > flush_tlb_all() doesn't take into account that the pte is unmapped (as > cpu_switch_mm() does). I'll add a separate thread for speculative TLB > loads, it's easier to have them in one place. Thanks. Ok. Thanks, Shameer _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm