[messed up my initial reply, resending] On Tue, Nov 01 2016 at 09:04:08 AM, Christoffer Dall <christoffer.dall@xxxxxxxxxx> wrote: > On Fri, Oct 28, 2016 at 11:27:50AM +0100, Marc Zyngier wrote: >> Architecturally, TLBs are private to the (physical) CPU they're >> associated with. But when multiple vcpus from the same VM are >> being multiplexed on the same CPU, the TLBs are not private >> to the vcpus (and are actually shared across the VMID). >> >> Let's consider the following scenario: >> >> - vcpu-0 maps PA to VA >> - vcpu-1 maps PA' to VA >> >> If run on the same physical CPU, vcpu-1 can hit TLB entries generated >> by vcpu-0 accesses, and access the wrong physical page. >> >> The solution to this is to keep a per-VM map of which vcpu ran last >> on each given physical CPU, and invalidate local TLBs when switching >> to a different vcpu from the same VM. >> >> Reviewed-by: Mark Rutland <mark.rutland@xxxxxxx> >> Signed-off-by: Marc Zyngier <marc.zyngier@xxxxxxx> >> --- >> Fixed comments, added Mark's RB. >> >> arch/arm/include/asm/kvm_host.h | 11 ++++++++++- >> arch/arm/include/asm/kvm_hyp.h | 1 + >> arch/arm/kvm/arm.c | 35 ++++++++++++++++++++++++++++++++++- >> arch/arm/kvm/hyp/switch.c | 9 +++++++++ >> arch/arm64/include/asm/kvm_host.h | 11 ++++++++++- >> arch/arm64/kvm/hyp/switch.c | 8 ++++++++ >> 6 files changed, 72 insertions(+), 3 deletions(-) >> [...] >> @@ -310,6 +322,27 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) >> return 0; >> } >> >> +void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) >> +{ > > why is calling this from here sufficient? > > You only get a notification from preempt notifiers if you were preempted > while running (or rather while the vcpu was loaded). I think this > needs Arghh. I completely miss-read the code when writing that patch. > to go in kvm_arch_vcpu_load, but be aware that the vcpu_load gets called > for other vcpu ioctls and doesn't necessarily imply that the vcpu will > actually run, which is also the case for the sched_in notification, btw. > The worst that will happen in that case is a bit of extra TLB > invalidation, so sticking with kvm_arch_vcpu_load is probably fine. Indeed. I don't mind the extra invalidation, as long as it is rare enough. Another possibility would be to do this test on the entry path, once preemption is disabled. > >> + int *last_ran; >> + >> + last_ran = per_cpu_ptr(vcpu->kvm->arch.last_vcpu_ran, cpu); >> + >> + /* >> + * We might get preempted before the vCPU actually runs, but >> + * this is fine. Our TLBI stays pending until we actually make >> + * it to __activate_vm, so we won't miss a TLBI. If another >> + * vCPU gets scheduled, it will see our vcpu_id in last_ran, >> + * and pend a TLBI for itself. >> + */ >> + if (*last_ran != vcpu->vcpu_id) { >> + if (*last_ran != -1) >> + vcpu->arch.tlb_vmid_stale = true; >> + >> + *last_ran = vcpu->vcpu_id; >> + } >> +} >> + >> void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) >> { >> vcpu->cpu = cpu; >> diff --git a/arch/arm/kvm/hyp/switch.c b/arch/arm/kvm/hyp/switch.c >> index 92678b7..a411762 100644 >> --- a/arch/arm/kvm/hyp/switch.c >> +++ b/arch/arm/kvm/hyp/switch.c >> @@ -75,6 +75,15 @@ static void __hyp_text __activate_vm(struct kvm_vcpu *vcpu) >> { >> struct kvm *kvm = kern_hyp_va(vcpu->kvm); >> write_sysreg(kvm->arch.vttbr, VTTBR); >> + if (vcpu->arch.tlb_vmid_stale) { >> + /* Force vttbr to be written */ >> + isb(); >> + /* Local invalidate only for this VMID */ >> + write_sysreg(0, TLBIALL); >> + dsb(nsh); >> + vcpu->arch.tlb_vmid_stale = false; >> + } >> + > > why not call this directly when you notice it via kvm_call_hyp as > opposed to adding another conditional in the critical path? Because the cost of a hypercall is very likely to be a lot higher than that of testing a variable. Not to mention that at this point we're absolutely sure that we're going to run the guest, while the hook in vcpu_load is only probabilistic. Thanks, M. -- Jazz is not dead. It just smells funny. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html