On Tue, Apr 10, 2018 at 04:37:12PM +0100, Marc Zyngier wrote: > On 10/04/18 16:24, Mark Rutland wrote: > > On Tue, Apr 10, 2018 at 05:05:40PM +0200, Christoffer Dall wrote: > >> On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote: > >>> I think we also need to update kvm->arch.vttbr before updating > >>> kvm->arch.vmid_gen, otherwise another CPU can come in, see that the > >>> vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the > >>> old VMID). > >>> > >>> With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of > >>> the critical section, I think that works, modulo using READ_ONCE() and > >>> WRITE_ONCE() to ensure single-copy-atomicity of the fields we access > >>> locklessly. > >> > >> Indeed, you're right. I would look something like this, then: > >> > >> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c > >> index 2e43f9d42bd5..6cb08995e7ff 100644 > >> --- a/virt/kvm/arm/arm.c > >> +++ b/virt/kvm/arm/arm.c > >> @@ -450,7 +450,9 @@ void force_vm_exit(const cpumask_t *mask) > >> */ > >> static bool need_new_vmid_gen(struct kvm *kvm) > >> { > >> - return unlikely(kvm->arch.vmid_gen != atomic64_read(&kvm_vmid_gen)); > >> + u64 current_vmid_gen = atomic64_read(&kvm_vmid_gen); > >> + smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */ > >> + return unlikely(READ_ONCE(kvm->arch.vmid_gen) != current_vmid_gen); > >> } > >> > >> /** > >> @@ -500,7 +502,6 @@ static void update_vttbr(struct kvm *kvm) > >> kvm_call_hyp(__kvm_flush_vm_context); > >> } > >> > >> - kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen); > >> kvm->arch.vmid = kvm_next_vmid; > >> kvm_next_vmid++; > >> kvm_next_vmid &= (1 << kvm_vmid_bits) - 1; > >> @@ -509,7 +510,10 @@ static void update_vttbr(struct kvm *kvm) > >> pgd_phys = virt_to_phys(kvm->arch.pgd); > >> BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK); > >> vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & VTTBR_VMID_MASK(kvm_vmid_bits); > >> - kvm->arch.vttbr = pgd_phys | vmid; > >> + WRITE_ONCE(kvm->arch.vttbr, pgd_phys | vmid); > >> + > >> + smp_wmb(); /* Ensure vttbr update is observed before vmid_gen update */ > >> + kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen); > >> > >> spin_unlock(&kvm_vmid_lock); > >> } > > > > I think that's right, yes. > > > > We could replace the smp_{r,w}mb() barriers with an acquire of the > > kvm_vmid_gen and a release of kvm->arch.vmid_gen, but if we're really > > trying to optimize things there are larger algorithmic changes necessary > > anyhow. > > > >> It's probably easier to convince ourselves about the correctness of > >> Marc's code using a rwlock instead, though. Thoughts? > > > > I believe that Marc's preference was the rwlock; I have no preference > > either way. > > I don't mind either way. If you can be bothered to write a proper commit > log for this, I'll take it. You've already done the work, and your patch is easier to read, so let's just go ahead with that. I was just curious to which degree my original implementation was broken; was I trying to achieve something impossible or was I just writing buggy code. Seems the latter. Oh well. > What I'd really want is Shannon to indicate > whether or not this solves the issue he was seeing. > Agreed, would like to see that too. Thanks (and sorry for being noisy), -Christoffer