On Thu, Jun 21, 2012 at 4:13 AM, Avi Kivity <avi@xxxxxxxxxx> wrote: > On 06/20/2012 07:40 AM, Christoffer Dall wrote: >> >>>> >>>> cpu0 cpu1 >>>> (vmid=0, gen=1) (gen=0) >>>> -------------------------- ---------------------- >>>> gen == global_gen, return >>>> >>>> gen != global_gen >>>> increment global_gen >>>> smp_call_function >>>> reset_vm_context >>>> vmid=0 >>>> >>>> enter with vmid=0 enter with vmid=0 >>> >>> I can't see how the scenario above can happen. First, no-one can run >>> with vmid 0 - it is reserved for the host. If we bump global_gen on >>> cpuN, then since we do smp_call_function(x, x, wait=1) we are now sure >>> that after this call, all cpus(N-1) potentially being inside a VM will >>> have exited, called reset_vm_context, but before they can re-enter >>> into the guest, they will call update_vttbr, and if their generation >>> counter differs from global_gen, they will try to grab that spinlock >>> and everything should happen in order. >>> >> >> the whole vmid=0 confused me a bit. The point is since we moved the >> generation check outside the spin_lock we have to re-check, I see: > > In fact I think the problem occured with the original code as well. The > problem is that the check is not atomic wrt guest entry, so > > spin_lock > check/update > spin_unlock > > enter guest > > has a hole between spin_unlock and guest entry. > you are absolutely right. >> >> /** >> + * check_new_vmid_gen - check that the VMID is still valid >> + * @kvm: The VM's VMID to checkt >> + * >> + * return true if there is a new generation of VMIDs being used >> + * >> + * The hardware supports only 256 values with the value zero reserved for the >> + * host, so we check if an assigned value belongs to a previous generation, >> + * which which requires us to assign a new value. If we're the first to use a >> + * VMID for the new generation, we must flush necessary caches and TLBs on all >> + * CPUs. >> + */ >> +static bool check_new_vmid_gen(struct kvm *kvm) >> +{ >> + if (likely(kvm->arch.vmid_gen == atomic64_read(&kvm_vmid_gen))) >> + return; >> +} >> + >> +/** >> * update_vttbr - Update the VTTBR with a valid VMID before the guest runs >> * @kvm The guest that we are about to run >> * >> @@ -324,15 +342,7 @@ static void update_vttbr(struct kvm *kvm) >> { >> phys_addr_t pgd_phys; >> >> - /* >> - * Check that the VMID is still valid. >> - * (The hardware supports only 256 values with the value zero >> - * reserved for the host, so we check if an assigned value belongs to >> - * a previous generation, which which requires us to assign a new >> - * value. If we're the first to use a VMID for the new generation, >> - * we must flush necessary caches and TLBs on all CPUs.) >> - */ >> - if (likely(kvm->arch.vmid_gen == atomic64_read(&kvm_vmid_gen))) >> + if (!check_new_vmid_gen(kvm)) >> return; >> >> spin_lock(&kvm_vmid_lock); >> @@ -504,6 +514,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu >> *vcpu, struct kvm_run *run) >> */ >> preempt_disable(); >> local_irq_disable(); >> + >> + if (check_new_vmid_gen(kvm)) { >> + local_irq_enable(); >> + preempt_enable(); >> + continue; >> + } >> + > > I see the same race with signals. Your signal_pending() check needs to > be after the local_irq_disable(), otherwise we can enter a guest with a > pending signal. > that's not functionally incorrect though is it? It may simply increase the latency for the signal delivery as far as I can see, but I definitely don't mind changing this path in any case. > > Better place the signal check before the vmid_gen check, to avoid the > possibility of a a signal being held up by other guests. > nice ;) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html