On Tue, Jun 19, 2012 at 11:27 PM, Christoffer Dall <c.dall@xxxxxxxxxxxxxxxxxxxxxx> wrote: > On Tue, Jun 19, 2012 at 5:16 AM, Avi Kivity <avi@xxxxxxxxxx> wrote: >> On 06/19/2012 01:05 AM, Christoffer Dall wrote: >>>> Premature, but this is sad. I suggest you split vmid generation from >>>> next available vmid. This allows you to make the generation counter >>>> atomic so it may be read outside the lock. >>>> >>>> You can do >>>> >>>> if (likely(kvm->arch.vmd_gen) == atomic_read(&kvm_vmid_gen))) >>>> return; >>>> >>>> spin_lock(... >>>> >>> >>> I knew you were going to say something here :), please take a look at >>> this and see if you agree: >> >> It looks reasonable wrt locking. >> >>> + >>> + /* First user of a new VMID generation? */ >>> + if (unlikely(kvm_next_vmid == 0)) { >>> + atomic64_inc(&kvm_vmid_gen); >>> + kvm_next_vmid = 1; >>> + >>> + /* This does nothing on UP */ >>> + smp_call_function(reset_vm_context, NULL, 1); >> >> Does this imply a memory barrier? If not, smp_mb__after_atomic_inc(). >> > > yes, it implies a memory barrier. > >>> + >>> + /* >>> + * On SMP we know no other CPUs can use this CPU's or >>> + * each other's VMID since the kvm_vmid_lock blocks >>> + * them from reentry to the guest. >>> + */ >>> + >>> + reset_vm_context(NULL); >> >> These two lines can be folded as on_each_cpu(). >> >> Don't you have a race here where you can increment the generation just >> before guest entry? > > I don't think I do. > uh, strike that, I do. >> >> cpu0 cpu1 >> (vmid=0, gen=1) (gen=0) >> -------------------------- ---------------------- >> gen == global_gen, return >> >> gen != global_gen >> increment global_gen >> smp_call_function >> reset_vm_context >> vmid=0 >> >> enter with vmid=0 enter with vmid=0 > > I can't see how the scenario above can happen. First, no-one can run > with vmid 0 - it is reserved for the host. If we bump global_gen on > cpuN, then since we do smp_call_function(x, x, wait=1) we are now sure > that after this call, all cpus(N-1) potentially being inside a VM will > have exited, called reset_vm_context, but before they can re-enter > into the guest, they will call update_vttbr, and if their generation > counter differs from global_gen, they will try to grab that spinlock > and everything should happen in order. > the whole vmid=0 confused me a bit. The point is since we moved the generation check outside the spin_lock we have to re-check, I see: diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index 19fe3b0..74760af 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -313,6 +313,24 @@ static void reset_vm_context(void *info) } /** + * check_new_vmid_gen - check that the VMID is still valid + * @kvm: The VM's VMID to checkt + * + * return true if there is a new generation of VMIDs being used + * + * The hardware supports only 256 values with the value zero reserved for the + * host, so we check if an assigned value belongs to a previous generation, + * which which requires us to assign a new value. If we're the first to use a + * VMID for the new generation, we must flush necessary caches and TLBs on all + * CPUs. + */ +static bool check_new_vmid_gen(struct kvm *kvm) +{ + if (likely(kvm->arch.vmid_gen == atomic64_read(&kvm_vmid_gen))) + return; +} + +/** * update_vttbr - Update the VTTBR with a valid VMID before the guest runs * @kvm The guest that we are about to run * @@ -324,15 +342,7 @@ static void update_vttbr(struct kvm *kvm) { phys_addr_t pgd_phys; - /* - * Check that the VMID is still valid. - * (The hardware supports only 256 values with the value zero - * reserved for the host, so we check if an assigned value belongs to - * a previous generation, which which requires us to assign a new - * value. If we're the first to use a VMID for the new generation, - * we must flush necessary caches and TLBs on all CPUs.) - */ - if (likely(kvm->arch.vmid_gen == atomic64_read(&kvm_vmid_gen))) + if (!check_new_vmid_gen(kvm)) return; spin_lock(&kvm_vmid_lock); @@ -504,6 +514,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) */ preempt_disable(); local_irq_disable(); + + if (check_new_vmid_gen(kvm)) { + local_irq_enable(); + preempt_enable(); + continue; + } + kvm_guest_enter(); vcpu->mode = IN_GUEST_MODE; >> >> You must recheck gen after disabling interrupts to ensure global_gen >> didn't bump after update_vttbr but before entry. x86 has a lot of this, >> see vcpu_enter_guest() and vcpu->requests (doesn't apply directly to >> your case but may come in useful later). >> >>> >>>>> + >>>>> +/** >>>>> + * kvm_arch_vcpu_ioctl_run - the main VCPU run function to execute guest code >>>>> + * @vcpu: The VCPU pointer >>>>> + * @run: The kvm_run structure pointer used for userspace state exchange >>>>> + * >>>>> + * This function is called through the VCPU_RUN ioctl called from user space. It >>>>> + * will execute VM code in a loop until the time slice for the process is used >>>>> + * or some emulation is needed from user space in which case the function will >>>>> + * return with return value 0 and with the kvm_run structure filled in with the >>>>> + * required data for the requested emulation. >>>>> + */ >>>>> int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) >>>>> { >>>>> - return -EINVAL; >>>>> + int ret = 0; >>>>> + sigset_t sigsaved; >>>>> + >>>>> + if (vcpu->sigset_active) >>>>> + sigprocmask(SIG_SETMASK, &vcpu->sigset, &sigsaved); >>>>> + >>>>> + run->exit_reason = KVM_EXIT_UNKNOWN; >>>>> + while (run->exit_reason == KVM_EXIT_UNKNOWN) { >>>> >>>> It's not a good idea to read stuff from run unless it's part of the ABI, >>>> since userspace can play with it while you're reading it. It's harmless >>>> here but in other places it can lead to a vulnerability. >>>> >>> >>> ok, so in this case, userspace can 'suppress' an MMIO or interrupt >>> window operation. >>> >>> I can change to keep some local variable to maintain the state and >>> have some API convention for emulation functions, if you feel strongly >>> about it, but otherwise it feels to me like the code will be more >>> complicated. Do you have other ideas? >> >> x86 uses: >> >> 0 - return to userspace (run prepared) >> 1 - return to guest (run untouched) >> -ESOMETHING - return to userspace >> > > yeah, we can do that I guess, that's fair. > >> as return values from handlers and for locals (usually named 'r'). >> >> >> -- >> error compiling committee.c: too many arguments to function >> >> -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html