Re: [PATCH v8 11/15] ARM: KVM: World-switch implementation

Avi Kivity <avi@xxxxxxxxxx> · Tue, 19 Jun 2012 12:16:31 +0300

On 06/19/2012 01:05 AM, Christoffer Dall wrote:
>> Premature, but this is sad.  I suggest you split vmid generation from
>> next available vmid.  This allows you to make the generation counter
>> atomic so it may be read outside the lock.
>>
>> You can do
>>
>>    if (likely(kvm->arch.vmd_gen) == atomic_read(&kvm_vmid_gen)))
>>           return;
>>
>>    spin_lock(...
>>
> 
> I knew you were going to say something here :), please take a look at
> this and see if you agree:

It looks reasonable wrt locking.

> +
> +	/* First user of a new VMID generation? */
> +	if (unlikely(kvm_next_vmid == 0)) {
> +		atomic64_inc(&kvm_vmid_gen);
> +		kvm_next_vmid = 1;
> +
> +		/* This does nothing on UP */
> +		smp_call_function(reset_vm_context, NULL, 1);

Does this imply a memory barrier?  If not, smp_mb__after_atomic_inc().

> +
> +		/*
> +		 * On SMP we know no other CPUs can use this CPU's or
> +		 * each other's VMID since the kvm_vmid_lock blocks
> +		 * them from reentry to the guest.
> +		 */
> +
> +		reset_vm_context(NULL);

These two lines can be folded as on_each_cpu().

Don't you have a race here where you can increment the generation just
before guest entry?

cpu0                       cpu1
(vmid=0, gen=1)            (gen=0)
-------------------------- ----------------------
gen == global_gen, return

                           gen != global_gen
                           increment global_gen
                           smp_call_function
reset_vm_context
                           vmid=0

enter with vmid=0          enter with vmid=0

You must recheck gen after disabling interrupts to ensure global_gen
didn't bump after update_vttbr but before entry.  x86 has a lot of this,
see vcpu_enter_guest() and vcpu->requests (doesn't apply directly to
your case but may come in useful later).

> 
>>> +
>>> +/**
>>> + * kvm_arch_vcpu_ioctl_run - the main VCPU run function to execute guest code
>>> + * @vcpu:    The VCPU pointer
>>> + * @run:     The kvm_run structure pointer used for userspace state exchange
>>> + *
>>> + * This function is called through the VCPU_RUN ioctl called from user space. It
>>> + * will execute VM code in a loop until the time slice for the process is used
>>> + * or some emulation is needed from user space in which case the function will
>>> + * return with return value 0 and with the kvm_run structure filled in with the
>>> + * required data for the requested emulation.
>>> + */
>>>  int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>>  {
>>> -     return -EINVAL;
>>> +     int ret = 0;
>>> +     sigset_t sigsaved;
>>> +
>>> +     if (vcpu->sigset_active)
>>> +             sigprocmask(SIG_SETMASK, &vcpu->sigset, &sigsaved);
>>> +
>>> +     run->exit_reason = KVM_EXIT_UNKNOWN;
>>> +     while (run->exit_reason == KVM_EXIT_UNKNOWN) {
>>
>> It's not a good idea to read stuff from run unless it's part of the ABI,
>> since userspace can play with it while you're reading it.  It's harmless
>> here but in other places it can lead to a vulnerability.
>>
> 
> ok, so in this case, userspace can 'suppress' an MMIO or interrupt
> window operation.
> 
> I can change to keep some local variable to maintain the state and
> have some API convention for emulation functions, if you feel strongly
> about it, but otherwise it feels to me like the code will be more
> complicated. Do you have other ideas?

x86 uses:

 0 - return to userspace (run prepared)
 1 - return to guest (run untouched)
 -ESOMETHING - return to userspace

as return values from handlers and for locals (usually named 'r').

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html