Re: [PATCH RFC 0/2] KVM: use RCU to allow dynamic kvm->vcpus array

Marc Zyngier <marc.zyngier@xxxxxxx> · Fri, 18 Aug 2017 15:22:18 +0100

On 18/08/17 15:10, Radim Krčmář wrote:
> 2017-08-17 21:17+0200, Alexander Graf:
>> On 17.08.17 16:54, Radim Krčmář wrote:
>>> 2017-08-17 09:04+0200, Alexander Graf:
>>>> What if we just sent a "vcpu move" request to all vcpus with the new pointer
>>>> after it moved? That way the vcpu thread itself would be responsible for the
>>>> migration to the new memory region. Only if all vcpus successfully moved,
>>>> keep rolling (and allow foreign get_vcpu again).
>>>
>>> I'm not sure if I understood this.  You propose to cache kvm->vcpus in
>>> vcpu->vcpus and do an extensions of this,
>>>
>>>    int vcpu_create(...) {
>>>      if (resize_needed(kvm->vcpus)) {
>>>        old_vcpus = kvm->vcpus
>>>        kvm->vcpus = make_bigger(kvm->vcpus)
>>
>> if (kvm->vcpus != old_vcpus) :)
>>
>>>        kvm_make_all_cpus_request(kvm, KVM_REQ_UPDATE_VCPUS)
>>
>> IIRC you'd need some manual bookkeeping to ensure that all users have
>> switched to the new array. Or set the KVM_REQUEST_WAIT flag :).
> 
> Absolutely.  I was thinking about synchronous execution, which might
> need extra work to expedite halted VCPUs.  Letting the last user free it
> is plausible and would need more protection against races.
> 
>>>        free(old_vcpus)
>>>      }
>>>      vcpu->vcpus = kvm->vcpus
>>>    }
>>>
>>> with added extra locking, (S)RCU, on accesses that do not come from
>>> VCPUs (irqfd and VM ioctl)?
>>
>> Well, in an ideal world we wouldn't have any users to vcpu structs outside
>> of the vcpus obviously. Every time we do, we should either reconsider
>> whether the design is smart and if we think it is, protect them accordingly.
> 
> And there would be no linear access to all VCPUs. :)
> 
> The main user of kvm->vcpus is kvm_for_each_vcpu(), which is well suited
> for a list, so we can change the design of kvm_for_each_vcpu() to use a
> list head in struct kvm_vcpu with head/tail in struct kvm.
> (The list is trivial to make lockless as we only append.)
> 
> This would allow more flexibility with the remaining uses.
> 
>> Maybe even hard code separate request mechanisms for the few cases where
>> it's reasonable?
> 
> All non-kvm_for_each_vcpu() seem to need accesss outside of VCPU scope.
> 
> We have few awkward accesses that can be handled keeping track of kvm
> state and all remaining uses need some kind of "int -> struct kvm_vcpu"
> mapping, where the integer is arbitrary.
> 
> All users of kvm_get_vcpu_by_id() need a vcpu_id mapping, but hijack
> kvm->vcpus for O(1) access if lucky, with fallback to
> kvm_for_each_vcpu().  Adding a vcpu_id mapping seems reasonable.
> 
> s390 __floating_irq_kick() and x86 kvm_irq_delivery_to_apic() are
> keeping a bitmap for kvm->vcpus indices.  They want compact indices,
> which cannot be provided by vcpu_id mapping.
> 
> I think that MIPS and ARM use the index in kvm->vcpus for userspace
> communication, which looks dangerous as userspace shouldn't know the
> position.  Not much we can do because of that.
I think (at least for the ARM side) that we could switch whatever use we
have of the index to a vcpu_id. The worse offender (as far as I can
remember) is when injecting an interrupt, and that could be creatively
re-purposed to describe an affinity value in a backward compatible way.

Probably.

	N,
-- 
Jazz is not dead. It just smells funny...