On 17/08/2017 12:20, David Hildenbrand wrote: > On 17.08.2017 12:18, Paolo Bonzini wrote: >> On 17/08/2017 11:55, David Hildenbrand wrote: >>> On 17.08.2017 11:44, Paolo Bonzini wrote: >>>> On 17/08/2017 11:28, Cornelia Huck wrote: >>>>> On Thu, 17 Aug 2017 11:16:59 +0200 >>>>> Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: >>>>> >>>>>> On 17/08/2017 09:36, Cornelia Huck wrote: >>>>>>>> What if we just sent a "vcpu move" request to all vcpus with the new >>>>>>>> pointer after it moved? That way the vcpu thread itself would be >>>>>>>> responsible for the migration to the new memory region. Only if all >>>>>>>> vcpus successfully moved, keep rolling (and allow foreign get_vcpu again). >>>>>>>> >>>>>>>> That way we should be basically lock-less and scale well. For additional >>>>>>>> icing, feel free to increase the vcpu array x2 every time it grows to >>>>>>>> not run into the slow path too often. >>>>>>> >>>>>>> I'd prefer the rcu approach: This is a mechanism already understood >>>>>>> well, no need to come up with a new one that will likely have its own >>>>>>> share of problems. >>>>>> >>>>>> What Alex is proposing _is_ RCU, except with a homegrown >>>>>> synchronize_rcu. Using kvm->srcu seems to be the best of both worlds. >>>>> >>>>> I'm worried a bit about the 'homegrown' part, though. >>>> >>>> I agree, that's why I'm suggesting SRCU instead. But it's a trick that >>>> has its uses. For example, if you were only doing reads from a work >>>> queue, flush_work_queue could be used as the "homegrown >>>> synchronize_rcu". In KVM you might use kvm_make_all_cpus_request, I guess. >>>> >>>>> I also may be misunderstanding what Alex means with "vcpu move"... >>>> >>>> My interpretation was "resizing the array" (so it moves in memory). >>> >>> Unpopular opinion: Let's keep it simple first (straight rcu) and >>> optimize later on. >> >> RCU vs. SRCU is about correctness, not optimization... > > Guess I am still missing the point why RCU cannot be used here. Because the body of kvm_foreach_vcpu might sleep. Paolo