Re: [PATCH RFC 04/15] KVM: Implement ring-based dirty memory tracking

Paolo Bonzini <pbonzini@xxxxxxxxxx> · Tue, 3 Dec 2019 14:48:10 +0100

On 02/12/19 22:50, Sean Christopherson wrote:
>>
>> I discussed this with Paolo, but I think Paolo preferred the per-vm
>> ring because there's no good reason to choose vcpu0 as what (1)
>> suggested.  While if to choose (2) we probably need to lock even for
>> per-cpu ring, so could be a bit slower.
> Ya, per-vm is definitely better than dumping on vcpu0.  I'm hoping we can
> find a third option that provides comparable performance without using any
> per-vcpu rings.
> 

The advantage of per-vCPU rings is that it naturally: 1) parallelizes
the processing of dirty pages; 2) makes userspace vCPU thread do more
work on vCPUs that dirty more pages.

I agree that on the producer side we could reserve multiple entries in
the case of PML (and without PML only one entry should be added at a
time).  But I'm afraid that things get ugly when the ring is full,
because you'd have to wait for all vCPUs to finish publishing the
entries they have reserved.

It's ugly that we _also_ need a per-VM ring, but unfortunately some
operations do not really have a vCPU that they can refer to.

Paolo