Re: [PATCH 09/10] Exit loop if we have been there too long

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/30/2010 08:12 AM, Paolo Bonzini wrote:
On 11/30/2010 02:47 PM, Anthony Liguori wrote:
On 11/30/2010 01:15 AM, Paolo Bonzini wrote:
On 11/30/2010 03:11 AM, Anthony Liguori wrote:

BufferedFile should hit the qemu_file_rate_limit check when the socket
buffer gets filled up.

The problem is that the file rate limit is not hit because work is
done elsewhere. The rate can limit the bandwidth used and makes QEMU
aware that socket operations may block (because that's what the
buffered file freeze/unfreeze logic does); but it cannot be used to
limit the _time_ spent in the migration code.

Yes, it can, if you set the rate limit sufficiently low.

You mean, just like you can drive a car without brakes by keeping the speed sufficiently low.

[..] accounting zero pages as full sized
pages should "fix" the problem.

I know you used quotes, but it's a very very generous definition of fix. Both these proposed "fixes" are nothing more than workarounds, and even particularly ugly ones. The worst thing about them is that there is no guarantee of migration finishing in a reasonable time, or at all.

If you account zero pages as full, you don't use effectively the bandwidth that was allotted to you, you use only 0.2% of it (8/4096). It then takes an exaggerate amount of time to start iteration on pages that matter. If you set the bandwidth low, instead, you do not have the bandwidth you need in order to converge.

Even from an aesthetic point of view, if there is such a thing, I don't understand why you advocate conflating network bandwidth and CPU usage into a single measurement. Nobody disagrees that all you propose is nice to have, and that what Juan sent is a stopgap measure (though a very effective one). However, this doesn't negate that Juan's accounting patches make a lot of sense in the current design.

Juan's patch, IIUC, does the following: If you've been iterating in a tight loop, return to the main loop for *one* iteration every 50ms.

But this means that during this 50ms period of time, a VCPU may be blocked from running. If the guest isn't doing a lot of device I/O *and* you're on a relatively low link speed, then this will mean that you don't hold qemu_mutex for more than 50ms at a time.

But in the degenerate case where you have a high speed link and you have a guest doing a lot of device I/O, you'll see the guest VCPU being blocked for 50ms, then getting to run for a very brief period of time, followed by another block for 50ms. The guest's execution will be extremely sporadic.

This isn't fixable with this approach. The only way to really fix this is to say that over a given period of time, migration may only consume XX amount of CPU time which guarantees the VCPUs get the qemu_mutex for the rest of the time.

This is exactly what rate limiting does. Yes, it results in a longer migration time but that's the trade-off we have to make if we want deterministic VCPU execution until we can implement threading properly.

If you want a simple example, doing I/O with the rtl8139 adapter while doing your migration test and run a tight loop in the get running gettimeofday(). Graph the results to see how much execution time the guest is actually getting.


In the long term, we need a new dirty bit interface from kvm.ko that
uses a multi-level table. That should dramatically improve scan
performance. We also need to implement live migration in a separate
thread that doesn't carry qemu_mutex while it runs.

This may be a good way to fix it, but it's also basically a rewrite.

The only correct short term solution I can see if rate limiting unfortunately.

Regards,

Anthony Liguori

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux