On Tue, Sep 7, 2010 at 3:51 PM, Anthony Liguori <aliguori@xxxxxxxxxxxxxxxxxx> wrote: > On 09/07/2010 09:33 AM, Stefan Hajnoczi wrote: >> >> On Tue, Sep 7, 2010 at 2:41 PM, Anthony Liguori >> <aliguori@xxxxxxxxxxxxxxxxxx> wrote: >> >>> >>> The interface for copy-on-read is just an option within qemu-img create. >>> Streaming, on the other hand, requires a bit more thought. Today, I >>> have a >>> monitor command that does the following: >>> >>> stream<device> <sector offset> >>> >>> Which will try to stream the minimal amount of data for a single I/O >>> operation and then return how many sectors were successfully streamed. >>> >>> The idea about how to drive this interface is a loop like: >>> >>> offset = 0; >>> while offset< image_size: >>> wait_for_idle_time() >>> count = stream(device, offset) >>> offset += count >>> >>> Obviously, the "wait_for_idle_time()" requires wide system awareness. >>> The >>> thing I'm not sure about is 1) would libvirt want to expose a similar >>> stream >>> interface and let management software determine idle time 2) attempt to >>> detect idle time on it's own and provide a higher level interface. If >>> (2), >>> the question then becomes whether we should try to do this within qemu >>> and >>> provide libvirt a higher level interface. >>> >> >> A self-tuning solution is attractive because it reduces the need for >> other components (management stack) or the user to get involved. In >> this case self-tuning should be possible. We need to detect periods >> of I/O inactivity, for example tracking the number of in-flight >> requests and then setting a grace timer when it reaches zero. When >> the grace timer expires, we start streaming until the guest initiates >> I/O again. >> > > That detects idle I/O within a single QEMU guest, but you might have another > guest running that's I/O bound which means that from an overall system > throughput perspective, you really don't want to stream. > > I think libvirt might be able to do a better job here by looking at overall > system I/O usage. But I'm not sure hence this RFC :-) Isn't this what block I/O controller cgroups is meant to solve? If you give vm-1 50% block bandwidth and vm-2 50% block bandwidth then vm-1 can do streaming without eating into vm-2's guaranteed bandwidth. Also, I'm not sure we should worry about the priority of the I/O too much: perhaps the user wants their vm to stream more than they want an unimportant local vm that is currently I/O bound to have all resources to itself. So I think it makes sense to defer this and not try for system-wide knowledge inside a QEMU process. Stefan -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list