On 03/30/2010 01:51 AM, Badari Pulavarty wrote:
Your io wait time is twice as long and your throughput is about half.
I think the qmeu block submission does an extra attempt at merging
requests. Does blktrace tell you anything interesting?
Yes. I see that in my testcase (2M writes) - QEMU is pickup 512K
requests from the virtio ring and merging them back to 2M before
submitting them.
Unfortunately, I can't do that quite easily in vhost-blk. QEMU
does re-creates iovecs for the merged IO. I have to come up with
a scheme to do this :(
I don't think that either vhost-blk or virtio-blk should do this.
Merging increases latency, and in the case of streaming writes makes it
impossible for the guest to prepare new requests while earlier ones are
being serviced (in effect it reduces the queue depth to 1).
qcow2 does benefit from merging, but it should do so itself without
impacting raw.
It does. I suggest using fio O_DIRECT random access patterns to avoid
such issues.
Well, I am not trying to come up with a test case where vhost-blk
performs better than virtio-blk. I am trying to understand where
and why vhost-blk performnce worse than virtio-blk.
In this case qemu-virtio is making an incorrect tradeoff. The guest
could easily merge those requests itself. If you want larger writes,
tune the guest to issue them.
Another way to look at it: merging improved bandwidth but increased
latency, yet you are only measuring bandwidth. If you measured only
latency you'd find that vhost-blk is better.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html