On 03/23/2010 03:00 AM, Badari Pulavarty wrote:
Forgot to CC: KVM list earlier
[RFC] vhost-blk implementation.eml
Subject:
[RFC] vhost-blk implementation
From:
Badari Pulavarty <pbadari@xxxxxxxxxx>
Date:
Mon, 22 Mar 2010 17:34:06 -0700
To:
virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx, qemu-devel@xxxxxxxxxx
Hi,
Inspired by vhost-net implementation, I did initial prototype
of vhost-blk to see if it provides any benefits over QEMU virtio-blk.
I haven't handled all the error cases, fixed naming conventions etc.,
but the implementation is stable to play with. I tried not to deviate
from vhost-net implementation where possible.
NOTE: Only change I had to make to vhost core code is to
increase VHOST_NET_MAX_SG to 130 (128+2) in vhost.h
Performance:
=============
I have done simple tests to see how it performs. I got very
encouraging results on sequential read tests. But on sequential
write tests, I see degrade over virtio-blk. I can't figure out and
explain why. Can some one shed light on whats happening here ?
Read Results:
=============
Test does read of 84GB file from the host (through virtio). I unmount
and mount the filesystem on the host to make sure there is nothing
in the page cache..
+#define VHOST_BLK_VQ_MAX 1
+
+struct vhost_blk {
+ struct vhost_dev dev;
+ struct vhost_virtqueue vqs[VHOST_BLK_VQ_MAX];
+ struct vhost_poll poll[VHOST_BLK_VQ_MAX];
+};
+
+static int do_handle_io(struct file *file, uint32_t type, uint64_t sector,
+ struct iovec *iov, int in)
+{
+ loff_t pos = sector<< 8;
+ int ret = 0;
+
+ if (type& VIRTIO_BLK_T_FLUSH) {
+ ret = vfs_fsync(file, file->f_path.dentry, 1);
+ } else if (type& VIRTIO_BLK_T_OUT) {
+ ret = vfs_writev(file, iov, in,&pos);
+ } else {
+ ret = vfs_readv(file, iov, in,&pos);
+ }
+ return ret;
+}
This should be done asynchronously. That is likely the cause of write
performance degradation. For reads, readahead means that that you're
async anyway, but writes/syncs are still synchronous.
I also think it should be done at the bio layer. File I/O is going to
be slower, if we do vhost-blk we should concentrate on maximum
performance. The block layer also exposes more functionality we can use
(asynchronous barriers for example).
btw, for fairness, cpu measurements should be done from the host side
and include the vhost thread.
--
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html