All blk requests are processed in notify_vq() which is in the context of ioeventfd thread: ioeventfd__thread(). The processing in notify_vq() may take a long time to complete and all devices share the single ioeventfd thead, so this might block other device's notify_vq() being called and starve other devices. This patch makes virtio blk's notify_vq() just notify the blk thread instead of doing the real hard read/write work. Tests show that the overhead of the notification operations is small. The reasons for using dedicated thead instead of using thead pool follow: 1) In thread pool model, each job handling operation: thread_pool__do_job() takes about 6 or 7 mutex_{lock,unlock} ops. Most of the mutex are global (job_mutex) which are contented by the threads in the pool. It's fine for the non performance critical virtio devices, such as console, rng, etc. But it's not optimal for net and blk devices. 2) Using dedicated threads to handle blk requests opens the door for user to set different IO priority for the blk threads. 3) It also reduces the contentions between net and blk devices if they do not share the thead pool. Signed-off-by: Asias He <asias.hejun@xxxxxxxxx> --- tools/kvm/virtio/blk.c | 26 +++++++++++++++++++++++++- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/tools/kvm/virtio/blk.c b/tools/kvm/virtio/blk.c index da92094..e0dc37d 100644 --- a/tools/kvm/virtio/blk.c +++ b/tools/kvm/virtio/blk.c @@ -49,6 +49,11 @@ struct blk_dev { struct virt_queue vqs[NUM_VIRT_QUEUES]; struct blk_dev_req reqs[VIRTIO_BLK_QUEUE_SIZE]; + + pthread_t io_thread; + int io_efd; + + struct kvm *kvm; }; static LIST_HEAD(bdevs); @@ -174,11 +179,26 @@ static int init_vq(struct kvm *kvm, void *dev, u32 vq, u32 pfn) return 0; } +static void *virtio_blk_thread(void *dev) +{ + struct blk_dev *bdev = dev; + u64 data; + + while (1) { + read(bdev->io_efd, &data, sizeof(u64)); + virtio_blk_do_io(bdev->kvm, &bdev->vqs[0], bdev); + } + + pthread_exit(NULL); + return NULL; +} + static int notify_vq(struct kvm *kvm, void *dev, u32 vq) { struct blk_dev *bdev = dev; + u64 data = 1; - virtio_blk_do_io(kvm, &bdev->vqs[vq], bdev); + write(bdev->io_efd, &data, sizeof(data)); return 0; } @@ -233,6 +253,8 @@ static int virtio_blk__init_one(struct kvm *kvm, struct disk_image *disk) .capacity = disk->size / SECTOR_SIZE, .seg_max = DISK_SEG_MAX, }, + .io_efd = eventfd(0, 0), + .kvm = kvm, }; virtio_init(kvm, bdev, &bdev->vdev, &blk_dev_virtio_ops, @@ -247,6 +269,8 @@ static int virtio_blk__init_one(struct kvm *kvm, struct disk_image *disk) disk_image__set_callback(bdev->disk, virtio_blk_complete); + pthread_create(&bdev->io_thread, NULL, virtio_blk_thread, bdev); + if (compat_id != -1) compat_id = compat__add_message("virtio-blk device was not detected", "While you have requested a virtio-blk device, " -- 1.7.10.2 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html