Hi, I stumbled across poor performance of virtio-blk while working on a high-performance network storage protocol. Moving virtio-blk's host side to kernel did increase single queue IOPS, but multiqueue disk still was not scaling well. It turned out that vhost handles events from all virtio queues in one helper thread, and that's pretty much a big serialization point. The following patch enables events handling in per-queue thread and increases IO concurrency, see IOPS numbers: # num-queues # bare metal # virtio-blk # vhost-blk 1 171k 148k 195k 2 328k 249k 349k 3 479k 179k 501k 4 622k 143k 620k 5 755k 136k 737k 6 887k 131k 830k 7 1004k 126k 926k 8 1099k 117k 1001k 9 1194k 115k 1055k 10 1278k 109k 1130k 11 1345k 110k 1119k 12 1411k 104k 1201k 13 1466k 106k 1260k 14 1517k 103k 1296k 15 1552k 102k 1322k 16 1480k 101k 1346k Vitaly Mayatskikh (1): vhost: add per-vq worker thread drivers/vhost/vhost.c | 123 +++++++++++++++++++++++++++++++----------- drivers/vhost/vhost.h | 11 +++- 2 files changed, 100 insertions(+), 34 deletions(-) -- 2.17.1