Re: [PATCH 00/10] vhost/qemu: thread per IO SCSI vq

Jason Wang <jasowang@xxxxxxxxxx> · Wed, 18 Nov 2020 13:17:50 +0800

On 2020/11/18 上午12:40, Stefan Hajnoczi wrote:
On Thu, Nov 12, 2020 at 05:18:59PM -0600, Mike Christie wrote:
The following kernel patches were made over Michael's vhost branch:

https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git/log/?h=vhost

and the vhost-scsi bug fix patchset:

https://lore.kernel.org/linux-scsi/20201112170008.GB1555653@stefanha-x1.localdomain/T/#t

And the qemu patch was made over the qemu master branch.

vhost-scsi currently supports multiple queues with the num_queues
setting, but we end up with a setup where the guest's scsi/block
layer can do a queue per vCPU and the layers below vhost can do
a queue per CPU. vhost-scsi will then do a num_queue virtqueues,
but all IO gets set on and completed on a single vhost-scsi thread.
After 2 - 4 vqs this becomes a bottleneck.

This patchset allows us to create a worker thread per IO vq, so we
can better utilize multiple CPUs with the multiple queues. It
implments Jason's suggestion to create the initial worker like
normal, then create the extra workers for IO vqs with the
VHOST_SET_VRING_ENABLE ioctl command added in this patchset.
How does userspace find out the tids and set their CPU affinity?

What is the meaning of the new VHOST_SET_VRING_ENABLE ioctl? It doesn't
really "enable" or "disable" the vq, requests are processed regardless.

Actually I think it should do the real "enable/disable" that tries to 
follow the virtio spec.

(E.g both PCI and MMIO have something similar).

The purpose of the ioctl isn't clear to me because the kernel could
automatically create 1 thread per vq without a new ioctl.

It's not necessarily to create or destroy kthread according to 
VRING_ENABLE but could be a hint.

  On the other
hand, if userspace is supposed to control worker threads then a
different interface would be more powerful:

   struct vhost_vq_worker_info {
       /*
        * The pid of an existing vhost worker that this vq will be
        * assigned to. When pid is 0 the virtqueue is assigned to the
        * default vhost worker. When pid is -1 a new worker thread is
        * created for this virtqueue. When pid is -2 the virtqueue's
        * worker thread is unchanged.
        *
        * If a vhost worker no longer has any virtqueues assigned to it
        * then it will terminate.
        *
        * The pid of the vhost worker is stored to this field when the
        * ioctl completes successfully. Use pid -2 to query the current
        * vhost worker pid.
        */
       __kernel_pid_t pid;  /* in/out */

       /* The virtqueue index*/
       unsigned int vq_idx; /* in */
   };

   ioctl(vhost_fd, VHOST_SET_VQ_WORKER, &info);

This seems to leave the question to userspace which I'm not sure it's 
good since it tries to introduce another scheduling layer.

Per vq worker seems be good enough to start with.

Thanks

Stefan