Re: [PATCH] vhost: Add polling mode

Razya Ladelsky <RAZYA@xxxxxxxxxx> · Tue, 29 Jul 2014 13:30:03 +0300

"Michael S. Tsirkin" <mst@xxxxxxxxxx> wrote on 29/07/2014 11:06:40 AM:

> From: "Michael S. Tsirkin" <mst@xxxxxxxxxx>
> To: Razya Ladelsky/Haifa/IBM@IBMIL, 
> Cc: kvm@xxxxxxxxxxxxxxx, abel.gordon@xxxxxxxxx, Joel Nider/Haifa/
> IBM@IBMIL, Yossi Kuperman1/Haifa/IBM@IBMIL, Eran Raichstein/Haifa/
> IBM@IBMIL, Alex Glikson/Haifa/IBM@IBMIL
> Date: 29/07/2014 11:06 AM
> Subject: Re: [PATCH] vhost: Add polling mode
> 
> On Mon, Jul 21, 2014 at 04:23:44PM +0300, Razya Ladelsky wrote:
> > Hello All,
> > 
> > When vhost is waiting for buffers from the guest driver (e.g., more 
> > packets
> > to send in vhost-net's transmit queue), it normally goes to sleep and 
> > waits
> > for the guest to "kick" it. This kick involves a PIO in the guest, and
> > therefore an exit (and possibly userspace involvement in translating 
this 
> > PIO
> > exit into a file descriptor event), all of which hurts performance.
> > 
> > If the system is under-utilized (has cpu time to spare), vhost can 
> > continuously poll the virtqueues for new buffers, and avoid asking 
> > the guest to kick us.
> > This patch adds an optional polling mode to vhost, that can be enabled 

> > via a kernel module parameter, "poll_start_rate".
> > 
> > When polling is active for a virtqueue, the guest is asked to
> > disable notification (kicks), and the worker thread continuously 
checks 
> > for
> > new buffers. When it does discover new buffers, it simulates a "kick" 
by
> > invoking the underlying backend driver (such as vhost-net), which 
thinks 
> > it
> > got a real kick from the guest, and acts accordingly. If the 
underlying
> > driver asks not to be kicked, we disable polling on this virtqueue.
> > 
> > We start polling on a virtqueue when we notice it has
> > work to do. Polling on this virtqueue is later disabled after 3 
seconds of
> > polling turning up no new work, as in this case we are better off 
> > returning
> > to the exit-based notification mechanism. The default timeout of 3 
seconds
> > can be changed with the "poll_stop_idle" kernel module parameter.
> > 
> > This polling approach makes lot of sense for new HW with 
posted-interrupts
> > for which we have exitless host-to-guest notifications. But even with 
> > support 
> > for posted interrupts, guest-to-host communication still causes exits. 

> > Polling adds the missing part.
> > 
> > When systems are overloaded, there won?t be enough cpu time for the 
> > various 
> > vhost threads to poll their guests' devices. For these scenarios, we 
plan 
> > to add support for vhost threads that can be shared by multiple 
devices, 
> > even of multiple vms. 
> > Our ultimate goal is to implement the I/O acceleration features 
described 
> > in:
> > KVM Forum 2013: Efficient and Scalable Virtio (by Abel Gordon) 
> > https://www.youtube.com/watch?v=9EyweibHfEs
> > and
> > https://www.mail-archive.com/kvm@xxxxxxxxxxxxxxx/msg98179.html
> > 
> > 
> > Comments are welcome, 
> > Thank you,
> > Razya
> > 
> > From: Razya Ladelsky <razya@xxxxxxxxxx>
> > 
> > Add an optional polling mode to continuously poll the virtqueues
> > for new buffers, and avoid asking the guest to kick us.
> > 
> > Signed-off-by: Razya Ladelsky <razya@xxxxxxxxxx>
> 
> This is an optimization patch, isn't it?
> Could you please include some numbers showing its
> effect?
> 
> 

Hi Michael,
Sure. I included them in a reply to Jason Wang in this thread,
Here it is:
http://www.spinics.net/linux/lists/kvm/msg106049.html

> > ---
> >  drivers/vhost/net.c   |    6 +-
> >  drivers/vhost/scsi.c  |    5 +-
> >  drivers/vhost/vhost.c |  247 
> > +++++++++++++++++++++++++++++++++++++++++++++++--
> >  drivers/vhost/vhost.h |   37 +++++++-
> >  4 files changed, 277 insertions(+), 18 deletions(-)
> 
> 
> Whitespace seems mangled to the point of making patch
> unreadable. Can you pls repost?
> 

Sure.

> > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> > index 971a760..558aecb 100644
> > --- a/drivers/vhost/net.c
> > +++ b/drivers/vhost/net.c
> > @@ -742,8 +742,10 @@ static int vhost_net_open(struct inode *inode, 
struct 
> > file *f)
> >         }
> >         vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX);
> > 
> > -       vhost_poll_init(n->poll + VHOST_NET_VQ_TX, handle_tx_net, 
POLLOUT, 
> > dev);
> > -       vhost_poll_init(n->poll + VHOST_NET_VQ_RX, handle_rx_net, 
POLLIN, 
> > dev);
> > +       vhost_poll_init(n->poll + VHOST_NET_VQ_TX, handle_tx_net, 
POLLOUT,
> > +                       vqs[VHOST_NET_VQ_TX]);
> > +       vhost_poll_init(n->poll + VHOST_NET_VQ_RX, handle_rx_net, 
POLLIN,
> > +                       vqs[VHOST_NET_VQ_RX]);
> > 
> >         f->private_data = n;
> > 
> > diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
> > index 4f4ffa4..56f0233 100644
> > --- a/drivers/vhost/scsi.c
> > +++ b/drivers/vhost/scsi.c
> > @@ -1528,9 +1528,8 @@ static int vhost_scsi_open(struct inode *inode, 
> > struct file *f)
> >         if (!vqs)
> >                 goto err_vqs;
> > 
> > -       vhost_work_init(&vs->vs_completion_work, 
> > vhost_scsi_complete_cmd_work);
> > -       vhost_work_init(&vs->vs_event_work, tcm_vhost_evt_work);
> > -
> > +       vhost_work_init(&vs->vs_completion_work, NULL, 
> > vhost_scsi_complete_cmd_work);
> > +       vhost_work_init(&vs->vs_event_work, NULL, tcm_vhost_evt_work);
> >         vs->vs_events_nr = 0;
> >         vs->vs_events_missed = false;
> > 
> > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> > index c90f437..678d766 100644
> > --- a/drivers/vhost/vhost.c
> > +++ b/drivers/vhost/vhost.c
> > @@ -24,9 +24,17 @@
> >  #include <linux/slab.h>
> >  #include <linux/kthread.h>
> >  #include <linux/cgroup.h>
> > +#include <linux/jiffies.h>
> >  #include <linux/module.h>
> > 
> >  #include "vhost.h"
> > +static int poll_start_rate = 0;
> > +module_param(poll_start_rate, int, S_IRUGO|S_IWUSR);
> > +MODULE_PARM_DESC(poll_start_rate, "Start continuous polling of 
virtqueue 
> > when rate of events is at least this number per jiffy. If 0, never 
start 
> > polling.");
> > +
> > +static int poll_stop_idle = 3*HZ; /* 3 seconds */
> > +module_param(poll_stop_idle, int, S_IRUGO|S_IWUSR);
> > +MODULE_PARM_DESC(poll_stop_idle, "Stop continuous polling of 
virtqueue 
> > after this many jiffies of no work.");
> > 
> >  enum {
> >         VHOST_MEMORY_MAX_NREGIONS = 64,
> > @@ -58,27 +66,27 @@ static int vhost_poll_wakeup(wait_queue_t *wait, 
> > unsigned mode, int sync,
> >         return 0;
> >  }
> > 
> > -void vhost_work_init(struct vhost_work *work, vhost_work_fn_t fn)
> > +void vhost_work_init(struct vhost_work *work, struct vhost_virtqueue 
*vq, 
> > vhost_work_fn_t fn)
> >  {
> >         INIT_LIST_HEAD(&work->node);
> >         work->fn = fn;
> >         init_waitqueue_head(&work->done);
> >         work->flushing = 0;
> >         work->queue_seq = work->done_seq = 0;
> > +       work->vq = vq;
> >  }
> >  EXPORT_SYMBOL_GPL(vhost_work_init);
> > 
> >  /* Init poll structure */
> >  void vhost_poll_init(struct vhost_poll *poll, vhost_work_fn_t fn,
> > -                    unsigned long mask, struct vhost_dev *dev)
> > +                    unsigned long mask, struct vhost_virtqueue *vq)
> >  {
> >         init_waitqueue_func_entry(&poll->wait, vhost_poll_wakeup);
> >         init_poll_funcptr(&poll->table, vhost_poll_func);
> >         poll->mask = mask;
> > -       poll->dev = dev;
> > +       poll->dev = vq->dev;
> >         poll->wqh = NULL;
> > -
> > -       vhost_work_init(&poll->work, fn);
> > +       vhost_work_init(&poll->work, vq, fn);
> >  }
> >  EXPORT_SYMBOL_GPL(vhost_poll_init);
> > 
> > @@ -174,6 +182,86 @@ void vhost_poll_queue(struct vhost_poll *poll)
> >  }
> >  EXPORT_SYMBOL_GPL(vhost_poll_queue);
> > 
> > +/* Enable or disable virtqueue polling (vqpoll.enabled) for a 
virtqueue.
> > + *
> > + * Enabling this mode it tells the guest not to notify ("kick") us 
when 
> > its
> > + * has made more work available on this virtqueue; Rather, we will 
> > continuously
> > + * poll this virtqueue in the worker thread. If multiple virtqueues 
are 
> > polled,
> > + * the worker thread polls them all, e.g., in a round-robin fashion.
> > + * Note that vqpoll.enabled doesn't always mean that this virtqueue 
is
> > + * actually being polled: The backend (e.g., net.c) may temporarily 
> > disable it
> > + * using vhost_disable/enable_notify(), while vqpoll.enabled is 
> > unchanged.
> > + *
> > + * It is assumed that these functions are called relatively rarely, 
when 
> > vhost
> > + * notices that this virtqueue's usage pattern significantly changed 
in a 
> > way
> > + * that makes polling more efficient than notification, or vice 
versa.
> > + * Also, we assume that vhost_vq_disable_vqpoll() is always called on 
vq
> > + * cleanup, so any allocations done by vhost_vq_enable_vqpoll() can 
be
> > + * reclaimed.
> > + */
> > +static void vhost_vq_enable_vqpoll(struct vhost_virtqueue *vq)
> > +{
> > +       if (vq->vqpoll.enabled)
> > +               return; /* already enabled, nothing to do */
> > +       if (!vq->handle_kick)
> > +               return; /* polling will be a waste of time if no 
callback! 
> > */
> > +       if (!(vq->used_flags & VRING_USED_F_NO_NOTIFY)) {
> > +               /* vq has guest notifications enabled. Disable them,
> > +                  and instead add vq to the polling list */
> > +               vhost_disable_notify(vq->dev, vq);
> > +               list_add_tail(&vq->vqpoll.link, 
&vq->dev->vqpoll_list);
> > +       }
> > +       vq->vqpoll.jiffies_last_kick = jiffies;
> > +       __get_user(vq->avail_idx, &vq->avail->idx); 
> > +       vq->vqpoll.enabled = true;
> > +
> > +       /* Map userspace's vq->avail to the kernel's memory space. */
> > +       if (get_user_pages_fast((unsigned long)vq->avail, 1, 0,
> > +               &vq->vqpoll.avail_page) != 1) {
> > +               /* TODO: can this happen, as we check access
> > +               to vq->avail in advance? */
> > +               BUG();
> > +       }
> > +       vq->vqpoll.avail_mapped = (struct vring_avail *) (
> > +               (unsigned long)kmap(vq->vqpoll.avail_page) |
> > +               ((unsigned long)vq->avail & ~PAGE_MASK));
> > +}
> > +
> > +/*
> > + * This function doesn't always succeed in changing the mode. 
Sometimes
> > + * a temporary race condition prevents turning on guest 
notifications, so
> > + * vq should be polled next time again.
> > + */
> > +static void vhost_vq_disable_vqpoll(struct vhost_virtqueue *vq)
> > +{
> > +       if (!vq->vqpoll.enabled) {
> > +               return; /* already disabled, nothing to do */
> > +       }
> > +       vq->vqpoll.enabled = false;
> > +
> > +       if (!list_empty(&vq->vqpoll.link)) {
> > +               /* vq is on the polling list, remove it from this list 
and
> > +                * instead enable guest notifications. */
> > +               list_del_init(&vq->vqpoll.link);
> > +               if (unlikely(vhost_enable_notify(vq->dev,vq))
> > +                       && !vq->vqpoll.shutdown) {
> > +                       /* Race condition: guest wrote before we 
enabled
> > +                        * notification, so we'll never get a 
notification 
> > for
> > +                        * this work - so continue polling mode for a 
> > while. */
> > +                       vhost_disable_notify(vq->dev, vq);
> > +                       vq->vqpoll.enabled = true;
> > +                       vhost_enable_notify(vq->dev, vq);
> > +                       return;
> > +               }
> > +       }
> > +
> > +       if (vq->vqpoll.avail_mapped) {
> > +               kunmap(vq->vqpoll.avail_page);
> > +               put_page(vq->vqpoll.avail_page);
> > +               vq->vqpoll.avail_mapped = 0;
> > +       }
> > +}
> > +
> >  static void vhost_vq_reset(struct vhost_dev *dev,
> >                            struct vhost_virtqueue *vq)
> >  {
> > @@ -199,6 +287,48 @@ static void vhost_vq_reset(struct vhost_dev *dev,
> >         vq->call = NULL;
> >         vq->log_ctx = NULL;
> >         vq->memory = NULL;
> > +       INIT_LIST_HEAD(&vq->vqpoll.link);
> > +       vq->vqpoll.enabled = false;
> > +       vq->vqpoll.shutdown = false;
> > +       vq->vqpoll.avail_mapped = NULL;
> > +}
> > +
> > +/* roundrobin_poll() takes worker->vqpoll_list, and returns one of 
the
> > + * virtqueues which the caller should kick, or NULL in case none 
should 
> > be
> > + * kicked. roundrobin_poll() also disables polling on a virtqueue 
which 
> > has
> > + * been polled for too long without success.
> > + *
> > + * This current implementation (the "round-robin" implementation) 
only
> > + * polls the first vq in the list, returning it or NULL as 
appropriate, 
> > and
> > + * moves this vq to the end of the list, so next time a different one 
is
> > + * polled.
> > + */
> > +static struct vhost_virtqueue *roundrobin_poll(struct list_head 
*list) {
> > +       struct vhost_virtqueue *vq;
> > +       u16 avail_idx;
> > +
> > +
> > +       if (list_empty(list))
> > +               return NULL;
> > +
> > +       vq = list_first_entry(list, struct vhost_virtqueue, 
vqpoll.link);
> > +       WARN_ON(!vq->vqpoll.enabled);
> > +       list_move_tail(&vq->vqpoll.link, list);
> > +
> > +       /* See if there is any new work available from the guest. */
> > +       /* TODO: can check the optional idx feature, and if we haven't
> > +        * reached that idx yet, don't kick... */
> > +       avail_idx = vq->vqpoll.avail_mapped->idx;
> > +       if (avail_idx != vq->last_avail_idx) {
> > +               return vq;
> > +       }
> > +       if (jiffies > vq->vqpoll.jiffies_last_kick + poll_stop_idle) {
> > +               /* We've been polling this virtqueue for a long time 
with 
> > no
> > +                * results, so switch back to guest notification
> > +                */
> > +               vhost_vq_disable_vqpoll(vq);
> > +       }
> > +       return NULL;
> >  }
> > 
> >  static int vhost_worker(void *data)
> > @@ -237,12 +367,66 @@ static int vhost_worker(void *data)
> >                 spin_unlock_irq(&dev->work_lock);
> > 
> >                 if (work) {
> > +                       struct vhost_virtqueue *vq = work->vq;
> >                         __set_current_state(TASK_RUNNING);
> >                         work->fn(work);
> > +                       /* Keep track of the work rate, for deciding 
when 
> > to
> > +                        * enable polling */
> > +                       if (vq) {
> > +                               if (vq->vqpoll.jiffies_last_work != 
> > jiffies) {
> > +                                       vq->vqpoll.jiffies_last_work = 

> > jiffies;
> > +                                       vq->vqpoll.work_this_jiffy = 
0;
> > +                               }
> > +                               vq->vqpoll.work_this_jiffy++;
> > +                       }
> > +                       /* If vq is in the round-robin list of 
virtqueues 
> > being
> > +                        * constantly checked by this thread, move vq 
the 
> > end
> > +                        * of the queue, because it had its fair 
chance 
> > now.
> > +                        */
> > +                       if (vq && !list_empty(&vq->vqpoll.link)) {
> > +                               list_move_tail(&vq->vqpoll.link,
> > +                                       &dev->vqpoll_list);
> > +                       }
> > +                       /* Otherwise, if this vq is looking for 
> > notifications
> > +                        * but vq polling is not enabled for it, do it 

> > now.
> > +                        */
> > +                       else if (poll_start_rate && vq && 
vq->handle_kick 
> > &&
> > +                               !vq->vqpoll.enabled &&
> > +                               !vq->vqpoll.shutdown &&
> > +                               !(vq->used_flags & 
VRING_USED_F_NO_NOTIFY) 
> > &&
> > +                               vq->vqpoll.work_this_jiffy >=
> > +                                       poll_start_rate) {
> > +                               vhost_vq_enable_vqpoll(vq);
> > +                       }
> > +               }
> > +               /* Check one virtqueue from the round-robin list */
> > +               if (!list_empty(&dev->vqpoll_list)) {
> > +                       struct vhost_virtqueue *vq;
> > +
> > +                       vq = roundrobin_poll(&dev->vqpoll_list);
> > +
> > +                       if (vq) {
> > +                               vq->handle_kick(&vq->poll.work);
> > +                               vq->vqpoll.jiffies_last_kick=jiffies;
> > +                       }
> > +
> > +                       /* If our polling list isn't empty, ask to 
> > continue
> > +                        * running this thread, don't yield.
> > +                        */
> > +                       __set_current_state(TASK_RUNNING);
> >                         if (need_resched())
> > +                        schedule();
> > +
> > +               }
> > +               else {
> > +                       if (work)
> > +                       {
> > +                               if (need_resched())
> > +                                       schedule();
> > +                       }
> > +                       else
> >                                 schedule();
> > -               } else
> > -                       schedule();
> > +               }
> > 
> >         }
> >         unuse_mm(dev->mm);
> > @@ -306,6 +490,7 @@ void vhost_dev_init(struct vhost_dev *dev,
> >         dev->mm = NULL;
> >         spin_lock_init(&dev->work_lock);
> >         INIT_LIST_HEAD(&dev->work_list);
> > +   INIT_LIST_HEAD(&dev->vqpoll_list);
> >         dev->worker = NULL;
> > 
> >         for (i = 0; i < dev->nvqs; ++i) {
> > @@ -318,7 +503,7 @@ void vhost_dev_init(struct vhost_dev *dev,
> >                 vhost_vq_reset(dev, vq);
> >                 if (vq->handle_kick)
> >                         vhost_poll_init(&vq->poll, vq->handle_kick,
> > -                                       POLLIN, dev);
> > +                                       POLLIN, vq);
> >         }
> >  }
> >  EXPORT_SYMBOL_GPL(vhost_dev_init);
> > @@ -350,7 +535,7 @@ static int vhost_attach_cgroups(struct vhost_dev 
*dev)
> >         struct vhost_attach_cgroups_struct attach;
> > 
> >         attach.owner = current;
> > -       vhost_work_init(&attach.work, vhost_attach_cgroups_work);
> > +       vhost_work_init(&attach.work, NULL, 
vhost_attach_cgroups_work);
> >         vhost_work_queue(dev, &attach.work);
> >         vhost_work_flush(dev, &attach.work);
> >         return attach.ret;
> > @@ -444,6 +629,25 @@ void vhost_dev_stop(struct vhost_dev *dev)
> >  }
> >  EXPORT_SYMBOL_GPL(vhost_dev_stop);
> > 
> > +/* shutdown_vqpoll() asks the worker thread to shut down virtqueue 
> > polling
> > + * mode for a given virtqueue which is itself being shut down. We ask 
the
> > + * worker thread to do this rather than doing it directly, so that we 

> > don't
> > + * race with the worker thread's use of the queue.
> > + */
> > +static void shutdown_vqpoll_work(struct vhost_work *work)
> > +{
> > +       work->vq->vqpoll.shutdown = true;
> > +       vhost_vq_disable_vqpoll(work->vq);
> > +       WARN_ON(work->vq->vqpoll.avail_mapped);
> > +}
> > +
> > +static void shutdown_vqpoll(struct vhost_virtqueue *vq)
> > +{
> > +       struct vhost_work work;
> > +       vhost_work_init(&work, vq, shutdown_vqpoll_work);
> > +       vhost_work_queue(vq->dev, &work);
> > +       vhost_work_flush(vq->dev, &work);
> > +}
> >  /* Caller should have device mutex if and only if locked is set */
> >  void vhost_dev_cleanup(struct vhost_dev *dev, bool locked)
> >  {
> > @@ -460,6 +664,7 @@ void vhost_dev_cleanup(struct vhost_dev *dev, bool 

> > locked)
> >                         eventfd_ctx_put(dev->vqs[i]->call_ctx);
> >                 if (dev->vqs[i]->call)
> >                         fput(dev->vqs[i]->call);
> > +               shutdown_vqpoll(dev->vqs[i]);
> >                 vhost_vq_reset(dev, dev->vqs[i]);
> >         }
> >         vhost_dev_free_iovecs(dev);
> > @@ -1491,6 +1696,19 @@ bool vhost_enable_notify(struct vhost_dev *dev, 

> > struct vhost_virtqueue *vq)
> >         u16 avail_idx;
> >         int r;
> > 
> > +       /* In polling mode, when the backend (e.g., net.c) asks to 
enable
> > +        * notifications, we don't enable guest notifications. 
Instead, 
> > start
> > +        * polling on this vq by adding it to the round-robin list.
> > +        */
> > +       if (vq->vqpoll.enabled) {
> > +               if (list_empty(&vq->vqpoll.link)) {
> > +                       list_add_tail(&vq->vqpoll.link,
> > +                               &vq->dev->vqpoll_list);
> > +                       vq->vqpoll.jiffies_last_kick = jiffies;
> > +               }
> > +               return false;
> > +       }
> > +
> >         if (!(vq->used_flags & VRING_USED_F_NO_NOTIFY))
> >                 return false;
> >         vq->used_flags &= ~VRING_USED_F_NO_NOTIFY;
> > @@ -1528,6 +1746,17 @@ void vhost_disable_notify(struct vhost_dev 
*dev, 
> > struct vhost_virtqueue *vq)
> >  {
> >         int r;
> > 
> > +       /* If this virtqueue is vqpoll.enabled, and on the polling 
list, 
> > it
> > +        * will generate notifications even if the guest is asked not 
to 
> > send
> > +        * them. So we must remove it from the round-robin polling 
list.
> > +        * Note that vqpoll.enabled remains set.
> > +        */
> > +       if (vq->vqpoll.enabled) {
> > +               if(!list_empty(&vq->vqpoll.link))
> > +                       list_del_init(&vq->vqpoll.link);
> > +               return;
> > +       }
> > +
> >         if (vq->used_flags & VRING_USED_F_NO_NOTIFY)
> >                 return;
> >         vq->used_flags |= VRING_USED_F_NO_NOTIFY;
> > diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
> > index 3eda654..feb16d6 100644
> > --- a/drivers/vhost/vhost.h
> > +++ b/drivers/vhost/vhost.h
> > @@ -24,6 +24,7 @@ struct vhost_work {
> >         int                       flushing;
> >         unsigned                  queue_seq;
> >         unsigned                  done_seq;
> > +       struct vhost_virtqueue    *vq;
> >  };
> > 
> >  /* Poll a file (eventfd or socket) */
> > @@ -37,11 +38,11 @@ struct vhost_poll {
> >         struct vhost_dev         *dev;
> >  };
> > 
> > -void vhost_work_init(struct vhost_work *work, vhost_work_fn_t fn);
> > +void vhost_work_init(struct vhost_work *work, struct vhost_virtqueue 
*vq, 
> > vhost_work_fn_t fn);
> >  void vhost_work_queue(struct vhost_dev *dev, struct vhost_work 
*work);
> > 
> >  void vhost_poll_init(struct vhost_poll *poll, vhost_work_fn_t fn,
> > -                    unsigned long mask, struct vhost_dev *dev);
> > +                    unsigned long mask, struct vhost_virtqueue  *vq);
> >  int vhost_poll_start(struct vhost_poll *poll, struct file *file);
> >  void vhost_poll_stop(struct vhost_poll *poll);
> >  void vhost_poll_flush(struct vhost_poll *poll);
> > @@ -54,8 +55,6 @@ struct vhost_log {
> >         u64 len;
> >  };
> > 
> > -struct vhost_virtqueue;
> > -
> >  /* The virtqueue structure describes a queue attached to a device. */
> >  struct vhost_virtqueue {
> >         struct vhost_dev *dev;
> > @@ -110,6 +109,35 @@ struct vhost_virtqueue {
> >         /* Log write descriptors */
> >         void __user *log_base;
> >         struct vhost_log *log;
> > +   struct {
> > +      /* When a virtqueue is in vqpoll.enabled mode, it declares
> > +       * that instead of using guest notifications (kicks) to
> > +       * discover new work, we prefer to continuously poll this
> > +       * virtqueue in the worker thread.
> > +       * If !enabled, the rest of the fields below are undefined.
> > +       */
> > +      bool enabled;
> > +      /* vqpoll.enabled doesn't always mean that this virtqueue is
> > +       * actually being polled: The backend (e.g., net.c) may
> > +       * temporarily disable it using vhost_disable/enable_notify().
> > +       * vqpoll.link is used to maintain the thread's round-robin
> > +       * list of virtqueues that actually need to be polled.
> > +       * Note list_empty(link) means this virtqueue isn't polled.
> > +       */
> > +      struct list_head link;
> > +      /* If this flag is true, the virtqueue is being shut down,
> > +       * so vqpoll should not be re-enabled.
> > +       */
> > +      bool shutdown;
> > +      /* Various counters used to decide when to enter polling mode
> > +       * or leave it and return to notification mode.
> > +       */
> > +      unsigned long jiffies_last_kick;
> > +      unsigned long jiffies_last_work;
> > +      int work_this_jiffy;
> > +      struct page *avail_page;
> > +      volatile struct vring_avail *avail_mapped;
> > +   } vqpoll;
> >  };
> > 
> >  struct vhost_dev {
> > @@ -123,6 +151,7 @@ struct vhost_dev {
> >         spinlock_t work_lock;
> >         struct list_head work_list;
> >         struct task_struct *worker;
> > +        struct list_head vqpoll_list;
> >  };
> > 
> >  void vhost_dev_init(struct vhost_dev *, struct vhost_virtqueue **vqs, 
int 
> > nvqs);
> > -- 
> > 1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html