On Sun, 2013-07-28 at 10:15 +0300, Michael S. Tsirkin wrote: > On Sun, Jul 28, 2013 at 04:40:50AM +0100, Ben Hutchings wrote: > > On Sat, 2013-07-27 at 22:51 +0200, Wolfram Gloger wrote: > > > Ben Hutchings <ben@xxxxxxxxxxxxxxx> writes: > > > > > > > This sounds like it could be suitable for stable, but that doesn't seem > > > > to have been requested by the author. I'm cc'ing those involved so they > > > > can make a decision whether this should be included in 3.2.y or other > > > > stable branches. > > > > > > What alarmed me - as an extensive virtio user - was that Michael stated: > > > > > > The race has been there from day 1, but it got especially nasty in > > > 3.0 when commit a5c262c5fd83ece01bd649fb08416c501d4c59d7 > > > "virtio_ring: support event idx feature" added more dependency on > > > vq state. > > > > > > > Thanks, but these are not in quite the right patch format. > > > > > > Oh sorry, I didn't think you might take this directly as produced by me. > > > Hope these are better? > > > > David and Michael, please consider these backports for Linux 3.2. It > > looks like these should also work for 3.0, while for 3.4 > > virtqueue_enable_cb() already has a kernel-doc comment and the > > virtio_ring patch would need to be adjusted for that. > > > > Ben. > > The backports look good to me. OK, I've queued these up for 3.2. Greg, please queue up the attached patches for 3.0 and 3.4: 3.0: virtio-race-1of2.diff, virtio-race-2of2.diff 3.4: virtio-race-1of2.diff, virtio-race-2of2-3.4.diff Ben. -- Ben Hutchings All extremists should be taken out and shot.
From: Michael S. Tsirkin <mst@xxxxxxxxxx> Subject: virtio_net: fix race in RX VQ processing commit cbdadbbf0c790f79350a8f36029208944c5487d0 upstream virtio net called virtqueue_enable_cq on RX path after napi_complete, so with NAPI_STATE_SCHED clear - outside the implicit napi lock. This violates the requirement to synchronize virtqueue_enable_cq wrt virtqueue_add_buf. In particular, used event can move backwards, causing us to lose interrupts. In a debug build, this can trigger panic within START_USE. Jason Wang reports that he can trigger the races artificially, by adding udelay() in virtqueue_enable_cb() after virtio_mb(). However, we must call napi_complete to clear NAPI_STATE_SCHED before polling the virtqueue for used buffers, otherwise napi_schedule_prep in a callback will fail, causing us to lose RX events. To fix, call virtqueue_enable_cb_prepare with NAPI_STATE_SCHED set (under napi lock), later call virtqueue_poll with NAPI_STATE_SCHED clear (outside the lock). Reported-by: Jason Wang <jasowang@xxxxxxxxxx> Tested-by: Jason Wang <jasowang@xxxxxxxxxx> Acked-by: Jason Wang <jasowang@xxxxxxxxxx> Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx> Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx> [wg: Backported to 3.2] Signed-off-by: Wolfram Gloger <wmglo@xxxxxxxxxxxxxxxxxxxxxxxx> --- diff -upr linux-3.2.49/drivers/net/virtio_net.c linux-3.2.49wg/drivers/net/virtio_net.c --- linux-3.2.49/drivers/net/virtio_net.c 2012-01-05 00:55:44.000000000 +0100 +++ linux-3.2.49wg/drivers/net/virtio_net.c 2013-07-27 13:57:33.000000000 +0200 @@ -508,7 +508,7 @@ static int virtnet_poll(struct napi_stru { struct virtnet_info *vi = container_of(napi, struct virtnet_info, napi); void *buf; - unsigned int len, received = 0; + unsigned int r, len, received = 0; again: while (received < budget && @@ -525,8 +525,9 @@ again: /* Out of packets? */ if (received < budget) { + r = virtqueue_enable_cb_prepare(vi->rvq); napi_complete(napi); - if (unlikely(!virtqueue_enable_cb(vi->rvq)) && + if (unlikely(virtqueue_poll(vi->rvq, r)) && napi_schedule_prep(napi)) { virtqueue_disable_cb(vi->rvq); __napi_schedule(napi);
From: Michael S. Tsirkin <mst@xxxxxxxxxx> Subject: virtio: support unlocked queue poll commit cc229884d3f77ec3b1240e467e0236c3e0647c0c upstream. This adds a way to check ring empty state after enable_cb outside any locks. Will be used by virtio_net. Note: there's room for more optimization: caller is likely to have a memory barrier already, which means we might be able to get rid of a barrier here. Deferring this optimization until we do some benchmarking. Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx> Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx> [wg: Backported to 3.2] Signed-off-by: Wolfram Gloger <wmglo@xxxxxxxxxxxxxxxxxxxxxxxx> --- diff -upr linux-3.2.49/drivers/virtio/virtio_ring.c linux-3.2.49wg/drivers/virtio/virtio_ring.c --- linux-3.2.49/drivers/virtio/virtio_ring.c 2013-07-27 18:13:40.000000000 +0200 +++ linux-3.2.49wg/drivers/virtio/virtio_ring.c 2013-07-27 13:57:28.000000000 +0200 @@ -360,9 +360,22 @@ void virtqueue_disable_cb(struct virtque } EXPORT_SYMBOL_GPL(virtqueue_disable_cb); -bool virtqueue_enable_cb(struct virtqueue *_vq) +/** + * virtqueue_enable_cb_prepare - restart callbacks after disable_cb + * @vq: the struct virtqueue we're talking about. + * + * This re-enables callbacks; it returns current queue state + * in an opaque unsigned value. This value should be later tested by + * virtqueue_poll, to detect a possible race between the driver checking for + * more work, and enabling callbacks. + * + * Caller must ensure we don't call this with other virtqueue + * operations at the same time (except where noted). + */ +unsigned virtqueue_enable_cb_prepare(struct virtqueue *_vq) { struct vring_virtqueue *vq = to_vvq(_vq); + u16 last_used_idx; START_USE(vq); @@ -372,15 +385,45 @@ bool virtqueue_enable_cb(struct virtqueu * either clear the flags bit or point the event index at the next * entry. Always do both to keep code simple. */ vq->vring.avail->flags &= ~VRING_AVAIL_F_NO_INTERRUPT; - vring_used_event(&vq->vring) = vq->last_used_idx; + vring_used_event(&vq->vring) = last_used_idx = vq->last_used_idx; + END_USE(vq); + return last_used_idx; +} +EXPORT_SYMBOL_GPL(virtqueue_enable_cb_prepare); + +/** + * virtqueue_poll - query pending used buffers + * @vq: the struct virtqueue we're talking about. + * @last_used_idx: virtqueue state (from call to virtqueue_enable_cb_prepare). + * + * Returns "true" if there are pending used buffers in the queue. + * + * This does not need to be serialized. + */ +bool virtqueue_poll(struct virtqueue *_vq, unsigned last_used_idx) +{ + struct vring_virtqueue *vq = to_vvq(_vq); + virtio_mb(); - if (unlikely(more_used(vq))) { - END_USE(vq); - return false; - } + return (u16)last_used_idx != vq->vring.used->idx; +} +EXPORT_SYMBOL_GPL(virtqueue_poll); - END_USE(vq); - return true; +/** + * virtqueue_enable_cb - restart callbacks after disable_cb. + * @vq: the struct virtqueue we're talking about. + * + * This re-enables callbacks; it returns "false" if there are pending + * buffers in the queue, to detect a possible race between the driver + * checking for more work, and enabling callbacks. + * + * Caller must ensure we don't call this with other virtqueue + * operations at the same time (except where noted). + */ +bool virtqueue_enable_cb(struct virtqueue *_vq) +{ + unsigned last_used_idx = virtqueue_enable_cb_prepare(_vq); + return !virtqueue_poll(_vq, last_used_idx); } EXPORT_SYMBOL_GPL(virtqueue_enable_cb); diff -upr linux-3.2.49/include/linux/virtio.h linux-3.2.49wg/include/linux/virtio.h --- linux-3.2.49/include/linux/virtio.h 2012-01-05 00:55:44.000000000 +0100 +++ linux-3.2.49wg/include/linux/virtio.h 2013-07-27 13:57:28.000000000 +0200 @@ -96,6 +96,10 @@ void virtqueue_disable_cb(struct virtque bool virtqueue_enable_cb(struct virtqueue *vq); +unsigned virtqueue_enable_cb_prepare(struct virtqueue *vq); + +bool virtqueue_poll(struct virtqueue *vq, unsigned); + bool virtqueue_enable_cb_delayed(struct virtqueue *vq); void *virtqueue_detach_unused_buf(struct virtqueue *vq);
From: Michael S. Tsirkin <mst@xxxxxxxxxx> Date: Tue, 9 Jul 2013 13:19:18 +0300 Subject: virtio: support unlocked queue poll commit cc229884d3f77ec3b1240e467e0236c3e0647c0c upstream. This adds a way to check ring empty state after enable_cb outside any locks. Will be used by virtio_net. Note: there's room for more optimization: caller is likely to have a memory barrier already, which means we might be able to get rid of a barrier here. Deferring this optimization until we do some benchmarking. Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx> Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx> [wg: Backported to 3.2] Signed-off-by: Wolfram Gloger <wmglo@xxxxxxxxxxxxxxxxxxxxxxxx> [bwh: Backported to 3.4: adjust context] Signed-off-by: Ben Hutchings <ben@xxxxxxxxxxxxxxx> --- diff -upr linux-3.2.49/drivers/virtio/virtio_ring.c linux-3.2.49wg/drivers/virtio/virtio_ring.c --- linux-3.2.49/drivers/virtio/virtio_ring.c 2013-07-27 18:13:40.000000000 +0200 +++ linux-3.2.49wg/drivers/virtio/virtio_ring.c 2013-07-27 13:57:28.000000000 +0200 @@ -498,16 +498,18 @@ void virtqueue_disable_cb(struct virtque * virtqueue_enable_cb - restart callbacks after disable_cb. * @vq: the struct virtqueue we're talking about. * - * This re-enables callbacks; it returns "false" if there are pending - * buffers in the queue, to detect a possible race between the driver - * checking for more work, and enabling callbacks. + * This re-enables callbacks; it returns current queue state + * in an opaque unsigned value. This value should be later tested by + * virtqueue_poll, to detect a possible race between the driver checking for + * more work, and enabling callbacks. * * Caller must ensure we don't call this with other virtqueue * operations at the same time (except where noted). */ -bool virtqueue_enable_cb(struct virtqueue *_vq) +unsigned virtqueue_enable_cb_prepare(struct virtqueue *_vq) { struct vring_virtqueue *vq = to_vvq(_vq); + u16 last_used_idx; START_USE(vq); @@ -517,15 +519,45 @@ bool virtqueue_enable_cb(struct virtqueu * either clear the flags bit or point the event index at the next * entry. Always do both to keep code simple. */ vq->vring.avail->flags &= ~VRING_AVAIL_F_NO_INTERRUPT; - vring_used_event(&vq->vring) = vq->last_used_idx; + vring_used_event(&vq->vring) = last_used_idx = vq->last_used_idx; + END_USE(vq); + return last_used_idx; +} +EXPORT_SYMBOL_GPL(virtqueue_enable_cb_prepare); + +/** + * virtqueue_poll - query pending used buffers + * @vq: the struct virtqueue we're talking about. + * @last_used_idx: virtqueue state (from call to virtqueue_enable_cb_prepare). + * + * Returns "true" if there are pending used buffers in the queue. + * + * This does not need to be serialized. + */ +bool virtqueue_poll(struct virtqueue *_vq, unsigned last_used_idx) +{ + struct vring_virtqueue *vq = to_vvq(_vq); + virtio_mb(vq); - if (unlikely(more_used(vq))) { - END_USE(vq); - return false; - } + return (u16)last_used_idx != vq->vring.used->idx; +} +EXPORT_SYMBOL_GPL(virtqueue_poll); - END_USE(vq); - return true; +/** + * virtqueue_enable_cb - restart callbacks after disable_cb. + * @vq: the struct virtqueue we're talking about. + * + * This re-enables callbacks; it returns "false" if there are pending + * buffers in the queue, to detect a possible race between the driver + * checking for more work, and enabling callbacks. + * + * Caller must ensure we don't call this with other virtqueue + * operations at the same time (except where noted). + */ +bool virtqueue_enable_cb(struct virtqueue *_vq) +{ + unsigned last_used_idx = virtqueue_enable_cb_prepare(_vq); + return !virtqueue_poll(_vq, last_used_idx); } EXPORT_SYMBOL_GPL(virtqueue_enable_cb); diff -upr linux-3.2.49/include/linux/virtio.h linux-3.2.49wg/include/linux/virtio.h --- linux-3.2.49/include/linux/virtio.h 2012-01-05 00:55:44.000000000 +0100 +++ linux-3.2.49wg/include/linux/virtio.h 2013-07-27 13:57:28.000000000 +0200 @@ -44,6 +44,10 @@ void virtqueue_disable_cb(struct virtque bool virtqueue_enable_cb(struct virtqueue *vq); +unsigned virtqueue_enable_cb_prepare(struct virtqueue *vq); + +bool virtqueue_poll(struct virtqueue *vq, unsigned); + bool virtqueue_enable_cb_delayed(struct virtqueue *vq); void *virtqueue_detach_unused_buf(struct virtqueue *vq);
Attachment:
signature.asc
Description: This is a digitally signed message part