Re: [PATCH] virtio-net: fix data corruption with OOM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 26, 2009 at 12:11:51PM +1030, Rusty Russell wrote:
> On Mon, 26 Oct 2009 03:33:40 am Michael S. Tsirkin wrote:
> > virtio net used to unlink skbs from send queues on error,
> > but ever since 48925e372f04f5e35fec6269127c62b2c71ab794
> > we do not do this. This causes guest data corruption and crashes
> > with vhost since net core can requeue the skb or free it without
> > it being taken off the list.
> > 
> > This patch fixes this by queueing the skb after successfull
> > transmit.
> 
> I originally thought that this was racy: as soon as we do add_buf, we need to
> make sure we're ready for the callback (for virtio_pci, it's ->kick, but we
> shouldn't rely on that).

BTW, wanted to note that unlink on error would *also* be racy if we did any
processing in the callback.

> So a comment would be nice.  How's this?
> 
> Subject: virtio-net: fix data corruption with OOM
> Date: Sun, 25 Oct 2009 19:03:40 +0200
> From: "Michael S. Tsirkin" <mst@xxxxxxxxxx>
> 
> virtio net used to unlink skbs from send queues on error,
> but ever since 48925e372f04f5e35fec6269127c62b2c71ab794
> we do not do this. This causes guest data corruption and crashes
> with vhost since net core can requeue the skb or free it without
> it being taken off the list.
> 
> This patch fixes this by queueing the skb after successful
> transmit.
> 
> Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx>
> Signed-off-by: Rusty Russell <rusty@xxxxxxxxxxxxxxx> (+ comment)
> ---
> 
> Rusty, here's a fix for another data corrupter I saw.
> This fixes a regression from 2.6.31, so definitely
> 2.6.32 I think. Comments?
> 
>  drivers/net/virtio_net.c |    8 +++++---
>  1 files changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -516,8 +516,7 @@ again:
>  	/* Free up any pending old buffers before queueing new ones. */
>  	free_old_xmit_skbs(vi);
>  
> -	/* Put new one in send queue and do transmit */
> -	__skb_queue_head(&vi->send, skb);
> +	/* Try to transmit */
>  	capacity = xmit_skb(vi, skb);
>  
>  	/* This can happen with OOM and indirect buffers. */
> @@ -531,8 +530,17 @@ again:
>  		}
>  		return NETDEV_TX_BUSY;
>  	}
> +	vi->svq->vq_ops->kick(vi->svq);
>  
> -	vi->svq->vq_ops->kick(vi->svq);
> +	/*
> +	 * Put new one in send queue.  You'd expect we'd need this before
> +	 * xmit_skb calls add_buf(), since the callback can be triggered
> +	 * immediately after that.  But since the callback just triggers
> +	 * another call back here, normal network xmit locking prevents the
> +	 * race.
> +	 */
> +	__skb_queue_head(&vi->send, skb);
> +
>  	/* Don't wait up for transmitted skbs to be freed. */
>  	skb_orphan(skb);
>  	nf_reset(skb);
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux