Re: [PATCH v2] virtio-blk: Fix hot-unplug race in remove method

Asias He <asias@xxxxxxxxxx> · Fri, 04 May 2012 16:37:45 +0800

On 05/03/2012 03:56 PM, Michael S. Tsirkin wrote:
On Thu, May 03, 2012 at 03:30:47PM +0800, Asias He wrote:
If we reset the virtio-blk device before the requests already dispatched
to the virtio-blk driver from the block layer are finised, we will stuck
in blk_cleanup_queue() and the remove will fail.

blk_cleanup_queue() calls blk_drain_queue() to drain all requests queued
before DEAD marking. However it will never success if the device is
already stopped. We'll have q->in_flight[]>  0, so the drain will not
finish.

How to reproduce the race:
1. hot-plug a virtio-blk device
2. keep reading/writing the device in guest
3. hot-unplug while the device is busy serving I/O

Changes in v2:
- Drop req_in_flight
- Use virtqueue_detach_unused_buf to get request dispatched to driver

Signed-off-by: Asias He<asias@xxxxxxxxxx>
---
  drivers/block/virtio_blk.c |   16 +++++++++++++---
  1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 72fe55d..670c28f 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -443,7 +443,7 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
  	if (err)
  		goto out_free_vblk;

-	vblk->pool = mempool_create_kmalloc_pool(1,sizeof(struct virtblk_req));
+	vblk->pool = mempool_create_kmalloc_pool(1, sizeof(struct virtblk_req));
  	if (!vblk->pool) {
  		err = -ENOMEM;
  		goto out_free_vq;

Would be a bit easier to review if whitespace changes
are avoided, and done in a separate patch targeting 3.5.

Well, I will cook another patch for this.

@@ -576,20 +576,30 @@ static void __devexit virtblk_remove(struct virtio_device *vdev)
  {
  	struct virtio_blk *vblk = vdev->priv;
  	int index = vblk->index;
+	struct virtblk_req *vbr;

  	/* Prevent config work handler from accessing the device. */
  	mutex_lock(&vblk->config_lock);
  	vblk->config_enable = false;
  	mutex_unlock(&vblk->config_lock);

+	/* Abort all request on the queue. */

All requests

Also, the comment isn't
really helpful. Want to explain why we abort
them all here? Won't they be drained
by later detach code?

blk_cleanup_queue is trying to drain the queue by calling request_fn, 
which is do_virtblk_request in this case. As we already stopped the 
device when calling blk_cleanup_queue, the drain might fail if do_req fails.

blk_cleanup_queue
   blk_drain_queue
       while (true)
          __blk_run_queue
            q->request_fn(q);

+	blk_abort_queue(vblk->disk->queue);

And now, I realized that using blk_abort_queue here to abort the queue 
is not right. it is used for timeout handling (block/blk-timeout.c).

[ CC'ing Jens and Chris ]

I suspect the btrfs code is using this in the wrong way too:

btrfs_abort_devices() {
        list_for_each_entry_rcu(dev, head, dev_list) {
                blk_abort_queue(dev->bdev->bd_disk->queue);
        }
}

+	del_gendisk(vblk->disk);
+
  	/* Stop all the virtqueues. */
  	vdev->config->reset(vdev);
-
  	flush_work(&vblk->config_work);

-	del_gendisk(vblk->disk);

Is there a reason you move del_gendisk to before reset?
Is it safe to del_gendisk while we might
still be getting callbacks from the device?

The original idea was to make the block layer stop sending request to 
driver asap. This is wrong since virtblk_config_changed_work might 
access vblk->disk.

+	/* Abort request dispatched to driver. */
+	while ((vbr = virtqueue_detach_unused_buf(vblk->vq))) {
+		blk_abort_request(vbr->req);
+		mempool_free(vbr, vblk->pool);
+	}
+
  	blk_cleanup_queue(vblk->disk->queue);
  	put_disk(vblk->disk);
+
  	mempool_destroy(vblk->pool);
  	vdev->config->del_vqs(vdev);
  	kfree(vblk);
--
1.7.10
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
Asias
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html