On 12/20/2012 05:41 PM, Alexander Graf wrote: > > > On 20.12.2012, at 09:46, Asias He <asias@xxxxxxxxxx> wrote: > >> On 12/20/2012 12:15 PM, Rusty Russell wrote: >>> Alexander Graf <agraf@xxxxxxx> writes: >>> >>>> When a file system is mounted on a virtio-blk disk, we then remove it >>>> and then reattach it, the reattached disk gets the same disk name and >>>> ids as the hot removed one. >>>> >>>> This leads to very nasty effects - mostly rendering the newly attached >>>> device completely unusable. >>>> >>>> Trying what happens when I do the same thing with a USB device, I saw >>>> that the sd node simply doesn't get free'd when a device gets forcefully >>>> removed. >>>> >>>> Imitate the same behavior for vd devices. This way broken vd devices >>>> simply are never free'd and newly attached ones keep working just fine. >>>> >>>> Signed-off-by: Alexander Graf <agraf@xxxxxxx> >>> >>> I think deserves a CC:stable, no? > > I wanted to make sure people can comment on it first. But yes, it should go into stable when accepted. > >>> >>> I've put it in my pending queue for *next* merge window for now... >> >> Thanks for looking into this, Alexander! >> >> I also noticed this problem. The problem is that, if you hot-unplug a >> mounted or opened disk, the disk is in opened state. Next time, when you >> hotplug the same disk. The kernel thought it was opened already. The >> driver will use the wrong gendisk data structure in bdev. > > Exactly :) > >> >> blkdev_open >> blkdev_get >> __blkdev_get >> if (!bdev->bd_openers) { <-- Here, bd_disk not got updated >> still points to old one >> bdev->bd_disk = disk; >> bdev->bd_queue = disk->queue; >> ... >> >> I tried something like this: >> >> @@ -854,6 +862,19 @@ static int __devinit virtblk_probe(struct >> virtio_device *vdev) >> blk_queue_io_opt(q, blk_size * opt_io_size); >> >> add_disk(vblk->disk); >> + >> + for (i = 0; i < 1 << PART_BITS; i++) { >> + bdev = bdget_disk(vblk->disk, i); >> + if (bdev) { >> + bdev->bd_disk = vblk->disk; >> + bdev->bd_queue = q; >> + bdput(bdev); > > Does this stack? What if the device was mounted twice? Or mounted and dd'ed at the same time? Then your refcount here is wrong. This is in the probe path. And there is no extra refcount here. bdev->bd_openers stays what it was before the remove. > Also this means that you're free'ing a bdev even though there's still a reference held to it. That means whoever holds it could later use that pointer and access bogus memory. The bdev is not freed. > Alex > >> + } >> + } >> >> >> 1) Before: >> ---> hot-plug >> [ 35.730183] virtio_blk: virtblk_probe: vblk=ffff880078b0e000, >> disk=ffff880078d54c00, q=ffff88007f88b3d8 >> [ 35.735352] virtio_blk: virtblk_ioctl: vblk=ffff880078b0e000, >> disk=ffff880078d54c00, bdev=ffff88007c45cc00, q=ffff88007f88b3d8 >> ---> hot-unplug >> >> ---> hot-plug >> [ 83.570480] virtio_blk: virtblk_probe: vblk=ffff880078b0e000, >> disk=ffff880078d55800, q=ffff88007f88bb40 >> [ 83.575614] virtio_blk: virtblk_ioctl: vblk=ffff880078b0e000, >> disk=ffff880078d54c00, bdev=ffff88007c45cc00, q=ffff88007f88b3d8 >> >> The disk points to old one ffff880078d54c00. The queue also points to >> old one ffff88007f88b3d8. >> >> 2) After: >> ---> hot-plug >> [ 68.035063] virtio_blk: virtblk_probe: vblk=ffff880079b20000, >> disk=ffff88007f9ebc00, q=ffff8800784d8ed0 >> [ 68.041140] virtio_blk: virtblk_ioctl: vblk=ffff880079b20000, >> disk=ffff88007f9ebc00, bdev=ffff88007ab2c000, q=ffff8800784d8ed0 >> ---> hot-unplug >> >> ---> hot-plug >> [ 86.317706] virtio_blk: virtblk_probe: vblk=ffff880079b20000, >> disk=ffff88007f9eb000, q=ffff8800784d9638 >> [ 86.322535] virtio_blk: virtblk_ioctl: vblk=ffff880079b20000, >> disk=ffff88007f9eb000, bdev=ffff88007ab2c000, q=ffff8800784d9638 >> >> The disk and queue are updated correctly. The attached disk works and >> still uses the old name. >> >> -- >> Asias -- Asias -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html