Re: [PATCH -stable] block: destroy bdi before blockdev is unregistered.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Apr 27, 2015 at 12:12 AM, NeilBrown <neilb@xxxxxxx> wrote:
>
> Because of the peculiar way that md devices are created (automatically
> when the device node is opened), a new device can be created and
> registered immediately after the
>         blk_unregister_region(disk_devt(disk), disk->minors);
> call in del_gendisk().
>
> Therefore it is important that all visible artifacts of the previous
> device are removed before this call.  In particular, the 'bdi'.
>
> Since:
> commit c4db59d31e39ea067c32163ac961e9c80198fd37
> Author: Christoph Hellwig <hch@xxxxxx>
>     fs: don't reassign dirty inodes to default_backing_dev_info
>
> moved the
>    device_unregister(bdi->dev);
> call from bdi_unregister() to bdi_destroy() it has been quite easy to
> lose a race and have a new (e.g.) "md127" be created after the
> blk_unregister_region() call and before bdi_destroy() is ultimately
> called by the final 'put_disk', which must come after del_gendisk().
>
> The new device finds that the bdi name is already registered in sysfs
> and complains
>
>> [ 9627.630029] WARNING: CPU: 18 PID: 3330 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x5a/0x70()
>> [ 9627.630032] sysfs: cannot create duplicate filename '/devices/virtual/bdi/9:127'
>
> We can fix this by moving the bdi_destroy() call out of
> blk_release_queue() (which can happen very late when a refcount
> reaches zero) and into blk_cleanup_queue() - which happens exactly when the md
> device driver calls it.
>
> Then it is only necessary for md to call blk_cleanup_queue() before
> del_gendisk().  As loop.c devices are also created on demand by
> opening the device node, we make the same change there.
>
> Fixes: c4db59d31e39ea067c32163ac961e9c80198fd37
> Reported-by: Azat Khuzhin <a3at.mail@xxxxxxxxx>
> Cc: Christoph Hellwig <hch@xxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx (v4.0)
> Signed-off-by: NeilBrown <neilb@xxxxxxx>
>
> --
> Hi Jens,
>  if you could check this and forward on to Linus I'd really appreciate it.
>
> Thanks,
> NeilBrown
>
>
> diff --git a/block/blk-core.c b/block/blk-core.c
> index fd154b94447a..7871603f0a29 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -552,6 +552,8 @@ void blk_cleanup_queue(struct request_queue *q)
>                 q->queue_lock = &q->__queue_lock;
>         spin_unlock_irq(lock);
>
> +       bdi_destroy(&q->backing_dev_info);
> +
>         /* @q is and will stay empty, shutdown and put */
>         blk_put_queue(q);
>  }
> diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
> index faaf36ade7eb..2b8fd302f677 100644
> --- a/block/blk-sysfs.c
> +++ b/block/blk-sysfs.c
> @@ -522,8 +522,6 @@ static void blk_release_queue(struct kobject *kobj)
>
>         blk_trace_shutdown(q);
>
> -       bdi_destroy(&q->backing_dev_info);
> -
>         ida_simple_remove(&blk_queue_ida, q->id);
>         call_rcu(&q->rcu_head, blk_free_queue_rcu);
>  }
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index ae3fcb4199e9..d7173cb1ea76 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -1620,8 +1620,8 @@ out:
>
>  static void loop_remove(struct loop_device *lo)
>  {
> -       del_gendisk(lo->lo_disk);
>         blk_cleanup_queue(lo->lo_queue);
> +       del_gendisk(lo->lo_disk);
>         blk_mq_free_tag_set(&lo->tag_set);
>         put_disk(lo->lo_disk);
>         kfree(lo);
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index d4f31e195e26..593a02476c78 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -4818,12 +4818,12 @@ static void md_free(struct kobject *ko)
>         if (mddev->sysfs_state)
>                 sysfs_put(mddev->sysfs_state);
>
> +       if (mddev->queue)
> +               blk_cleanup_queue(mddev->queue);
>         if (mddev->gendisk) {
>                 del_gendisk(mddev->gendisk);
>                 put_disk(mddev->gendisk);
>         }
> -       if (mddev->queue)
> -               blk_cleanup_queue(mddev->queue);
>
>         kfree(mddev);
>  }

I've taken this patch into consideration relative to DM, please see:
http://git.kernel.org/cgit/linux/kernel/git/snitzer/linux.git/commit/?h=wip&id=5dac86fb79697e93297edb6a316b429236d00137

Point is in my private snitzer/wip branch DM now calls
blk_cleanup_queue() before del_gendisk().

With your patch:
1) blk_cleanup_queue -> bdi_destroy -> bdi->dev = NULL;
2) del_gendisk -> bdi_unregister -> WARN_ON_ONCE(!bdi->dev)

So, in testing with DM I'm hitting bdi_unregister's WARN_ON(), are you
seeing this WARN_ON?

[  385.134474] WARNING: CPU: 3 PID: 11231 at mm/backing-dev.c:372
bdi_unregister+0x36/0x40()
[  385.143593] Modules linked in: dm_service_time dm_multipath
target_core_iblock tcm_loop target_core_mod sg iscsi_tcp libiscsi_tcp
libiscsi scsi_transport_iscsi core
temp kvm_intel kvm ixgbe igb crct10dif_pclmul crc32_pclmul
crc32c_intel ghash_clmulni_intel mdio aesni_intel ptp pps_core
glue_helper ipmi_si i7core_edac iTCO_wdt lrw
dca iTCO_vendor_support edac_core i2c_i801 pcspkr gf128mul
ipmi_msghandler acpi_power_meter shpchp lpc_ich ablk_helper cryptd
mfd_core acpi_cpufreq xfs libcrc32c sr_mo
d cdrom mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit
drm_kms_helper ata_generic ttm pata_acpi sd_mod ata_piix drm
iomemory_vsl(POE) libata megaraid_sas i2c_c
ore skd dm_mirror dm_region_hash dm_log dm_mod
[  385.213736] CPU: 3 PID: 11231 Comm: dmsetup Tainted: P          IOE
  4.1.0-rc1.snitm+ #55
[  385.222958] Hardware name: FUJITSU
PRIMERGY RX300 S6             /D2619, BIOS 6.00 Rev. 1.10.2619.N1
     05/24/2011
[  385.237602]  0000000000000000 000000006cf7a5f2 ffff88031bfcfb68
ffffffff8167b7e6
[  385.245897]  0000000000000000 0000000000000000 ffff88031bfcfba8
ffffffff8107bd5a
[  385.254193]  0000000000000000 ffff88032c4b6c00 0000000000000000
0000000000000000
[  385.262488] Call Trace:
[  385.265218]  [<ffffffff8167b7e6>] dump_stack+0x45/0x57
[  385.270955]  [<ffffffff8107bd5a>] warn_slowpath_common+0x8a/0xc0
[  385.277660]  [<ffffffff8107be8a>] warn_slowpath_null+0x1a/0x20
[  385.284171]  [<ffffffff8119e216>] bdi_unregister+0x36/0x40
[  385.290295]  [<ffffffff812fb8c1>] del_gendisk+0x131/0x2b0
[  385.296326]  [<ffffffffa0000eba>] cleanup_mapped_device+0xda/0x130 [dm_mod]
[  385.304101]  [<ffffffffa000283b>] __dm_destroy+0x19b/0x260 [dm_mod]
[  385.311099]  [<ffffffffa00040c3>] dm_destroy+0x13/0x20 [dm_mod]
[  385.317709]  [<ffffffffa0009d0e>] dev_remove+0x11e/0x180 [dm_mod]
[  385.324516]  [<ffffffffa0009bf0>] ? dev_suspend+0x250/0x250 [dm_mod]
[  385.331614]  [<ffffffffa000a3e5>] ctl_ioctl+0x255/0x500 [dm_mod]
[  385.338319]  [<ffffffff8128c8d8>] ? SYSC_semtimedop+0x298/0xea0
[  385.344931]  [<ffffffffa000a6a3>] dm_ctl_ioctl+0x13/0x20 [dm_mod]
[  385.351733]  [<ffffffff8120d038>] do_vfs_ioctl+0x2f8/0x4f0
[  385.357857]  [<ffffffff811207e4>] ? __audit_syscall_entry+0xb4/0x110
[  385.364950]  [<ffffffff8102367c>] ? do_audit_syscall_entry+0x6c/0x70
[  385.372041]  [<ffffffff8120d2b1>] SyS_ioctl+0x81/0xa0
[  385.377679]  [<ffffffff81025046>] ? syscall_trace_leave+0xc6/0x120
[  385.384578]  [<ffffffff81682f2e>] system_call_fastpath+0x12/0x71
[  385.391282] ---[ end trace af60d8ac7157d319 ]---
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux