On Thu, Jul 28, 2016 at 8:50 AM, Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > [ adding linux-block ] > > On Wed, Jul 27, 2016 at 8:20 PM, Yi Zhang <yizhan@xxxxxxxxxx> wrote: >> Hello everyone >> >> Could you help check this issue, thanks. >> >> Steps I used: >> 1. Reserve 4*8G of memory for pmem by add kernel parameter "memmap=8G!4G memmap=8G!12G memmap=8G!20G memmap=8G!28G" >> 2. Execute below script >> #!/bin/bash >> pmem_btt_switch() { >> sector_size_list="512 520 528 4096 4104 4160 4224" >> for sector_size in $sector_size_list; do >> ndctl create-namespace -f -e namespace${1}.0 --mode=sector -l $sector_size >> ndctl create-namespace -f -e namespace${1}.0 --mode=raw >> done >> } >> >> for i in 0 1 2 3; do >> pmem_btt_switch $i & >> done > > Thanks for the report. This looks like del_gendisk() frees the > previous usage of the devt before the bdi is unregistered. This > appears to be a general problem with all block drivers, not just > libnvdimm, since blk_cleanup_queue() is typically called after > del_gendisk(). I.e. it will always be the case that the bdi > registered with the devt allocated at add_disk() will still be alive > when del_gendisk()->disk_release() frees the previous devt number. > > I *think* the path forward is to allow the bdi to hold a reference > against the blk_alloc_devt() allocation until it is done with it. Any > other ideas on fixing this object lifetime problem? Does the attached patch solve this for you?
From 44bcbf8c531e9249d09e6bf502d3696668f3d22c Mon Sep 17 00:00:00 2001 From: Dan Williams <dan.j.williams@xxxxxxxxx> Date: Sat, 30 Jul 2016 08:23:06 -0700 Subject: [PATCH] block: fix bdi vs gendisk lifetime mismatch The bdi for gendisk is named after the gendisk. However, since the gendisk is destroyed before the bdi it leaves a window where a new gendisk could dynamically reuse the same devt while a bdi while a bdi with the same name is still live. Arrange for the bdi to hold a reference against its "owner" disk device while it is registered. Otherwise we can hit sysfs duplicate name collisions like the following: WARNING: CPU: 10 PID: 2078 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x64/0x80 sysfs: cannot create duplicate filename '/devices/virtual/bdi/259:1' Hardware name: HP ProLiant DL580 Gen8, BIOS P79 05/06/2015 0000000000000286 0000000002c04ad5 ffff88006f24f970 ffffffff8134caec ffff88006f24f9c0 0000000000000000 ffff88006f24f9b0 ffffffff8108c351 0000001f0000000c ffff88105d236000 ffff88105d1031e0 ffff8800357427f8 Call Trace: [<ffffffff8134caec>] dump_stack+0x63/0x87 [<ffffffff8108c351>] __warn+0xd1/0xf0 [<ffffffff8108c3cf>] warn_slowpath_fmt+0x5f/0x80 [<ffffffff812a0d34>] sysfs_warn_dup+0x64/0x80 [<ffffffff812a0e1e>] sysfs_create_dir_ns+0x7e/0x90 [<ffffffff8134faaa>] kobject_add_internal+0xaa/0x320 [<ffffffff81358d4e>] ? vsnprintf+0x34e/0x4d0 [<ffffffff8134ff55>] kobject_add+0x75/0xd0 [<ffffffff816e66b2>] ? mutex_lock+0x12/0x2f [<ffffffff8148b0a5>] device_add+0x125/0x610 [<ffffffff8148b788>] device_create_groups_vargs+0xd8/0x100 [<ffffffff8148b7cc>] device_create_vargs+0x1c/0x20 [<ffffffff811b775c>] bdi_register+0x8c/0x180 [<ffffffff811b7877>] bdi_register_dev+0x27/0x30 [<ffffffff813317f5>] add_disk+0x175/0x4a0 Reported-by: Yi Zhang <yizhan@xxxxxxxxxx> Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx> --- block/genhd.c | 2 +- include/linux/backing-dev-defs.h | 1 + include/linux/backing-dev.h | 1 + mm/backing-dev.c | 18 ++++++++++++++++++ 4 files changed, 21 insertions(+), 1 deletion(-) diff --git a/block/genhd.c b/block/genhd.c index 3c9dede4e04f..f6f7ffcd4eab 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -614,7 +614,7 @@ void device_add_disk(struct device *parent, struct gendisk *disk) /* Register BDI before referencing it from bdev */ bdi = &disk->queue->backing_dev_info; - bdi_register_dev(bdi, disk_devt(disk)); + bdi_register_owner(bdi, disk_to_dev(disk)); blk_register_region(disk_devt(disk), disk->minors, NULL, exact_match, exact_lock, disk); diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h index 3f103076d0bf..c357f27d5483 100644 --- a/include/linux/backing-dev-defs.h +++ b/include/linux/backing-dev-defs.h @@ -163,6 +163,7 @@ struct backing_dev_info { wait_queue_head_t wb_waitq; struct device *dev; + struct device *owner; struct timer_list laptop_mode_wb_timer; diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h index 491a91717788..43b93a947e61 100644 --- a/include/linux/backing-dev.h +++ b/include/linux/backing-dev.h @@ -24,6 +24,7 @@ __printf(3, 4) int bdi_register(struct backing_dev_info *bdi, struct device *parent, const char *fmt, ...); int bdi_register_dev(struct backing_dev_info *bdi, dev_t dev); +int bdi_register_owner(struct backing_dev_info *bdi, struct device *owner); void bdi_unregister(struct backing_dev_info *bdi); int __must_check bdi_setup_and_register(struct backing_dev_info *, char *); diff --git a/mm/backing-dev.c b/mm/backing-dev.c index efe237742074..7b51cb7905be 100644 --- a/mm/backing-dev.c +++ b/mm/backing-dev.c @@ -825,6 +825,19 @@ int bdi_register_dev(struct backing_dev_info *bdi, dev_t dev) } EXPORT_SYMBOL(bdi_register_dev); +int bdi_register_owner(struct backing_dev_info *bdi, struct device *owner) +{ + int rc; + + rc = bdi_register(bdi, NULL, "%u:%u", MAJOR(owner->devt), + MINOR(owner->devt)); + if (rc) + return rc; + bdi->owner = owner; + get_device(owner); +} +EXPORT_SYMBOL(bdi_register_owner); + /* * Remove bdi from bdi_list, and ensure that it is no longer visible */ @@ -849,6 +862,11 @@ void bdi_unregister(struct backing_dev_info *bdi) device_unregister(bdi->dev); bdi->dev = NULL; } + + if (bdi->owner) { + put_device(bdi->owner); + bdi->owner = NULL; + } } void bdi_exit(struct backing_dev_info *bdi) -- 2.5.5