Re: [BUG] kernel NULL pointer dereference observed during pmem btt switch test

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 28, 2016 at 8:50 AM, Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
> [ adding linux-block ]
>
> On Wed, Jul 27, 2016 at 8:20 PM, Yi Zhang <yizhan@xxxxxxxxxx> wrote:
>> Hello everyone
>>
>> Could you help check this issue, thanks.
>>
>> Steps I used:
>> 1. Reserve 4*8G of memory for pmem by add kernel parameter "memmap=8G!4G memmap=8G!12G memmap=8G!20G memmap=8G!28G"
>> 2. Execute below script
>> #!/bin/bash
>> pmem_btt_switch() {
>>         sector_size_list="512 520 528 4096 4104 4160 4224"
>>         for sector_size in $sector_size_list; do
>>                 ndctl create-namespace -f -e namespace${1}.0 --mode=sector -l $sector_size
>>                 ndctl create-namespace -f -e namespace${1}.0 --mode=raw
>>         done
>> }
>>
>> for i in 0 1 2 3; do
>>         pmem_btt_switch $i &
>> done
>
> Thanks for the report.  This looks like del_gendisk() frees the
> previous usage of the devt before the bdi is unregistered.  This
> appears to be a general problem with all block drivers, not just
> libnvdimm, since blk_cleanup_queue() is typically called after
> del_gendisk().  I.e. it will always be the case that the bdi
> registered with the devt allocated at add_disk() will still be alive
> when del_gendisk()->disk_release() frees the previous devt number.
>
> I *think* the path forward is to allow the bdi to hold a reference
> against the blk_alloc_devt() allocation until it is done with it.  Any
> other ideas on fixing this object lifetime problem?

Does the attached patch solve this for you?
From 44bcbf8c531e9249d09e6bf502d3696668f3d22c Mon Sep 17 00:00:00 2001
From: Dan Williams <dan.j.williams@xxxxxxxxx>
Date: Sat, 30 Jul 2016 08:23:06 -0700
Subject: [PATCH] block: fix bdi vs gendisk lifetime mismatch

The bdi for gendisk is named after the gendisk.  However, since the
gendisk is destroyed before the bdi it leaves a window where a new
gendisk could dynamically reuse the same devt while a bdi while a bdi
with the same name is still live.  Arrange for the bdi to hold a
reference against its "owner" disk device while it is registered.
Otherwise we can hit sysfs duplicate name collisions like the following:

 WARNING: CPU: 10 PID: 2078 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x64/0x80
 sysfs: cannot create duplicate filename '/devices/virtual/bdi/259:1'

 Hardware name: HP ProLiant DL580 Gen8, BIOS P79 05/06/2015
  0000000000000286 0000000002c04ad5 ffff88006f24f970 ffffffff8134caec
  ffff88006f24f9c0 0000000000000000 ffff88006f24f9b0 ffffffff8108c351
  0000001f0000000c ffff88105d236000 ffff88105d1031e0 ffff8800357427f8
 Call Trace:
  [<ffffffff8134caec>] dump_stack+0x63/0x87
  [<ffffffff8108c351>] __warn+0xd1/0xf0
  [<ffffffff8108c3cf>] warn_slowpath_fmt+0x5f/0x80
  [<ffffffff812a0d34>] sysfs_warn_dup+0x64/0x80
  [<ffffffff812a0e1e>] sysfs_create_dir_ns+0x7e/0x90
  [<ffffffff8134faaa>] kobject_add_internal+0xaa/0x320
  [<ffffffff81358d4e>] ? vsnprintf+0x34e/0x4d0
  [<ffffffff8134ff55>] kobject_add+0x75/0xd0
  [<ffffffff816e66b2>] ? mutex_lock+0x12/0x2f
  [<ffffffff8148b0a5>] device_add+0x125/0x610
  [<ffffffff8148b788>] device_create_groups_vargs+0xd8/0x100
  [<ffffffff8148b7cc>] device_create_vargs+0x1c/0x20
  [<ffffffff811b775c>] bdi_register+0x8c/0x180
  [<ffffffff811b7877>] bdi_register_dev+0x27/0x30
  [<ffffffff813317f5>] add_disk+0x175/0x4a0

Reported-by: Yi Zhang <yizhan@xxxxxxxxxx>
Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>
---
 block/genhd.c                    |  2 +-
 include/linux/backing-dev-defs.h |  1 +
 include/linux/backing-dev.h      |  1 +
 mm/backing-dev.c                 | 18 ++++++++++++++++++
 4 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/block/genhd.c b/block/genhd.c
index 3c9dede4e04f..f6f7ffcd4eab 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -614,7 +614,7 @@ void device_add_disk(struct device *parent, struct gendisk *disk)
 
 	/* Register BDI before referencing it from bdev */
 	bdi = &disk->queue->backing_dev_info;
-	bdi_register_dev(bdi, disk_devt(disk));
+	bdi_register_owner(bdi, disk_to_dev(disk));
 
 	blk_register_region(disk_devt(disk), disk->minors, NULL,
 			    exact_match, exact_lock, disk);
diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h
index 3f103076d0bf..c357f27d5483 100644
--- a/include/linux/backing-dev-defs.h
+++ b/include/linux/backing-dev-defs.h
@@ -163,6 +163,7 @@ struct backing_dev_info {
 	wait_queue_head_t wb_waitq;
 
 	struct device *dev;
+	struct device *owner;
 
 	struct timer_list laptop_mode_wb_timer;
 
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index 491a91717788..43b93a947e61 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -24,6 +24,7 @@ __printf(3, 4)
 int bdi_register(struct backing_dev_info *bdi, struct device *parent,
 		const char *fmt, ...);
 int bdi_register_dev(struct backing_dev_info *bdi, dev_t dev);
+int bdi_register_owner(struct backing_dev_info *bdi, struct device *owner);
 void bdi_unregister(struct backing_dev_info *bdi);
 
 int __must_check bdi_setup_and_register(struct backing_dev_info *, char *);
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index efe237742074..7b51cb7905be 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -825,6 +825,19 @@ int bdi_register_dev(struct backing_dev_info *bdi, dev_t dev)
 }
 EXPORT_SYMBOL(bdi_register_dev);
 
+int bdi_register_owner(struct backing_dev_info *bdi, struct device *owner)
+{
+	int rc;
+
+	rc = bdi_register(bdi, NULL, "%u:%u", MAJOR(owner->devt),
+			MINOR(owner->devt));
+	if (rc)
+		return rc;
+	bdi->owner = owner;
+	get_device(owner);
+}
+EXPORT_SYMBOL(bdi_register_owner);
+
 /*
  * Remove bdi from bdi_list, and ensure that it is no longer visible
  */
@@ -849,6 +862,11 @@ void bdi_unregister(struct backing_dev_info *bdi)
 		device_unregister(bdi->dev);
 		bdi->dev = NULL;
 	}
+
+	if (bdi->owner) {
+		put_device(bdi->owner);
+		bdi->owner = NULL;
+	}
 }
 
 void bdi_exit(struct backing_dev_info *bdi)
-- 
2.5.5


[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux