Hi, Ming
在 2023/03/22 15:58, Ming Lei 写道:
On Wed, Mar 22, 2023 at 11:59:26AM +0800, Yu Kuai wrote:
From: Yu Kuai <yukuai3@xxxxxxxxxx>
Currently if disk_scan_partitions() failed, GD_NEED_PART_SCAN will still
set, and partition scan will be proceed again when blkdev_get_by_dev()
is called. However, this will cause a problem that re-assemble partitioned
raid device will creat partition for underlying disk.
Test procedure:
mdadm -CR /dev/md0 -l 1 -n 2 /dev/sda /dev/sdb -e 1.0
sgdisk -n 0:0:+100MiB /dev/md0
blockdev --rereadpt /dev/sda
blockdev --rereadpt /dev/sdb
mdadm -S /dev/md0
mdadm -A /dev/md0 /dev/sda /dev/sdb
Test result: underlying disk partition and raid partition can be
observed at the same time
Note that this can still happen in come corner cases that
GD_NEED_PART_SCAN can be set for underlying disk while re-assemble raid
device.
Fixes: e5cfefa97bcc ("block: fix scan partition for exclusively open device again")
Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx>
The issue still can't be avoided completely, such as, after rebooting,
/dev/sda1 & /dev/md0p1 can be observed at the same time. And this one
should be underlying partitions scanned before re-assembling raid, I
guess it may not be easy to avoid.
Yes, this is possible and I don't know how to fix this yet...
Also seems the following change added in e5cfefa97bcc isn't necessary:
/* Make sure the first partition scan will be proceed */
if (get_capacity(disk) && !(disk->flags & GENHD_FL_NO_PART) &&
!test_bit(GD_SUPPRESS_PART_SCAN, &disk->state))
set_bit(GD_NEED_PART_SCAN, &disk->state);
since the following disk_scan_partitions() in device_add_disk() should cover
partitions scan.
This can't be guaranteed if someone else open the device excl after
bdev_add and before disk_scan_partitions:
t1: t2:
device_add_disk
bdev_add
insert_inode_hash
// open device excl
disk_scan_partitions
// will fail
However, this is just in theory, and it's unlikely to happen in
practice.
Thanks,
Kuai
---
block/genhd.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/block/genhd.c b/block/genhd.c
index 08bb1a9ec22c..a72e27d6779d 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -368,7 +368,6 @@ int disk_scan_partitions(struct gendisk *disk, fmode_t mode)
if (disk->open_partitions)
return -EBUSY;
- set_bit(GD_NEED_PART_SCAN, &disk->state);
/*
* If the device is opened exclusively by current thread already, it's
* safe to scan partitons, otherwise, use bd_prepare_to_claim() to
@@ -381,12 +380,19 @@ int disk_scan_partitions(struct gendisk *disk, fmode_t mode)
return ret;
}
+ set_bit(GD_NEED_PART_SCAN, &disk->state);
bdev = blkdev_get_by_dev(disk_devt(disk), mode & ~FMODE_EXCL, NULL);
if (IS_ERR(bdev))
ret = PTR_ERR(bdev);
else
blkdev_put(bdev, mode & ~FMODE_EXCL);
+ /*
+ * If blkdev_get_by_dev() failed early, GD_NEED_PART_SCAN is still set,
+ * and this will cause that re-assemble partitioned raid device will
+ * creat partition for underlying disk.
+ */
+ clear_bit(GD_NEED_PART_SCAN, &disk->state);
I feel GD_NEED_PART_SCAN becomes a bit hard to follow.
So far, it is only consumed by blkdev_get_whole(), and cleared in
bdev_disk_changed(). That means partition scan can be retried
if bdev_disk_changed() fails.
Another mess is that more drivers start to touch this flag, such as
nbd/sd, probably it is better to change them into one API of
blk_disk_need_partition_scan(), and hide implementation detail
to drivers.
thanks,
Ming
.