Hi, 在 2025/03/04 14:12, Wu Guanghao 写道:
When testing the raid1, I found that adding disk to raid1 fails. Here's how to reproduce it: 1. modprobe brd rd_nr=3 rd_size=524288 2. mdadm -Cv /dev/md0 -l1 -n2 -e1.0 /dev/ram0 /dev/ram1 3. mdadm /dev/md0 -f /dev/ram0 4. mdadm /dev/md0 -r /dev/ram0 5. echo "10000 100" > /sys/block/md0/md/dev-ram1/bad_blocks 6. echo "write_error" > /sys/block/md0/md/dev-ram1/state 7. mkfs.xfs /dev/md0 8. mdadm --examine-badblocks /dev/ram1 # Bad block records can be seen Bad-blocks on /dev/ram1: 10000 for 100 sectors 9. mdadm /dev/md0 -a /dev/ram2 mdadm: add new device failed for /dev/ram2 as 2: Invalid argument
Can you add a new regression test as well?
When adding a disk to a RAID1 array, the metadata is read from the existing member disks for synchronization. However, only the bad_blocks flag are copied, the bad_blocks records are not copied, so the bad_blocks records are all zeros. The kernel function super_1_load() detects bad_blocks flag and reads the bad_blocks record, then sets the bad block using badblocks_set(). After the kernel commit 1726c7746("badblocks: improve badblocks_set() for multiple ranges handling"), if the length of a bad_blocks record is 0, it will return a failure. Therefore the device addition will fail. So, don't set the bad_blocks flag when initializing the metadata, kernel can handle it. Signed-off-by: Wu Guanghao <wuguanghao3@xxxxxxxxxx> --- super1.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/super1.c b/super1.c index fe3c4c64..03578e5b 100644 --- a/super1.c +++ b/super1.c @@ -2139,6 +2139,9 @@ static int write_init_super1(struct supertype *st) if (raid0_need_layout) sb->feature_map |= __cpu_to_le32(MD_FEATURE_RAID0_LAYOUT); + if (sb->feature_map & MD_FEATURE_BAD_BLOCKS) + sb->feature_map &= ~__cpu_to_le32(MD_FEATURE_BAD_BLOCKS);
There are also other flags that is per rdev, like MD_FEATURE_REPLACEMENT and MD_FEATURE_JOURNAL, they should be excluded as well. Thanks, Kuai
+ sb->sb_csum = calc_sb_1_csum(sb); rv = store_super1(st, di->fd);