Re: [PATCH] mdadm: Don't set bad_blocks flag when initializing metadata

Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> · Mon, 10 Mar 2025 09:38:09 +0800

Hi,

在 2025/03/08 13:27, Xiao Ni 写道:
On Sat, Mar 8, 2025 at 11:06 AM Xiao Ni <xni@xxxxxxxxxx> wrote:

Hi Guanghao

Thanks for your patch.

On Tue, Mar 4, 2025 at 8:27 PM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:

Hi,

在 2025/03/04 14:12, Wu Guanghao 写道:
When testing the raid1, I found that adding disk to raid1 fails.
Here's how to reproduce it:

       1. modprobe brd rd_nr=3 rd_size=524288
       2. mdadm -Cv /dev/md0 -l1 -n2 -e1.0 /dev/ram0 /dev/ram1

       3. mdadm /dev/md0 -f /dev/ram0
       4. mdadm /dev/md0 -r /dev/ram0

       5. echo "10000 100" > /sys/block/md0/md/dev-ram1/bad_blocks
       6. echo "write_error" > /sys/block/md0/md/dev-ram1/state

       7. mkfs.xfs /dev/md0

Do we need this step7 here?

I confirmed myself. The answer is yes.

       8. mdadm --examine-badblocks /dev/ram1  # Bad block records can be seen
          Bad-blocks on /dev/ram1:
                      10000 for 100 sectors

       9. mdadm /dev/md0 -a /dev/ram2
          mdadm: add new device failed for /dev/ram2 as 2: Invalid argument

Can you add a new regression test as well?


When adding a disk to a RAID1 array, the metadata is read from the existing
member disks for synchronization. However, only the bad_blocks flag are copied,
the bad_blocks records are not copied, so the bad_blocks records are all zeros.
The kernel function super_1_load() detects bad_blocks flag and reads the
bad_blocks record, then sets the bad block using badblocks_set().

After the kernel commit 1726c7746("badblocks: improve badblocks_set() for
multiple ranges handling"), if the length of a bad_blocks record is 0, it will
return a failure. Therefore the device addition will fail.

Can you give the specific function replace with "it will return a failure" here?

I know, you mean badblocks_set which is called in super_1_load. It's
better to describe clearly in a commit message.


So, don't set the bad_blocks flag when initializing the metadata, kernel can
handle it.

Signed-off-by: Wu Guanghao <wuguanghao3@xxxxxxxxxx>
---
   super1.c | 3 +++
   1 file changed, 3 insertions(+)

diff --git a/super1.c b/super1.c
index fe3c4c64..03578e5b 100644
--- a/super1.c
+++ b/super1.c
@@ -2139,6 +2139,9 @@ static int write_init_super1(struct supertype *st)
               if (raid0_need_layout)
                       sb->feature_map |= __cpu_to_le32(MD_FEATURE_RAID0_LAYOUT);

+             if (sb->feature_map & MD_FEATURE_BAD_BLOCKS)
+                     sb->feature_map &= ~__cpu_to_le32(MD_FEATURE_BAD_BLOCKS);

There are also other flags that is per rdev, like MD_FEATURE_REPLACEMENT
and MD_FEATURE_JOURNAL, they should be excluded as well.

Hmm, why do we need to remove these flags too? It's better to use a
separate patch rather than using this patch and describe it in detail.
It's better to give an example also.

This MD_FEATURE_REPLACEMENT should be removed, because this is per-rdev
flag, means this rdev is an replacement, and this should never be copied
to new rdev:

        if (le32_to_cpu(sb->feature_map) & MD_FEATURE_REPLACEMENT)
                set_bit(Replacement, &rdev->flags);

This is exactaly the same as MD_FEATURE_BAD_BLOCKS, means this rdev has
bad blocks.

And I'm wrong about MD_FEATURE_JOURNAL, this doesn't not mean this rdev
is journal.

Thanks,
Kuai


Please answer this question.

Regards
Xiao

Best Regards
Xiao

Thanks,
Kuai

+
               sb->sb_csum = calc_sb_1_csum(sb);
               rv = store_super1(st, di->fd);





.