Re: [PATCH] mdadm: Don't set bad_blocks flag when initializing metadata

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Mar 8, 2025 at 11:06 AM Xiao Ni <xni@xxxxxxxxxx> wrote:
>
> Hi Guanghao
>
> Thanks for your patch.
>
> On Tue, Mar 4, 2025 at 8:27 PM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:
> >
> > Hi,
> >
> > 在 2025/03/04 14:12, Wu Guanghao 写道:
> > > When testing the raid1, I found that adding disk to raid1 fails.
> > > Here's how to reproduce it:
> > >
> > >       1. modprobe brd rd_nr=3 rd_size=524288
> > >       2. mdadm -Cv /dev/md0 -l1 -n2 -e1.0 /dev/ram0 /dev/ram1
> > >
> > >       3. mdadm /dev/md0 -f /dev/ram0
> > >       4. mdadm /dev/md0 -r /dev/ram0
> > >
> > >       5. echo "10000 100" > /sys/block/md0/md/dev-ram1/bad_blocks
> > >       6. echo "write_error" > /sys/block/md0/md/dev-ram1/state
> > >
> > >       7. mkfs.xfs /dev/md0
>
> Do we need this step7 here?

I confirmed myself. The answer is yes.
>
> > >       8. mdadm --examine-badblocks /dev/ram1  # Bad block records can be seen
> > >          Bad-blocks on /dev/ram1:
> > >                      10000 for 100 sectors
> > >
> > >       9. mdadm /dev/md0 -a /dev/ram2
> > >          mdadm: add new device failed for /dev/ram2 as 2: Invalid argument
> >
> > Can you add a new regression test as well?
> >
> > >
> > > When adding a disk to a RAID1 array, the metadata is read from the existing
> > > member disks for synchronization. However, only the bad_blocks flag are copied,
> > > the bad_blocks records are not copied, so the bad_blocks records are all zeros.
> > > The kernel function super_1_load() detects bad_blocks flag and reads the
> > > bad_blocks record, then sets the bad block using badblocks_set().
> > >
> > > After the kernel commit 1726c7746("badblocks: improve badblocks_set() for
> > > multiple ranges handling"), if the length of a bad_blocks record is 0, it will
> > > return a failure. Therefore the device addition will fail.
>
> Can you give the specific function replace with "it will return a failure" here?

I know, you mean badblocks_set which is called in super_1_load. It's
better to describe clearly in a commit message.
>
> > >
> > > So, don't set the bad_blocks flag when initializing the metadata, kernel can
> > > handle it.
> > >
> > > Signed-off-by: Wu Guanghao <wuguanghao3@xxxxxxxxxx>
> > > ---
> > >   super1.c | 3 +++
> > >   1 file changed, 3 insertions(+)
> > >
> > > diff --git a/super1.c b/super1.c
> > > index fe3c4c64..03578e5b 100644
> > > --- a/super1.c
> > > +++ b/super1.c
> > > @@ -2139,6 +2139,9 @@ static int write_init_super1(struct supertype *st)
> > >               if (raid0_need_layout)
> > >                       sb->feature_map |= __cpu_to_le32(MD_FEATURE_RAID0_LAYOUT);
> > >
> > > +             if (sb->feature_map & MD_FEATURE_BAD_BLOCKS)
> > > +                     sb->feature_map &= ~__cpu_to_le32(MD_FEATURE_BAD_BLOCKS);
> >
> > There are also other flags that is per rdev, like MD_FEATURE_REPLACEMENT
> > and MD_FEATURE_JOURNAL, they should be excluded as well.
>
> Hmm, why do we need to remove these flags too? It's better to use a
> separate patch rather than using this patch and describe it in detail.
> It's better to give an example also.

Please answer this question.

Regards
Xiao
>
> Best Regards
> Xiao
> >
> > Thanks,
> > Kuai
> >
> > > +
> > >               sb->sb_csum = calc_sb_1_csum(sb);
> > >               rv = store_super1(st, di->fd);
> > >
> >
> >






[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux