Re: [PATCH md-6.12 0/7] md: enhance faulty chekcing for blocked handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

在 2024/10/09 15:14, Mariusz Tkaczyk 写道:
On Fri, 30 Aug 2024 15:27:14 +0800
Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:

From: Yu Kuai <yukuai3@xxxxxxxxxx>

The lifetime of badblocks:

- IO error, and decide to record badblocks, and record sb_flags;
- write IO found rdev has badblocks and not yet acknowledged, then this
IO is blocked;
- daemon found sb_flags is set, update superblock and flush badblocks;
- write IO continue;

Main idea is that badblocks will be set in memory fist, before badblocks
are acknowledged, new write request must be blocked to prevent reading
old data after power failure, and this behaviour is not necessary if rdev
is faulty in the first place.

Yu Kuai (7):
   md: add a new helper rdev_blocked()
   md: don't wait faulty rdev in md_wait_for_blocked_rdev()
   md: don't record new badblocks for faulty rdev
   md/raid1: factor out helper to handle blocked rdev from
     raid1_write_request()
   md/raid1: don't wait for Faulty rdev in wait_blocked_rdev()
   md/raid10: don't wait for Faulty rdev in wait_blocked_rdev()
   md/raid5: don't set Faulty rdev for blocked_rdev

  drivers/md/md.c     |  8 +++--
  drivers/md/md.h     | 24 +++++++++++++++
  drivers/md/raid1.c  | 75 +++++++++++++++++++++++----------------------
  drivers/md/raid10.c | 40 +++++++++++-------------
  drivers/md/raid5.c  | 13 ++++----
  5 files changed, 92 insertions(+), 68 deletions(-)



Hi,
We tested this patchset.

mdmon rework:
https://github.com/md-raid-utilities/mdadm/pull/66

Kernel build torvalds/linux.git master:
commit e32cde8d2bd7d251a8f9b434143977ddf13dcec6

I applied this patchset on top of that.

My tests proved that:
- If only mdmon PR is applied - hangs are reproducible.
- If only this patchset is applied - hangs are reproducible.
- If both kernel patchset and mdmon rework are applied- hangs are not
   reproducible (at least until now).

It was tricky topic (I needed to deal with weird issues related to shared
descriptors in mdmon).

What the most important- there is no regression detected.

Good to here that, I'll send a V2 then. Usually this set will land in
v6.13, because this doesn't look like a fix in kernel. :)

Thanks,
Kuai


Thanks,
Mariusz

.






[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux