Re: [PATCH 2/2] md/raid10: handle replacement devices in fix_recovery_read_error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

在 2023/07/07 16:33, Song Liu 写道:
On Tue, Jun 27, 2023 at 11:42 AM <linan666@xxxxxxxxxxxxxxx> wrote:

From: Li Nan <linan122@xxxxxxxxxx>

In fix_recovery_read_error(), the handling of replacement devices is
missing. Add it. If io error is from replacement, error this device
directly. If io error is from other device, just set badblocks for
replacement.

Signed-off-by: Li Nan <linan122@xxxxxxxxxx>
---
  drivers/md/raid10.c | 10 ++++++++--
  1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 5105273f60e9..6d9025089455 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -2551,7 +2551,7 @@ static void fix_recovery_read_error(struct r10bio *r10_bio)

         while (sectors) {
                 int s = sectors;
-               struct md_rdev *rdev;
+               struct md_rdev *rdev, *repl;
                 sector_t addr;
                 int ok;

@@ -2559,6 +2559,7 @@ static void fix_recovery_read_error(struct r10bio *r10_bio)
                         s = PAGE_SIZE >> 9;

                 rdev = conf->mirrors[dr].rdev;
+               repl = conf->mirrors[dw].replacement;
                 addr = r10_bio->devs[0].addr + sect,
                 ok = sync_page_io(rdev,
                                   addr,
@@ -2580,6 +2581,9 @@ static void fix_recovery_read_error(struct r10bio *r10_bio)
                                         set_bit(MD_RECOVERY_NEEDED,
                                                 &rdev->mddev->recovery);
                         }
+                       if (repl && !sync_page_io(repl, addr, s << 9,
+                           pages[idx], REQ_OP_WRITE, false))
+                               md_error(mddev, repl);
                 }
                 if (!ok) {
                         /* We don't worry if we cannot set a bad block -
@@ -2592,7 +2596,9 @@ static void fix_recovery_read_error(struct r10bio *r10_bio)
                                 /* need bad block on destination too */
                                 rdev = conf->mirrors[dw].rdev;
                                 addr = r10_bio->devs[1].addr + sect;
-                               if (!rdev_set_badblocks(rdev, addr, s, 0)) {
+                               if (!rdev_set_badblocks(rdev, addr, s, 0) ||
+                                   (repl &&
+                                   !rdev_set_badblocks(repl, addr, s, 0))) {

Do we really want this in the if () statement? Shall we always set
badblock on both rdev and repl?

I think this is wrong to set repl badblocks inside this, because if
setting badblocks for rdev failed, repl is still not handled.

By the way, I think it's better to at least try to read from all
possible copies before setting badblocks for repl.

Thanks,
Kuai

Thanks,
Song

                                         /* just abort the recovery */
                                         pr_notice("md/raid10:%s: recovery aborted due to read error\n",
                                                   mdname(mddev));
--
2.39.2

.





[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux