On Thu, 12 Jul 2012 18:38:43 +0300 Alexander Lyakas <alex.bolshoy@xxxxxxxxx> wrote: > Hi Neil, > I am testing the following simple scenario: > - RAID1 with two drives > - First drive has a bad block marked in the bad block list > - "repair" is triggered Thanks for testing!! > > What is happening is that the code in raid1.c:sync_request() selects > candidates for reading. If it encounters a bad block recorded, it > skips this particular drive: > if (is_badblock(rdev, sector_nr, good_sectors, > &first_bad, &bad_sectors)) { > if (first_bad > sector_nr) > .../* we don't go here*/ > else { > /* we go here*/ > bad_sectors -= (sector_nr - first_bad); > if (min_bad == 0 || > min_bad > bad_sectors) > min_bad = bad_sectors; > } > } > if (sector_nr < first_bad) { > /* we don't go here */ > ... > But the second drive has no bad blocks recorded, so it is selected. > So we end up with read_targets=1 and min_bad>0. > > Then the following code: > if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery) && read_targets > 0) > /* extra read targets are also write targets */ > write_targets += read_targets-1; > leaves write_targets=0 > > Then the following code: > if (write_targets == 0 || read_targets == 0) { > /* There is nowhere to write, so all non-sync > * drives must be failed - so we are finished > */ > sector_t rv = max_sector - sector_nr; > aborts the resync. Oops. > > Is this the intended behavior? Because it looks like this bit does not > take into account the bad-blocks existence. > I am not sure about which logic would solve it, but something like "If > we did not select a drive for reading because of bad blocks, and as a > result we have only one read_target, and MD_RECOVERY_REQUESTED, then > let's report only the bad block as skipped and not the whole drive". > Something like that, but there are probably many cases I am not > thinking about. > > What do you think? I don't see that MD_RECOVERY_REQUESTED is really an issue is it? We the wrong thing happen in any case where there just one device with no bad block, and at least one device with a bad block. In that case we want to skip forward by the number of bad blocks, not skip to the end of the array. So this? diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 240ff31..c443f93 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -2422,7 +2422,10 @@ static sector_t sync_request(struct mddev *mddev, sector_t sector_nr, int *skipp /* There is nowhere to write, so all non-sync * drives must be failed - so we are finished */ - sector_t rv = max_sector - sector_nr; + sector_t rv; + if (min_bad > 0) + max_sector = sector_nr + min_bad; + rv = max_sector - sector_nr; *skipped = 1; put_buf(r1_bio); return rv; Thanks, NeilBrown
Attachment:
signature.asc
Description: PGP signature