I hate to nag... but looking for feedback on this change, which addresses what
seems to me to be a serious bug.
Thanks,
Nate
On 07/29/2015 04:46 PM, Joe Lawrence wrote:
On 07/28/2015 03:28 PM, Nate Dailey wrote:
If a bitmap recovery is interrupted and later restarted, then
sectors below the recovery offset, written between interruption
and resumption, will not be copied. This results in corruption.
See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=777511
for a script that can be used to repro this.
Seems like ignoring the recovery_offset if a bitmap exists is
the way to go.
Signed-off-by: Nate Dailey <nate.dailey@xxxxxxxxxxx>
---
drivers/md/md.c | 24 +++++++++++++-----------
1 file changed, 13 insertions(+), 11 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 0c2a4e8..79c6285 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -7738,16 +7738,18 @@ void md_do_sync(struct md_thread *thread)
else {
/* recovery follows the physical size of devices */
max_sectors = mddev->dev_sectors;
- j = MaxSector;
- rcu_read_lock();
- rdev_for_each_rcu(rdev, mddev)
- if (rdev->raid_disk >= 0 &&
- !test_bit(Faulty, &rdev->flags) &&
- !test_bit(In_sync, &rdev->flags) &&
- rdev->recovery_offset < j)
- j = rdev->recovery_offset;
- rcu_read_unlock();
-
+ /* we don't use the offset if there's a bitmap */
+ if (!mddev->bitmap) {
+ j = MaxSector;
+ rcu_read_lock();
+ rdev_for_each_rcu(rdev, mddev)
+ if (rdev->raid_disk >= 0 &&
+ !test_bit(Faulty, &rdev->flags) &&
+ !test_bit(In_sync, &rdev->flags) &&
+ rdev->recovery_offset < j)
+ j = rdev->recovery_offset;
+ rcu_read_unlock();
+ }
/* If there is a bitmap, we need to make sure all
* writes that started before we added a spare
* complete before we start doing a recovery.
@@ -7756,7 +7758,7 @@ void md_do_sync(struct md_thread *thread)
* recovery has checked that bit and skipped that
* region.
*/
- if (mddev->bitmap) {
+ else {
mddev->pers->quiesce(mddev, 1);
mddev->pers->quiesce(mddev, 0);
}
[+cc Ben & Cyril from the Debian bug report]
-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html