On Mon, 2018-05-28 at 12:00 +0200, Greg Kroah-Hartman wrote: > 4.4-stable review patch. If anyone has any objections, please let me know. > > ------------------ > > From: Liu Bo <bo.li.liu@xxxxxxxxxx> > > [ Upstream commit 762221f095e3932669093466aaf4b85ed9ad2ac1 ] The diff here is actually from commit 8810f7517a3b ("Btrfs: make raid6 rebuild retry more", mentioned in this commit message). (Sasha, please try to work out why commit messages and descriptions are getting mixed up in your auto-selections.) Maybe stable branches should get the real commit 762221f095e3 as well? Ben. > The raid6 corruption is that, > suppose that all disks can be read without problems and if the content > that was read out doesn't match its checksum, currently for raid6 > btrfs at most retries twice, > > - the 1st retry is to rebuild with all other stripes, it'll eventually > be a raid5 xor rebuild, > - if the 1st fails, the 2nd retry will deliberately fail parity p so > that it will do raid6 style rebuild, > > however, the chances are that another non-parity stripe content also > has something corrupted, so that the above retries are not able to > return correct content. > > We've fixed normal reads to rebuild raid6 correctly with more retries > in Patch "Btrfs: make raid6 rebuild retry more"[1], this is to fix > scrub to do the exactly same rebuild process. > > [1]: https://patchwork.kernel.org/patch/10091755/ > > Signed-off-by: Liu Bo <bo.li.liu@xxxxxxxxxx> > Signed-off-by: David Sterba <dsterba@xxxxxxxx> > Signed-off-by: Sasha Levin <alexander.levin@xxxxxxxxxxxxx> > Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> > --- > fs/btrfs/raid56.c | 18 ++++++++++++++---- > fs/btrfs/volumes.c | 9 ++++++++- > 2 files changed, 22 insertions(+), 5 deletions(-) > > --- a/fs/btrfs/raid56.c > +++ b/fs/btrfs/raid56.c > @@ -2160,11 +2160,21 @@ int raid56_parity_recover(struct btrfs_r > } > > /* > - * reconstruct from the q stripe if they are > - * asking for mirror 3 > + * Loop retry: > + * for 'mirror == 2', reconstruct from all other stripes. > + * for 'mirror_num > 2', select a stripe to fail on every retry. > */ > - if (mirror_num == 3) > - rbio->failb = rbio->real_stripes - 2; > + if (mirror_num > 2) { > + /* > + * 'mirror == 3' is to fail the p stripe and > + * reconstruct from the q stripe. 'mirror > 3' is to > + * fail a data stripe and reconstruct from p+q stripe. > + */ > + rbio->failb = rbio->real_stripes - (mirror_num - 1); > + ASSERT(rbio->failb > 0); > + if (rbio->failb <= rbio->faila) > + rbio->failb--; > + } > > ret = lock_stripe_add(rbio); > > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -5056,7 +5056,14 @@ int btrfs_num_copies(struct btrfs_fs_inf > else if (map->type & BTRFS_BLOCK_GROUP_RAID5) > ret = 2; > else if (map->type & BTRFS_BLOCK_GROUP_RAID6) > - ret = 3; > + /* > + * There could be two corrupted data stripes, we need > + * to loop retry in order to rebuild the correct data. > + * > + * Fail a stripe at a time on every retry except the > + * stripe under reconstruction. > + */ > + ret = map->num_stripes; > else > ret = 1; > free_extent_map(em); -- Ben Hutchings, Software Developer Codethink Ltd https://www.codethink.co.uk/ Dale House, 35 Dale Street Manchester, M1 2HF, United Kingdom