Re: BUG: RAID6 recovery broken by commit 4f4fd7c5798bbdd5a03a60f6269cf1177fbd11ef (Linux 5.1.3)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Thorsten,

On Wed, May 22, 2019 at 9:19 AM Song Liu <liu.song.a23@xxxxxxxxx> wrote:
>
> Hi Thorsten,
>
> Thanks for the report. I will follow up with stable@ to fix them.
>
> Best regards,
> Song

Could you please confirm the follow patches fixes the issue?

commit a25d8c327bb4 ("Revert "Don't jump to compute_result state from
check_result state"")
commit b2176a1dfb51 ("md/raid: raid5 preserve the writeback action
after the parity check")

Thanks,
Song


>
> On Wed, May 22, 2019 at 5:26 AM Thorsten Knabe <linux@xxxxxxxxxxxxxxxxx> wrote:
> >
> > Hello.
> >
> > BUG: RAID6 recovery broken by commit
> > 4f4fd7c5798bbdd5a03a60f6269cf1177fbd11ef (Linux 5.1.3+)
> >
> > Replacing a failed disk of a MD RAID6 array causes file system
> > corruption and data loss on kernels containing commit
> > 4f4fd7c5798bbdd5a03a60f6269cf1177fbd11ef.
> >
> > Affected kernels: 5.1.3, 5.1.4 possibly others.
> > Unaffected kernels: 5.1.2
> >
> > OS: Debian stretch amd64
> >
> > Steps to reproduce the BUG:
> >
> > 1. Create a new 4-disk RAID6 array, create a file system and mount it:
> >    mdadm /dev/md0 --create -l 6 -n 4 /dev/sd[bcde]
> >    mkfs.ext4 /dev/md0
> >    mount /dev/md0 /mnt
> > 2. Store some data (a few GB should be fine) on the RAID6 arrays file
> > system:
> >    cp -r whatever /mnt
> > 3. Fail a disk of the RAID6 array and remove it from the array:
> >    mdadm /dev/md0 --fail /dev/sdd
> >    mdadm /dev/md0 --remove /dev/sdd
> > 4. Drop caches:
> >    echo "3" > /proc/sys/vm/drop_caches
> > 5. Compare data copied to the RAID6 array in step 2 with its source:
> >    diff -r whatever /mnt/whatever
> >    There should be no differences and no file system errors.
> > 6. Add a new empty disk to the RAID6 array:
> >    mdadm /dev/md0 --add /dev/sdf
> > 7. RAID6 recovery should start now, wait for the RAID6 recovery to finish.
> > 8. Drop caches again:
> >    echo "3" > /proc/sys/vm/drop_caches
> > 9. Compare data copied to the RAID6 array in step 2 with its source again:
> >    diff -r whatever /mnt/whatever
> >    diff now reports a lot of differences and the kernel log gets filled
> > with file system errors. For example:
> >    EXT4-fs warning (device md0): ext4_dirent_csum_verify:355: inode
> > #918549: comm diff: No space for directory leaf checksum. Please run
> > e2fsck -D.
> >
> > Reverting commit 4f4fd7c5798bbdd5a03a60f6269cf1177fbd11ef from kernel
> > 5.1.4 resolves the issues described above.
> >
> > Kind regards
> > Thorsten
> >
> >
> > --
> > ___
> >  |        | /                 E-Mail: linux@xxxxxxxxxxxxxxxxx
> >  |horsten |/\nabe                WWW: http://linux.thorsten-knabe.de
> >



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux