Re: [PATCH STABLE 5.10 5.15 0/2] btrfs: raid56 backports to reduce destructive RMW

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2022/8/4 19:26, Qu Wenruo wrote:


On 2022/8/4 18:25, Wang Yugui wrote:
Hi,

xfstest btrfs/158 trigged a panic after these 2 patches are applied.

btrfs-158-dmesg.txt
    dmesg output when panic
btrfs-158-dmesg-decoded.txt
    dmesg output decoded by decode_stacktrace.sh
    and some source code is added too.

reproduce rate:
    not 100%, but 2 times here.

xfstest  './check -g scrub' seem higher rate  than
'./check test/btrfs/158' to reproduce this problem .

Also reproduced here running that in a loop.


linux kernel: 5.15.59 with some local backport patches too.

Got the reason pinned down, missing one dependency.

The code triggering the crash is "const u32 sectorsize =
fs_info->sectorsize", and @fs_info is from bioc.

But bioc initialization doesn't ensure every bioc has its fs_info
initialized.

That is only ensured by commit 731ccf15c952 ("btrfs: make sure
btrfs_io_context::fs_info is always initialized").

Wait, it can be done without that dependency, just use old
btrfs_raid_bio::fs_info.

Thanks,
Qu


So I have also need to backport that patch.

Weirdly, I ran my tests with "-g raid -g replace -g scrub" but didn't
trigger this on even older branches.

I'll do more tests to make sure it doesn't cause problems.

Thanks,
Qu



Best Regards
Wang Yugui (wangyugui@xxxxxxxxxxxx)
2022/08/04

Hi Greg and Sasha,

This two patches are backports for v5.15 and v5.10 (for v5.10 conflicts
can be auto resolved) stable branches.

(For older branches from v4.9 to v5.4, due to some naming change,
although the patches can be applied with auto-resolve, they won't
compile).

These two patches are reducing the chance of destructive RMW cycle,
where btrfs can use corrupted data to generate new P/Q, thus making some
repairable data unrepairable.

Those patches are more important than what I initially thought, thus
unfortunately they are not CCed to stable by themselves.

Furthermore due to recent refactors/renames, there are quite some member
change related to those patches, thus have to be manually backported.


One of the fastest way to verify the behavior is the existing btrfs/125
test case from fstests. (not in auto group AFAIK).

Qu Wenruo (2):
   btrfs: only write the sectors in the vertical stripe which has data
     stripes
   btrfs: raid56: don't trust any cached sector in
     __raid56_parity_recover()

  fs/btrfs/raid56.c | 74 ++++++++++++++++++++++++++++++++++++-----------
  1 file changed, 57 insertions(+), 17 deletions(-)

--
2.37.0





[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux