Re: Fedora 33 System-Wide Change proposal: Make btrfs the default file system for desktop variants

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/1/20 12:50 PM, Chris Murphy wrote:

...

> Integrity checking is highly valued by some and less by others.
> Considering that we know hardware isn't 100% reliable, and doesn't
> always report its own failures as expected, and hence why most file
> systems now at least checksum metadata, it's not persuasive to me that
> the data should be left unchecked, and corruption ought to be handled
> by user space somehow.

There's a flip side to this coin - in my experience, if the right btrfs
metadata blocks experience this disk corruption, there can be
a complete inability to recover the btrfs filesystem from that error -
i.e. it won't mount, and btrfsck --repair won't get it to a mountable
state.

So if we're saying disk corruption happens often enough that data
checksumming is critical, then it happens often enough that metadata
recovery is at least as critical.

I've been trying to quantify this and have not come up with a particularly
compelling test scenario, because it involves purposefully (though at random)
corrupting enough blocks on a filesystem image that a critical block gets
hit, so it looks synthetic.  But the net result is frequently a filesystem
where btrfsck and/or mount fails, and at first blush this type of failure
happens much more often than on other filesystems.[1]

I think Josef has alluded to this situation as well.  To me, that's a big
concern.  Not trying to be a wet blanket here but I think this needs to be
carefully investigated and evaluated to understand what impact it may have
on Fedora btrfs users and their ability to recover their data in the face
of metadata corruption, because it looks to me like a definite btrfs weak
spot.

-Eric

[1] some details - I used the mangle.c fuzzer from fsfuzzer, and modified
it so that it corrupts 8192 bytes of an image, which in fs terms
can be up to 8192 filesystem blocks.  I also avoided the first 4k so that
any filesystem signature was not damaged.

I then ran a loop where I created a 1G base image, populated it, fuzzed it
in this way, (so up to 3% of blocks were damaged) and ran the filesystem's
fsck utility  (in btrfs' case, btrfsck --repair) and then tried to mount
(in btrfs' case, with bare mount, then -o usebackuproot if mount failed). 
If it mounted, I used "find | wc" to see how many files were reachable vs
the original image.

If either fsck or mount reports an exit code that reflects failure to
complete properly, I recorded that.

It was a quick hack, and it's not beautiful, so there are probably holes
to be poked in it; if you want to look, I threw the bash script and the C
source up at https://people.redhat.com/esandeen/fsckfuzzer/

Running 10 loops on each of btrfs, ext4, and xfs I got results that look
like this (ext4 always creates empty lost+found so it will always find at
least 1 file there)

btrfs

fsck failed
0 files in lost+found, 628 files gone/unreachable
0 files in lost+found, 0 files gone/unreachable
526 files in lost+found, 9 files gone/unreachable
595 files in lost+found, 55 files gone/unreachable
53 files in lost+found, 8 files gone/unreachable
57 files in lost+found, 44 files gone/unreachable
fsck failed
7 files in lost+found, 1491 files gone/unreachable
fsck failed, mount failed
fsck failed, mount failed
88 files in lost+found, 40 files gone/unreachable
== 4 fsck failures, 2 mount failures

ext4

1 files in lost+found, 0 files gone/unreachable
1 files in lost+found, 0 files gone/unreachable
164 files in lost+found, 2 files gone/unreachable
1 files in lost+found, 0 files gone/unreachable
1 files in lost+found, 0 files gone/unreachable
1 files in lost+found, 1 files gone/unreachable
1 files in lost+found, 0 files gone/unreachable
9 files in lost+found, 1 files gone/unreachable
1 files in lost+found, 0 files gone/unreachable
1 files in lost+found, 0 files gone/unreachable
== 0 fsck failures, 0 mount failures

xfs

0 files in lost+found, 1 files gone/unreachable
0 files in lost+found, 0 files gone/unreachable
958 files in lost+found, 629 files gone/unreachable
0 files in lost+found, 0 files gone/unreachable
2 files in lost+found, 0 files gone/unreachable
0 files in lost+found, 1 files gone/unreachable
0 files in lost+found, 0 files gone/unreachable
0 files in lost+found, 0 files gone/unreachable
8 files in lost+found, 1 files gone/unreachable
3 files in lost+found, -1 files gone/unreachable
== 0 fsck failures, 0 mount failures


_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Users]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux