Re: [PATCH] generic: skip dm-log-writes tests on XFS v5 superblock filesystems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2019/2/27 下午1:19, Amir Goldstein wrote:
> On Wed, Feb 27, 2019 at 7:01 AM Qu Wenruo <wqu@xxxxxxx> wrote:
>>
>>
>>
>> On 2019/2/27 下午12:49, Amir Goldstein wrote:
>> [snip]
>>>>> Indeed.
>>>>
>>>> May I ask a stupid question?
>>>>
>>>> How does it matter whether the device is clean or not?
>>>> Shouldn't the journal/metadata or whatever be self-contained?
>>>>
>>>
>>> Yes and no.
>>>
>>> The most simple example (not limited to xfs and not sure it is like that in xfs)
>>> is how you find the last valid journal commit entry. It should have correct CRC
>>> and the largest LSN. But it you replay IO on top of existing journal without
>>> wiping it first, then journal recovery will continue past the point to meant to
>>> replay or worse.
>>
>> The journal doesn't have some superblock like system to tell where to
>> stop replaying?
>>
> 
> Yes, it does. My statement was inaccurate.
> The point is how can filesystem trust that journal superblock is not corrupted?
> If filesystem observes data from the future in the journal that is not
> expected to
> be there, or any checksumed metadata on the device, it may fail sanity checks
> that superblock or other metadata are not corrupted.
> 
> Suppose filesystem has a bug that overwrites new metadata from time N
> with old stale metadata buffer from time N-1. That would look not much different
> than log-writes replay to time N-1 over non clean device from time N.
> 
>> If not, then indeed we need to discard the journal before writing new one.
>>
> ...
> 
>>>> Am I missing something? Or do I get too poisoned by btrfs CoW?
>>>>
>>>
>>> I'd be very surprised if btrfs cannot be flipped by seeing stale data "from
>>> the future" in the block device. Seems to me like the entire concept of
>>> CoW and metadata checksums is completely subverted by the existence
>>> of correct checksums on "stale metadata from the future".
>>
>> It seems that metadata CoW makes it impossible to see future data.
>>
>> All btree trees get updated CoW, so no metadata will be overwritten
>> during one transaction.
>> Only super block is overwritten and normally superblock is updated
>> atomically.
>>
>> So either old superblock is still here, all we can see is old tree pointers.
>> Or new superblock is here, all we can see is new tree pointers.
>> And new metadata will never be written into old metadata, there is no
>> way to see future metadata.
>>
> 
> Those assumptions could fail if you have unreliable hardware that
> reorders IO across FUA, just drops IO on the floor or a bug in the filesystem
> or block layer.

Well, if hardware has problem, we can't really do anything to help.

Maybe that's reason why there are more corruption report for btrfs as
there are more problematic hardware than we thought?

> 
> Existence of metadata from the future could look like any of
> the above has happened.

Btrfs has an extra layer to prevent such problem from happening, each
metadata pointer has its expected generation.

If one metadata has a mismatch generation with its parent, then kernel
will know something went wrong.

And in fact, that's the most common failure mode for btrfs (although
most of them is seeing too old metadata), and we're looking into the
problem (but not much progress yet).

Thanks,
Qu

> 
> Thanks,
> Amir.
> 

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux