[TOPIC] Extending the filesystem crash recovery guaranties contract

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Suggestion for another filesystems track topic.

Some of you may remember the emotional(?) discussions that ensued
when the crashmonkey developers embarked on a mission to document
and verify filesystem crash recovery guaranties:

https://lore.kernel.org/linux-fsdevel/CAOQ4uxj8YpYPPdEvAvKPKXO7wdBg6T1O3osd6fSPFKH9j=i2Yg@xxxxxxxxxxxxxx/

There are two camps among filesystem developers and every camp
has good arguments for wanting to document existing behavior and for
not wanting to document anything beyond "use fsync if you want any guaranty".

I would like to take a suggestion proposed by Jan on a related discussion:
https://lore.kernel.org/linux-fsdevel/CAOQ4uxjQx+TO3Dt7TA3ocXnNxbr3+oVyJLYUSpv4QCt_Texdvw@xxxxxxxxxxxxxx/

and make a proposal that may be able to meet the concerns of
both camps.

The proposal is to add new APIs which communicate
crash consistency requirements of the application to the filesystem.

Example API could look like this:
renameat2(..., RENAME_METADATA_BARRIER | RENAME_DATA_BARRIER)
It's just an example. The API could take another form and may need
more barrier types (I proposed to use new file_sync_range() flags).

The idea is simple though.
METADATA_BARRIER means all the inode metadata will be observed
after crash if rename is observed after crash.
DATA_BARRIER same for file data.
We may also want a "ALL_METADATA_BARRIER" and/or
"METADATA_DEPENDENCY_BARRIER" to more accurately
describe what SOMC guaranties actually provide today.

The implementation is also simple. filesystem that currently
have SOMC behavior don't need to do anything to respect
METADATA_BARRIER and only need to call
filemap_write_and_wait_range() to respect DATA_BARRIER.
filesystem developers are thus not tying their hands w.r.t future
performance optimizations for operations that are not explicitly
requesting a barrier.

Thanks,
Amir.



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux