Re: Documenting the crash consistency guarantees of file systems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 13, 2019 at 12:22 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote:
>
> On Wed, Feb 13, 2019 at 7:06 PM Jayashree Mohan <jaya@xxxxxxxxxxxxx> wrote:
> >
> > Hi Amir!
> >
> > Thanks for putting across your thoughts on this. Your suggestions
> > definitely make sense, and we'll compile these information and submit
> > a patch for review.
> >
> > When it comes to strictly ordered metadata consistency, to the best of
> > our knowledge only xfs claims to provide it explicitly. In ext4,
> > delayed allocation and fsync of a file not persisting all its hard
> > links[1] are examples of violation to the strictly ordered metadata
> > consistency right?
>
> No, I don't think they are.
> At least that is not how understand what Ted wrote.
>
> > And for btrfs, they don't seem to explicit about
> > providing such semantics. Look at this thread[2] for example, owing to
> > the lack of specification, btrfs does not commit to providing such
> > guarantees.
>
> The discussion is not about ordered metadata, is it about what
> fsync(file) should do. They are related if we decide that fsync(file)
> should persist nlink, but I think all fs maintainers are in agreement
> that it doesn't matter and btrfs choice is as valid as ext4/xfs choice.
>
> That said, I don't know if btrfs does strictly ordered metadata or not.
> Order metadata means if user does op A then op B, you should not be
> able to see consequence of op B after crash without seeing the
> consequence of op A.
>
> Can you give a counter example for btrfs? for ext4?

My understanding of strictly ordered metadata is that if op A precedes
op B in program order (in-memory execution), then op A should precede
op B in persistence order. As you say, one should not observe op B on
storage without op A. Note that we don't say anything about whether
fsync was called on op A or op B.

I remember this old conversation from our ALICE work that btrfs does
not persist things in order:
https://www.spinics.net/lists/linux-btrfs/msg32215.html

If you do the following:

create file foo
write to file foo
rename bar to baz
CRASH

and then you see baz but not foo on storage, that is a violation of
strictly ordered semantics. ext4 violates this due to delayed
allocation. So it does not provide strictly ordered metadata?

AFAIK, any file system which persists things out of order to increase
performance does not provide strictly ordered metadata semantics.
These semantics seem to indicate a total ordering among all
operations, and an fsync should persist all previous operations (as
ext3 used to do).

Note that Jayashree and I aren't arguing file systems should provide
this semantics, merely that ext4 and btrfs violate it at certain
points.



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux