Re: Symlink not persisted even after fsync

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]



On Mon, Apr 16, 2018 at 7:07 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> On Sun, Apr 15, 2018 at 07:10:52PM -0500, Vijay Chidambaram wrote:
>> Thanks! As I mentioned before, this is useful. I have a follow-up
>> question. Consider the following workload:
>>
>>  creat foo
>>  link (foo, A/bar)
>>  fsync(foo)
>>  crash
>>
>> In this case, after the file system recovers, do we expect foo's link
>> count to be 2 or 1?
>
> So, strictly ordered behaviour:
>
> create foo:
>         - creates dirent in inode B and new inode A in an atomic
>           transaction sequence #1
>
> link foo -> A/bar
>         - creates dirent in inode C and bumps inode A link count in
>           an atomic transaction seqeunce #2.
>
> fsync foo
>         - looks at inode A, sees it's "last modification" sequence
>           counter as #2
>         - flushes all transactions up to and including #2 to the
>           journal.
>
> See the dependency chain? Both the inodes and dirents in the create
> operation and the link operation are chained to the inode foo via
> the atomic transactions. Hence when we flush foo, we also flush the
> dependent changes because of the change atomicity requirements....
>
>> I would say 2,
>
> Correct, for strict ordering. But....
>
>> but POSIX is silent on this,
>
> Well, it's not silent, POSIX explicitly allows for fsync() to do
> nothing and report success. Hence we can't really look to POSIX to
> define how fsync() should behave.
>
>> so
>> thought I would confirm. The tricky part here is we are not calling
>> fsync() on directory A.
>
> Right. But directory A has a dependent change linked to foo. If we
> fsync() foo, we are persisting the link count change in that file,
> and hence all the other changes related to that link count change
> must also be flushed. Similarly, all the cahnges related to the
> creation on foo must be flushed, too.
>
>> In this case, its not a symlink; its a hard link, so I would say the
>> link count for foo should be 2.
>
> Right - that's the "reference counted object dependency" I refered
> to. i.e. it's a bi-direction atomic dependency - either we show both
> the new dirent and the link count change, or we show neither of
> them.  Hence fsync on one object implies that we are also persisting
> the related changes in the other object, too.
>
>> But btrfs and F2FS show link count of
>> 1 after a crash.
>
> That may be valid if the dirent A/bar does not exist after recovery,
> but it also means fsync() hasn't actually guaranteed inode changes
> made prior to the fsync to be persistent on disk. i.e. that's a
> violation of ordered metadata semantics and probably a bug.

Great, this matches our understanding perfectly. We have separately
posted to the btrfs mailing list to confirm it is a bug. Thanks!
--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux