Re: ext4 file replace guarantees

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jun 21, 2013 at 08:43:16AM -0400, Ryan Lortie wrote:
> 
> On Thu, Jun 20, 2013, at 20:59, Theodore Ts'o wrote:
> > It's not _guaranteed_ safe.  It significantly reduces the chances of
> > data loss in case of a crash, but it's possible for the transaction
> > containing the rename to close before the blocks are written back.
> 
> I think you need to update the documentation.  Specifically, this:
> 
> "avoids the                     "zero-length" problem that can happen
> when a system crashes before the delayed allocation                     
>  blocks are forced to disk"
> 
> makes it sound like the replace-by-rename-without-fsync problem has been
> solved.  This is the statement that caused me to remove the extra
> fsync() we had in GLib.

I agree it can be read that way, although we were very careful to
avoid the word "guarantee".

> Note that "significantly reduces the chances of" is not good enough to
> prevent about a dozen reports of lost data that I alone have heard about
> in the past few days...
> 
> https://bugzilla.gnome.org/show_bug.cgi?id=701560#c30

So in at least a few of these bugs, people are failing to fsync()
after creatinig a new file.  The implied flush *only* happens in two
cases:

#1) If an existing file was opened using O_TRUNC, an implied flush is
    sent after the file is closed.

#2) If an existing file is removed via a rename, if there are any
    delayed allocation blocks in the new file, they will be flushed
    out.

One of the test cases created lots of new files, and that's not one of
the two cases shown above.

> It would be great if the docs would just said "If you want safety with
> ext4, it's going to be slow.  Please always call fsync()." instead of
> making it sound like I'll probably be mostly OKish if I don't.

This is going to be true for all file systems.  If application writers
are trying to surf the boundaries of what is safe or not, inevitably
they will eventually run into problems.

Also, although fsync() is not free, it's not the performance disaster
it was in ext3.  Precisely because of delayed allocation, we don't
have to flush out pending writes of unrelated files when you do a
fsync().

Finally, I strongly encourage you to think very carefully about your
strategy for storing these sorts of registry data.  Even if it is
"safe" for btrfs, if the desktop applications are constantly writing
back files for no good reason, it's going to burn battery, SSD write
cycles, and etc.  And if said registry files are large XML files that
have to be complete rewritten every single time an applicatoin
modifies an element or two (and said application is doing this several
times a seocnd), the user is going to have a bad time --- in shortened
battery and SSD life if nothing else.  And this is not a problem a
file system can protect you against, since while we can try to be more
clever about how we manage the metadata, the data blocks still have to
be written to disk.

The fact that you are trying to optimize out the fsync() makes me
wonder if there is something fundamentally flawed in the design of
either the application or its underlying libraries....

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux