Re: [Lsf-pc] [LSF/MM TOPIC] Working towards better power fail testing

Dave Chinner <david@xxxxxxxxxxxxx> · Tue, 6 Jan 2015 10:27:43 +1100

On Mon, Jan 05, 2015 at 02:26:30PM -0800, Sage Weil wrote:
> On Tue, 6 Jan 2015, Dave Chinner wrote:
> > Again, this is probably more a misunderstanding of FIEMAP than
> > anything. FIEMAP is *advisory* and gives no output accuracy
> > guarantees as userspace cannot prevent the extent maps from changing
> > at any time. As an example, see the aborted attempt by the 'cp'
> > utility to use FIEMAP to detect holes when copying sparse files....
> 
> Where did the cp vs FIEMAP discussion play out?  I missed that one.

Oh, there were several issues - different filesystems exposed
different issues, but the main one is that extent maps don't reflect
newly written cached data that do not have extents allocated for
them, hence the nedd for SEEK_DATA/SEEK_HOLE for optimal sparse file
traversal:

http://lwn.net/Articles/429345/
http://lwn.net/Articles/440255/

Not to mention race conditions between extent walking and background
writeback started to noticed:

http://lists.openwall.net/linux-ext4/2012/11/13/8

But then there were also corruption bugs in the cp FIEMAP code as
well:

http://gnu-coreutils.7620.n7.nabble.com/bug-12656-cp-since-8-11-corrupts-files-td20710.html

> We only use fiemap to determine which file regions are holes, only after 
> fsync, and only when there are no other processes or threads accessing the 
> same file (and only when explicitly enabled by the admin since many users 
> still have buggy implementations deployed).  Under those circumstances I 
> thought it should be reliable...

And when the filesystem does background defragmentation or block
trimming or some other re-organisation of recently accessed files?

> In retrospect the SEEK_HOLE/SEEK_DATA interface is simpler and better 
> suited, but I'm hesitant to fall into the same trap.

SEEK_HOLE/DATA is independent of the underlying file layout, hence
it's behaviour is not affected by filesystem changing the extent
layout of the file in a manner that userspace is not aware of and
cannot control.

> > Write tests for the regression test suite that filesystem developers
> > run all the time. ;)
> 
> Yes (and I assume that you specifically mean xfstests here).

*nod*

> I hope we can get some consensus on what that testing approach
> will be for power failure.  I don't much care whether it's an
> ioctl each fs implements or a dm layer that does about the same
> thing; I see advantages to both approaches.  As long as there is
> some convergence...

Yes, I see advantages to both, too, but there's no point creating
esoteric device error conditions if the filesystem can't correctly
handle and recover from simple shutdown situations....

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html