Re: xfstests: generic/342 run failed in f2fs

Jaegeuk Kim <jaegeuk@xxxxxxxxxx> · Thu, 28 Dec 2017 08:29:57 -0800

On 12/28, Dave Chinner wrote:
> On Wed, Dec 27, 2017 at 11:11:30AM -0800, Jaegeuk Kim wrote:
> > On 12/25, Theodore Ts'o wrote:
> > > On Sun, Dec 24, 2017 at 11:47:20PM -0800, Eric Sandeen wrote:
> > > > Filesystems are free to do /more/ than the minimum required by posix -
> > > > see ext4_sync_parent for example.  Or xfs_finish_rename, for synchronous
> > > > mounts:
> > > > 
> > > >          * If this is a synchronous mount, make sure that the rename transaction
> > > >          * goes to disk before returning to the user.
> > > >          */
> > > >         if (tp->t_mountp->m_flags & (XFS_MOUNT_WSYNC|XFS_MOUNT_DIRSYNC))
> > > >                 xfs_trans_set_sync(tp);
> > > > 
> > > > so behavior can be fs-dependent, or mount option dependent, etc.
> > > > 
> > > > But to be portable, if an app wants directory changes to be persistent
> > > > before proceeding, it must fsync the directory after making changes.
> > > > 
> > > > I don't know about f2fs's design intent, whether it guarantees more
> > > > than posix requires, but to guarantee that this test works on any posix
> > > > fs, I think that directory fsync is needed.
> > > 
> > > Agreed that this is a test bug, and we should add the fsync to the
> > > parent directory.
> > 
> > Agreed too. Or, how about using "-o dirsync"?
> > 
> > > 
> > > It might also be a good idea for f2fs to do more, given that fsync is
> > > a slow enough operation that so long as you can make sure the fsync of
> > > the parent directory happens within the same atomic update as the
> > > child inode, you might as well give the more expansive guarantee.  But
> > > obviously that's up to the f2fs developers to decide whether they want
> > > to do that work.
> > 
> > Indeed. Actually, since one of our goals was to reduce fsync latencies for
> > Android, we decided to support posix in a minimum way. In order to avoid
> > complex directory updates recursively, however, we allowed the fsync on
> > directories to trigger checkpoint requiring many IO operations.
> 
> So what you are really saying is that f2fs is not strictly ordered
> w.r.t metadata crash consistency after fsync()? Wasn't that
> considered a bug in btrfs that had to be fixed (and did get fixed)?
> 
> Oh, yeah, it's right there in the test commit history:
> 
> commit f02fe949113f35ae221ec1ab5c9959912f594bf4
> Author: Filipe Manana <fdmanana@xxxxxxxx>
> Date:   Tue Apr 5 11:47:55 2016 +1000
> 
>     generic: add test for fsync after renaming file
>     
>     Test that if we rename a file, create a new file that has the old name
>     of the other file and is a child of the same parent directory, fsync the
>     new inode, power fail and mount the filesystem, we do not lose the first
>     file and that file has the name it was renamed to.
>     
>     This test is motivated by an issue found in btrfs which is fixed by the
>     following patch for the linux kernel:
>     
>       "Btrfs: fix file loss caused by fsync after rename and new inode"
>     
>     Signed-off-by: Filipe Manana <fdmanana@xxxxxxxx>
>     Reviewed-by: Eryu Guan <eguan@xxxxxxxxxx>
>     Signed-off-by: Dave Chinner <david@xxxxxxxxxxxxx>
> 
> There's a whole lot more detail in the kernel commit 2be63d5ce929
> ("Btrfs: fix file loss on log replay after renaming a file and
> fsync") but my point is that we considered this a btrfs filesystem
> bug and so changing the test defeats it's purpose as a regression
> test for the btrfs bug.
> 
> So IMO the test should not be changed. And I think we should be
> consistent and consider this f2fs failure as a f2fs bug that needs
> fixing to bring it's behaviour in line with xfs, ext4, and btrfs.
> 
> Remember this when quoting POSIX about fsync behaviour: Posix is a
> terrible standard when it comes to data integrity. We go way, way
> beyond what POSIX specifies as a valid fsync implementation (i.e.
> posix allows "do nothing and return success" as a conformant
> implementation). Ext4, XFS and btrfs all have strictly ordered
> metadata crash recovery semantics and all of the crash recovery
> tests expect this behaviour from the filesytem being tested. The
> underlying intention is that by encoding it into these tests, all
> widely used and future linux filesystems meet or exceed this crash
> integrity requirement.

I just noticed that this test is not applied for generic filesystems,
but for ones that support metadata_journaling. So, IMHO, we don't have
to rely on POSIX too much here. Back to metadata_journalling, we indeed
have to treat this as a bug in f2fs, since this recovery behavior sounds
like defacto across the major filesystems.

I submitted a patch to address this, based on the hint by Ted.
https://patchwork.kernel.org/patch/10135085/

Thanks,

> 
> IOWs, changing the test is the wrong thing to do on many levels....
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx
> --
> To unsubscribe from this list: send the line "unsubscribe fstests" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html