On 12/28, Dave Chinner wrote: > On Wed, Dec 27, 2017 at 11:11:30AM -0800, Jaegeuk Kim wrote: > > On 12/25, Theodore Ts'o wrote: > > > On Sun, Dec 24, 2017 at 11:47:20PM -0800, Eric Sandeen wrote: > > > > Filesystems are free to do /more/ than the minimum required by posix - > > > > see ext4_sync_parent for example. Or xfs_finish_rename, for synchronous > > > > mounts: > > > > > > > > * If this is a synchronous mount, make sure that the rename transaction > > > > * goes to disk before returning to the user. > > > > */ > > > > if (tp->t_mountp->m_flags & (XFS_MOUNT_WSYNC|XFS_MOUNT_DIRSYNC)) > > > > xfs_trans_set_sync(tp); > > > > > > > > so behavior can be fs-dependent, or mount option dependent, etc. > > > > > > > > But to be portable, if an app wants directory changes to be persistent > > > > before proceeding, it must fsync the directory after making changes. > > > > > > > > I don't know about f2fs's design intent, whether it guarantees more > > > > than posix requires, but to guarantee that this test works on any posix > > > > fs, I think that directory fsync is needed. > > > > > > Agreed that this is a test bug, and we should add the fsync to the > > > parent directory. > > > > Agreed too. Or, how about using "-o dirsync"? > > > > > > > > It might also be a good idea for f2fs to do more, given that fsync is > > > a slow enough operation that so long as you can make sure the fsync of > > > the parent directory happens within the same atomic update as the > > > child inode, you might as well give the more expansive guarantee. But > > > obviously that's up to the f2fs developers to decide whether they want > > > to do that work. > > > > Indeed. Actually, since one of our goals was to reduce fsync latencies for > > Android, we decided to support posix in a minimum way. In order to avoid > > complex directory updates recursively, however, we allowed the fsync on > > directories to trigger checkpoint requiring many IO operations. > > So what you are really saying is that f2fs is not strictly ordered > w.r.t metadata crash consistency after fsync()? Wasn't that > considered a bug in btrfs that had to be fixed (and did get fixed)? > > Oh, yeah, it's right there in the test commit history: > > commit f02fe949113f35ae221ec1ab5c9959912f594bf4 > Author: Filipe Manana <fdmanana@xxxxxxxx> > Date: Tue Apr 5 11:47:55 2016 +1000 > > generic: add test for fsync after renaming file > > Test that if we rename a file, create a new file that has the old name > of the other file and is a child of the same parent directory, fsync the > new inode, power fail and mount the filesystem, we do not lose the first > file and that file has the name it was renamed to. > > This test is motivated by an issue found in btrfs which is fixed by the > following patch for the linux kernel: > > "Btrfs: fix file loss caused by fsync after rename and new inode" > > Signed-off-by: Filipe Manana <fdmanana@xxxxxxxx> > Reviewed-by: Eryu Guan <eguan@xxxxxxxxxx> > Signed-off-by: Dave Chinner <david@xxxxxxxxxxxxx> > > There's a whole lot more detail in the kernel commit 2be63d5ce929 > ("Btrfs: fix file loss on log replay after renaming a file and > fsync") but my point is that we considered this a btrfs filesystem > bug and so changing the test defeats it's purpose as a regression > test for the btrfs bug. > > So IMO the test should not be changed. And I think we should be > consistent and consider this f2fs failure as a f2fs bug that needs > fixing to bring it's behaviour in line with xfs, ext4, and btrfs. > > Remember this when quoting POSIX about fsync behaviour: Posix is a > terrible standard when it comes to data integrity. We go way, way > beyond what POSIX specifies as a valid fsync implementation (i.e. > posix allows "do nothing and return success" as a conformant > implementation). Ext4, XFS and btrfs all have strictly ordered > metadata crash recovery semantics and all of the crash recovery > tests expect this behaviour from the filesytem being tested. The > underlying intention is that by encoding it into these tests, all > widely used and future linux filesystems meet or exceed this crash > integrity requirement. I just noticed that this test is not applied for generic filesystems, but for ones that support metadata_journaling. So, IMHO, we don't have to rely on POSIX too much here. Back to metadata_journalling, we indeed have to treat this as a bug in f2fs, since this recovery behavior sounds like defacto across the major filesystems. I submitted a patch to address this, based on the hint by Ted. https://patchwork.kernel.org/patch/10135085/ Thanks, > > IOWs, changing the test is the wrong thing to do on many levels.... > > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx > -- > To unsubscribe from this list: send the line "unsubscribe fstests" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html