On Thu, Dec 28, 2017 at 08:59:18AM -0800, Eric Sandeen wrote: > On 12/28/17 1:09 AM, Dave Chinner wrote: > ... > > > There's a whole lot more detail in the kernel commit 2be63d5ce929 > > ("Btrfs: fix file loss on log replay after renaming a file and > > fsync") but my point is that we considered this a btrfs filesystem > > bug and so changing the test defeats it's purpose as a regression > > test for the btrfs bug. > > > > So IMO the test should not be changed. And I think we should be > > consistent and consider this f2fs failure as a f2fs bug that needs > > fixing to bring it's behaviour in line with xfs, ext4, and btrfs. > > > > Remember this when quoting POSIX about fsync behaviour: Posix is a > > terrible standard when it comes to data integrity. We go way, way > > beyond what POSIX specifies as a valid fsync implementation (i.e. > > posix allows "do nothing and return success" as a conformant > > implementation). Ext4, XFS and btrfs all have strictly ordered > > metadata crash recovery semantics and all of the crash recovery > > tests expect this behaviour from the filesytem being tested. The > > underlying intention is that by encoding it into these tests, all > > widely used and future linux filesystems meet or exceed this crash > > integrity requirement. > > > > IOWs, changing the test is the wrong thing to do on many levels.... > > > > Cheers, > > > > Dave. > > > > Thanks for digging into the btrfs commit which changed the behavior > tested by this testcase, I had meant to do that. > > Yeah, that's all fair, for sure. But this shouldn't be implicit in the > testcase; it should explicitly document that it is testing a de-facto > new standard adhered to by several filesystems, what that standard is, > etc, and not leave it to the reader. ;) Well, the point of the test case is to catch recovery behaviour that is not properly ordered in this complex corner case. It's not an obvious thing for developers to test, as we can see by both btrfs and now f2fs failing this test. What that failure actually means and implies is completely separate to the test itself. i.e. the test is just a canary that tells us a filesystem is not strictly ordered in it's crash recovery behaviour. If we don't have tests somewhere that exercise the behaviour we want filesystems to have, we'll never know if a filesystem does or doesn't behave as we want them to. The fact we'd like all filesystems to have strictly ordered crash recovery semantics has nothing to do with the test. It's based on the fact that XFS and ext3/4 have had this behaviour since forever, and apps written on these filesystems are quite likely to skip the directory fsync() because it's not necessary. That's seen by btrfs users reporting lost files where other filesystems are fine. We want all linux filesystems to behave the same way in these circumstances, but we can't enforce it because.... > If a filesystem explicitly chooses not to adhere to that standard they > /could/ opt out of the test. ... that is a option the filesystem developers can take. > It's makes a lot of sense for linux filesystems to behave in consistent > ways, but if we're going to start enforcing that through xfstests we > need to be very clear about it, IMHO. xfstests is not the way we enforce it - it's just tells us something is inconsistent. What the fs developers do from there (fix it or skip the test) is up to them. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html