On 11/02/2017 02:13 PM, Eryu Guan wrote: > On Wed, Nov 01, 2017 at 05:03:21PM +0200, Omer Zilberberg wrote: >> >> On 11/01/2017 02:52 PM, Eryu Guan wrote: >>> On Wed, Nov 01, 2017 at 02:06:27PM +0200, Omer Zilberberg wrote: >>>> On 10/31/2017 01:34 PM, Eryu Guan wrote: >>>>> On Tue, Oct 31, 2017 at 12:25:51PM +0200, Omer Zilberberg wrote: >>>>>> On 10/31/2017 06:37 AM, Eryu Guan wrote: >>>>>>> On Tue, Oct 31, 2017 at 07:36:58AM +1100, Dave Chinner wrote: >>>>>>>> On Mon, Oct 30, 2017 at 10:08:31AM +0200, Omer Zilberberg wrote: >>>>>>>>> These tests locally change the TEST_FS_MOUNT_OPTS/MOUNT_OPTIONS >>>>>>>>> environment variables, and run _test_cycle_mount. As a result, following >>>>>>>>> tests using the TEST mount point may start with different mount options, >>>>>>>>> depending on run order. >>>>>>>> I don't think that's the case. The change of the environment >>>>>>>> variable should only affect the current test process and it's >>>>>>>> children. When the test exits, we go back to the environment of the >>>>>>>> check process, where the TEST_FS_MOUNT_OPTS environment variable is >>>>>>>> still correctly set, and all future tests inherit from that. i.e.: >>>>>>>> >>>>>>>> $ export FOO=foo >>>>>>>> $ echo $FOO >>>>>>>> foo >>>>>>>> $ bash >>>>>>>> $ echo $FOO >>>>>>>> foo >>>>>>>> $ export FOO=bar >>>>>>>> $ echo $FOO >>>>>>>> bar >>>>>>>> $ exit >>>>>>>> $ echo $FOO >>>>>>>> foo >>>>>>>> $ >>>>>>>> >>>>>>>> And after each test, check runs _check_filesystems(), which cycles >>>>>>>> the test mount, so for each new test process that is run they should >>>>>>>> already start in the correct state... >>>>>>> I agreed, the changing of variables in a sub-shell won't affect the >>>>>>> parent's copy, and check will restore the mounts with the untouched >>>>>>> options. >>>>>>> >>>>>>> But the problem is that _check_test_fs() will cycle mount TEST_DEV with >>>>>>> MOUNT_OPTIONS not TEST_FS_MOUNT_OPTS, so if you have different mount >>>>>>> options set for TEST_DEV and SCRATCH_DEV, you'll see mount options >>>>>>> changed for TEST_DEV. e.g. >>>>>>> >>>>>>> MOUNT_OPTIONS="-o dax" TEST_FS_MOUNT_OPTS="" ./check generic/413 generic/445 >>>>>>> generic/445 mount TEST_DEV with "-o dax" too >>>>>>> >>>>>>> MOUNT_OPTIONS="" TEST_FS_MOUNT_OPTS="-o dax" ./check generic/413 generic/445 >>>>>>> generic/445 mount TEST_DEV without "-o dax" >>>>>>> >>>>>>> MOUNT_OPTIONS="-o dax" TEST_FS_MOUNT_OPTS="-o dax" ./check generic/413 generic/445 >>>>>>> both tests and both devices mount with "-o dax" >>>>>>> >>>>>>> That's been discussed in this thread: >>>>>>> https://patchwork.kernel.org/patch/9742039/ >>>>>>> >>>>>>> Omer, can you please confirm if you're hitting this issue? >>>>>> I'm not 100% that's the case, so I better describe my settings more clearly: >>>>>> I have a debug mount option on my system to recover the FS from a backup. >>>>>> When that flag is set, umount writes everything to the backup. >>>>>> Mount restores from it, overwriting everything. >>>>> If you're testing with setting your debug mount option to both >>>>> TEST_FS_MOUNT_OPTS and MOUNT_OPTIONS, and you still see the failure you >>>>> were seeing, then that's a different problem. >>>> Yeah, that's what I'm doing, setting both with that flag. >>>>>> As long as generic/413 is not involved, everything works well. >>>>>> All _test_cycle_mount() calls first back everything up on umount, >>>>>> then restore upon mount. So I get the same FS contents. >>>>>> >>>>>> But, consider generic/118 running after generic/413: >>>>>> - generic/413 finishes with a mount point with no mount options >>>>>> - generic/118 begins with restored TEST_FS_MOUNT_OPTS, as you've pointed out. >>>>>> - some writes are performed to the FS >>>>>> - next _test_cycle_mount: >>>>>> calls umount w/o backing up (debug flag previously unset by generic/413). >>>>> Does this clear the backup too? If so, I suspect TEST_DEV got cleared on >>>>> first mount with the debug option in generic/118, because the backup has >>>>> been cleared in the _test_cycle_mount call in generic/413. >>>> Yeah the backup is cleared, which is normal behavior when the debug flag is off. >>>> And exactly, it's generic/413 clearing the flag from the mount point, >>>> that's caused this. >>> IMHO, in this case fstests doesn't do anything wrong, all the mount >>> options are restored when running the next test, it's just not >>> compatible with your debug option and workflow. >> But the mount options are not restored when running the next test... >> Only the first _test_cycle_mount restores them. >> In case I haven't made myself clear, here's a final demonstration. >> If you still think this is valid, I'll back down :) >> >> I've edited generic/118 locally, and after _require_test_reflink, I've added: >> mount | grep $TEST_DIR >> Just to see the mount options the TEST FS has. >> >> I've also put aside my debug option, and used -o strictatime instead. >> If I mount w/o special options, I get the "relatime" mount option as default. >> If I mount with -o strictatime, I do not see "relatime". Now back to fstests: > I assume you still set "-o strictatime" to both TEST_FS_MOUNT_OPTS and > MOUNT_OPTIONS, so all went well so far. > >> If the order of run is generic/118 followed by generic/413, >> Then generic/118 does NOT print "relatime" in its options, because strictatime >> is active. > Yeah, I saw the same result, which was also expected. > >> However, if the order of run is reversed, then "relatime" IS printed, because >> the TEST FS' was left with default mount options by generic/413. > I saw different result here, generic/118 didn't print relatime for me. > > [root@ibm-x3550m3-05 xfstests]# TEST_FS_MOUNT_OPTS="-o strictatime" MOUNT_OPTIONS="-o strictatime" ./check generic/413 generic/118 > FSTYP -- xfs (debug) > PLATFORM -- Linux/x86_64 ibm-x3550m3-05 4.14.0-rc7 > MKFS_OPTIONS -- -f -bsize=4096 /dev/sdc2 > MOUNT_OPTIONS -- -o strictatime /dev/sdc2 /mnt/testarea/scratch > > generic/413 1s ... - output mismatch (see /root/xfstests/results//generic/413.out.bad) > --- tests/generic/413.out 2017-10-11 18:15:46.720424655 +0800 > +++ /root/xfstests/results//generic/413.out.bad 2017-11-02 20:02:39.546307209 +0800 > @@ -1,2 +1,4 @@ > QA output created by 413 > +/dev/sdc1 on /mnt/testarea/test type xfs (rw,seclabel,attr2,inode64,noquota) # right after _require_test > +/dev/sdc1 on /mnt/testarea/test type xfs (rw,relatime,attr2,inode64,noquota) # after _cycle_test_mount > Silence is golden > ... > (Run 'diff -u tests/generic/413.out /root/xfstests/results//generic/413.out.bad' to see the entire diff) > generic/118 0s ... - output mismatch (see /root/xfstests/results//generic/118.out.bad) > --- tests/generic/118.out 2017-10-11 18:15:46.644423183 +0800 > +++ /root/xfstests/results//generic/118.out.bad 2017-11-02 20:02:41.080336481 +0800 > @@ -1,5 +1,7 @@ > QA output created by 118 > +/dev/sdc1 on /mnt/testarea/test type xfs (rw,seclabel,attr2,inode64,noquota) # right after _require_test_reflink > Create the original files > +/dev/sdc1 on /mnt/testarea/test type xfs (rw,attr2,inode64,noquota) # after _cycle_test_mount > > Note that I hacked 'check' too to not sort the tests by seq number. > > Did I miss anything? It took me a while to figure this one out... You haven't missed anything. The mechanism that revives the mount options is _check_generic_filesystem, or _check_xfs_filesystem in this case, and it does this by unmounting and later mounting again. In my code path, _check_generic_filesystem is skipped, replaced by NOP, because I didn't need fsck, etc., but I also lost the inherent cycle mount that's included in that method. Sorry about that. > > Thanks, > Eryu > >> So, generic/118's start conditions depend on its predecessor. >> Is that valid for TEST fs mount options? >> >>>>>> calls mount WITH the debug flag, and recovers from an empty backup, >>>>>> deleting the earlier writes. >>>>>> - subsequent md5sum fails on "No such file or directory", as FS is now empty. >>>>>> >>>>>>> I think fixing _check_<fs>_filesystem() is the correct way. And I guess >>>>>>> we can refactor out a common function and call it in >>>>>>> _check_[xfs|btrfs|generic]_filesystem, pass the correct mount options >>>>>>> based on what device we're working on. >>>>>> If indeed we're talking about the same problem, >>>>>> please let me know if you'd like me to prepare a different patch. >>>>> Sure, really appreciated if you can prepare a different patch, even if >>>>> it's not the same problem :) >>>> Ok. >>>> But are we in agreement that there are 2 different issues here? >>> Yes, the problem you hit is different with the issue I described above. >>> >>>> If so, please let me know what you think of this patch, >>>> which does resolve that issue I had originally (at least locally for me). >>> I prefer not merging your patch, sorry. Because in this specific case, >>> from fstests' point of view, it's all doing well and everything works as >>> expected. >>> >>>> And I'll explore the issue with check_test_fs and the different mount options, >>>> based on what you've both written here and the thread you've pointed to. >>>> I'll send another patch to address that later. >>> Thanks a lot! >>> >>> Eryu >>> -- >>> To unsubscribe from this list: send the line "unsubscribe fstests" in >>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html