On Tue, May 24, 2022 at 3:28 PM Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > On Tue, May 24, 2022 at 03:14:39PM +0300, Amir Goldstein wrote: > > On Tue, May 24, 2022 at 1:13 PM Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > > > > > On Tue, May 24, 2022 at 01:01:57PM +0300, Amir Goldstein wrote: > > > > On Tue, May 24, 2022 at 12:57 PM Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > > > > > > > > > On Tue, May 24, 2022 at 11:29:17AM +0300, Amir Goldstein wrote: > > > > > > On Tue, May 24, 2022 at 11:01 AM Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > > > > > > > > > > > > > Hi folks, > > > > > > > > > > > > > > I pulled on a string a couple of days ago, and it got out of > > > > > > > control. It all started when I went to kill a test with ctrl-c and > > > > > > > it, once again, left background processes running that I had to hunt > > > > > > > down and kill manually. > > > > > > > > > > > > > > I then started looking a why this keeps happening, and realised that > > > > > > > the way we clean up on test completion is messy, inconsistent and > > > > > > > frequently buggy. So I started cleaning it all up, starting with the > > > > > > > tests/xfs directory because I saw a lot of low hanging fruit there. > > > > > > > > > > > > > > Essentially, we use _cleanup() functions as a way of overriding the > > > > > > > default trap handler we install in _begin_fstest(). Rather than > > > > > > > register a new handler, we just redefine the common cleanup function > > > > > > > and re-implement it (poorly) in every test that does an override. > > > > > > > Often these overrides are completely unnecessary - I think I reduced > > > > > > > the total number of overrides in tests/xfs by ~30% (~190 -> ~125), > > > > > > > and I reudced the number of *unique overrides by a lot more than > > > > > > > that. > > > > > > > > > > > > > > > > > > > That looks like an awesome improvement! > > > > > > > > > > > > > The method for overriding changes to be "stacked cleanups" rather > > > > > > > than "duplicated cleanups". That is, tests no longer open code: > > > > > > > > > > > > > > cd / > > > > > > > rm -rf $tmp.* > > > > > > > > > > > > > > THis is what common/preamble::_cleanup() does. We should call that > > > > > > > function to do this. Hence if we have a local cleanup that we need > > > > > > > to do, it becomes: > > > > > > > > > > > > > > local_cleanup() > > > > > > > { > > > > > > > rm -f $testfile > > > > > > > _cleanup > > > > > > > } > > > > > > > _register_cleanup local_cleanup > > > > > > > > > > > > While removing boilerplate code, we had better not create another boilerplate. > > > > > > Instead of expecting test writers to always call _cleanup > > > > > > if we always want _cleanup to be called we can always implicitly > > > > > > chain it in _register_cleanup(): > > > > > > > > > > > > --- a/common/preamble > > > > > > +++ b/common/preamble > > > > > > @@ -20,7 +20,7 @@ _register_cleanup() > > > > > > shift > > > > > > > > > > > > test -n "$cleanup" && cleanup="${cleanup}; " > > > > > > - trap "${cleanup}exit \$status" EXIT HUP INT QUIT TERM $* > > > > > > + trap "${cleanup}_cleanup; exit \$status" EXIT HUP INT QUIT TERM $* > > > > > > } > > > > > > > > > > I considered that, but then I found the _no_cleanup cases. IOWs, > > > > > this doesn't work for the cases where we want to prevent the generic > > > > > _cleanup function from being run on failure/test exit. Hence the > > > > > cleanup function stacking behaviour rather than unconditional > > > > > calling of _cleanup as per above. > > > > > > > > > > > > > I didn't know about those. > > > > Since you went to all this trouble to find them, can you provide a reference. > > > > I wonder, what could ever be the reason not to want to rm $tmp.*? > > > > > > [PATCH 6/8] fstests: consolidate no cleanup test setup > > > > > > > Ah I see. > > It might have been better to explicitly opt-out of cleanup > > only for those tests via _register_no_cleanup or _unregister_cleanup > > Premature optimisation. > It's not an optimization at all. The purpose of doing it this way would be to eliminate the human mistakes (from which you found so many in this work) and create a situation where _cleanup() is called for all tests, unless the test author really wants to avoid it. But anyway, no need to fix everything at one go. There will still be plenty of ways that a test author can write a test that does not clean after itself. Thanks, Amir.