Re: [RFC PATCH 0/8] fstests: _cleanup() overrides are a mess

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]



On Tue, May 24, 2022 at 11:01 AM Dave Chinner <david@xxxxxxxxxxxxx> wrote:
>
> Hi folks,
>
> I pulled on a string a couple of days ago, and it got out of
> control. It all started when I went to kill a test with ctrl-c and
> it, once again, left background processes running that I had to hunt
> down and kill manually.
>
> I then started looking a why this keeps happening, and realised that
> the way we clean up on test completion is messy, inconsistent and
> frequently buggy. So I started cleaning it all up, starting with the
> tests/xfs directory because I saw a lot of low hanging fruit there.
>
> Essentially, we use _cleanup() functions as a way of overriding the
> default trap handler we install in _begin_fstest(). Rather than
> register a new handler, we just redefine the common cleanup function
> and re-implement it (poorly) in every test that does an override.
> Often these overrides are completely unnecessary - I think I reduced
> the total number of overrides in tests/xfs by ~30% (~190 -> ~125),
> and I reudced the number of *unique overrides by a lot more than
> that.
>

That looks like an awesome improvement!

> The method for overriding changes to be "stacked cleanups" rather
> than "duplicated cleanups". That is, tests no longer open code:
>
>         cd /
>         rm -rf $tmp.*
>
> THis is what common/preamble::_cleanup() does. We should call that
> function to do this. Hence if we have a local cleanup that we need
> to do, it becomes:
>
> local_cleanup()
> {
>         rm -f $testfile
>         _cleanup
> }
> _register_cleanup local_cleanup

While removing boilerplate code, we had better not create another boilerplate.
Instead of expecting test writers to always call _cleanup
if we always want _cleanup to be called we can always implicitly
chain it in _register_cleanup():

--- a/common/preamble
+++ b/common/preamble
@@ -20,7 +20,7 @@ _register_cleanup()
        shift

        test -n "$cleanup" && cleanup="${cleanup}; "
-       trap "${cleanup}exit \$status" EXIT HUP INT QUIT TERM $*
+       trap "${cleanup}_cleanup; exit \$status" EXIT HUP INT QUIT TERM $*
 }

 # Prepare to run a fstest by initializing the required global variables to
@@ -46,7 +46,7 @@ _begin_fstest()
        tmp=/tmp/$$
        status=1        # failure is the default!

-       _register_cleanup _cleanup
+       _register_cleanup

        . ./common/rc


It could be a bit weird for tests that call
_register_cleanup _cleanup

But you removed most of those in the SIGBUS patch
and even if they do exists, _cleanup() is re-entrant.


>
> While this looks more verbose, it means we can actually reuse the
> same cleanup function across lots of tests.
>
> A large number of xfsdump tests were all using the same override
> cleanup function to call _cleanup_dump. These are all changed to:
>
> . ./common/dump
> _register_cleanup _cleanup_dump
>
> and _cleanup_dump stacks like this:
>
> _cleanup_dump()
> {
>         #do xfsdump cleanup stuff
>
>         _cleanup
> }
>
> and we don't need to do anything else. There is one xfsdump test
> that needs it's own cleanup. It stacks like this:
>
> local_cleanup()
> {
>         rm -f $testfile
>         _cleanup_dump
> }
> _register_cleanup local_cleanup
>
> All the tests that run fsstress in the background now have a common
> cleanup function that kills fsstress processes defined in
> common/preamble. They just do:
>
> _register_cleanup _cleanup_fsstress
>
> And now every test that puts fsstress in the background behaves
> correctly and kills all the background fsstree processes when
> interrupted.
>
> The conversion is by no means complete. I've named the local cleanup
> functions by what they do so we can go back and derive commonality
> between them. The number of different variations on tearing down
> loops devices is crazy, and half of them are buggy. I haven't worked
> through these yet, so you'll see lots of tests with:
>
> _loop_cleanup()
> {
>         ......
>         _cleanup
> }
> _register_cleanup _loop_cleanup
>
> That have similar but different ways of cleaning up loop devices.
>
> I also added a _no_cleanup() function, as there are a large number
> of xfs fuzzer tests that want to leave a warm corpse behind so that
> debugging what just happened is easy.
>
> I also added BUS to the default signal trap set - well over a 100
> tests in tests/xfs had a line like:
>
> _register_cleanup "_cleanup" BUS
>
> just to add BUS signals to the set that would cause the cleanup
> function to run. Just make it the default!
>
> Overall, this significantly reduces the amount of boiler plate in
> tests, and sets us down the path of common cleanup functions that
> tests may not even need to define. e.g. just including
> ./common/dmflakey registers the _cleanup_dmflakey() trap that will
> do all the necessary cleanup when the test exists. This makes the
> tests simpler, more robust and reduces the maintenance burden of
> over 1700 individual tests....
>
> I won't put the full diffstat in this mail, but the benefits should
> be clean from the summary:
>
> 360 files changed, 420 insertions(+), 1781 deletions(-)
>
> I've lost count of the number of test bugs I killed in removing
> all this code, and that's largely just in the tests/xfs directory.
> So before I go spend another couple of days on converting the rest
> of fstests, I figured I better make sure everyone is OK with these
> changes.
>
> Thoughts, comments?
>

Thank you for taking this on!

Amir.



[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux