On Wed, Apr 13, 2022 at 10:13:35AM +0300, Amir Goldstein wrote: > On Wed, Apr 13, 2022 at 4:53 AM Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > > > On Tue, Apr 12, 2022 at 10:25:00PM +0800, Zorro Lang wrote: > > > On Tue, Apr 12, 2022 at 02:59:42PM +0200, David Disseldorp wrote: > > > > On Mon, 11 Apr 2022 15:48:33 +1000, Dave Chinner wrote: > > > > > > > > > From: Dave Chinner <dchinner@xxxxxxxxxx> > > > > > > > > > > If you ctrl-c generic/019, it leaves fsstress processes running. > > > > > Kill them in the cleanup function so that they don't have to be > > > > > manually killed after interrupting the test. > > > > > > > > > > While touching the _cleanup() function, make it do everything that > > > > > the generic _cleanup function it overrides does and fix the > > > > > indenting. > > > > > > > > > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> > > > > > --- > > > > > tests/generic/019 | 6 ++++-- > > > > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > > > > > > > diff --git a/tests/generic/019 b/tests/generic/019 > > > > > index db56dac1..cda107f4 100755 > > > > > --- a/tests/generic/019 > > > > > +++ b/tests/generic/019 > > > > > @@ -53,8 +53,10 @@ stop_fail_scratch_dev() > > > > > # Override the default cleanup function. > > > > > _cleanup() > > > > > { > > > > > - disallow_fail_make_request > > > > > - rm -f $tmp.* > > > > > + kill $fs_pid $fio_pid &> /dev/null > > > > > + disallow_fail_make_request > > > > > + cd / > > > > > + rm -r -f $tmp.* > > > > > } > > > > > > > > > > RUN_TIME=$((20+10*$TIME_FACTOR)) > > > > > > > > Might be worth unset'ing the "fs_pid" and "fio_pid" variables after the > > > > wait, but should be fine as-is: > > > > > > I agree. Better to avoid killing other system processes. Or how about this place > > > does (avoid killing system useful processes): > > > $KILLALL_PROG -q $FSSTRESS_PROG > > > $KILLALL_PROG -q $FIO_PROG > > > > > > Another picky question is, do we need to use a while loop checking, until the > > > processes really get killed? :) > > > > Do we really need to paint the bikeshed over how best to kill a > > process? I don't have time to do that, this is just a drive-by fix > > that works for me.... > > > > This is not a kind response to reviewers. > Does a "drive-by fix" get exempt from the review process? > The review comments are legit even if they could be dismissed > on technical grounds, because the risk of pid wraparound is quite low. > > I don't think this is about "bikeshed over how best to kill a process" > I think this is about how to have better test cleanup practices. I agree, but this is a broad treewide cleanup, which itself is a separate project that shouldn't hold up this quick cleanup... > It would have been nice to have better isolation by having fstests > run a test without a control group and cleanup the control group > processes after the test if someone wants to take on this task. ...because there are quite a few places (particularly anything that runs fsx/fsstress/iogen for fun) where we kick off a group of background processes and later require a reliable way to shoot them all down. Fixing all that in a consistent way is a *much* bigger task than what Dave is trying to accomplish here. The current "scheme" is that ./check will run each test in its own systemd scope (if available) to try to improve the reliability of test program cleanup if the _cleanup method itself fails to kill all the child tasks. This isn't foolproof because some people refuse to use systemd, and the systemd tools themselves can't do a whole lot about processes stuck in D state. In the ideal world, whoever takes on cleaning up process cleanup probably ought to figure out a more general solution, or at least investigate it more thoroughly than I did to decide if it's worth reimplementing process control group control via bash script for people who do not use systemd. Does anyone want to take on this task? > I personally prefer the pattern of dedicated cleanup trap for aborting the test > like generic/251 that leaves the generic _cleanup on EXIT instead of > duplicating _cleanup (which generic/251 also duplicate incorrectly), > but no strong feeling about that, so as a "drive-by fix" you may add: > > Reviewed-by: Amir Goldstein <amir73il@xxxxxxxxx> For this patch, Reviewed-by: Darrick J. Wong <djwong@xxxxxxxxxx> --D > > Thanks, > Amir.