Re: [PATCH 02/40] fstests: cleanup fsstress process management

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]



On Thu, Dec 05, 2024 at 02:04:35AM +0800, Zorro Lang wrote:
> On Wed, Nov 27, 2024 at 03:51:32PM +1100, Dave Chinner wrote:
> > diff --git a/tests/generic/561 b/tests/generic/561
> > index 39e5977a3..3e931b1a7 100755
> > --- a/tests/generic/561
> > +++ b/tests/generic/561
> > @@ -13,9 +13,9 @@ _begin_fstest auto stress dedupe
> >  # Override the default cleanup function.
> >  _cleanup()
> >  {
> > +	end_test
> >  	cd /
> >  	rm -f $tmp.*
> > -	end_test
> >  }
> >  
> >  # Import common functions.
> > @@ -23,28 +23,20 @@ _cleanup()
> >  . ./common/reflink
> >  
> >  _require_scratch_duperemove
> > -_require_command "$KILLALL_PROG" killall
> >  
> >  _scratch_mkfs > $seqres.full 2>&1
> >  _scratch_mount >> $seqres.full 2>&1
> >  
> >  function end_test()
> >  {
> > -	local f=1
> > +	_kill_fsstress
> >  
> >  	# stop duperemove running
> >  	if [ -e $dupe_run ]; then
> >  		rm -f $dupe_run
> > -		$KILLALL_PROG -q $DUPEREMOVE_PROG > /dev/null 2>&1
> > +		kill $dedup_pids
> 
> Hi Dave,
> 
> The $dedup_pids is a "while loop" bash process, it isn't the $DUPEREMOVE_PROG
> process itself. From my testing, this change might cause g/561 keep waiting
> $DUPEREMOVE_PROG processes forever, as $DUPEREMOVE_PROG not always be killed
> properly.

I have another patch that I didn't send that reworks duperemove
killing. I didn't sent it because I ended up simply turning the test
off via adding it to the unreliable_in_parallel group because it had
a 75% failure rate due to duperemove bugs (e.g. getting stuck in
fiemap loops that never terminate)...

> I'm wondering if you hope to do "pkill $DUPEREMOVE_PROG" directly? Or you'd
> like to copy $DUPEREMOVE_PROG to $TEST_DIR/${othername}_duperemove, then
> pkill ${othername}_duperemove ?

I came up with the "copy to $seq.binary then pkill $seq.binary" idea
after I gave up on this test as fundamentally broken. I'll revisit
it now that we have a reliable way of killing only the processes
belonging to the specific test. That might make this test
reliable enough to run concurrently, but I have my doubts...

I'll have another look at it.

-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx




[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux