Re: [PATCH 02/40] fstests: cleanup fsstress process management

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]



On Thu, Dec 05, 2024 at 03:55:49PM +1100, Dave Chinner wrote:
> On Thu, Dec 05, 2024 at 02:04:35AM +0800, Zorro Lang wrote:
> > On Wed, Nov 27, 2024 at 03:51:32PM +1100, Dave Chinner wrote:
> > > diff --git a/tests/generic/561 b/tests/generic/561
> > > index 39e5977a3..3e931b1a7 100755
> > > --- a/tests/generic/561
> > > +++ b/tests/generic/561
> > > @@ -13,9 +13,9 @@ _begin_fstest auto stress dedupe
> > >  # Override the default cleanup function.
> > >  _cleanup()
> > >  {
> > > +	end_test
> > >  	cd /
> > >  	rm -f $tmp.*
> > > -	end_test
> > >  }
> > >  
> > >  # Import common functions.
> > > @@ -23,28 +23,20 @@ _cleanup()
> > >  . ./common/reflink
> > >  
> > >  _require_scratch_duperemove
> > > -_require_command "$KILLALL_PROG" killall
> > >  
> > >  _scratch_mkfs > $seqres.full 2>&1
> > >  _scratch_mount >> $seqres.full 2>&1
> > >  
> > >  function end_test()
> > >  {
> > > -	local f=1
> > > +	_kill_fsstress
> > >  
> > >  	# stop duperemove running
> > >  	if [ -e $dupe_run ]; then
> > >  		rm -f $dupe_run
> > > -		$KILLALL_PROG -q $DUPEREMOVE_PROG > /dev/null 2>&1
> > > +		kill $dedup_pids
> > 
> > Hi Dave,
> > 
> > The $dedup_pids is a "while loop" bash process, it isn't the $DUPEREMOVE_PROG
> > process itself. From my testing, this change might cause g/561 keep waiting
> > $DUPEREMOVE_PROG processes forever, as $DUPEREMOVE_PROG not always be killed
> > properly.
> 
> I have another patch that I didn't send that reworks duperemove
> killing. I didn't sent it because I ended up simply turning the test
> off via adding it to the unreliable_in_parallel group because it had
> a 75% failure rate due to duperemove bugs (e.g. getting stuck in
> fiemap loops that never terminate)...

Great to know you've noticed that :)

> 
> > I'm wondering if you hope to do "pkill $DUPEREMOVE_PROG" directly? Or you'd
> > like to copy $DUPEREMOVE_PROG to $TEST_DIR/${othername}_duperemove, then
> > pkill ${othername}_duperemove ?
> 
> I came up with the "copy to $seq.binary then pkill $seq.binary" idea
> after I gave up on this test as fundamentally broken. I'll revisit
> it now that we have a reliable way of killing only the processes
> belonging to the specific test. That might make this test
> reliable enough to run concurrently, but I have my doubts...
> 
> I'll have another look at it.

Currently only generic/561 runs duperemove in background bash, if you don't have
time to change it this week, I can help to change it to use the "copy to
$seq.binary then pkill $seq.binary" way temporarily, to avoid a test blocker.
Due to I want to merge your patchset ASAP in this week. Then we can move on.
I don't want to let you or others do much rebase jobs :)

Thanks,
Zorro

> 
> -Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx
> 





[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux