Re: [PATCH 09/12] generic/251: constrain runtime via time/load/soak factors

"Darrick J. Wong" <djwong@xxxxxxxxxx> · Tue, 19 Nov 2024 13:16:10 -0800

On Wed, Nov 20, 2024 at 08:04:43AM +1100, Dave Chinner wrote:
> On Tue, Nov 19, 2024 at 07:45:20AM -0800, Darrick J. Wong wrote:
> > On Mon, Nov 18, 2024 at 10:13:23PM -0800, Christoph Hellwig wrote:
> > > On Tue, Nov 19, 2024 at 12:45:05PM +1100, Dave Chinner wrote:
> > > > Question for you: Does your $here directory contain a .git subdir?
> > > > 
> > > > One of the causes of long runtime for me has been that $here might
> > > > only contain 30MB of files, but the .git subdir balloons to several
> > > > hundred MB over time, resulting is really long runtimes because it's
> > > > copying GBs of data from the .git subdir.
> > > 
> > > Or the results/ directory when run in a persistent test VM like the
> > > one for quick runs on my laptop.  I currently need to persistently
> > > purge that for just this test.
> 
> Yeah, I use persistent VMs and that's why the .git dir grows...
> 
> > > > --- a/tests/generic/251
> > > > +++ b/tests/generic/251
> > > > @@ -175,9 +175,12 @@ nproc=20
> > > >  # Copy $here to the scratch fs and make coipes of the replica.  The fstests
> > > >  # output (and hence $seqres.full) could be in $here, so we need to snapshot
> > > >  # $here before computing file checksums.
> > > > +#
> > > > +# $here/* as the files to copy so we avoid any .git directory that might be
> > > > +# much, much larger than the rest of the fstests source tree we are copying.
> > > >  content=$SCRATCH_MNT/orig
> > > >  mkdir -p $content
> > > > -cp -axT $here/ $content/
> > > > +cp -ax $here/* $content/
> > > 
> > > Maybe we just need a way to generate more predictable file system
> > > content?
> > 
> > How about running fsstress for ~50000ops or so, to generate some test
> > files and directory tree?
> 
> Do we even need to do that? It's a set of small files distributed
> over a few directories. There are few large files in the mix, so we
> could just create a heap of 1-4 block files across a dozen or so
> directories and get the same sort of data set to copy.
> 
> And given this observation, if we are generating the data set in the
> first place, why use cp to copy it every time? Why not just have
> each thread generate the data set on the fly?

run_process compares the copies to the original to try to discover
places where written blocks got discarded, so they actually do need to
be copies.

/me suspects that this test is kinda bogus if the block device doesn't
set discard_zeroes_data because it won't trip on discard errors for
crappy sata ssds that don't actually clear the remapping tables until
minutes later.

--D

> # create a directory structure with numdirs directories and numfiles
> # files per directory. Files are 0-3 blocks in length, space is
> # allocated by fallocate to avoid needing to write data. Files are
> # created concurrently across directories to create the data set as
> # fast as possible.
> create_files()
> {
> 	local numdirs=$1
> 	local numfiles=$2
> 	local basedir=$3
> 
> 	for ((i=0; i<$numdirs; i++)); do
> 		mkdir -p $basedir/$i
> 		for ((j=0; j<$numfiles; j++); do
> 			local len=$((RANDOM % 4))
> 			$XFS_IO_PROG -fc "falloc 0 ${len}b" $basedir/$i/$j
> 		done &
> 	done
> 	wait
> }
> 
> -Dave
> 
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx