Re: [PATCH] btrfs/179: optimize remove file selection

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]



On Sat, Aug 19, 2023 at 03:35:33AM +0800, Zorro Lang wrote:
> On Thu, Aug 17, 2023 at 02:13:17PM +0900, Naohiro Aota wrote:
> > Currently, we use "ls ... | sort -R | head -n1" to choose a removing
> > victim. It sorts the files with "ls", sort it randomly and pick the first
> > line, which wastes the "ls" sort.
> > 
> > Also, using "sort -R | head -n1" is inefficient. For example, in a
> > directory with 1000000 files, it takes more than 15 seconds to pick a file.
> > 
> >   $ time bash -c "ls -U | sort -R | head -n 1 >/dev/null"
> >   bash -c "ls -U | sort -R | head -n 1 >/dev/null"  15.38s user 0.14s system 99% cpu 15.536 total
> > 
> >   $ time bash -c "ls -U | shuf -n 1 >/dev/null"
> >   bash -c "ls -U | shuf -n 1 >/dev/null"  0.30s user 0.12s system 138% cpu 0.306 total
> > 
> > So, just use "ls -U" and "shuf -n 1" to choose a victim.
> > 
> > Signed-off-by: Naohiro Aota <naohiro.aota@xxxxxxx>
> > ---
> >  tests/btrfs/179 | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/tests/btrfs/179 b/tests/btrfs/179
> > index 2f17c9f9fb4a..0fbd875cf01b 100755
> > --- a/tests/btrfs/179
> > +++ b/tests/btrfs/179
> > @@ -45,7 +45,7 @@ fill_workload()
> >  
> >  		# Randomly remove some files for every 5 loop
> >  		if [ $(( $i % 5 )) -eq 0 ]; then
> > -			victim=$(ls "$SCRATCH_MNT/src" | sort -R | head -n1)
> > +			victim=$(ls -U "$SCRATCH_MNT/src" | shuf -n 1)
> 
> Thanks for this improvement. This case has two lines have this similar logic,
> Why not change them both?

Indeed. I forgot to pick it.

> And btrfs/192 has a similar line too:
> 
> $ grep -rsn -- "sort -R" tests
> tests/btrfs/179:48:                     victim=$(ls "$SCRATCH_MNT/src" | sort -R | head -n1)
> tests/btrfs/179:72:             victim=$(ls "$SCRATCH_MNT/snapshots" | sort -R | head -n1)
> tests/btrfs/192:75:     echo "$basedir/$(ls $basedir | sort -R | tail -1)"
> tests/btrfs/004:204:    for file in `find $dir -name f\* -size +0 | sort -R`; do

Yeah, we can use "shuf" there too. I found even the simple "sort -R"
(without head or tail) is really slower than "shuf". I'll address them too.

> 
> Do we need to change that too? And a common helper might help, if more cases
> would like to have this helper?

Yeah, maybe, we can define _random_file() as named in btrfs/192.

> Thanks,
> Zorro
> 
> >  			rm "$SCRATCH_MNT/src/$victim"
> >  		fi
> >  		i=$((i + 1))
> > -- 
> > 2.41.0
> > 
> 



[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux