Re: [PATCH v30.2 1/1] xfs: fix severe performance problems when fstrimming a subset of an AG

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Apr 02, 2024 at 10:07:18PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@xxxxxxxxxx>
> 
> On a 10TB filesystem where the free space in each AG is heavily
> fragmented, I noticed some very high runtimes on a FITRIM call for the
> entire filesystem.  xfs_scrub likes to report progress information on
> each phase of the scrub, which means that a strace for the entire
> filesystem:
> 
> ioctl(3, FITRIM, {start=0x0, len=10995116277760, minlen=0}) = 0 <686.209839>
> 
> shows that scrub is uncommunicative for the entire duration.  Reducing
> the size of the FITRIM requests to a single AG at a time produces lower
> times for each individual call, but even this isn't quite acceptable,
> because the time between progress reports are still very high:
> 
> Strace for the first 4x 1TB AGs looks like (2):
> ioctl(3, FITRIM, {start=0x0, len=1099511627776, minlen=0}) = 0 <68.352033>
> ioctl(3, FITRIM, {start=0x10000000000, len=1099511627776, minlen=0}) = 0 <68.760323>
> ioctl(3, FITRIM, {start=0x20000000000, len=1099511627776, minlen=0}) = 0 <67.235226>
> ioctl(3, FITRIM, {start=0x30000000000, len=1099511627776, minlen=0}) = 0 <69.465744>
> 
> I then had the idea to limit the length parameter of each call to a
> smallish amount (~11GB) so that we could report progress relatively
> quickly, but much to my surprise, each FITRIM call still took ~68
> seconds!
> 
> Unfortunately, the by-length fstrim implementation handles this poorly
> because it walks the entire free space by length index (cntbt), which is
> a very inefficient way to walk a subset of the blocks of an AG.
> 
> Therefore, create a second implementation that will walk the bnobt and
> perform the trims in block number order.  This implementation avoids the
> worst problems of the original code, though it lacks the desirable
> attribute of freeing the biggest chunks first.
> 
> On the other hand, this second implementation will be much easier to
> constrain the system call latency, and makes it much easier to report
> fstrim progress to anyone who's running xfs_scrub.
> 
> Inspired-by: Dave Chinner <dchinner@xxxxxxxxxx>
> Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx>
> ---
>  fs/xfs/xfs_discard.c |  153 ++++++++++++++++++++++++++++++--------------------
>  1 file changed, 93 insertions(+), 60 deletions(-)

Looks OK.

Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx>

-- 
Dave Chinner
david@xxxxxxxxxxxxx




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux