Re: xfs_extent_busy_flush vs. aio

Dave Chinner <david@xxxxxxxxxxxxx> · Mon, 29 Jan 2018 22:35:56 +1100

On Mon, Jan 29, 2018 at 11:40:27AM +0200, Avi Kivity wrote:
> 
> 
> On 01/25/2018 03:08 PM, Brian Foster wrote:
> > On Thu, Jan 25, 2018 at 10:50:40AM +0200, Avi Kivity wrote:
> > > On 01/23/2018 07:39 PM, Brian Foster wrote:
> > > > Yeah, could be.. perhaps the issue is that despite the large amount of
> > > > total free space, the free space is too fragmented to satisfy a
> > > > particular allocation request..?
> > >     from      to extents  blocks    pct
> > >        1       1    2702    2702   0.00
> > >        2       3     690    1547   0.00
> > >        4       7     115     568   0.00
> > >        8      15      60     634   0.00
> > >       16      31      63    1457   0.00
> > >       32      63     102    4751   0.00
> > >       64     127    7940  895365   0.19
> > >      128     255   49680 12422100   2.67
> > >      256     511    1025  417078   0.09
> > >      512    1023    4170 3660771   0.79
> > >     1024    2047    2168 3503054   0.75
> > >     2048    4095    2567 7729442   1.66
> > >     4096    8191    8688 59394413  12.76
> > >     8192   16383     310 3100186   0.67
> > >    16384   32767     112 2339935   0.50
> > >    32768   65535      35 1381122   0.30
> > >    65536  131071       8  651391   0.14
> > >   131072  262143       2  344196   0.07
> > >   524288 1048575       4 2909925   0.62
> > > 1048576 2097151       3 3550680   0.76
> > > 4194304 8388607      10 82497658  17.72
> > > 8388608 16777215      10 158022653  33.94
> > > 16777216 24567552       5 122778062  26.37
> > > total free extents 80469
> > > total free blocks 465609690
> > > average free extent size 5786.2
> > > 
> > > Looks like plenty of free large extents, with most of the free space
> > > completely, unfragmented.
> > > 
> > Indeed..

You need to look at each AG, not the overall summary. You could have
a suboptimal AG hidden in amongst that (e.g. near ENOSPC) and it's
that one AG that is causing all your problems.

There's many reasons this can happen, but the most common is the
working files in a directory (or subset of directories in the same
AG) have a combined space usage of larger than an AG ....

> > > Lots of 16MB-32MB extents, too. 32MB is our allocation hint size, could have
> > > something to do with it.
> > > 
> > Most likely. Based on this, it's hard to say for certain why you'd be
> > running into allocation latency caused by busy extents.

One of only two reasons:

	1. the AG has a large enough free space, but they are all
	marked busy (i.e. just been freed), or

	2. The extent selected has had the busy range trimmed out of
	it and it's now less than the minimum extent length
	requested.

Both cases imply that we're allocating extents that have been very
recently freed, and that implies there is no other suitable non-busy
free space in the AG. Hence the need to look at the per-AG freespace
pattern rather than the global summary.

Also, it's worth dumping the freespace via xfs_spaceman as it walks
the in memory trees rather than the on-disk trees and so is properly
coherent with operations in progress. (i.e. xfs_spaceman -c "freesp
..." /mntpt)

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html