Re: Pathological allocation pattern with direct IO

Dave Chinner <david@xxxxxxxxxxxxx> · Thu, 7 Mar 2013 16:03:25 +1100

On Wed, Mar 06, 2013 at 09:22:10PM +0100, Jan Kara wrote:
>   Hello,
> 
>   one of our customers has application that write large (tens of GB) files
> using direct IO done in 16 MB chunks. They keep the fs around 80% full
> deleting oldest files when they need to store new ones. Usually the file
> can be stored in under 10 extents but from time to time a pathological case
> is triggered and the file has few thousands extents (which naturally has
> impact on performance). The customer actually uses 2.6.32-based kernel but
> I reproduced the issue with 3.8.2 kernel as well.
> 
> I was analyzing why this happens and the filefrag for the file looks like:
> Filesystem type is: 58465342
> File size of /raw_data/ex.20130302T121135/ov.s1a1.wb is 186294206464
> (45481984 blocks, blocksize 4096)
>  ext logical physical expected length flags
>    0       0       13          4550656
>    1 4550656 188136807  4550668 12562432
>    2 17113088 200699240 200699238 622592
>    3 17735680 182046055 201321831   4096
>    4 17739776 182041959 182050150   4096
>    5 17743872 182037863 182046054   4096
>    6 17747968 182033767 182041958   4096
>    7 17752064 182029671 182037862   4096
> ...
> 6757 45400064 154381644 154389835   4096
> 6758 45404160 154377548 154385739   4096
> 6759 45408256 252951571 154381643  73728 eof
> /raw_data/ex.20130302T121135/ov.s1a1.wb: 6760 extents found
> 
> So we see that at one moment, the allocator starts giving us 16 MB chunks
> backwards. This seems to be caused by XFS_ALLOCTYPE_NEAR_BNO allocation. For
> two cases I was able to track down the logic:
> 
> 1) We start allocating blocks for file. We want to allocate in the same AG
> as the inode is. First we try exact allocation which fails so we try
> XFS_ALLOCTYPE_NEAR_BNO allocation which finds large enough free extent
> before the inode. So we start allocating 16 MB chunks from the end of that
> free extent. From this moment on we are basically bound to continue
> allocating backwards using XFS_ALLOCTYPE_NEAR_BNO allocation until we
> exhaust the whole free extent.
> 
> 2) Similar situation happens when we cannot further grow current extent but
> there is large free space somewhere before this extent in the AG.
> 
> So I was wondering is this known? Is XFS_ALLOCTYPE_NEAR_BNO so beneficial
> it outweights pathological cases like the above? Or shouldn't it maybe be
> disabled for larger files or for direct IO?

Well known issue, first diagnosed about 15 years ago, IIRC. Simple
solution: use extent size hints.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs