Re: ENSOPC on a 10% used disk

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Oct 17, 2018 at 10:52:48AM +0300, Avi Kivity wrote:
> I have a user running a 1.7TB filesystem with ~10% usage (as shown
> by df), getting sporadic ENOSPC errors. The disk is mounted with
> inode64 and has a relatively small number of large files. The disk
> is a single-member RAID0 array, with 1MB chunk size. There are 32
> AGs. Running Linux 4.9.17.

ENOSPC on what operation? write? open(O_CREAT)? something else?

What's the filesystem config (xfs_info output)?

> The write load consists of AIO/DIO writes, followed by unlinks of
> these files. The writes are non-size-changing (we truncate ahead)
> and we use XFS_IOC_FSSETXATTR/XFS_FLAG_EXTSIZE with a hint size of
> 32MB. The errors happen on commit logs, which have a target size of
> 32MB (but may exceed it a little).
> 
> 
> The errors are sporadic and after restarting the workload they go
> away for a few hours to a few days, but then return. During one of
> the crashes I used xfs_db to look at fragmentation and saw that most
> AGs had free extents of size categories up to 128-255, but a few had
> more. I tried xfs_fsr but it did not help.

32MB extents are 8192 blocks. The bucket 128-255 records extents
between 512k and 1MB in size, so it sounds like free space has been
fragmented to death. Has xfs_fsr been run on this filesystem
regularly?

If the ENOSPC errors are only from files with a 32MB extent size
hints on them, then it may be that there isn't sufficient contiguous
free space to allocate an entire 32MB extent. I'm not sure what the
allocator behaviour here is (the code is a maze of twisty passages),
so I'll have to look more into this.

In the mean time, can you post the output of the freespace command
(both global and per-ag) so we can see just how much free space
there is and how badly fragmented it has become? I might be able to
reproduce the behaviour if I know the conditions under which it is
occuring.

> Is this a known issue? Would upgrading the kernel help?

Not that I know of. If it's an extszhint vs free space fragmentation
issue, then a kernel upgrade is unlikely to fix it.

Cheers,

Dave.

-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux