Re: ENSOPC on a 10% used disk

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Avi,

On Wed, Oct 17, 2018 at 10:52:48AM +0300, Avi Kivity wrote:
> I have a user running a 1.7TB filesystem with ~10% usage (as shown
> by df), getting sporadic ENOSPC errors. The disk is mounted with
> inode64 and has a relatively small number of large files. The disk
> is a single-member RAID0 array, with 1MB chunk size. There are 32
> AGs. Running Linux 4.9.17.
> 
> 
> The write load consists of AIO/DIO writes, followed by unlinks of
> these files. The writes are non-size-changing (we truncate ahead)
> and we use XFS_IOC_FSSETXATTR/XFS_FLAG_EXTSIZE with a hint size of
> 32MB. The errors happen on commit logs, which have a target size of
> 32MB (but may exceed it a little).
> 
> 
> The errors are sporadic and after restarting the workload they go
> away for a few hours to a few days, but then return. During one of
> the crashes I used xfs_db to look at fragmentation and saw that most
> AGs had free extents of size categories up to 128-255, but a few had
> more. I tried xfs_fsr but it did not help.
> 
> 
> Is this a known issue? Would upgrading the kernel help?

Long time, I know, but Brian has just made me aware of this commit
from early 2018 that went into 4.16 that might be relevant and so I
thought it best to close the loop:

commit 6d8a45ce29c7d67cc4fc3016dc2a07660c62482a
Author: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
Date:   Fri Jan 19 17:47:36 2018 -0800

    xfs: don't screw up direct writes when freesp is fragmented
    
    xfs_bmap_btalloc is given a range of file offset blocks that must be
    allocated to some data/attr/cow fork.  If the fork has an extent size
    hint associated with it, the request will be enlarged on both ends to
    try to satisfy the alignment hint.  If free space is fragmentated,
    sometimes we can allocate some blocks but not enough to fulfill any of
    the requested range.  Since bmapi_allocate always trims the new extent
    mapping to match the originally requested range, this results in
    bmapi_write returning zero and no mapping.
    
    The consequences of this vary -- buffered writes will simply re-call
    bmapi_write until it can satisfy at least one block from the original
    request.  Direct IO overwrites notice nmaps == 0 and return -ENOSPC
    through the dio mechanism out to userspace with the weird result that
    writes fail even when we have enough space because the ENOSPC return
    overrides any partial write status.  For direct CoW writes the situation
    was disastrous because nobody notices us returning an invalid zero-length
    wrong-offset mapping to iomap and the write goes off into space.
    
    Therefore, if free space is so fragmented that we managed to allocate
    some space but not enough to map into even a single block of the
    original allocation request range, we should break the alignment hint in
    order to guarantee at least some forward progress for the direct write.
    If we return a short allocation to iomap_apply it'll call back about the
    remaining blocks.
    
    Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>

The spurious ENOSPC symptoms seem to match what you are seeing here
on your customer's 4.9 kernel, so it may be that this is the fix for
the ENOSPC problem that was reported. If this comes up again, then
perhaps it would be worth either upgrading the kernel to 4.16+ or
backporting this commit to see if it fixes the problem.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux