Re: [PATCH 5/9] xfs: force inode garbage collection before fallocate when space is low

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 07, 2021 at 03:25:21PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@xxxxxxxxxx>
> 
> Generally speaking, when a user calls fallocate, they're looking to
> preallocate space in a file in the largest contiguous chunks possible.
> If free space is low, it's possible that the free space will look
> unnecessarily fragmented because there are unlinked inodes that are
> holding on to space that we could allocate.  When this happens,
> fallocate makes suboptimal allocation decisions for the sake of deleted
> files, which doesn't make much sense, so scan the filesystem for dead
> items to delete to try to avoid this.
> 
> Note that there are a handful of fstests that fill a filesystem, delete
> just enough files to allow a single large allocation, and check that
> fallocate actually gets the allocation.  These tests regress because the
> test runs fallocate before the inode gc has a chance to run, so add this
> behavior to maintain as much of the old behavior as possible.

I don't think this is a good justification for the change. Just
because the unit tests exploit an undefined behaviour that no
filesystem actually guarantees to acheive a specific layout, it
doesn't mean we always have to behave that way.

For example, many tests used to use reverse sequential writes to
exploit deficiencies in the allocation algorithms to generate
fragmented files. We fixed that problem and the tests broke because
they couldn't fragment files any more.

Did we reject those changes because the tests broke? No, we didn't
because the tests were exploiting an observed behaviour rather than
a guaranteed behaviour.

So, yeah, "test does X to make Y happen" doesn't mean "X will always
make Y happen". It just means the test needs to be made more robust,
or we have to provide a way for the test to trigger the behaviour it
needs.

Indeed, I think that the way to fix these sorts of issues is to have
the tests issue a syncfs(2) after they've deleted the inodes and have
the filesystem run a inodegc flush as part of the sync mechanism.

Then we don't need to do.....

> +/*
> + * If the target device (or some part of it) is full enough that it won't to be
> + * able to satisfy the entire request, try to free inactive files to free up
> + * space.  While it's perfectly fine to fill a preallocation request with a
> + * bunch of short extents, we prefer to slow down preallocation requests to
> + * combat long term fragmentation in new file data.
> + */
> +static int
> +xfs_alloc_consolidate_freespace(
> +	struct xfs_inode	*ip,
> +	xfs_filblks_t		wanted)
> +{
> +	struct xfs_mount	*mp = ip->i_mount;
> +	struct xfs_perag	*pag;
> +	struct xfs_sb		*sbp = &mp->m_sb;
> +	xfs_agnumber_t		agno;
> +
> +	if (!xfs_has_inodegc_work(mp))
> +		return 0;
> +
> +	if (XFS_IS_REALTIME_INODE(ip)) {
> +		if (sbp->sb_frextents * sbp->sb_rextsize >= wanted)
> +			return 0;
> +		goto free_space;
> +	}
> +
> +	for_each_perag(mp, agno, pag) {
> +		if (pag->pagf_freeblks >= wanted) {
> +			xfs_perag_put(pag);
> +			return 0;
> +		}
> +	}

... really hurty things (e.g. on high AG count fs) on every fallocate()
call, and we have a simple modification to the tests that allow them
to work as they want to on both old and new kernels....

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux