On Mon, Jan 11, 2021 at 06:18:37PM -0800, Darrick J. Wong wrote: > On Tue, Jan 12, 2021 at 12:40:45PM +1100, Dave Chinner wrote: > > On Mon, Jan 11, 2021 at 05:31:26PM -0800, Darrick J. Wong wrote: > > > On Tue, Jan 12, 2021 at 12:22:49PM +1100, Dave Chinner wrote: > > > > On Mon, Jan 11, 2021 at 03:23:05PM -0800, Darrick J. Wong wrote: > > > > > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > > > > > > > > > If a fs modification (creation, file write, reflink, etc.) is unable to > > > > > reserve enough quota to handle the modification, try clearing whatever > > > > > space the filesystem might have been hanging onto in the hopes of > > > > > speeding up the filesystem. The flushing behavior will become > > > > > particularly important when we add deferred inode inactivation because > > > > > that will increase the amount of space that isn't actively tied to user > > > > > data. > > > > > > > > > > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> > > > > > --- > > > > > fs/xfs/xfs_bmap_util.c | 16 ++++++++++++++++ > > > > > fs/xfs/xfs_file.c | 2 +- > > > > > fs/xfs/xfs_icache.c | 9 +++++++-- > > > > > fs/xfs/xfs_icache.h | 2 +- > > > > > fs/xfs/xfs_inode.c | 17 +++++++++++++++++ > > > > > fs/xfs/xfs_ioctl.c | 2 ++ > > > > > fs/xfs/xfs_iomap.c | 20 +++++++++++++++++++- > > > > > fs/xfs/xfs_reflink.c | 40 +++++++++++++++++++++++++++++++++++++--- > > > > > fs/xfs/xfs_trace.c | 1 + > > > > > fs/xfs/xfs_trace.h | 40 ++++++++++++++++++++++++++++++++++++++++ > > > > > 10 files changed, 141 insertions(+), 8 deletions(-) > > > > > > > > > > > > > > > diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c > > > > > index 7371a7f7c652..437fdc8a8fbd 100644 > > > > > --- a/fs/xfs/xfs_bmap_util.c > > > > > +++ b/fs/xfs/xfs_bmap_util.c > > > > > @@ -761,6 +761,7 @@ xfs_alloc_file_space( > > > > > */ > > > > > while (allocatesize_fsb && !error) { > > > > > xfs_fileoff_t s, e; > > > > > + bool cleared_space = false; > > > > > > > > > > /* > > > > > * Determine space reservations for data/realtime. > > > > > @@ -803,6 +804,7 @@ xfs_alloc_file_space( > > > > > /* > > > > > * Allocate and setup the transaction. > > > > > */ > > > > > +retry: > > > > > error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks, > > > > > resrtextents, 0, &tp); > > > > > > > > > > @@ -819,6 +821,20 @@ xfs_alloc_file_space( > > > > > xfs_ilock(ip, XFS_ILOCK_EXCL); > > > > > error = xfs_trans_reserve_quota_nblks(tp, ip, qblocks, > > > > > 0, quota_flag); > > > > > + /* > > > > > + * We weren't able to reserve enough quota to handle fallocate. > > > > > + * Flush any disk space that was being held in the hopes of > > > > > + * speeding up the filesystem. We hold the IOLOCK so we cannot > > > > > + * do a synchronous scan. > > > > > + */ > > > > > + if ((error == -ENOSPC || error == -EDQUOT) && !cleared_space) { > > > > > + xfs_trans_cancel(tp); > > > > > + xfs_iunlock(ip, XFS_ILOCK_EXCL); > > > > > + cleared_space = xfs_inode_free_quota_blocks(ip, false); > > > > > + if (cleared_space) > > > > > + goto retry; > > > > > + return error; > > > > > > > > Can't say I'm a fan of repeating this everywhere. Can we move this > > > > into xfs_trans_reserve_quota_nblks() with a "retry" flag such that > > > > we do: > > > > > > > > error = xfs_trans_reserve_quota_nblks(tp, ip, qblocks, 0, > > > > quota_flag, &retry); > > > > if (error) { > > > > /* tp already cancelled, inode unlocked */ > > > > return error; > > > > } > > > > if (retry) { > > > > /* tp already cancelled, inode unlocked */ > > > > goto retry; > > > > } > > > > > > Assuming you'd be interested in the same kind of change being applied to > > > the next patch (kick the inode reclaim from xfs_trans_reserve on ENOSPC) > > > then yes, that seems like a nice cleanup. > > > > *nod* > > > > It definitely seems to me to be cleaner to put all this GC stuff > > into the transaction setup code that does the actual space > > reservation, and then simply have the code that is setting up the > > transactions handle failures appropriately. > > I tried converting this and it wasn't as easy as I thought. > > The downside is that now we have a function that sometimes consumes the > transaction and the ILOCK, and there's still the question of what to do > if xfs_inode_free_quota_blocks doesn't find any space to free. > > By leaving all the ugliness in the call sites, we maintain the property > that the function that allocates the transaction and ilocks the inode > also gets to iunlock and free the tp, and we can also skip the retry if > the eofblocks flush doesn't clear anything. > > It might be easier to rework this as a xfs_trans_alloc_quota function > wherein you feed it a tres, the inode, the number of blocks you want, > and whether or not this is an rt extent; and either it reserves and > locks everything for you, or returns failure. The downside is that > doesn't work for reflink since it doesn't always require quota > reservation to remap an extent. > > But it's pretty late in the day and my brain is scrambled eggs so I'll > defer until tomorrow. ...until three days later. You were right, it wasn't so difficult to make xfs_trans_reserve_quota_nblks cancel the transaction on all errors, and run the scan if it hit edquot/enospc. --D > --D > > > Cheers, > > > > Dave. > > -- > > Dave Chinner > > david@xxxxxxxxxxxxx