On Wed, Mar 09, 2022 at 09:18:55AM +1100, Dave Chinner wrote: > On Mon, Feb 28, 2022 at 06:51:18PM -0800, Darrick J. Wong wrote: > > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > > > The XFS implementation of the linkat call does not reserve quota for the > > potential directory expansion. This means that we don't reject the > > expansion with EDQUOT when we're at or near a hard limit, which means > > that one can use linkat() to exceed quota. Fix this by adding a quota > > reservation. > > > > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> > > --- > > fs/xfs/xfs_inode.c | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c > > index 04bf467b1090..6e556c9069e8 100644 > > --- a/fs/xfs/xfs_inode.c > > +++ b/fs/xfs/xfs_inode.c > > @@ -1249,6 +1249,10 @@ xfs_link( > > xfs_trans_ijoin(tp, sip, XFS_ILOCK_EXCL); > > xfs_trans_ijoin(tp, tdp, XFS_ILOCK_EXCL); > > > > + error = xfs_trans_reserve_quota_nblks(tp, tdp, resblks, 0, false); > > + if (error) > > + goto error_return; > > + > > error = xfs_iext_count_may_overflow(tdp, XFS_DATA_FORK, > > XFS_IEXT_DIR_MANIP_CNT(mp)); > > if (error) > > Yup, ok, but doesn't xfs_remove have exactly the same problem? i.e. Yes, it does, however, the reason I don't have a fix for that ready is that... > removing a directory entry can punch a hole in the bmbt and require > new allocations for a BMBT split, thereby increasing the number of ...rejecting a directory unlink with EDQUOT creates the situation where a user who's gone over the soft limit cannot rm a file to get themselves back under quota because the removal asked for enough bmbt-expansion quota reservation to push the quota over the hard limit... > blocks allocated to the directory? e.g. remove a single data block, > need to then allocate half a dozen BMBT blocks for the shape change. ...and while the next thing that occurred to me was to retry the quota reservation with FORCE_RES, having such a path means that one can still overrun the hard limit (albeit slowly) by creating a fragmented directory and selectively removing entries to cause bmbt splits. I /think/ I'm ok with the "retry with FORCE_QUOTA" solution for xfs_remove, but I'm hanging onto it for now for further consideration and QA testing. > If so, then both xfs_link() and xfs_remove() have exactly the same > dquot, inode locking and transaction setup code and requirements, > and probably should be factored out into xfs_trans_alloc_dir() (i.e. > equivalent of xfs_trans_alloc_icreate() used by all the inode create > functions). That way we only have one copy of this preamble and > only need to fix the bug in one place? They're not the same problem -- adding hardlinks is not a known strategy for reducing quota usage below the limits, whereas unlinking files is. > Alternatively, fix the bug in both places first and add a followup > patch that factors out this code as per above. I sent a patch for the link situation because I thought it looked like an obvious fix, and left the unlink() problem until a full solution is presented or proved impossible. > Hmmm - looking further a callers of xfs_lock_two_inodes(), it would > appear that xfs_swap_extents() needs the same quota reservation > and also largely has the same transaction setup and inode locking > preamble as link and remove... Yes, I know about that problem. I've *solved* that problem with the atomic extent swap rewrite that's been hanging out in djwong-dev since late 2019 as part of the online fsck series. Perhaps I will have time to send that in late 2022. --D > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx