On Mon, Jan 08, 2024 at 10:14:42PM -0800, Darrick J. Wong wrote: > On Mon, Jan 08, 2024 at 11:35:17AM +1100, Dave Chinner wrote: > > On Thu, Jan 04, 2024 at 02:22:48PM +0800, Jian Wen wrote: > > > From: Jian Wen <wenjianhn@xxxxxxxxx> > > > > > > Currently, xfs_trans_dqresv() return -ENOSPC when the project quota > > > limit is reached. As a result, xfs_file_buffered_write() will flush > > > the whole filesystem instead of the project quota. > > > > > > Fix the issue by make xfs_trans_dqresv() return -EDQUOT rather than > > > -ENOSPC. Add a helper, xfs_blockgc_nospace_flush(), to make flushing > > > for both EDQUOT and ENOSPC consistent. > > > > > > Changes since v3: > > > - rename xfs_dquot_is_enospc to xfs_dquot_hardlimit_exceeded > > > - acquire the dquot lock before checking the free space > > > > > > Changes since v2: > > > - completely rewrote based on the suggestions from Dave > > > > > > Suggested-by: Dave Chinner <david@xxxxxxxxxxxxx> > > > Signed-off-by: Jian Wen <wenjian1@xxxxxxxxxx> > > > > Please send new patch versions as a new thread, not as a reply to > > a random email in the middle of the review thread for a previous > > version. > > > > > --- > > > fs/xfs/xfs_dquot.h | 22 +++++++++++++++--- > > > fs/xfs/xfs_file.c | 41 ++++++++++++-------------------- > > > fs/xfs/xfs_icache.c | 50 +++++++++++++++++++++++++++++----------- > > > fs/xfs/xfs_icache.h | 7 +++--- > > > fs/xfs/xfs_inode.c | 19 ++++++++------- > > > fs/xfs/xfs_reflink.c | 5 ++++ > > > fs/xfs/xfs_trans.c | 41 ++++++++++++++++++++++++-------- > > > fs/xfs/xfs_trans_dquot.c | 3 --- > > > 8 files changed, 121 insertions(+), 67 deletions(-) > > > > > > diff --git a/fs/xfs/xfs_dquot.h b/fs/xfs/xfs_dquot.h > > > index 80c8f851a2f3..d28dce0ed61a 100644 > > > --- a/fs/xfs/xfs_dquot.h > > > +++ b/fs/xfs/xfs_dquot.h > > > @@ -183,6 +183,22 @@ xfs_dquot_is_enforced( > > > return false; > > > } > > > > > > +static inline bool > > > +xfs_dquot_hardlimit_exceeded( > > > + struct xfs_dquot *dqp) > > > +{ > > > + int64_t freesp; > > > + > > > + if (!dqp) > > > + return false; > > > + if (!xfs_dquot_is_enforced(dqp)) > > > + return false; > > > + xfs_dqlock(dqp); > > > + freesp = dqp->q_blk.hardlimit - dqp->q_blk.reserved; > > > + xfs_dqunlock(dqp); > > > + return freesp < 0; > > > +} > > > > Ok, what about if the project quota EDQUOT has come about because we > > are over the inode count limit or the realtime block limit? Both of > > those need to be converted to ENOSPC, too. > > > > i.e. all the inode creation operation need to be checked against > > both the data device block space and the inode count space, whilst > > data writes need to be checked against data space for normal IO > > and both data space and real time space for inodes that are writing > > to real time devices. > > (Yeah.) > > > Also, why do we care about locking here? If something is modifying > > dqp->q_blk.reserved concurrently, holding the lock here does nothing > > to protect this code from races. All it means is that we we'll block > > waiting for the transaction that holds the dquot locked to complete > > and we'll either get the same random failure or success as if we > > didn't hold the lock during this calculation... > > I thought we had to hold the dquot lock before accessing its fields. Only if we care about avoiding races with ongoing modifications or we want to serialise against new references (e.g. because we are about to reclaim the dquot). The inode holds a reference to the dquot at this point (because of xfs_qm_dqattach()), so we really don't need to hold a lock just to sample the contents of the attached dquot. > Or are you really saying that it's silly to take the dquot lock *again* > having already decided (under dqlock elsewhere) that we were over a > quota? No, I'm saying that we really don't have to hold the dqlock to determine if the dquot is over quota limits. It's either going to over or under, and holding the dqlock while sampling it really doesn't change the fact that it the dquot accounting can change between the initial check under the dqlock and a subsequent check on the second failure under a different hold of the dqlock. It's an inherently racy check, and holding the dqlock does nothing to make it less racy or more accurate. > In that case, perhaps it makes more sense to have > xfs_trans_dqresv return an unusual errno for "project quota over limits" > so that callers can trap that magic value and translate it into ENOSPC? Sure, that's another option, but it means we have to trap EDQUOT, ENOSPC and the new special EDQUOT-but-really-means-ENOSPC return errors. I'm not sure it will improve the code a great deal, but if there's a clean way to implement such error handling it may make more sense. Have you prototyped how such error handling would look in these cases? Which also makes me wonder if we should actually be returning what quota limit failed, not EDQUOT. To take the correct flush action, we really need to know if we failed on data blocks, inode count or rt extents. e.g. flushing data won't help alleviate an inode count overrun... -Dave. -- Dave Chinner david@xxxxxxxxxxxxx