On Tue, Oct 23, 2018 at 12:18:08PM +0530, Chandan Rajendra wrote: > generic/305 fails on a 64k block sized filesystem due to the following > interaction, > > 1. We are writing 8 blocks (i.e. [0, 512k-1]) of data to a 1 MiB file. > 2. XFS reserves 32 blocks of space in the CoW fork. > xfs_bmap_extsize_align() calculates XFS_DEFAULT_COWEXTSZ_HINT (32 > blocks) as the number of blocks to be reserved. > 3. The reserved space in the range [1M(i.e. i_size), 1M + 16 > blocks] is freed by __fput(). This corresponds to freeing "eof > blocks" i.e. space reserved beyond EOF of a file. > > The reserved space to which data was never written i.e. [9th block, > 1M(EOF)], remains reserved in the CoW fork until either the CoW block > reservation trimming worker gets invoked or the filesystem is > unmounted. > > This commit fixes the issue by freeing unused CoW block reservations > whenever quota numbers are requested by userspace application. > > Signed-off-by: Chandan Rajendra <chandan@xxxxxxxxxxxxxxxxxx> > --- > > PS: With the above patch, the tests xfs/214 & xfs/440 fail because the > value passed to xfs_io's cowextsize does not have any effect when CoW > fork reservations are flushed before querying for quota usage numbers. Hmmm. I restarted looking into all the weird quota count mismatches in xfstests and noticed (with a generous amount of trace_printks) that most of the discrepancies can be traced to speculative preallocations in the cow fork that don't get cleaned out. So we're on the same page. :) I thought about enhancing the XFS_IOC_FREE_EOFBLOCKS ioctl with a new mode to clean out CoW stuff too, but then I started thinking about what _check_quota_usage is actually looking for, and realized that (for xfs anyway) it compares an aged quota report (reflective of thousands of individual fs ops) against a freshly quotacheck'd quota report to look for accounting leaks. Then I tried replacing the $XFS_SPACEMAN_PROG -c 'prealloc -s' call in _check_quota_usage with a umount/mount cycle so that we know we've cleaned out all the reservations and *poof* the discrepancies all went away. The test is still useful since we're comparing the accumulated quota counts against freshly computed counts, but now we know that we've cleaned out any speculative preallocations that xfs might have decided to try (assuming xfs never changes behavior to speculate on a fresh mount). It's awfully tempting to just leave it that way... but what do you think? I think it's a better solution than forcing /every/ quota report to iterate the in-core inodes looking for cow blocks to dump. Granted maybe we still want the ioctl to do it for us? Though that could get tricky since written extents in the cow fork represent writes in progress and can't ever be removed except by xfs_inactive. --D > fs/xfs/xfs_quotaops.c | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/fs/xfs/xfs_quotaops.c b/fs/xfs/xfs_quotaops.c > index a7c0c65..9236a38 100644 > --- a/fs/xfs/xfs_quotaops.c > +++ b/fs/xfs/xfs_quotaops.c > @@ -218,14 +218,21 @@ xfs_fs_get_dqblk( > struct kqid qid, > struct qc_dqblk *qdq) > { > + int ret; > struct xfs_mount *mp = XFS_M(sb); > xfs_dqid_t id; > + struct xfs_eofblocks eofb = { 0 }; > > if (!XFS_IS_QUOTA_RUNNING(mp)) > return -ENOSYS; > if (!XFS_IS_QUOTA_ON(mp)) > return -ESRCH; > > + eofb.eof_flags = XFS_EOF_FLAGS_SYNC; > + ret = xfs_icache_free_cowblocks(mp, &eofb); > + if (ret) > + return ret; > + > id = from_kqid(&init_user_ns, qid); > return xfs_qm_scall_getquota(mp, id, xfs_quota_type(qid.type), qdq); > } > @@ -240,12 +247,18 @@ xfs_fs_get_nextdqblk( > int ret; > struct xfs_mount *mp = XFS_M(sb); > xfs_dqid_t id; > + struct xfs_eofblocks eofb = { 0 }; > > if (!XFS_IS_QUOTA_RUNNING(mp)) > return -ENOSYS; > if (!XFS_IS_QUOTA_ON(mp)) > return -ESRCH; > > + eofb.eof_flags = XFS_EOF_FLAGS_SYNC; > + ret = xfs_icache_free_cowblocks(mp, &eofb); > + if (ret) > + return ret; > + > id = from_kqid(&init_user_ns, *qid); > ret = xfs_qm_scall_getquota_next(mp, &id, xfs_quota_type(qid->type), > qdq); > -- > 2.9.5 >