On 04 Mar 2022 at 13:21, Dave Chinner wrote: > On Tue, Mar 01, 2022 at 04:09:35PM +0530, Chandan Babu R wrote: >> This commit upgrades inodes to use 64-bit extent counters when they are read >> from disk. Inodes are upgraded only when the filesystem instance has >> XFS_SB_FEAT_INCOMPAT_NREXT64 incompat flag set. >> >> Reviewed-by: Darrick J. Wong <djwong@xxxxxxxxxx> >> Signed-off-by: Chandan Babu R <chandan.babu@xxxxxxxxxx> >> --- >> fs/xfs/libxfs/xfs_attr.c | 3 ++- >> fs/xfs/libxfs/xfs_bmap.c | 5 ++--- >> fs/xfs/libxfs/xfs_inode_fork.c | 37 ++++++++++++++++++++++++++++++++++ >> fs/xfs/libxfs/xfs_inode_fork.h | 2 ++ >> fs/xfs/xfs_bmap_item.c | 3 ++- >> fs/xfs/xfs_bmap_util.c | 10 ++++----- >> fs/xfs/xfs_dquot.c | 2 +- >> fs/xfs/xfs_iomap.c | 5 +++-- >> fs/xfs/xfs_reflink.c | 5 +++-- >> fs/xfs/xfs_rtalloc.c | 2 +- >> 10 files changed, 58 insertions(+), 16 deletions(-) >> >> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c >> index 23523b802539..03a358930d74 100644 >> --- a/fs/xfs/libxfs/xfs_attr.c >> +++ b/fs/xfs/libxfs/xfs_attr.c >> @@ -774,7 +774,8 @@ xfs_attr_set( >> return error; >> >> if (args->value || xfs_inode_hasattr(dp)) { >> - error = xfs_iext_count_may_overflow(dp, XFS_ATTR_FORK, >> + error = xfs_trans_inode_ensure_nextents(&args->trans, dp, >> + XFS_ATTR_FORK, >> XFS_IEXT_ATTR_MANIP_CNT(rmt_blks)); > > hmmmm. > >> if (error) >> goto out_trans_cancel; >> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c >> index be7f8ebe3cd5..3a3c99ef7f13 100644 >> --- a/fs/xfs/libxfs/xfs_bmap.c >> +++ b/fs/xfs/libxfs/xfs_bmap.c >> @@ -4523,14 +4523,13 @@ xfs_bmapi_convert_delalloc( >> return error; >> >> xfs_ilock(ip, XFS_ILOCK_EXCL); >> + xfs_trans_ijoin(tp, ip, 0); >> >> - error = xfs_iext_count_may_overflow(ip, whichfork, >> + error = xfs_trans_inode_ensure_nextents(&tp, ip, whichfork, >> XFS_IEXT_ADD_NOSPLIT_CNT); >> if (error) >> goto out_trans_cancel; >> >> - xfs_trans_ijoin(tp, ip, 0); >> - >> if (!xfs_iext_lookup_extent(ip, ifp, offset_fsb, &bma.icur, &bma.got) || >> bma.got.br_startoff > offset_fsb) { >> /* >> diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c >> index a3a3b54f9c55..d1d065abeac3 100644 >> --- a/fs/xfs/libxfs/xfs_inode_fork.c >> +++ b/fs/xfs/libxfs/xfs_inode_fork.c >> @@ -757,3 +757,40 @@ xfs_iext_count_may_overflow( >> >> return 0; >> } >> + >> +/* >> + * Ensure that the inode has the ability to add the specified number of >> + * extents. Caller must hold ILOCK_EXCL and have joined the inode to >> + * the transaction. Upon return, the inode will still be in this state >> + * upon return and the transaction will be clean. >> + */ >> +int >> +xfs_trans_inode_ensure_nextents( >> + struct xfs_trans **tpp, >> + struct xfs_inode *ip, >> + int whichfork, >> + int nr_to_add) > > Ok, xfs_trans_inode* is a namespace that belongs to > fs/xfs/xfs_trans_inode.c, not fs/xfs/libxfs/xfs_inode_fork.c. So my > second observation is that the function needs either be renamed or > moved. > > My first observation was that the function name didn't really make > any sense to me when read in context. xfs_iext_count_may_overflow() > makes sense because it's telling me that it's checking that the > extent count hasn't overflowed. xfs_trans_inode_ensure_nextents() > conveys none of that certainty. > > What does it ensure? "ensure" doesn't imply we are goign to change > anything - it could just mean "check and abort if wrong" when read > as "ensure we haven't overflowed". And if we already have nrext64 > and we've overflowed that then it will still fail, meaning we > haven't "ensured" anything. > > This would make much more sense if written as: > > error = xfs_iext_count_may_overflow(); > if (error && error != -EOVERFLOW) > goto out_trans_cancel; > > if (error == -EOVERFLOW) { > error = xfs_inode_upgrade_extent_counts(); > if (error) > goto out_trans_cancel; > } > > Because it splits the logic into a "do we need to do something" > part and a "do an explicit modification" part. > Ok. The above logic is much better than xfs_trans_inode_ensure_nextents(). Also, I will define xfs_inode_upgrade_extent_counts() in libxfs/xfs_inode_fork.c since the function is supposed to operate on inode extent counts. >> +{ >> + int error; >> + >> + error = xfs_iext_count_may_overflow(ip, whichfork, nr_to_add); >> + if (!error) >> + return 0; >> + >> + /* >> + * Try to upgrade if the extent count fields aren't large >> + * enough. >> + */ >> + if (!xfs_has_nrext64(ip->i_mount) || >> + (ip->i_diflags2 & XFS_DIFLAG2_NREXT64)) >> + return error; > > Oh, that's tricky, too. The first check returns if there's no error, > the second check returns the error of the first function. Keeping > the initial overflow check in the caller gets rid of this, too. > >> + >> + ip->i_diflags2 |= XFS_DIFLAG2_NREXT64; >> + xfs_trans_log_inode(*tpp, ip, XFS_ILOG_CORE); >> + >> + error = xfs_trans_roll(tpp); >> + if (error) >> + return error; > > Why does this need to roll the transaction? We can just log the > inode core and return to the caller which will then commit the > change. Transaction was rolled in order to make sure that we don't overflow log reservations (computed in libxfs/xfs_trans_resv.c). But now I see that any transaction which causes inode's extent count to change would have considered the space required to log an inode in its reservation calculation. Hence, I will remove the above call to xfs_trans_roll(). >> + return xfs_iext_count_may_overflow(ip, whichfork, nr_to_add); > > If the answer is so we don't cancel a dirty transaction here, then > I think this check needs to be more explicit - don't even try to do > the upgrade if the number of extents we are adding will cause an > overflow anyway. > > As it is, wouldn't adding 2^47 - 2^31 extents in a single hit be > indicative of a bug? We can only modify the extent count by a > handful of extents (10, maybe 20?) at most in a single transaction, > so why do we even need this check? Yes, the above call to xfs_iext_count_may_overflow() is not correct. The value of nr_to_add has to be larger than 2^17 (2^32 - 2^15 for attr fork and 2^48 - 2^31 for data fork) for extent count to overflow. Hence, I will remove this call to xfs_iext_count_may_overflow(). -- chandan