On Mon, Jan 04, 2021 at 04:01:10PM +0530, Chandan Babu R wrote: > Directory entry addition can cause the following, > 1. Data block can be added/removed. > A new extent can cause extent count to increase by 1. > 2. Free disk block can be added/removed. > Same behaviour as described above for Data block. > 3. Dabtree blocks. > XFS_DA_NODE_MAXDEPTH blocks can be added. Each of these > can be new extents. Hence extent count can increase by > XFS_DA_NODE_MAXDEPTH. > > Directory entry remove and rename (applicable only to the source > directory entry) operations are handled specially to allow them to > succeed in low extent count availability scenarios > i.e. xfs_bmap_del_extent_real() will now return -ENOSPC when a possible > extent count overflow is detected. -ENOSPC is already handled by higher > layers of XFS by letting, > 1. Empty Data/Free space index blocks to linger around until a future > remove operation frees them. > 2. Dabtree blocks would be swapped with the last block in the leaf space > followed by unmapping of the new last block. > > Also, Extent overflow check is performed for the target directory entry > of the rename operation only when the entry does not exist and a > non-zero space reservation is obtained successfully. > > Signed-off-by: Chandan Babu R <chandanrlinux@xxxxxxxxx> > --- > fs/xfs/libxfs/xfs_bmap.c | 15 ++++++++++++ > fs/xfs/libxfs/xfs_inode_fork.h | 13 ++++++++++ > fs/xfs/xfs_inode.c | 45 ++++++++++++++++++++++++++++++++++ > fs/xfs/xfs_symlink.c | 5 ++++ > 4 files changed, 78 insertions(+) > > diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c > index 32aeacf6f055..5fd804534e67 100644 > --- a/fs/xfs/libxfs/xfs_bmap.c > +++ b/fs/xfs/libxfs/xfs_bmap.c > @@ -5151,6 +5151,21 @@ xfs_bmap_del_extent_real( > /* > * Deleting the middle of the extent. > */ > + > + /* > + * For directories, -ENOSPC will be handled by higher layers of > + * XFS by letting the corresponding empty Data/Free blocks to > + * linger around until a future remove operation. Dabtree blocks > + * would be swapped with the last block in the leaf space and > + * then the new last block will be unmapped. > + */ > + if (S_ISDIR(VFS_I(ip)->i_mode) && > + whichfork == XFS_DATA_FORK && > + xfs_iext_count_may_overflow(ip, whichfork, 1)) { > + error = -ENOSPC; > + goto done; Hmm... it strikes me as a little odd that we're checking file mode and fork type in the middle of the bmap code. However, I think it's the case that the only place where anyone would punch a hole in the /middle/ of an extent is xattr trees and regular files, right? And both of those cases are checked before we end up in the bmap code, right? So we only really need this check to prevent extent count overflows when removing dirents from directories, like the comment says, and only because directories don't have a hard requirement that the bunmapi succeeds. And I think this logic covers xfs_remove too? That's a bit subtle, but as there's no extent count check in that function, there's not much to attach a comment to... :) Hm. I think I'd like xfs_rename to get a brief comment that we're protected from extent count overflows in xfs_remove() by virtue of this "leave the dir block in place if we ENOSPC" capability: /* * NOTE: We don't need to check for extent overflows here * because the dir removename code will leave the dir block * in place if the extent count would overflow. */ error = xfs_dir_removename(...); Do xattr trees also have the same ability? I think they do, at least for the dabtree part...? I think I would've split this patch into three pieces: - create, link, and symlink in one patch (adding dirents), - the xfs_bmap_del_extent_real change and a comment for xfs_remove (removing dirents) - all the xfs_rename changes (adding and removing dirents) Though I dunno, this series is already 14 patches, and the part that I care most about is not leaving that subtlety in xfs_remove(). :) Other than that, I follow the logic in this patch and will give it a testrun tonight. --D > + } > + > old = got; > > got.br_blockcount = del->br_startoff - got.br_startoff; > diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h > index bcac769a7df6..ea1a9dd8a763 100644 > --- a/fs/xfs/libxfs/xfs_inode_fork.h > +++ b/fs/xfs/libxfs/xfs_inode_fork.h > @@ -47,6 +47,19 @@ struct xfs_ifork { > */ > #define XFS_IEXT_PUNCH_HOLE_CNT (1) > > +/* > + * Directory entry addition can cause the following, > + * 1. Data block can be added/removed. > + * A new extent can cause extent count to increase by 1. > + * 2. Free disk block can be added/removed. > + * Same behaviour as described above for Data block. > + * 3. Dabtree blocks. > + * XFS_DA_NODE_MAXDEPTH blocks can be added. Each of these can be new > + * extents. Hence extent count can increase by XFS_DA_NODE_MAXDEPTH. > + */ > +#define XFS_IEXT_DIR_MANIP_CNT(mp) \ > + ((XFS_DA_NODE_MAXDEPTH + 1 + 1) * (mp)->m_dir_geo->fsbcount) > + > /* > * Fork handling. > */ > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c > index b7352bc4c815..0db21368c7e1 100644 > --- a/fs/xfs/xfs_inode.c > +++ b/fs/xfs/xfs_inode.c > @@ -1042,6 +1042,11 @@ xfs_create( > if (error) > goto out_trans_cancel; > > + error = xfs_iext_count_may_overflow(dp, XFS_DATA_FORK, > + XFS_IEXT_DIR_MANIP_CNT(mp)); > + if (error) > + goto out_trans_cancel; > + > /* > * A newly created regular or special file just has one directory > * entry pointing to them, but a directory also the "." entry > @@ -1258,6 +1263,11 @@ xfs_link( > xfs_trans_ijoin(tp, sip, XFS_ILOCK_EXCL); > xfs_trans_ijoin(tp, tdp, XFS_ILOCK_EXCL); > > + error = xfs_iext_count_may_overflow(tdp, XFS_DATA_FORK, > + XFS_IEXT_DIR_MANIP_CNT(mp)); > + if (error) > + goto error_return; > + > /* > * If we are using project inheritance, we only allow hard link > * creation in our tree when the project IDs are the same; else > @@ -3106,6 +3116,35 @@ xfs_rename( > /* > * Check for expected errors before we dirty the transaction > * so we can return an error without a transaction abort. > + * > + * Extent count overflow check: > + * > + * From the perspective of src_dp, a rename operation is essentially a > + * directory entry remove operation. Hence the only place where we check > + * for extent count overflow for src_dp is in > + * xfs_bmap_del_extent_real(). xfs_bmap_del_extent_real() returns > + * -ENOSPC when it detects a possible extent count overflow and in > + * response, the higher layers of directory handling code do the > + * following: > + * 1. Data/Free blocks: XFS lets these blocks linger around until a > + * future remove operation removes them. > + * 2. Dabtree blocks: XFS swaps the blocks with the last block in the > + * Leaf space and unmaps the last block. > + * > + * For target_dp, there are two cases depending on whether the > + * destination directory entry exists or not. > + * > + * When destination directory entry does not exist (i.e. target_ip == > + * NULL), extent count overflow check is performed only when transaction > + * has a non-zero sized space reservation associated with it. With a > + * zero-sized space reservation, XFS allows a rename operation to > + * continue only when the directory has sufficient free space in its > + * data/leaf/free space blocks to hold the new entry. > + * > + * When destination directory entry exists (i.e. target_ip != NULL), all > + * we need to do is change the inode number associated with the already > + * existing entry. Hence there is no need to perform an extent count > + * overflow check. > */ > if (target_ip == NULL) { > /* > @@ -3116,6 +3155,12 @@ xfs_rename( > error = xfs_dir_canenter(tp, target_dp, target_name); > if (error) > goto out_trans_cancel; > + } else { > + error = xfs_iext_count_may_overflow(target_dp, > + XFS_DATA_FORK, > + XFS_IEXT_DIR_MANIP_CNT(mp)); > + if (error) > + goto out_trans_cancel; > } > } else { > /* > diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c > index 1f43fd7f3209..0b8136a32484 100644 > --- a/fs/xfs/xfs_symlink.c > +++ b/fs/xfs/xfs_symlink.c > @@ -220,6 +220,11 @@ xfs_symlink( > if (error) > goto out_trans_cancel; > > + error = xfs_iext_count_may_overflow(dp, XFS_DATA_FORK, > + XFS_IEXT_DIR_MANIP_CNT(mp)); > + if (error) > + goto out_trans_cancel; > + > /* > * Allocate an inode for the symlink. > */ > -- > 2.29.2 >