Re: [PATCH RESEND v2 11/18] xfs: extend transaction reservations for parent attributes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2022-08-09 at 10:48 -0700, Darrick J. Wong wrote:
> On Thu, Aug 04, 2022 at 12:40:06PM -0700, Allison Henderson wrote:
> > We need to add, remove or modify parent pointer attributes during
> > create/link/unlink/rename operations atomically with the dirents in
> > the
> > parent directories being modified. This means they need to be
> > modified
> > in the same transaction as the parent directories, and so we need
> > to add
> > the required space for the attribute modifications to the
> > transaction
> > reservations.
> 
> While we're on the topic of log reservations ... Dave and I noticed
> during the 5.19 cycle that xfs_log_calc_max_attrsetm_res has a unit
> conversion problem when it's trying to compute the minimum log size:
> 
> STATIC int
> xfs_log_calc_max_attrsetm_res(
> 	struct xfs_mount	*mp)
> {
> 	int			size;
> 	int			nblks;
> 
> 	size = xfs_attr_leaf_entsize_local_max(mp->m_attr_geo->blksize) 
> -
> 	       MAXNAMELEN - 1;
> 
> Notice here that @size is the maximum amount of space that a local
> format attribute can use in an xattr leaf block.  The computation is
> in
> units of bytes.
> 
> 	nblks = XFS_DAENTER_SPACE_RES(mp, XFS_ATTR_FORK);
> 	nblks += XFS_B_TO_FSB(mp, size);
> 
> ...and here we convert bytes to fs blocks for the block count
> computation...
> 
> 	nblks += XFS_NEXTENTADD_SPACE_RES(mp, size, XFS_ATTR_FORK);
> 
> ...but here we pass the byte count into a macro that takes a block
> count
> as its second parameter and returns the number of bmbt blocks needed
> to
> add that many blocks to an attribute fork.  Oops!
> 
> I would like to fix this incorrect code, but it's never a good idea
> to
> adjust downwards the min log size calculation for existing
> filesystems,
> because this can result in the situation where new mkfs formats a
> filesystem with a small enough log that an old kernel won't mount it.
> 
> Therefore, the corrected logic would have to be gated on whatever
> happens to be the next new ondisk feature.  It's probably too late to
> do
> this for large extent counts, but fixing the calculation would be (I
> think) appropriate for parent pointers, since it's still undergoing
> review and won't be an easy upgrade, which eliminates the legacy
> problem.
> 
> I'll attach the patches that I've written as patches 19 and 20 to
> this
> patchset, if you don't 
Sure, I will keep an eye out for them then.

> 
> 	return  M_RES(mp)->tr_attrsetm.tr_logres +
> 		M_RES(mp)->tr_attrsetrt.tr_logres * nblks;
> }
> 
> 
> > [achender: rebased]
> > 
> > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
> > Signed-off-by: Allison Henderson <allison.henderson@xxxxxxxxxx>
> > ---
> >  fs/xfs/libxfs/xfs_trans_resv.c | 105 +++++++++++++++++++++++++++
> > ------
> >  1 file changed, 86 insertions(+), 19 deletions(-)
> > 
> > diff --git a/fs/xfs/libxfs/xfs_trans_resv.c
> > b/fs/xfs/libxfs/xfs_trans_resv.c
> > index e9913c2c5a24..b43ac4be7564 100644
> > --- a/fs/xfs/libxfs/xfs_trans_resv.c
> > +++ b/fs/xfs/libxfs/xfs_trans_resv.c
> > @@ -909,24 +909,67 @@ xfs_calc_sb_reservation(
> >  	return xfs_calc_buf_res(1, mp->m_sb.sb_sectsize);
> >  }
> >  
> > -void
> > -xfs_trans_resv_calc(
> > -	struct xfs_mount	*mp,
> > -	struct xfs_trans_resv	*resp)
> > +STATIC void
> > +xfs_calc_parent_ptr_reservations(
> > +	struct xfs_mount     *mp)
> >  {
> > -	int			logcount_adj = 0;
> > +	struct xfs_trans_resv   *resp = M_RES(mp);
> >  
> > -	/*
> > -	 * The following transactions are logged in physical format and
> > -	 * require a permanent reservation on space.
> > -	 */
> > -	resp->tr_write.tr_logres = xfs_calc_write_reservation(mp,
> > false);
> > -	resp->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT;
> > -	resp->tr_write.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
> > +	/* Calculate extra space needed for parent pointer attributes
> > */
> 
> This might be better expressed as a comment just prior to the
> function
> declaration above.
Alrighty, will move upwards

> 
> > +	if (!xfs_has_parent(mp))
> > +		return;
> >  
> > -	resp->tr_itruncate.tr_logres =
> > xfs_calc_itruncate_reservation(mp, false);
> > -	resp->tr_itruncate.tr_logcount = XFS_ITRUNCATE_LOG_COUNT;
> > -	resp->tr_itruncate.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
> > +	/* rename can add/remove/modify 4 parent attributes */
> > +	resp->tr_rename.tr_logres += 4 * max(resp-
> > >tr_attrsetm.tr_logres,
> > +					 resp->tr_attrrm.tr_logres);
> 
> Why does the per-transaction reservation increase by 4x the amount of
> space needed to set (or delete) an xattr?  The pptr patchset now uses
> logged xattrs, which means that each xattr update needed to commit
> the
> rename operation will happen in a separate transaction.  IOWs, each
> transaction in the chain does not have to handle *every* update that
> must be made during the entire chain, it only has to handle one step
> of
> the full process.
Oh, I think initially this might have been pre-larp code, so probably
we  can just drop it now

> 
> Doesn't that mean that the size of tr_rename.tr_logres only needs to
> increase by the amount of space needed to log the four(?) xattr items
> to
> the first transaction in the chain?  AFAICT, it also can't be smaller
> than max(resp->tr_attrsetm.tr_logres, resp->tr_attrrm.tr_logres);
I think so, probably we can just leave it as max

> 
> (I'm also not sure why four -- the patch for xfs_rename only creates
> three xfs_parent_defer objects.)
Hrmm, well initially I think it was for 4 inodes, but then we
remembered wip in the last review, so now it's 5 right? That's why we
expanded XFS_DEFER_OPS_NR_INODES in patch 1.  So probably the 4 inodes
needs to turn into a 5 in this patch too.


> 
> I also think that adjusting tr_rename to account for parent pointers
> is
> something that should be done in xfs_calc_rename_reservation, not a
> separate function:
> 
> /*
>  * In renaming a files we can modify (t1):
>  *    the four inodes involved: 4 * inode size
5?

>  *    the two directory btrees: 2 * (max depth + v2) * dir block size
>  *    the two directory bmap btrees: 2 * max depth * block size
>  * And the bmap_finish transaction can free dir and bmap blocks (two
> sets
>  *	of bmap blocks) giving (t2):
>  *    the agf for the ags in which the blocks live: 3 * sector size
>  *    the agfl for the ags in which the blocks live: 3 * sector size
>  *    the superblock for the free block count: sector size
>  *    the allocation btrees: 3 exts * 2 trees * (2 * max depth - 1) *
> block size
>  * If parent pointers are enabled (t3), then each transaction in the
> chain
>  *    must be capable of setting or removing the extended attribute
>  *    containing the parent information.  It must also be able to
> handle
>  *    the three xattr intent items that track the progress of the
> parent
>  *    pointer update.
>  */
> STATIC uint
> xfs_calc_rename_reservation(
> 	struct xfs_mount	*mp)
> {
> 	unsigned int		overhead = XFS_DQUOT_LOGRES(mp);
> 	unsigned int		t1, t2, t3 = 0;
> 
> 	t1 = xfs_calc_inode_res(mp, 4) +
and then the same 5 goes in here too...

> 	     xfs_calc_buf_res(2 * XFS_DIROP_LOG_COUNT(mp),
> 			XFS_FSB_TO_B(mp, 1));
> 
> 	t2 = xfs_calc_buf_res(7, mp->m_sb.sb_sectsize) +
> 	     xfs_calc_buf_res(xfs_allocfree_block_count(mp, 3),
> 			XFS_FSB_TO_B(mp, 1))));
> 
> 	if (xfs_has_parent(mp)) {
> 		t3 = max(resp->tr_attrsetm.tr_logres,
> 				resp->tr_attrrm.tr_logres);
> 		overhead += 3 * (size of a pptr xattr intent item);
> 	}
> 
> 	return overhead + max3(t1, t2, t3);
> }
> 
> > +	resp->tr_rename.tr_logcount += 4 * max(resp-
> > >tr_attrsetm.tr_logcount,
> > +					   resp-
> > >tr_attrrm.tr_logcount);
> 
> Looks correct, module the 4 vs. 3 thing.
> 
I think that looks right?  5 inodes 3 pptrs?


> > +
> > +	/* create will add 1 parent attribute */
> > +	resp->tr_create.tr_logres += resp->tr_attrsetm.tr_logres;
> > +	resp->tr_create.tr_logcount += resp->tr_attrsetm.tr_logcount;
> > +
> > +	/* mkdir will add 1 parent attribute */
> > +	resp->tr_mkdir.tr_logres += resp->tr_attrsetm.tr_logres;
> > +	resp->tr_mkdir.tr_logcount += resp->tr_attrsetm.tr_logcount;
> > +
> > +	/* link will add 1 parent attribute */
> > +	resp->tr_link.tr_logres += resp->tr_attrsetm.tr_logres;
> > +	resp->tr_link.tr_logcount += resp->tr_attrsetm.tr_logcount;
> > +
> > +	/* symlink will add 1 parent attribute */
> > +	resp->tr_symlink.tr_logres += resp->tr_attrsetm.tr_logres;
> > +	resp->tr_symlink.tr_logcount += resp->tr_attrsetm.tr_logcount;
> > +
> > +	/* remove will remove 1 parent attribute */
> > +	resp->tr_remove.tr_logres += resp->tr_attrrm.tr_logres;
> > +	resp->tr_remove.tr_logcount += resp->tr_attrrm.tr_logcount;
> > +}
> > +
> > +/*
> > + * Namespace reservations.
> > + *
> > + * These get tricky when parent pointers are enabled as we have
> > attribute
> > + * modifications occurring from within these transactions. Rather
> > than confuse
> > + * each of these reservation calculations with the conditional
> > attribute
> > + * reservations, add them here in a clear and concise manner. This
> > assumes that
> > + * the attribute reservations have already been calculated.
> > + *
> > + * Note that we only include the static attribute reservation
> > here; the runtime
> > + * reservation will have to be modified by the size of the
> > attributes being
> > + * added/removed/modified. See the comments on the attribute
> > reservation
> > + * calculations for more details.
> > + *
> > + * Note for rename: rename will vastly overestimate requirements.
> > This will be
> > + * addressed later when modifications are made to ensure parent
> > attribute
> 
> Later?  I took a look at the rename patch, and it looks like we're
> using
> logged xattrs from the start.
I think it's a stale comment.  Will remove.  Thanks for the reviews!

Allison

> 
> --D
> 
> > + * modifications can be done atomically with the rename operation.
> > + */
> > +STATIC void
> > +xfs_calc_namespace_reservations(
> > +	struct xfs_mount	*mp,
> > +	struct xfs_trans_resv	*resp)
> > +{
> > +	ASSERT(resp->tr_attrsetm.tr_logres > 0);
> >  
> >  	resp->tr_rename.tr_logres = xfs_calc_rename_reservation(mp);
> >  	resp->tr_rename.tr_logcount = XFS_RENAME_LOG_COUNT;
> > @@ -948,15 +991,37 @@ xfs_trans_resv_calc(
> >  	resp->tr_create.tr_logcount = XFS_CREATE_LOG_COUNT;
> >  	resp->tr_create.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
> >  
> > +	resp->tr_mkdir.tr_logres = xfs_calc_mkdir_reservation(mp);
> > +	resp->tr_mkdir.tr_logcount = XFS_MKDIR_LOG_COUNT;
> > +	resp->tr_mkdir.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
> > +
> > +	xfs_calc_parent_ptr_reservations(mp);
> > +}
> > +
> > +void
> > +xfs_trans_resv_calc(
> > +	struct xfs_mount	*mp,
> > +	struct xfs_trans_resv	*resp)
> > +{
> > +	int			logcount_adj = 0;
> > +
> > +	/*
> > +	 * The following transactions are logged in physical format and
> > +	 * require a permanent reservation on space.
> > +	 */
> > +	resp->tr_write.tr_logres = xfs_calc_write_reservation(mp,
> > false);
> > +	resp->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT;
> > +	resp->tr_write.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
> > +
> > +	resp->tr_itruncate.tr_logres =
> > xfs_calc_itruncate_reservation(mp, false);
> > +	resp->tr_itruncate.tr_logcount = XFS_ITRUNCATE_LOG_COUNT;
> > +	resp->tr_itruncate.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
> > +
> >  	resp->tr_create_tmpfile.tr_logres =
> >  			xfs_calc_create_tmpfile_reservation(mp);
> >  	resp->tr_create_tmpfile.tr_logcount =
> > XFS_CREATE_TMPFILE_LOG_COUNT;
> >  	resp->tr_create_tmpfile.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
> >  
> > -	resp->tr_mkdir.tr_logres = xfs_calc_mkdir_reservation(mp);
> > -	resp->tr_mkdir.tr_logcount = XFS_MKDIR_LOG_COUNT;
> > -	resp->tr_mkdir.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
> > -
> >  	resp->tr_ifree.tr_logres = xfs_calc_ifree_reservation(mp);
> >  	resp->tr_ifree.tr_logcount = XFS_INACTIVE_LOG_COUNT;
> >  	resp->tr_ifree.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
> > @@ -986,6 +1051,8 @@ xfs_trans_resv_calc(
> >  	resp->tr_qm_dqalloc.tr_logcount = XFS_WRITE_LOG_COUNT;
> >  	resp->tr_qm_dqalloc.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
> >  
> > +	xfs_calc_namespace_reservations(mp, resp);
> > +
> >  	/*
> >  	 * The following transactions are logged in logical format with
> >  	 * a default log count.
> > -- 
> > 2.25.1
> > 




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux