On Thu, Aug 04, 2022 at 12:40:06PM -0700, Allison Henderson wrote: > We need to add, remove or modify parent pointer attributes during > create/link/unlink/rename operations atomically with the dirents in the > parent directories being modified. This means they need to be modified > in the same transaction as the parent directories, and so we need to add > the required space for the attribute modifications to the transaction > reservations. While we're on the topic of log reservations ... Dave and I noticed during the 5.19 cycle that xfs_log_calc_max_attrsetm_res has a unit conversion problem when it's trying to compute the minimum log size: STATIC int xfs_log_calc_max_attrsetm_res( struct xfs_mount *mp) { int size; int nblks; size = xfs_attr_leaf_entsize_local_max(mp->m_attr_geo->blksize) - MAXNAMELEN - 1; Notice here that @size is the maximum amount of space that a local format attribute can use in an xattr leaf block. The computation is in units of bytes. nblks = XFS_DAENTER_SPACE_RES(mp, XFS_ATTR_FORK); nblks += XFS_B_TO_FSB(mp, size); ...and here we convert bytes to fs blocks for the block count computation... nblks += XFS_NEXTENTADD_SPACE_RES(mp, size, XFS_ATTR_FORK); ...but here we pass the byte count into a macro that takes a block count as its second parameter and returns the number of bmbt blocks needed to add that many blocks to an attribute fork. Oops! I would like to fix this incorrect code, but it's never a good idea to adjust downwards the min log size calculation for existing filesystems, because this can result in the situation where new mkfs formats a filesystem with a small enough log that an old kernel won't mount it. Therefore, the corrected logic would have to be gated on whatever happens to be the next new ondisk feature. It's probably too late to do this for large extent counts, but fixing the calculation would be (I think) appropriate for parent pointers, since it's still undergoing review and won't be an easy upgrade, which eliminates the legacy problem. I'll attach the patches that I've written as patches 19 and 20 to this patchset, if you don't mind having a look and adding them? return M_RES(mp)->tr_attrsetm.tr_logres + M_RES(mp)->tr_attrsetrt.tr_logres * nblks; } > [achender: rebased] > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> > Signed-off-by: Allison Henderson <allison.henderson@xxxxxxxxxx> > --- > fs/xfs/libxfs/xfs_trans_resv.c | 105 +++++++++++++++++++++++++++------ > 1 file changed, 86 insertions(+), 19 deletions(-) > > diff --git a/fs/xfs/libxfs/xfs_trans_resv.c b/fs/xfs/libxfs/xfs_trans_resv.c > index e9913c2c5a24..b43ac4be7564 100644 > --- a/fs/xfs/libxfs/xfs_trans_resv.c > +++ b/fs/xfs/libxfs/xfs_trans_resv.c > @@ -909,24 +909,67 @@ xfs_calc_sb_reservation( > return xfs_calc_buf_res(1, mp->m_sb.sb_sectsize); > } > > -void > -xfs_trans_resv_calc( > - struct xfs_mount *mp, > - struct xfs_trans_resv *resp) > +STATIC void > +xfs_calc_parent_ptr_reservations( > + struct xfs_mount *mp) > { > - int logcount_adj = 0; > + struct xfs_trans_resv *resp = M_RES(mp); > > - /* > - * The following transactions are logged in physical format and > - * require a permanent reservation on space. > - */ > - resp->tr_write.tr_logres = xfs_calc_write_reservation(mp, false); > - resp->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT; > - resp->tr_write.tr_logflags |= XFS_TRANS_PERM_LOG_RES; > + /* Calculate extra space needed for parent pointer attributes */ This might be better expressed as a comment just prior to the function declaration above. > + if (!xfs_has_parent(mp)) > + return; > > - resp->tr_itruncate.tr_logres = xfs_calc_itruncate_reservation(mp, false); > - resp->tr_itruncate.tr_logcount = XFS_ITRUNCATE_LOG_COUNT; > - resp->tr_itruncate.tr_logflags |= XFS_TRANS_PERM_LOG_RES; > + /* rename can add/remove/modify 4 parent attributes */ > + resp->tr_rename.tr_logres += 4 * max(resp->tr_attrsetm.tr_logres, > + resp->tr_attrrm.tr_logres); Why does the per-transaction reservation increase by 4x the amount of space needed to set (or delete) an xattr? The pptr patchset now uses logged xattrs, which means that each xattr update needed to commit the rename operation will happen in a separate transaction. IOWs, each transaction in the chain does not have to handle *every* update that must be made during the entire chain, it only has to handle one step of the full process. Doesn't that mean that the size of tr_rename.tr_logres only needs to increase by the amount of space needed to log the four(?) xattr items to the first transaction in the chain? AFAICT, it also can't be smaller than max(resp->tr_attrsetm.tr_logres, resp->tr_attrrm.tr_logres); (I'm also not sure why four -- the patch for xfs_rename only creates three xfs_parent_defer objects.) I also think that adjusting tr_rename to account for parent pointers is something that should be done in xfs_calc_rename_reservation, not a separate function: /* * In renaming a files we can modify (t1): * the four inodes involved: 4 * inode size * the two directory btrees: 2 * (max depth + v2) * dir block size * the two directory bmap btrees: 2 * max depth * block size * And the bmap_finish transaction can free dir and bmap blocks (two sets * of bmap blocks) giving (t2): * the agf for the ags in which the blocks live: 3 * sector size * the agfl for the ags in which the blocks live: 3 * sector size * the superblock for the free block count: sector size * the allocation btrees: 3 exts * 2 trees * (2 * max depth - 1) * block size * If parent pointers are enabled (t3), then each transaction in the chain * must be capable of setting or removing the extended attribute * containing the parent information. It must also be able to handle * the three xattr intent items that track the progress of the parent * pointer update. */ STATIC uint xfs_calc_rename_reservation( struct xfs_mount *mp) { unsigned int overhead = XFS_DQUOT_LOGRES(mp); unsigned int t1, t2, t3 = 0; t1 = xfs_calc_inode_res(mp, 4) + xfs_calc_buf_res(2 * XFS_DIROP_LOG_COUNT(mp), XFS_FSB_TO_B(mp, 1)); t2 = xfs_calc_buf_res(7, mp->m_sb.sb_sectsize) + xfs_calc_buf_res(xfs_allocfree_block_count(mp, 3), XFS_FSB_TO_B(mp, 1)))); if (xfs_has_parent(mp)) { t3 = max(resp->tr_attrsetm.tr_logres, resp->tr_attrrm.tr_logres); overhead += 3 * (size of a pptr xattr intent item); } return overhead + max3(t1, t2, t3); } > + resp->tr_rename.tr_logcount += 4 * max(resp->tr_attrsetm.tr_logcount, > + resp->tr_attrrm.tr_logcount); Looks correct, module the 4 vs. 3 thing. > + > + /* create will add 1 parent attribute */ > + resp->tr_create.tr_logres += resp->tr_attrsetm.tr_logres; > + resp->tr_create.tr_logcount += resp->tr_attrsetm.tr_logcount; > + > + /* mkdir will add 1 parent attribute */ > + resp->tr_mkdir.tr_logres += resp->tr_attrsetm.tr_logres; > + resp->tr_mkdir.tr_logcount += resp->tr_attrsetm.tr_logcount; > + > + /* link will add 1 parent attribute */ > + resp->tr_link.tr_logres += resp->tr_attrsetm.tr_logres; > + resp->tr_link.tr_logcount += resp->tr_attrsetm.tr_logcount; > + > + /* symlink will add 1 parent attribute */ > + resp->tr_symlink.tr_logres += resp->tr_attrsetm.tr_logres; > + resp->tr_symlink.tr_logcount += resp->tr_attrsetm.tr_logcount; > + > + /* remove will remove 1 parent attribute */ > + resp->tr_remove.tr_logres += resp->tr_attrrm.tr_logres; > + resp->tr_remove.tr_logcount += resp->tr_attrrm.tr_logcount; > +} > + > +/* > + * Namespace reservations. > + * > + * These get tricky when parent pointers are enabled as we have attribute > + * modifications occurring from within these transactions. Rather than confuse > + * each of these reservation calculations with the conditional attribute > + * reservations, add them here in a clear and concise manner. This assumes that > + * the attribute reservations have already been calculated. > + * > + * Note that we only include the static attribute reservation here; the runtime > + * reservation will have to be modified by the size of the attributes being > + * added/removed/modified. See the comments on the attribute reservation > + * calculations for more details. > + * > + * Note for rename: rename will vastly overestimate requirements. This will be > + * addressed later when modifications are made to ensure parent attribute Later? I took a look at the rename patch, and it looks like we're using logged xattrs from the start. --D > + * modifications can be done atomically with the rename operation. > + */ > +STATIC void > +xfs_calc_namespace_reservations( > + struct xfs_mount *mp, > + struct xfs_trans_resv *resp) > +{ > + ASSERT(resp->tr_attrsetm.tr_logres > 0); > > resp->tr_rename.tr_logres = xfs_calc_rename_reservation(mp); > resp->tr_rename.tr_logcount = XFS_RENAME_LOG_COUNT; > @@ -948,15 +991,37 @@ xfs_trans_resv_calc( > resp->tr_create.tr_logcount = XFS_CREATE_LOG_COUNT; > resp->tr_create.tr_logflags |= XFS_TRANS_PERM_LOG_RES; > > + resp->tr_mkdir.tr_logres = xfs_calc_mkdir_reservation(mp); > + resp->tr_mkdir.tr_logcount = XFS_MKDIR_LOG_COUNT; > + resp->tr_mkdir.tr_logflags |= XFS_TRANS_PERM_LOG_RES; > + > + xfs_calc_parent_ptr_reservations(mp); > +} > + > +void > +xfs_trans_resv_calc( > + struct xfs_mount *mp, > + struct xfs_trans_resv *resp) > +{ > + int logcount_adj = 0; > + > + /* > + * The following transactions are logged in physical format and > + * require a permanent reservation on space. > + */ > + resp->tr_write.tr_logres = xfs_calc_write_reservation(mp, false); > + resp->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT; > + resp->tr_write.tr_logflags |= XFS_TRANS_PERM_LOG_RES; > + > + resp->tr_itruncate.tr_logres = xfs_calc_itruncate_reservation(mp, false); > + resp->tr_itruncate.tr_logcount = XFS_ITRUNCATE_LOG_COUNT; > + resp->tr_itruncate.tr_logflags |= XFS_TRANS_PERM_LOG_RES; > + > resp->tr_create_tmpfile.tr_logres = > xfs_calc_create_tmpfile_reservation(mp); > resp->tr_create_tmpfile.tr_logcount = XFS_CREATE_TMPFILE_LOG_COUNT; > resp->tr_create_tmpfile.tr_logflags |= XFS_TRANS_PERM_LOG_RES; > > - resp->tr_mkdir.tr_logres = xfs_calc_mkdir_reservation(mp); > - resp->tr_mkdir.tr_logcount = XFS_MKDIR_LOG_COUNT; > - resp->tr_mkdir.tr_logflags |= XFS_TRANS_PERM_LOG_RES; > - > resp->tr_ifree.tr_logres = xfs_calc_ifree_reservation(mp); > resp->tr_ifree.tr_logcount = XFS_INACTIVE_LOG_COUNT; > resp->tr_ifree.tr_logflags |= XFS_TRANS_PERM_LOG_RES; > @@ -986,6 +1051,8 @@ xfs_trans_resv_calc( > resp->tr_qm_dqalloc.tr_logcount = XFS_WRITE_LOG_COUNT; > resp->tr_qm_dqalloc.tr_logflags |= XFS_TRANS_PERM_LOG_RES; > > + xfs_calc_namespace_reservations(mp, resp); > + > /* > * The following transactions are logged in logical format with > * a default log count. > -- > 2.25.1 >