On Tue, Sep 29, 2020 at 10:43:38AM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > When xfs_defer_capture extracts the deferred ops and transaction state > from a transaction, it should absorb the remaining block reservation so > that when we continue the dfops chain, we still have those blocks to > use. > > This adds the requirement that every log intent item recovery function > must be careful to reserve enough blocks to handle both itself and all > defer ops that it can queue. On the other hand, this enables us to do > away with the handwaving block estimation nonsense that was going on in > xlog_finish_defer_ops. > > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > Reviewed-by: Christoph Hellwig <hch@xxxxxx> > Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx> > --- > fs/xfs/libxfs/xfs_defer.c | 5 +++++ > fs/xfs/libxfs/xfs_defer.h | 1 + > fs/xfs/xfs_log_recover.c | 18 +----------------- > 3 files changed, 7 insertions(+), 17 deletions(-) > > > diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c > index 85c371d29e8d..0cceebb390c4 100644 > --- a/fs/xfs/libxfs/xfs_defer.c > +++ b/fs/xfs/libxfs/xfs_defer.c > @@ -575,6 +575,10 @@ xfs_defer_ops_capture( > dfc->dfc_tpflags = tp->t_flags & XFS_TRANS_LOWMODE; > tp->t_flags &= ~XFS_TRANS_LOWMODE; > > + /* Capture the block reservation along with the dfops. */ > + dfc->dfc_blkres = tp->t_blk_res - tp->t_blk_res_used; > + tp->t_blk_res = tp->t_blk_res_used; > + > return dfc; > } > > @@ -632,6 +636,7 @@ xfs_defer_ops_continue( > /* Move captured dfops chain and state to the transaction. */ > list_splice_init(&dfc->dfc_dfops, &tp->t_dfops); > tp->t_flags |= dfc->dfc_tpflags; > + tp->t_blk_res += dfc->dfc_blkres; > > kmem_free(dfc); > } Seems sane, but I'm curious why we need to modify the transactions directly in both of these contexts. Rather than building up and holding a growing block reservation across transactions during intent processing, could we just sample the unused blocks in the transaction at capture time and use that as a resblks parameter when we allocate the transaction to continue the chain? Then we at least have some validation via the traditional allocation path if we ever screw up the accounting.. Brian > diff --git a/fs/xfs/libxfs/xfs_defer.h b/fs/xfs/libxfs/xfs_defer.h > index 3af82ebc1249..b1c7b761afd5 100644 > --- a/fs/xfs/libxfs/xfs_defer.h > +++ b/fs/xfs/libxfs/xfs_defer.h > @@ -75,6 +75,7 @@ struct xfs_defer_capture { > /* Deferred ops state saved from the transaction. */ > struct list_head dfc_dfops; > unsigned int dfc_tpflags; > + unsigned int dfc_blkres; > }; > > /* > diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c > index 550d0fa8057a..b06c9881a13d 100644 > --- a/fs/xfs/xfs_log_recover.c > +++ b/fs/xfs/xfs_log_recover.c > @@ -2439,26 +2439,10 @@ xlog_finish_defer_ops( > { > struct xfs_defer_capture *dfc, *next; > struct xfs_trans *tp; > - int64_t freeblks; > - uint64_t resblks; > int error = 0; > > list_for_each_entry_safe(dfc, next, capture_list, dfc_list) { > - /* > - * We're finishing the defer_ops that accumulated as a result > - * of recovering unfinished intent items during log recovery. > - * We reserve an itruncate transaction because it is the > - * largest permanent transaction type. Since we're the only > - * user of the fs right now, take 93% (15/16) of the available > - * free blocks. Use weird math to avoid a 64-bit division. > - */ > - freeblks = percpu_counter_sum(&mp->m_fdblocks); > - if (freeblks <= 0) > - return -ENOSPC; > - > - resblks = min_t(uint64_t, UINT_MAX, freeblks); > - resblks = (resblks * 15) >> 4; > - error = xfs_trans_alloc(mp, &M_RES(mp)->tr_itruncate, resblks, > + error = xfs_trans_alloc(mp, &M_RES(mp)->tr_itruncate, 0, > 0, XFS_TRANS_RESERVE, &tp); > if (error) > return error; >