On Mon, May 04, 2020 at 06:13:39PM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > When we replay unfinished intent items that have been recovered from the > log, it's possible that the replay will cause the creation of more > deferred work items. As outlined in commit 509955823cc9c ("xfs: log > recovery should replay deferred ops in order"), later work items have an > implicit ordering dependency on earlier work items. Therefore, recovery > must replay the items (both recovered and created) in the same order > that they would have been during normal operation. > > For log recovery, we enforce this ordering by using an empty transaction > to collect deferred ops that get created in the process of recovering a > log intent item to prevent them from being committed before the rest of > the recovered intent items. After we finish committing all the > recovered log items, we allocate a transaction with an enormous block > reservation, splice our huge list of created deferred ops into that > transaction, and commit it, thereby finishing all those ops. > > This is /really/ hokey -- it's the one place in XFS where we allow > nested transactions; the splicing of the defer ops list is is inelegant > and has to be done twice per recovery function; and the broken way we > handle inode pointers and block reservations cause subtle use-after-free > and allocator problems that will be fixed by this patch and the two > patches after it. > > Therefore, replace the hokey empty transaction with a structure designed > to capture each chain of deferred ops that are created as part of > recovering a single unfinished log intent. Finally, refactor the loop > that replays those chains to do so using one transaction per chain. > > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> FWIW, I don't like the "freezer" based naming here. It's too easily confused with freezing and thawing the filesystem.... I know, "delayed deferred ops" isn't much better, but at least it won't get confused with existing unrelated functionality. I've barely looked at the code, so no real comments on that yet, but I did notice this: > @@ -2495,35 +2515,59 @@ xlog_recover_process_data( > /* Take all the collected deferred ops and finish them in order. */ > static int > xlog_finish_defer_ops( > - struct xfs_trans *parent_tp) > + struct xfs_mount *mp, > + struct list_head *dfops_freezers) > { > - struct xfs_mount *mp = parent_tp->t_mountp; > + struct xfs_defer_freezer *dff, *next; > struct xfs_trans *tp; > int64_t freeblks; > uint resblks; .... > + resblks = min_t(int64_t, UINT_MAX, freeblks); > + resblks = (resblks * 15) >> 4; Can overflow when freeblks > (UINT_MAX / 15). Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx