On Tue, Aug 18, 2020 at 04:59:59PM -0700, Darrick J. Wong wrote: > On Wed, Aug 12, 2020 at 07:25:47PM +1000, Dave Chinner wrote: > > From: Gao Xiang <hsiangkao@xxxxxxxxxx> > > > > We currently keep unlinked lists short on disk by hashing the inodes > > across multiple buckets. We don't need to ikeep them short anymore > > as we no longer need to traverse the entire to remove an inode from > > it. The in-memory back reference index provides the previous inode > > in the list for us instead. > > > > Log recovery still has to handle existing filesystems that use all > > 64 on-disk buckets so we detect and handle this case specially so > > that so inode eviction can still work properly in recovery. > > > > [dchinner: imported into parent patch series early on and modified > > to fit cleanly. ] > > > > Signed-off-by: Gao Xiang <hsiangkao@xxxxxxxxxx> > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> > > --- > > fs/xfs/xfs_inode.c | 49 +++++++++++++++++++++++++++------------------- > > 1 file changed, 29 insertions(+), 20 deletions(-) > > > > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c > > index f2f502b65691..fa92bdf6e0da 100644 > > --- a/fs/xfs/xfs_inode.c > > +++ b/fs/xfs/xfs_inode.c > > @@ -33,6 +33,7 @@ > > #include "xfs_symlink.h" > > #include "xfs_trans_priv.h" > > #include "xfs_log.h" > > +#include "xfs_log_priv.h" > > #include "xfs_bmap_btree.h" > > #include "xfs_reflink.h" > > > > @@ -2092,25 +2093,32 @@ xfs_iunlink_update_bucket( > > struct xfs_trans *tp, > > xfs_agnumber_t agno, > > struct xfs_buf *agibp, > > - unsigned int bucket_index, > > + xfs_agino_t old_agino, > > xfs_agino_t new_agino) > > { > > + struct xlog *log = tp->t_mountp->m_log; > > struct xfs_agi *agi = agibp->b_addr; > > xfs_agino_t old_value; > > - int offset; > > + unsigned int bucket_index; > > + int offset; > > > > ASSERT(xfs_verify_agino_or_null(tp->t_mountp, agno, new_agino)); > > > > + bucket_index = 0; > > + /* During recovery, the old multiple bucket index can be applied */ > > + if (!log || log->l_flags & XLOG_RECOVERY_NEEDED) { > > Does the flag test need parentheses? Yes, will fix. > It feels a little funny that we pass in old_agino (having gotten it from > agi_unlinked) and then compare it with agi_unlinked, but as the commit > log points out, I guess this is a wart of having to support the old > unlinked list behavior. It makes sense to me that if we're going to > change the unlinked list behavior we could be a little more careful > about double-checking things. > > Question: if a newer kernel crashes with a super-long unlinked list and > the fs gets recovered on an old kernel, will this lead to insanely high > recovery times? I think the answer is no, because recovery is single > threaded and the hash only existed to reduce AGI contention during > normal unlinking operations? Right, the answer is no because log recovery even on old kernels always recovers the inode at the head of the list. It does no traversal, so it doesn't matter if it's recovering one list or 64 lists, the recovery time is the same. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx