On Thu, Feb 01, 2024 at 11:30:15AM +1100, Dave Chinner wrote: > From: Dave Chinner <dchinner@xxxxxxxxxx> > > To allow us to recycle inodes that are awaiting inactivation, we > need to enable lazy removal of inodes from the list. Th elist is a s/Th elist/The list/ > lockless single linked variant, so we can't just remove inodes from > the list at will. > > Instead, we can remove them lazily whenever inodegc runs by enabling > the inodegc processing to determine whether inactivation needs to be > done at processing time rather than queuing time. > > We've already modified the queuing code to only queue the inode if > it isn't already queued, so here all we need to do is modify the > queue processing to determine if inactivation needs to be done. > > Hence we introduce the behaviour that we can cancel inactivation > processing simply by clearing the XFS_NEED_INACTIVE flag on the > inode. Processing will check this flag and skip inactivation > processing if it is not set. The flag is always set at queuing time, > regardless of whether the inode is already one the queues or not. > Hence if it is not set at processing time, it means that something > has cancelled the inactivation and we should just remove it from the > list and then leave it alone. > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> > --- > fs/xfs/xfs_icache.c | 38 ++++++++++++++++++++++++++++++-------- > 1 file changed, 30 insertions(+), 8 deletions(-) > > diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c > index 2dd1559aade2..10588f78f679 100644 > --- a/fs/xfs/xfs_icache.c > +++ b/fs/xfs/xfs_icache.c > @@ -1877,15 +1877,23 @@ xfs_inodegc_worker( > int error; > > /* > - * Switch state to inactivating and remove the inode from the > - * gclist. This allows the use of llist_on_list() in the queuing > - * code to determine if the inode is already on an inodegc > - * queue. > + * Remove the inode from the gclist and determine if it needs to > + * be processed. The XFS_NEED_INACTIVE flag gets cleared if the > + * inode is reactivated after queuing, but the list removal is > + * lazy and left up to us. > + * > + * We always remove the inode from the list to allow the use of > + * llist_on_list() in the queuing code to determine if the inode > + * is already on an inodegc queue. > */ > spin_lock(&ip->i_flags_lock); > + init_llist_node(&ip->i_gclist); > + if (!(ip->i_flags & XFS_NEED_INACTIVE)) { > + spin_unlock(&ip->i_flags_lock); > + continue; > + } > ip->i_flags |= XFS_INACTIVATING; > ip->i_flags &= ~XFS_NEED_INACTIVE; > - init_llist_node(&ip->i_gclist); Nit: unnecessary churn from the last patch. So if I understand this correctly, if we think a released inode needs inactivation, we put it on the percpu gclist and set NEEDS_INACTIVE. Once it's on there, only the inodegc worker can remove it from that list. The novel part here is that now we serialize the i_gclist update with i_flags_lock, which means that the inodegc worker can observe that NEEDS_INACTIVE fell off the inode, and ignore the inode. This sounds pretty similar to the v8 deferred inode inactivation series where one could untag a NEED_INACTIVE inode to prevent the inodegc worker from finding it, though now ported to lockless lists that showed up for v9. > spin_unlock(&ip->i_flags_lock); > > error = xfs_inodegc_inactivate(ip); > @@ -2153,7 +2161,6 @@ xfs_inode_mark_reclaimable( > struct xfs_inode *ip) > { > struct xfs_mount *mp = ip->i_mount; > - bool need_inactive; > > XFS_STATS_INC(mp, vn_reclaim); > > @@ -2162,8 +2169,23 @@ xfs_inode_mark_reclaimable( > */ > ASSERT_ALWAYS(!xfs_iflags_test(ip, XFS_ALL_IRECLAIM_FLAGS)); > > - need_inactive = xfs_inode_needs_inactive(ip); > - if (need_inactive) { > + /* > + * If the inode is already queued for inactivation because it was > + * re-activated and is now being reclaimed again (e.g. fs has been > + * frozen for a while) we must ensure that the inode waits for inodegc > + * to be run and removes it from the inodegc queue before it moves to > + * the reclaimable state and gets freed. > + * > + * We don't care about races here. We can't race with a list addition > + * because only one thread can be evicting the inode from the VFS cache, > + * hence false negatives can't occur and we only need to worry about > + * list removal races. If we get a false positive from a list removal > + * race, then the inode goes through the inactive list whether it needs > + * to or not. This will slow down reclaim of this inode slightly but > + * should have no other side effects. That makes sense to me. > + */ > + if (llist_on_list(&ip->i_gclist) || > + xfs_inode_needs_inactive(ip)) { With the nits fixed, Reviewed-by: Darrick J. Wong <djwong@xxxxxxxxxx> --D > xfs_inodegc_queue(ip); > return; > } > -- > 2.43.0 > >