On Thu, Apr 27, 2023 at 03:49:26PM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > I've been noticing odd racing behavior in the inodegc code that could > only be explained by one cpu adding an inode to its inactivation llist > at the same time that another cpu is processing that cpu's llist. > Preemption is disabled between get/put_cpu_ptr, so the only explanation > is scheduler mayhem. I inserted the following debug code into > xfs_inodegc_worker (see the next patch): > > ASSERT(gc->cpu == smp_processor_id()); > > This assertion tripped during overnight tests on the arm64 machines, but > curiously not on x86_64. I think we haven't observed any resource leaks > here because the lockfree list code can handle simultaneous llist_add > and llist_del_all functions operating on the same list. However, the > whole point of having percpu inodegc lists is to take advantage of warm > memory caches by inactivating inodes on the last processor to touch the > inode. > > The incorrect scheduling seems to occur after an inodegc worker is > subjected to mod_delayed_work(). This wraps mod_delayed_work_on with > WORK_CPU_UNBOUND specified as the cpu number. Unbound allows for > scheduling on any cpu, not necessarily the same one that scheduled the > work. Ugh, that's a bit of a landmine. > Because preemption is disabled for as long as we have the gc pointer, I > think it's safe to use current_cpu() (aka smp_processor_id) to queue the > delayed work item on the correct cpu. > > Fixes: 7cf2b0f9611b ("xfs: bound maximum wait time for inodegc work") > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> > --- > fs/xfs/xfs_icache.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) Looks good. Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx> -- Dave Chinner david@xxxxxxxxxxxxx