> > > it could race with xfs_inodegc_queue() processing the > > same struct xfs_inodegc. If xfs_inodegc_queue() adds the last inode > > to the gc list during this race, that inode might never be processed > > and reclaimed due to cpumask not set. This maybe lead to memory leaks > > after filesystem unmount, I'm unsure if there are other more serious > > implications. > > xfs_inodegc_stop() should handle this all just fine. It removes > the enabled flag, then moves into a loop that should catch list adds > that were in progress when the enabled flag was cleared. > I don't think xfs_inodegc_stop() adequately handles this scenario. xfs_inodegc_queue_all() only processes the CPUs that are set in mp->m_inodegc_cpumask. If a CPU's corresponding bit is not set in the cpumask, then any inodes waiting for gc on that CPU won't have a chance to be processed. Thanks, Long Li > > > > Current problematic sequence: > > > > > > > > CPU0 CPU1 > > > > -------------------- --------------------- > > > > xfs_inodegc_queue() xfs_inodegc_worker() > > > > llist_del_all(&gc->list) > > > > llist_add(&ip->i_gclist, &gc->list) > > > > cpumask_test_and_set_cpu() > > > > cpumask_clear_cpu() > > > > < cpumask not set > > > > > > > > > Fix this by moving llist_del_all() after cpumask_clear_cpu() to ensure > > > > proper ordering. This change ensures that when the worker thread clears > > > > the cpumask, any concurrent queue operations will either properly set > > > > the cpumask bit or have already emptied the list. > > > > > > > > Also remove unnecessary smp_mb__{before/after}_atomic() barriers since > > > > the llist_* operations already provide required ordering semantics. it > > > > make the code cleaner. > > > > > > IIRC, the barriers were for ordering the cpumask bitmap ops against > > > llist operations. There are calls elsewhere to for_each_cpu() that > > > then use llist_empty() checks (e.g xfs_inodegc_queue_all/wait_all), > > > so on relaxed architectures (like alpha) I think we have to ensure > > > the bitmask ops carried full ordering against the independent llist > > > ops themselves. i.e. llist_empty() just uses READ_ONCE, so it only > > > orders against other llist ops and won't guarantee any specific > > > ordering against against cpumask modifications. > > > > > > I could be remembering incorrectly, but I think that was the > > > original reason for the barriers. Can you please confirm that the > > > cpumask iteration/llist_empty checks do not need these bitmask > > > barriers anymore? If that's ok, then the change looks fine. > > > > > > > Even on architectures with relaxed memory ordering (like alpha), I noticed > > that llist_add() already has full barrier semantics, so I think the > > smp_mb__before_atomic barrier in xfs_inodegc_queue() can be removed. > > Ok. Seems reasonable to remove it if everything uses full memory > barriers for the llist_add() operation. > > -Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx