On Tue, Aug 29, 2023 at 04:07:52PM -0700, Darrick J. Wong wrote: > Hi all, > > Ritesh and Eric separately reported crashes in XFS's hook function for > CPU hot remove if the remove event races with a filesystem being > mounted. I also noticed via generic/650 that once in a while the log > will shut down over an apparent overrun of a transaction reservation; > this turned out to be due to CIL percpu list aggregation failing to pick > up the percpu list items from a dying CPU. > > Either way, the solution here is to eliminate the need for a CPU dying > hook by using a private cpumask to track which CPUs have added to their > percpu lists directly, and iterating with that mask. This fixes the log > problems and (I think) solves a theoretical UAF bug in the inodegc code > too. > > v2: fix a few put_cpu uses, add necessary memory barriers, and use > atomic cpumask operations > > If you're going to start using this code, I strongly recommend pulling > from my git trees, which are linked below. > > This has been running on the djcloud for months with no problems. Enjoy! > Comments and questions are, as always, welcome. These changes look good. Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx> -- Dave Chinner david@xxxxxxxxxxxxx