On Mon, Jun 03, 2024 at 11:43:27AM -0700, Darrick J. Wong wrote: > On Mon, Jun 03, 2024 at 05:23:40PM +0000, Wengang Wang wrote: > > > What about the sb_icount and sb_ifree counters? They are also percpu > > > counters, and they can return transient negative numbers, too, > > > right? If they end up in the log, the same as this transient > > > negative sb_fdblocks count, won't that also cause exactly the same > > > issue? > > > > > > > Yes, sb_icount and sb_ifree are also percpu counters. They have been addressed by > > commit 59f6ab40fd8735c9a1a15401610a31cc06a0bbd6, right? > > icount and ifree don't go through xfs_mod_freecounter, which means that > they never do the "subtract and see if we went negative" dance that > fdblocks/frextents does. percpu_counter_sum_positive isn't necessary. I'm pretty sure that what xfs_mod_freecounter() does has no bearing on whether percpu_counter_sum() can return a negative number. percpu_counter_sum() is intentionally designed to be racy w.r.t. sub-batch percpu counter updates. That is, percpu_counter_sum() takes the fbc->lock while summing only to bound the maximum sum error to +/-(nr_cpus * batch size). It does not serialise against sub-batch percpu count modifications - add/sub only serialise against a sum when they go over the batch size and need folding. Hence the sum will serialise against folding operations, thereby bounding the maximum error across the sum to +/-batch size on every CPU. For example, if we have a counter with a batch of 128, 4 CPUs, fbc->count = 0 and the current percpu count values are: CPU value 0 32 1 0 2 -32 3 0 We have a sum of zero when everything is static and unchanging. However, when we look at the dynamic nature of the sum, we can have percpu counters change during the summing iteration. For example, say between the sum starting and finishing on CPU 3, we have a racing subtraction on CPU 0 of 64, and and add of 64 on CPU 1. Neither trigger batch overruns so don't serialise against a sum in progress. If the running sum samples CPU 0 after it updates, and CPU 1 before it updates, the sum will see: fbc->count = 0 CPU value 0 -32 1 0 2 -32 3 0 and the sum will be -64. If we then rerun the sum and it doesn't race with any updates, it will see: fbc->count = 0 CPU value 0 -32 1 64 2 -32 3 0 and the sum will be zero. The above example could actually happen with ifree/icount. Modifications to ifree and icount in xfs_trans_unreserve_and_mod_sb() aren't serialised against each other (i.e. multiple alloc/free transactions can be committing and updating the ifree/icount counters concurrently) and they can also be running concurrently with the superblock update summing in xfs_log_sb(). Hence the inherent raciness in percpu_counter_sum() vs percpu_counter_add() is not mitigated in any way, and so ifree and icount could have a transient sum that is less than zero.... So AFAICT, then need the _sum_positive() treatment as well. -Dave. -- Dave Chinner david@xxxxxxxxxxxxx