On Wed, Dec 22, 2010 at 09:56:15PM -0600, Alex Elder wrote: > In __percpu_counter_add_unless_lt(), there is a test to see if > a call to __percpu_counter_add() can be made, knowing the result > will not be less than the given threshhold and the called function > will do the addition without taking a lock. > > Unfortunately, it is using the wrong value in its comparison--the > amount to add is not properly being taken into account. As a > result, __percpu_counter_add() could end up adding a value when it > should not. And even if the result will be non-negative, the called > function may take the lock anyway because it does so if the *sum* of > the (percpu) delta count and the given amount (and not just the amount > alone) is greater than the batch size (or less than its negative). > > Fix the comparison, and since it __percpu_counter_add() will do > the right thing (and might lock anyway), just call it regardless > of what amount is getting added. > > Signed-off-by: Alex Elder <aelder@xxxxxxx> > > --- > lib/percpu_counter.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > Index: b/lib/percpu_counter.c > =================================================================== > --- a/lib/percpu_counter.c > +++ b/lib/percpu_counter.c > @@ -239,10 +239,10 @@ int __percpu_counter_add_unless_lt(struc > goto out; > > /* > - * If the counter is over the threshold and the change is less than the > - * batch size, we might be able to avoid locking. > + * If the updated counter will be over the threshold we know > + * we can safely add, and might be able to avoid locking. > */ > - if (count > threshold + error && abs(amount) < batch) { > + if (count + amount > threshold + error) { > __percpu_counter_add(fbc, amount, batch); > ret = 1; > goto out; What you have is what I first implemented following your exact logic. However, after having several assert failures n the XFS code, I realised that the logic fails when there are concurrent modifications with abs(amount) > batch. To demonstrate, take the initial conditions: threshold = 0 amount = -39 (for ease of maths) batch = 32 error = 256 (2 * 32 * 4 cpus) with the per CPU counter values: fbc->count = 296 CPU 0 1 2 3 value -31 -31 -31 -31 And the start the concurrent modification such that every racing CPU sees fbc->count = 296 at their initial sample. All evaluate it as: count = 296 if (296 - 39 > 0 + 256) { and so take the __percpu_counter_add() path. Assuming CPUs 1-3 complete before CPU 0, the counter changes value like: cpu 1 pcp_cnt_add_lt(-39) CPU 0 1 2 3 value -31 0 -31 -31 fbc->count = 216 cpu 2 pcp_cnt_add_lt(-39) CPU 0 1 2 3 value -31 0 0 -31 fbc->count = 136 cpu 3 pcp_cnt_add_lt(-39) CPU 0 1 2 3 value -31 0 0 0 fbc->count = 56 And the we run the counter add on CPU 0: __percpu_counter_add(fbc, -39, 32); CPU 0 1 2 3 value 0 0 0 0 fbc->count = -24 And we've just blown through the threshold. We modified the counter and pushed the result less than the threshold which we are not supposed to be, and worst yet is the fact we returning a positive number indicating that we _didn't_ bust the threshold. Analysis of the failures lead me to this lead me to this logic for the unlocked check in my proposed patch: We can do unlocked modifications concurrently on all CPUs IFF 1. the code cannot be preempted between sampling fbc->count and calling __percpu_counter_add() 2. -batch < amount < batch 3. the error bound is set to 2 * batch * nr_cpus To demonstrate the worst case initial conditions, but being bound by the above rules: threshold = 0 amount = -31 batch = 32 error = 256 (2 * 32 * 4 cpus) per CPU counters are at worst case lower edge for lt detection: CPU 0 1 2 3 value -31 -31 -31 -31 fbc->count = 258 and the racing CPUs are all doing equivalent add_unless_lt of -31. Every racing CPU sees fbc->count = 258 at their initial sample. what we end up with is: CPU 0: Racing CPUs count = 258 if (258 > 0 + 256) { cpu 1 pcp_cnt_add_lt(-31) CPU 0 1 2 3 value -31 0 -31 -31 fbc->count = 196 cpu 2 pcp_cnt_add_lt(-31) CPU 0 1 2 3 value -31 0 0 -31 fbc->count = 134 cpu 3 pcp_cnt_add_lt(-31) CPU 0 1 2 3 value -31 0 0 0 fbc->count = 70 __percpu_counter_add(fbc, -31, 32); CPU 0 1 2 3 value 0 0 0 0 fbc->count = 6 And so the return value of 1 is valid because 6 > 0. Indeed, looking at this I can probably change the rule to -batch <= amount <= batch, because the above with amount = -32 would give and end value of fbc->count = 2.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs