Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > Can you re-iterate the exact problem? I konw we talked about this in the > past, but I seem to have misplaced those memories :/ Take this for example: void afs_put_call(struct afs_call *call) { struct afs_net *net = call->net; int n = atomic_dec_return(&call->usage); int o = atomic_read(&net->nr_outstanding_calls); trace_afs_call(call, afs_call_trace_put, n + 1, o, __builtin_return_address(0)); ASSERTCMP(n, >=, 0); if (n == 0) { ... } } I am printing the usage count in the afs_call tracepoint so that I can use it to debug refcount bugs. If I do it like this: void afs_put_call(struct afs_call *call) { int n = refcount_read(&call->usage); int o = atomic_read(&net->nr_outstanding_calls); trace_afs_call(call, afs_call_trace_put, n, o, __builtin_return_address(0)); if (refcount_dec_and_test(&call->usage)) { ... } } then there's a temporal gap between the usage count being read and the actual atomic decrement in which another CPU can alter the count. This can be exacerbated by an interrupt occurring, a softirq occurring or someone enabling the tracepoint. I can't do the tracepoint after the decrement if refcount_dec_and_test() returns false unless I save all the values from the object that I might need as the object could be destroyed any time from that point on. In this particular case, that's just call->debug_id, but it could be other things in other cases. Note that I also can't touch the afs_net object in that situation either, and the outstanding calls count that I record will potentially be out of date - but there's not a lot I can do about that. David