On Tue, Jan 31, 2017 at 8:55 AM, Tejun Heo <tj@xxxxxxxxxx> wrote: > > Douglas found and fixed a ref leak bug in percpu_ref_tryget[_live](). > The bug is caused by storing the return value of > atomic_long_inc_not_zero() into an int temp variable before returning > it as a bool. The interim cast to int loses the upper bits and can > lead to false negatives. As percpu_ref uses a high bit to mark a > draining counter, this can happen relatively easily. Fixed by using > bool for the temp variable. I think this fix is wrong. The fact is, atomic_long_inc_not_zero() shouldn't be returning anything with high bits.. Casting to "int" should have absolutely no impact. "int" is the traditional C truth value (with zero/nonzero being false/true), and while we're generally moving towards "bool" for true/false return values, I do think that code that assumes that these functions can be cast to "int" are right. For example, we used to have similar bugs in "test_bit()" returning the actual bit value (which could be high). And when users hit that problem, we fixed "test_bit()", not the users of it. So I'd rather fix the places that (insanely) return a 64-bit value. Is this purely a ppc64 issue, or does it happen somewhere else too? Basically, the functions atomic*_sub_and_test() atomic*_dec_and_test() atomic*_inc_and_test() atomic*_add_negative() atomic*_add_unless() should all be returning a proper bool/int truth value (preferably 0/1 so that we don't need to worry about it), not some random crazy value. I've pulled this, but I really think it's papering over the real issue. Adding "linux-arch" mailing list to ask architecture maintainers to check their implementation of the atomic ops that return a truth value. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html