On Tue, Mar 17, 2020 at 11:06:12PM +0000, George Spelvin wrote: > The most serious is that if two threads simultaneously observe > rand_bits == 1, but do their decrements separately, you end up > with rand_bits = 255 and you generate 255 consecutive 0 bits > before refilling the buffer. > > Since we're only generating random bits, a screwed-up answer occasionally > doesn't really matter (like the comment says "lack of locking is > deliberate"), but 255 screwed up bits is a bit much. Okay, I'm on board! :) Thanks for spelling this race out; I hadn't seen quite how nasty it could get. (Perhaps mention in the commit log for v2?) > I avoided changing the underlying locking model because I didn't > feel up to such an invasive change; I managed to fix the problems > I saw without going there. And shrink the code; tht seemed like > enough of a win to justify it to me. Fair enough. > The compiler is allowed to (in John Woods' memorable explanation) > produce code that makes demons fly out of your nose. (More plausibly, > it may simply crash.) So one thing that I see here that is still in the nasal demon realm is that the left-shift of a signed value, which is technically undefined behavior in C. (See the comment on check_shl_overflow().) Doing a signedness check is very cheap in the resulting machine code; but I suspect sticking to unsigned and reversing direction for a bottom-bit test too bad? i.e.: static unsigned long rand_bits; unsigned long r = READ_ONCE(rand_bits), rshift = r >> 1; if (unlikely(rshift == 0)) { r = get_random_long(); rshift = (r >> 1) | (0x1UL << (BITS_PER_LONG - 1)); } WRITE_ONCE(rand_bits, rshift); if (r & 1) add_to... else add_to...tail -- Kees Cook