On Wed, Mar 18, 2020 at 01:44:10AM +0000, George Spelvin wrote: > The old code had separate "rand" and "rand_count" variables, > which could get out of sync with bad results. > > In the worst case, two threads would see rand_count == 1 and > both decrement it, resultint in rand_count = 255 and rand being typo: resultint -> resulting > filled with zeros for the next 255 calls. > > Instead, pack them both into a single, atomically updatable, > variable. This makes it a lot easier to reason about race > conditions. They are still there - the code deliberately eschews > locking - but basically harmless on the rare occasions that > they happen. > > Second, use READ_ONCE and WRITE_ONCE. Without them, we are deep > in the land of nasal demons. The compiler would be free to spill > temporaries to the static variables in arbitrary perverse ways > and create hard-to-find bugs. > > (Alternatively, we could declare the static variable "volatile", > one of the few places in the Linux kernel that would be correct, > but it would probably annoy Linus.) > > Third, use long rather than u64. This not only keeps the > state atomically updatable, it also speeds up the fast path > on 32-bit machines. Saving at least three instructions on > the fast path (one load, one add-with-carry, and one store) > is worth exchanging one call to get_random_u64 for two > calls to get_random_u32. The fast path of get_random_* is > less than the 3*64 = 192 instructions saved, and the slow > path happens every 64 bytes so isn't affectrd by the change. > > I've tried a few variants. Keeping random lsbits with > a most-significant end marker, and using an explicit bool > flag rather than testing r both increase code size slightly. > > x86_64 i386 > This code 94 95 > Explicit bool 103 99 > Lsbits 99 101 > Both 96 100 > > Signed-off-by: George Spelvin <lkml@xxxxxxx> And with Randy's other fix, please consider this: Acked-by: Kees Cook <keescook@xxxxxxxxxxxx> -Kees > Cc: Dan Williams <dan.j.williams@xxxxxxxxx> > Cc: Kees Cook <keescook@xxxxxxxxxxxx> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > Cc: linux-mm@xxxxxxxxx > --- > mm/shuffle.c | 26 ++++++++++++++++---------- > 1 file changed, 16 insertions(+), 10 deletions(-) > > diff --git a/mm/shuffle.c b/mm/shuffle.c > index e0ed247f8d90..4ba3ba84764d 100644 > --- a/mm/shuffle.c > +++ b/mm/shuffle.c > @@ -186,22 +186,28 @@ void __meminit __shuffle_free_memory(pg_data_t *pgdat) > void add_to_free_area_random(struct page *page, struct free_area *area, > int migratetype) > { > - static u64 rand; > - static u8 rand_bits; > + static long rand; /* 0..BITS_PER_LONG-1 buffered random bits */ > + unsigned long r = READ_ONCE(rand), rshift = r << 1;; > > /* > - * The lack of locking is deliberate. If 2 threads race to > - * update the rand state it just adds to the entropy. > + * rand holds some random msbits, with a 1 bit appended, followed > + * by zero-padding in the lsbits. This allows us to maintain > + * the pre-generated bits and the count of bits in a single, > + * atomically updatable, variable. > + * > + * The lack of locking is deliberate. If two threads race to > + * update the rand state it just adds to the entropy. The > + * worst that can happen is a random bit is used twice, or > + * get_random_long is called redundantly. > */ > - if (rand_bits == 0) { > - rand_bits = 64; > - rand = get_random_u64(); > + if (unlikely(rshift == 0)) { > + r = get_random_long(); > + rshift = r << 1 | 1; > } > + WRITE_ONCE(rand, rshift); > > - if (rand & 1) > + if ((long)r < 0) > add_to_free_area(page, area, migratetype); > else > add_to_free_area_tail(page, area, migratetype); > - rand_bits--; > - rand >>= 1; > } > -- > 2.26.0.rc2 -- Kees Cook