On Tue, Jul 14, 2020 at 05:46:05AM -0700, Andy Lutomirski wrote: > x86 has this exact problem. At least no more than 64*8 CPUs share the cache line :) I've seen patches for a 'sparse' bitmap to solve related problems. It's basically the same code, except it multiplies everything (size, bit-nr) by a constant to reduce the number of active bits per line. This sadly doesn't take topology into account, but reducing contention is still good ofcourse.