On Sat, Nov 28, 2020 at 07:54:57PM -0800, Andy Lutomirski wrote: > Version (b) seems fairly straightforward to implement -- add RCU > protection and a atomic_t special_ref_cleared (initially 0) to struct > mm_struct itself. After anyone clears a bit to mm_cpumask (which is > already a barrier), No it isn't. clear_bit() implies no barrier what so ever. That's x86 you're thinking about. > they read mm_users. If it's zero, then they scan > mm_cpumask and see if it's empty. If it is, they atomically swap > special_ref_cleared to 1. If it was zero before the swap, they do > mmdrop(). I can imagine some tweaks that could make this a big > faster, at least in the limit of a huge number of CPUs.