Excerpts from peterz@xxxxxxxxxxxxx's message of August 28, 2020 9:15 pm: > On Fri, Aug 28, 2020 at 08:00:19PM +1000, Nicholas Piggin wrote: > >> Closing this race only requires interrupts to be disabled while ->mm >> and ->active_mm are being switched, but the TLB problem requires also >> holding interrupts off over activate_mm. Unfortunately not all archs >> can do that yet, e.g., arm defers the switch if irqs are disabled and >> expects finish_arch_post_lock_switch() to be called to complete the >> flush; um takes a blocking lock in activate_mm(). > > ARM at least has activate_mm() := switch_mm(), so it could be made to > work. > Yeah, so long as that post_lock_switch switch did the right thing with respect to its TLB flushing. It should do because arm doesn't seem to check ->mm or ->active_mm (and if it was broken, the scheduler context switch would be suspect too). I don't think the fix would be hard, just that I don't have a good way to test it and qemu isn't great for testing this kind of thing. um too I think could probably defer that lock until after interrupts are enabled again. I might throw a bunch of arch conversion patches over the wall if this gets merged and try to move things along. Thanks, Nick