On Wed, Aug 29, 2018 at 8:17 AM, Rik van Riel <riel@xxxxxxxxxxx> wrote: > On Tue, 2018-08-28 at 20:46 -0700, Andy Lutomirski wrote: >> On Tue, Aug 28, 2018 at 10:56 AM, Rik van Riel <riel@xxxxxxxxxxx> >> wrote: >> > On Mon, 27 Aug 2018 16:04:16 -0700 >> > Andy Lutomirski <luto@xxxxxxxxxx> wrote: >> > >> > > The 0day bot is still chewing on this, but I've tested it a bit >> > > locally >> > > and it seems to do the right thing. >> > >> > Hi Andy, >> > >> > the version of the patch below should fix the bug we talked about >> > in email yesterday. It should automatically cover kernel threads >> > in lazy TLB mode, because current->mm will be NULL, while the >> > cpu_tlbstate.loaded_mm should never be NULL. >> > >> >> That's better than mine. I tweaked it a bit and added some >> debugging, >> and I got this: >> >> > https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/commit/?h=x86/fixes&id=dd956eba16646fd0b15c3c0741269dfd84452dac >> >> I made the loaded_mm handling a little more conservative to make it >> more obvious that switch_mm_irqs_off() is safe regardless of exactly >> when it gets called relative to switching current. > > I am not convinced that the dance of writing > cpu_tlbstate.loaded_mm twice, with a barrier on > each end, is useful or necessary. > > At the time switch_mm_irqs_off returns, nmi_uaccess_ok() > will still return false, because we have not switched > "current" to the task that owns the next mm_struct yet. > > We just have to make sure to: > 1) Change cpu_tlbstate.loaded_mm before we manipulate > CR3, and > 2) Change "current" only once enough of the mm stuff has > been switched, __switch_to seems to get that right. > > Between the time switch_mm_irqs_off() sets cpu_tlbstate > to the next mm, and __switch_to moves() over current, > nmi_uaccess_ok() will return false. All true, but I think it stops working as soon as someone starts calling switch_mm_irqs_off() for some other reason, such as during text_poke(). And that was the original motivation for this patch.