On Thu, May 25, 2017 at 05:40:16PM -0700, Andy Lutomirski wrote: > On Thu, May 25, 2017 at 4:24 PM, Linus Torvalds > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Thu, May 25, 2017 at 1:33 PM, Kirill A. Shutemov > > <kirill.shutemov@xxxxxxxxxxxxxxx> wrote: > >> Here' my first attempt to bring boot-time between 4- and 5-level paging. > >> It looks not too terrible to me. I've expected it to be worse. > > > > If I read this right, you just made it a global on/off thing. > > > > May I suggest possibly a different model entirely? Can you make it a > > per-mm flag instead? > > > > And then we > > > > (a) make all kthreads use the 4-level page tables > > > > (b) which means that all the init code uses the 4-level page tables > > > > (c) which means that all those checks for "start_secondary" etc can > > just go away, because those all run with 4-level page tables. > > > > Or is it just much too expensive to switch between 4-level and 5-level > > paging at run-time? > > > > Even ignoring expensiveness, I'm not convinced it's practical. AFAICT > you can't atomically switch the paging mode and CR3, so either you > need some magic page table with trampoline that works in both modes > (which is presumably doable with some trickery) or you need to flip > paging off. Good luck if an NMI hits in the mean time. There was > code like that once upon a time for EFI mixed mode, but it got deleted > due to triple-faults. According to Intel's documentation you pretty much have to disable paging anyway: "The processor allows software to modify CR4.LA57 only outside of IA-32e mode. In IA-32e mode, an attempt to modify CR4.LA57 using the MOV CR instruction causes a general-protection exception (#GP)." (If it weren't for that, maybe you could point the last entry in the PML4 at the PML4 itself, so it also works as a PML5 for accessing kernel addresses? And of course make sure nothing gets loaded above 0xffffff8000000000). - Kevin