On Thu, Jan 11, 2018 at 11:47:23PM +0100, Thomas Gleixner wrote: > On Thu, 11 Jan 2018, Steven Sistare wrote: > > On 1/11/2018 5:30 PM, Thomas Gleixner wrote: > > > On Thu, 11 Jan 2018, Thomas Gleixner wrote: > > >> On Thu, 11 Jan 2018, Linus Torvalds wrote: > > >> > > >>> On Thu, Jan 11, 2018 at 12:37 PM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote: > > >>>> > > >>>> 67a9108ed431 ("x86/efi: Build our own page table structures") > > >>>> > > >>>> got rid of EFI depending on real_mode_header->trampoline_pgd > > >>> > > >>> So I think it only got rid of by default - the codepath is still > > >>> there, the allocation is still there, it's just that it's not actually > > >>> used unless somebody does that "efi=old_mmap" thing. > > >> > > >> Yes, the trampoline_pgd is still around, but I can't figure out how it > > >> would be used after boot. Confused, digging more. > > > > > > So coming back to the same commit. From the changelog: > > > > > > This is caused by mapping EFI regions with RWX permissions. > > > There isn't much we can do to restrict the permissions for these > > > regions due to the way the firmware toolchains mix code and > > > data, but we can at least isolate these mappings so that they do > > > not appear in the regular kernel page tables. > > > > > > In commit d2f7cbe7b26a ("x86/efi: Runtime services virtual > > > mapping") we started using 'trampoline_pgd' to map the EFI > > > regions because there was an existing identity mapping there > > > which we use during the SetVirtualAddressMap() call and for > > > broken firmware that accesses those addresses. > > > > > > So this very commit gets rid of the (ab)use of trampoline_pgd and allocates > > > efi_pgd, which we made use the proper size. > > > > > > trampoline_pgd is since then only used to get into long mode in > > > realmode/rm/trampoline_64.S and for reboot in machine_real_restart(). > > > > > > The runtime services stuff does not use it in kernel versions >= 4.6 > > > > > > Thanks, > > > > > > tglx > > > > Yes, and addressing Linus' concern about EFI_OLD_MEMMAP, those paths are > > independent of it. When EFI_OLD_MMAP is enabled, the efi pgd is not > > used, and the bug will not bite. > > We have a fix queued in tip/x86/pti which addresses a missing NX clear, but > that's a different story. > Since you are talking about NX, I see this in last night's -next: kernel tried to execute NX-protected page - exploit attempt? (uid: 0) BUG: unable to handle kernel paging request at fffffe0000007000 IP: 0xfffffe0000006e9d PGD ffd6067 P4D ffd6067 PUD ffd5067 PMD ff73067 PTE 800000000fc09063 Oops: 0011 [#1] PREEMPT SMP PTI Modules linked in: CPU: 0 PID: 1 Comm: init Tainted: G W 4.15.0-rc7-next-20180111-yocto-standard #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:0xfffffe0000006e9d RSP: 0018:ffffaee28000ffd0 EFLAGS: 00000006 RAX: 000000000000000c RBX: 0000000000400040 RCX: 00007f2c4186ad6a RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffb6a00000 RBP: 0000000000000008 R08: 000000000000037f R09: 0000000000000064 R10: 00000000078bfbfd R11: 0000000000000246 R12: 00007f2c41856a60 R13: 0000000000000000 R14: 0000000000402368 R15: 0000000000001000 FS: 0000000000000000(0000) GS:ffff95fecfc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: fffffe0000007000 CR3: 000000000d88a000 CR4: 00000000003406f0 Call Trace: Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 <90> 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 RIP: 0xfffffe0000006e9d RSP: ffffaee28000ffd0 CR2: fffffe0000007000 ---[ end trace a82b8742114c1785 ]--- Is this the issue you are talking about, or is the fix triggering the crash ? Guenter