On Wed, Oct 14, 2015 at 2:00 PM, Matt Fleming <matt@xxxxxxxxxxxxxxxxxxx> wrote: > On Wed, 14 Oct, at 09:22:03AM, Andy Lutomirski wrote: >> On Wed, Oct 14, 2015 at 6:52 AM, Matt Fleming <matt@xxxxxxxxxxxxxxxxxxx> wrote: >> > (Pulling in luto for low-level x86 fu) >> > >> > On Wed, 14 Oct, at 01:30:45PM, Paolo Bonzini wrote: >> >> On 32-bit systems, the initial_page_table is reused by >> >> efi_call_phys_prolog as an identity map to call >> >> SetVirtualAddressMap. efi_call_phys_prolog takes care of >> >> converting the current CPU's GDT to a physical address too. >> >> >> >> For PAE kernels the identity mapping is achieved by aliasing the >> >> first PDPE for the kernel memory mapping into the first PDPE >> >> of initial_page_table. This makes the EFI stub's trick "just work". >> >> >> >> However, for non-PAE kernels there is no guarantee that the identity >> >> mapping in the initial_page_table extends as far as the GDT; in this >> >> case, accesses to the GDT will cause a page fault (which quickly becomes >> >> a triple fault). Fix this by copying the kernel mappings from >> >> swapper_pg_dir to initial_page_table twice, both at PAGE_OFFSET and at >> >> identity mapping. >> > >> > Oops, good catch guys. This is clearly a bug, but... >> > >> >> For some reason, this is only reproducible with QEMU's dynamic translation >> >> mode, and not for example with KVM. However, even under KVM one can clearly >> >> see that the page table is bogus: >> >> I haven't looked at the code, but it wouldn't surprise me if this is >> some kind of TLB issue. With the hardware TLB (which is in use on >> KVM), it seems quite likely that the GDT is pretty much always in the >> TLB and, if nothing flushes global mappings, then it'll probably stick >> around. > > From some quick experiments it appears that you can skate past this > issue if you don't receive any interrupts while the bogus GDT pointer > is loaded, or if you avoid reloading the segment registers in general. > Which is interesting because I assumed that writing to GDTR took > immediate effect. Trivia for your amusement: AFAICT it's entirely permissible for the GDTR and/or LDT descriptor to point to unmapped memory. Any attempt to use them (segment loads, interrupts, IRET, etc) will try to access that memory as if the access came from CPL 0 and, if the access fails, will generate a valid page fault with CR2 pointing into the GDT or LDT. Xen is nuts^Wclever and actually uses this. Of course, if your #PF vector references a GDT or LDT descriptor and trying to load that descriptor results in a page fault, you get a double fault. I learned this while trying to puzzle out why v1 of my LDT synchronization patch caused random faults on Xen. --Andy -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html