On 21/07/2015 22:53, Boris Ostrovsky wrote: > On 07/21/2015 03:59 PM, Andy Lutomirski wrote: >> --- a/arch/x86/include/asm/mmu_context.h >> +++ b/arch/x86/include/asm/mmu_context.h >> @@ -34,6 +34,44 @@ static inline void load_mm_cr4(struct mm_struct >> *mm) {} >> #endif >> /* >> + * ldt_structs can be allocated, used, and freed, but they are never >> + * modified while live. >> + */ >> +struct ldt_struct { >> + int size; >> + int __pad; /* keep the descriptors naturally aligned. */ >> + struct desc_struct entries[]; >> +}; > > > > This breaks Xen which expects LDT to be page-aligned. Not sure why. > > Jan, Andrew? PV guests are not permitted to have writeable mappings to the frames making up the GDT and LDT, so it cannot make unaudited changes to loadable descriptors. In particular, for a 32bit PV guest, it is only the segment limit which protects Xen from the ring1 guest kernel. A lot of this code hasn't been touched in years, and it certainly predates me. The alignment requirement appears to come from the virtual region Xen uses to map the guests GDT and LDT. Strict alignment is required for the GDT so Xen's descriptors starting at 0xe0xx are correct, but the LDT alignment seems to be a side effect of similar codepaths. For an LDT smaller than 8192 entries, I can't see any specific reason for enforcing alignment, other than "that's the way it has always been". However, the guest would still have to relinquish write access to all frames which make up the LDT, which looks to be a bit of an issue given the snippet above. I think I have a solution, but I doubt it is going to be very popular. * Make a new paravirt hook for allocation of ldt_struct, so the paravirt backend can choose an alignment if needed * Make absolutely certain that __pad has the value 0 (so size and __pad combined don't look like a present descriptor) * Never hand selector 0x0008 to unsuspecting users. This will allow ldt_struct itself to be page aligned, and for the size field to sit across the base/limit field of what would logically be selector 0x0008 There would be some issues accessing size. To load frames as an LDT, a guest must drop all refs to the page so that its type may be changed from writeable to segdesc. After that, an update_descriptor hypercall can be used to change size, and I believe the guest may subsequently recreate read-only mappings to the frames in question (although frankly it is getting late so you will want to double check all of this). Anyhow, this looks like an issue which should be fixed up with slightly more PVOps, rather than enforcing a Xen view of the world on native Linux. ~Andrew -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html