On Tue, Jul 21, 2015 at 7:04 PM, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx> wrote: > > > On 07/21/2015 08:49 PM, Andrew Cooper wrote: >> >> On 22/07/2015 01:28, Andy Lutomirski wrote: >>> >>> On Tue, Jul 21, 2015 at 5:21 PM, Andrew Cooper >>> <andrew.cooper3@xxxxxxxxxx> wrote: >>>> >>>> On 22/07/2015 01:07, Andy Lutomirski wrote: >>>>> >>>>> On Tue, Jul 21, 2015 at 4:38 PM, Andrew Cooper >>>>> <andrew.cooper3@xxxxxxxxxx> wrote: >>>>>> >>>>>> On 21/07/2015 22:53, Boris Ostrovsky wrote: >>>>>>> >>>>>>> On 07/21/2015 03:59 PM, Andy Lutomirski wrote: >>>>>>>> >>>>>>>> --- a/arch/x86/include/asm/mmu_context.h >>>>>>>> +++ b/arch/x86/include/asm/mmu_context.h >>>>>>>> @@ -34,6 +34,44 @@ static inline void load_mm_cr4(struct mm_struct >>>>>>>> *mm) {} >>>>>>>> #endif >>>>>>>> /* >>>>>>>> + * ldt_structs can be allocated, used, and freed, but they are >>>>>>>> never >>>>>>>> + * modified while live. >>>>>>>> + */ >>>>>>>> +struct ldt_struct { >>>>>>>> + int size; >>>>>>>> + int __pad; /* keep the descriptors naturally aligned. */ >>>>>>>> + struct desc_struct entries[]; >>>>>>>> +}; >>>>>>> >>>>>>> This breaks Xen which expects LDT to be page-aligned. Not sure why. >>>>>>> >>>>>>> Jan, Andrew? >>>>>> >>>>>> PV guests are not permitted to have writeable mappings to the frames >>>>>> making up the GDT and LDT, so it cannot make unaudited changes to >>>>>> loadable descriptors. In particular, for a 32bit PV guest, it is only >>>>>> the segment limit which protects Xen from the ring1 guest kernel. >>>>>> >>>>>> A lot of this code hasn't been touched in years, and it certainly >>>>>> predates me. The alignment requirement appears to come from the >>>>>> virtual >>>>>> region Xen uses to map the guests GDT and LDT. Strict alignment is >>>>>> required for the GDT so Xen's descriptors starting at 0xe0xx are >>>>>> correct, but the LDT alignment seems to be a side effect of similar >>>>>> codepaths. >>>>>> >>>>>> For an LDT smaller than 8192 entries, I can't see any specific reason >>>>>> for enforcing alignment, other than "that's the way it has always >>>>>> been". >>>>>> >>>>>> However, the guest would still have to relinquish write access to all >>>>>> frames which make up the LDT, which looks to be a bit of an issue >>>>>> given >>>>>> the snippet above. >>>>> >>>>> Does the LDT itself need to be aligned or just the address passed to >>>>> paravirt_alloc_ldt? >>>> >>>> The address which Xen receives needs to be aligned. >>>> >>>> It looks like xen_alloc_ldt() blindly assumes that the desc_struct *ldt >>>> it is passed is page aligned, and passes it straight through. >>> >>> xen_alloc_ldt is just fiddling with protection though, I think. Isn't >>> it xen_set_ldt that's the meat? We could easily pass xen_alloc_ldt a >>> pointer to the ldt_struct. >> >> So it is. It is the linear_addr in xen_set_ldt() which Xen currently >> audits to be page aligned. >> >>>>>> This will allow ldt_struct itself to be page aligned, and for the size >>>>>> field to sit across the base/limit field of what would logically be >>>>>> selector 0x0008 There would be some issues accessing size. To load >>>>>> frames as an LDT, a guest must drop all refs to the page so that its >>>>>> type may be changed from writeable to segdesc. After that, an >>>>>> update_descriptor hypercall can be used to change size, and I believe >>>>>> the guest may subsequently recreate read-only mappings to the frames >>>>>> in >>>>>> question (although frankly it is getting late so you will want to >>>>>> double >>>>>> check all of this). >>>>>> >>>>>> Anyhow, this looks like an issue which should be fixed up with >>>>>> slightly >>>>>> more PVOps, rather than enforcing a Xen view of the world on native >>>>>> Linux. >>>>>> >>>>> I could presumably make the allocation the other way around so the >>>>> size is at the end. I could even use two separate allocations if >>>>> needed. > > > Why not wrap mm_context_t's ldt and size into a struct (just like ldt_struct > but without __pad) and have a single allocation of ldt? > > I.e. > > struct ldt_struct { > int size; > struct desc_struct *entries; > } > > --- a/arch/x86/include/asm/mmu.h > +++ b/arch/x86/include/asm/mmu.h > @@ -9,8 +9,7 @@ > * we put the segment information here. > */ > typedef struct { > - void *ldt; > - int size; > + struct ldt_struct ldt; > #ifdef CONFIG_X86_64 > /* True if mm supports a task running in 32 bit compatibility mode. */ I want the atomic read of both of them. The current code make interesting assumptions about ordering that may or may not be correct but are certainly not obviously correct. --Andy -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html