On 29/06/2016 19:25, Quentin Casasnovas wrote: > On Fri, Jun 24, 2016 at 03:10:03PM +0200, Paolo Bonzini wrote: >> On 24/06/2016 15:04, Quentin Casasnovas wrote: >>> On Thu, Jun 23, 2016 at 06:03:01PM +0200, Paolo Bonzini wrote: >>>> >>>> >>>> On 18/06/2016 11:01, Quentin Casasnovas wrote: >>>>> Cross-checking the KVM/VMX VMREAD emulation code with the Intel Software >>>>> Developper Manual Volume 3C - "VMREAD - Read Field from Virtual-Machine >>>>> Control Structure", I found that we're enforcing that the destination >>>>> operand is NOT located in a read-only data segment or any code segment when >>>>> the L1 is in long mode - BUT that check should only happen when it is in >>>>> protected mode. >>>>> >>>>> Shuffling the code a bit to make our emulation follow the specification >>>>> allows me to boot a Xen dom0 in a nested KVM and start HVM L2 guests >>>>> without problems. >>>> >>>> That's great, and I'm applying the patch, but it's also pretty weird. :) >>>> Do you have a pointer to Xen source code that does a VMREAD into a >>>> read-only data segment or a code segment? >>> >>> It is indeed pretty weird. Looking at the Xen stack trace, it looks like >>> the vmread is writing to an on-stack buffer, and surely it must be writable >>> so I wonder if Xen might not be using an executable stack for some reason? >>> That would be a bit scary so I'm surely missing something. >>> >>> Is there an easy way to know from my KVM host the different segment >>> permission setup by the guest? >> >> Remove your patch, call dump_vmcs() where the #GP is injected, and >> you'll find the VMCS (including segment permissions, but not the >> instruction info field---you probably should add it) in dmesg. > > Thanks for the heads up :) > > I've had a bit more time to spend on this this morning and attached is the > VMCS dump. I've look at the vmcs_instruction_info and it appears the > segment referenced is SS (which is in sync with the backtrace where the > instruction causing the vmexit is "vmread %rbp, %rbp), and it has awkward > attributes: > > SS: sel=0x0000, attr=0x1c000, limit=0xffffffff, base=0x0000000000000000 > > The lower 16 bits are all zero so KVM VMX emulation was injecting the GP(0) > because we were about to write to a read-only segment. At least the stack > isn't executable from what I can tell! Yes, that was my reading of the VMCS dump too. The weird attributes come from the (non)handling of selectors in 64-bit mode. Paolo > Attached is the full VMCS dump where I've added a printk() to show the > 'type' (all zeroes) and vmcs_instruction_info in case my above analysis is > complete non-sense. > > Quentin > -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html