Re: [PATCH] KVM: nVMX: Fix L2 guest hang if shadow page tables on EPT

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 22/03/2017 14:44, Wanpeng Li wrote:
> 2017-03-22 20:00 GMT+08:00 Ladi Prosek <lprosek@xxxxxxxxxx>:
>> On Sat, Mar 18, 2017 at 7:37 AM, Wanpeng Li <kernellwp@xxxxxxxxx> wrote:
>>> 2017-03-18 1:28 GMT+08:00 Ladi Prosek <lprosek@xxxxxxxxxx>:
>>>> On Fri, Mar 17, 2017 at 3:41 PM, Wanpeng Li <kernellwp@xxxxxxxxx> wrote:
>>>>> From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
>>>>>
>>>>> The L2 guest hang if shadow page tables on EPT, the trace on L1 shows that
>>>>> L2 kvm_exit reason EXCEPTION_NMI and page fault repeatedly:
>>>>>
>>>>> qemu-system-x86-2821  [003] d..2    45.848814: kvm_entry: vcpu 0
>>>>> qemu-system-x86-2821  [003] ...1    45.848827: kvm_exit: reason EXCEPTION_NMI rip 0xe05b info fe05b 80000b0e
>>>>> qemu-system-x86-2821  [003] ...1    45.848827: kvm_page_fault: address fe05b error_code 14
>>>>>
>>>>> Commit 7ca29de21362 (KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT)
>>>>> prevents to load L2's PDPTRs according to dereferencing L2's CR3 since it is
>>>>> uninitialized in real mode. Hyper-V L1 will emulate L2 real mode with PAE
>>>>> paging and EPT enabled. However, there is a progress to switch from Legacy
>>>>> mode's such-mode Protected mode to Long mode during system boot, the check
>>>>> in nested_vmx_load_cr3() will prevent to load PDPTRs if it is still in
>>>>> Protected mode w/ PAE paging and nested EPT/shadow page tables on EPT. Actually
>>>>> the original commit should just intended to prevent to dereference L2's CR3
>>>>> if the L1 hypervisor emulates L2's real mode through vm8086.
>>>>>
>>>>> This patch fixes it by allowing load PDPTRs if PAE paing, EPT enabled and
>>>>> !vm86_active.
>>>>>
>>>>> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
>>>>> Cc: Radim Krčmář <rkrcmar@xxxxxxxxxx>
>>>>> Cc: Ladi Prosek <lprosek@xxxxxxxxxx>
>>>>> Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
>>>>> ---
>>>>>  arch/x86/kvm/vmx.c | 4 ++--
>>>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>>>> index c664365..2b2a05f 100644
>>>>> --- a/arch/x86/kvm/vmx.c
>>>>> +++ b/arch/x86/kvm/vmx.c
>>>>> @@ -9933,7 +9933,7 @@ static bool nested_cr3_valid(struct kvm_vcpu *vcpu, unsigned long val)
>>>>>  static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool nested_ept,
>>>>>                                u32 *entry_failure_code)
>>>>>  {
>>>>> -       if (cr3 != kvm_read_cr3(vcpu) || (!nested_ept && pdptrs_changed(vcpu))) {
>>>>> +       if (cr3 != kvm_read_cr3(vcpu) || pdptrs_changed(vcpu)) {
>>>>>                 if (!nested_cr3_valid(vcpu, cr3)) {
>>>>>                         *entry_failure_code = ENTRY_FAIL_DEFAULT;
>>>>>                         return 1;
>>>>> @@ -9944,7 +9944,7 @@ static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool ne
>>>>>                  * must not be dereferenced.
>>>>>                  */
>>>>>                 if (!is_long_mode(vcpu) && is_pae(vcpu) && is_paging(vcpu) &&
>>>>> -                   !nested_ept) {
>>>>> +                   !(nested_ept && to_vmx(vcpu)->rmode.vm86_active)) {
>>>>
>>>> This change breaks Hyper-V on KVM. L2 hangs on start-up, same symptoms
>>>> as before 7ca29de21362.
>>>
>>> Hmm, I miss the function pdptrs_changed() will also dereference CR3.
>>> How about something like this:
>>>
>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>> index c664365..d7ebf03 100644
>>> --- a/arch/x86/kvm/vmx.c
>>> +++ b/arch/x86/kvm/vmx.c
>>> @@ -9933,7 +9933,9 @@ static bool nested_cr3_valid(struct kvm_vcpu
>>> *vcpu, unsigned long val)
>>>  static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long
>>> cr3, bool nested_ept,
>>>                     u32 *entry_failure_code)
>>>  {
>>> -    if (cr3 != kvm_read_cr3(vcpu) || (!nested_ept && pdptrs_changed(vcpu))) {
>>> +    if (cr3 != kvm_read_cr3(vcpu) ||
>>> +        (!(nested_ept && to_vmx(vcpu)->rmode.vm86_active) &&
>>> +        pdptrs_changed(vcpu))) {
>>>          if (!nested_cr3_valid(vcpu, cr3)) {
>>>              *entry_failure_code = ENTRY_FAIL_DEFAULT;
>>>              return 1;
>>> @@ -9944,7 +9946,7 @@ static int nested_vmx_load_cr3(struct kvm_vcpu
>>> *vcpu, unsigned long cr3, bool ne
>>>           * must not be dereferenced.
>>>           */
>>>          if (!is_long_mode(vcpu) && is_pae(vcpu) && is_paging(vcpu) &&
>>> -            !nested_ept) {
>>> +            !(nested_ept && to_vmx(vcpu)->rmode.vm86_active)) {
>>>              if (!load_pdptrs(vcpu, vcpu->arch.walk_mmu, cr3)) {
>>>                  *entry_failure_code = ENTRY_FAIL_PDPTE;
>>>                  return 1;
>>
>> Still the same, Hyper-V is broken. The problem is not in real vs.
>> protected mode. The way nested_ept_enabled is computed is incorrect.
>>
>> I can run both Hyper-V and KVM with EPT = 0 in L1 with this patch. Can
>> you please give it a try?
>>
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 98e82ee..9145c94 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -10121,7 +10121,7 @@ static int prepare_vmcs02(struct kvm_vcpu
>> *vcpu, struct vmcs12 *vmcs12,
>>                                 vmcs12->guest_intr_status);
>>                 }
>>
>> -               nested_ept_enabled = (exec_control &
>> SECONDARY_EXEC_ENABLE_EPT) != 0;
>> +               nested_ept_enabled =
>> (vmcs12->secondary_vm_exec_control & SECONDARY_EXEC_ENABLE_EPT) != 0;
>>
>>                 /*
>>                  * Write an illegal value to APIC_ACCESS_ADDR. Later,
> 
> You are right, it works. Please send out a formal patch and add the
> kvm-unit-tests as Paolo mentioned.

I do believe it would be more fair if _you_ send out the kvm-unit-tests
patch.

Paolo



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux