On Fri, May 12, 2017 at 1:45 AM, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > > > On 11/05/2017 21:51, David Matlack wrote: >>> No, I've tried the tests on upstream Linux with eptad=0 (so that EPT A/D >>> is not used by KVM on the host) and they also hang with an infinite stream >>> of EPT violations. >> >> I think the failures are caused by this code in handle_ept_violation, >> which clears the ACC_WRITE bit of the exit qualification before >> handling the fault, when EPT A/D is disabled: >> >> if (is_guest_mode(vcpu) >> && !(exit_qualification & EPT_VIOLATION_GVA_TRANSLATED)) { >> /* >> * Fix up exit_qualification according to whether guest >> * page table accesses are reads or writes. >> */ >> u64 eptp = nested_ept_get_cr3(vcpu); >> if (!(eptp & VMX_EPT_AD_ENABLE_BIT)) >> exit_qualification &= ~EPT_VIOLATION_ACC_WRITE; >> } > > I thought so as well; that's why my hypervisor patch moved > trace_kvm_page_fault before munging exit_qualification. Then I searched > for 0x82 or 0x8a in the traces, didn't find it, and got terribly confused. > > What actually happens doesn't match what the SDM says. You correctly cite this: > >> Per 28.2.3.2 EPT Violations: "Writes by the logical processor to guest >> paging structures to update accessed and dirty flags are considered to >> be data writes." > > though here you mean EPT_VIOLATION_GVA_TRANSLATED=0, EPT_VIOLATION_ACC_WRITE=1: > >> In other words, it's valid for >> EPT_VIOLATION_GVA_TRANSLATED and EPT_VIOLATION_ACC_WRITE to both be >> set in the exit qual when EPT A/D is disabled. > > What really happens is that the processor first reads the page tables, > then translates the resulting GPA, and then finally does _another_ EPT > translation for the page tables, this time with EPT_VIOLATION_ACC_READ=1 > and EPT_VIOLATION_ACC_WRITE=1: > > qemu-kvm-3683 [042] 2364.196479: kvm_entry: vcpu 0 > qemu-kvm-3683 [042] 2364.196479: kvm_exit: reason EPT_VIOLATION rip 0x40519e info 8b 0 > qemu-kvm-3683 [042] 2364.196479: kvm_nested_vmexit: rip: 0x000000000040519e reason: EPT_VIOLATION ext_inf1: 0x000000000000008b ext_inf2: 0x0000000000000000 ext_int: 0x00000000 ext_int_err: 0x00000000 > qemu-kvm-3683 [042] 2364.196480: kvm_page_fault: address 479000 error_code 8b > qemu-kvm-3683 [042] 2364.196480: kvm_mmu_pagetable_walk: addr 479000 pferr 7 P|W|U > qemu-kvm-3683 [042] 2364.196480: kvm_mmu_paging_element: pte 472007 level 4 > qemu-kvm-3683 [042] 2364.196480: kvm_mmu_paging_element: pte 473007 level 3 > qemu-kvm-3683 [042] 2364.196480: kvm_mmu_paging_element: pte 67c007 level 2 > qemu-kvm-3683 [042] 2364.196480: kvm_mmu_paging_element: pte 479001 level 1 > > (This trace comes from vanilla Linux 4.11.0). This matches your kvm-unit-tests > patch ("x86: fix ept_access_test_paddr exit qualifications"). Aargh. > > So Peter is right; the only way to get this behavior is to run with > EPT A/D bits disabled when L1 disables them. I'll wait for you guys to send > a patch for this---we have plenty of time to include it in 4.12.0. In the > meanwhile Radim shall drop patch 2 from the KVM series. Sounds good, we'll get you that patch. When does 4.12.0 close? > > Thanks, > > Paolo