On 04/08/2017 02:30, Brijesh Singh wrote: > > > On 8/2/17 5:42 AM, Paolo Bonzini wrote: >> On 01/08/2017 15:36, Brijesh Singh wrote: >>>> The flow is: >>>> >>>> hardware walks page table; L2 page table points to read only memory >>>> -> pf_interception (code = >>>> -> kvm_handle_page_fault (need_unprotect = false) >>>> -> kvm_mmu_page_fault >>>> -> paging64_page_fault (for example) >>>> -> try_async_pf >>>> map_writable set to false >>>> -> paging64_fetch(write_fault = true, map_writable = false, >>>> prefault = false) >>>> -> mmu_set_spte(speculative = false, host_writable = false, >>>> write_fault = true) >>>> -> set_spte >>>> mmu_need_write_protect returns true >>>> return true >>>> write_fault == true -> set emulate = true >>>> return true >>>> return true >>>> return true >>>> emulate >>>> >>>> Without this patch, emulation would have called >>>> >>>> ..._gva_to_gpa_nested >>>> -> translate_nested_gpa >>>> -> paging64_gva_to_gpa >>>> -> paging64_walk_addr >>>> -> paging64_walk_addr_generic >>>> set fault (nested_page_fault=true) >>>> >>>> and then: >>>> >>>> kvm_propagate_fault >>>> -> nested_svm_inject_npf_exit >>>> >>> maybe then safer thing would be to qualify the new error_code check with >>> !mmu_is_nested(vcpu) or something like that. So that way it would run on >>> L1 guest, and not the L2 guest. I believe that would restrict it avoid >>> hitting this case. Are you okay with this change ? >> Or check "vcpu->arch.mmu.direct_map"? That would be true when not using >> shadow pages. > > Yes that can be used. Are you going to send a patch for this? Paolo >>> IIRC, the main place where this check was valuable was when L1 guest had >>> a fault (when coming out of the L2 guest) and emulation was not needed. >> How do I measure the effect? I tried counting the number of emulations, >> and any difference from the patch was lost in noise. > > I think this patch is necessary for functional reasons (not just > perf), because we added the other patch to look at the GPA and stop > walking the guest page tables on a NPF. > > The issue I think was that hardware has taken an NPF because the page > table is marked RO, and it saves the GPA in the VMCB. KVM was then going > and emulating the instruction and it saw that a GPA was available. But > that GPA was not the GPA of the instruction it is emulating, since it > was the GPA of the tablewalk page that had the fault. It was debugged > that at the time and realized that emulating the instruction was > unnecessary so we added this new code in there which fixed the > functional issue and helps perf. > > I don't have any data on how much perf, as I recall it was most > effective when the L1 guest page tables and L2 nested page tables were > exactly the same. In that case, it avoided emulations for code that L1 > executes which I think could be as much as one emulation per 4kb code page. >