On Tue, Jul 17, 2018 at 05:15:28PM +0300, Denis Plotnikov wrote: > The patch enables userspace to be notified about failed memeory accesses. > > This can be employed by userspace, say QEMU, to build the memory access > tracker while userfaultfd doesn't support detection of protected memeory > writes. > > In case of QEMU, it would allow to make a background snapshot mode. > > Signed-off-by: Denis Plotnikov <dplotnikov@xxxxxxxxxxxxx> > --- > arch/x86/kvm/vmx.c | 17 ++++++++++++++++- > include/uapi/linux/kvm.h | 6 ++++++ > 2 files changed, 22 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > index 1689f433f3a0..9c9b840537de 100644 > --- a/arch/x86/kvm/vmx.c > +++ b/arch/x86/kvm/vmx.c > @@ -7269,6 +7269,7 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu) > unsigned long exit_qualification; > gpa_t gpa; > u64 error_code; > + int r; > > exit_qualification = vmcs_readl(EXIT_QUALIFICATION); > > @@ -7305,7 +7306,21 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu) > PFERR_GUEST_FINAL_MASK : PFERR_GUEST_PAGE_MASK; > > vcpu->arch.exit_qualification = exit_qualification; > - return kvm_mmu_page_fault(vcpu, gpa, error_code, NULL, 0); > + > + r = kvm_mmu_page_fault(vcpu, gpa, error_code, NULL, 0); > + > + if (r == -EFAULT) { > + unsigned long hva; > + > + hva = kvm_vcpu_gfn_to_hva(vcpu, gpa >> PAGE_SHIFT); > + > + vcpu->run->exit_reason = KVM_EXIT_FAIL_MEM_ACCESS; > + vcpu->run->hw.hardware_exit_reason = EXIT_REASON_EPT_VIOLATION; > + vcpu->run->fail_mem_access.hva = hva | (gpa & (PAGE_SIZE-1)); > + r = 0; > + } > + > + return r; What I feel like is that we should do this inside kvm_mmu_page_fault() or even deeper inside otherwise this will only work for EPT but not the other use cases as I mentioned in my reply in your QEMU series (e.g., no paging, or shadow pagings). Another thing that I am not sure is that whether we should capture all the -EFAULT into FAIL_MEM_ACCESS event. AFAIU what we really care about is that when kvm vcpus try to write read-only pages setup by mprotect() before hand. So not sure we should only trap this at the specific point where we found that we're writting to a read-only page. IIUC that's somewhere inside hva_to_pfn() and when hva_to_pfn_slow() failed into: ... } else { if (async && vma_is_valid(vma, write_fault)) *async = true; pfn = KVM_PFN_ERR_FAULT; <===== the -EFAULT should be translated from this } ... So maybe we should only trap here instead of all the -EFAULT errors? And this is natually servicing all paging modes, and possibly AMD as well. Thanks, > } > > static int handle_ept_misconfig(struct kvm_vcpu *vcpu) > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > index b6270a3b38e9..6dee323d87a7 100644 > --- a/include/uapi/linux/kvm.h > +++ b/include/uapi/linux/kvm.h > @@ -235,6 +235,7 @@ struct kvm_hyperv_exit { > #define KVM_EXIT_S390_STSI 25 > #define KVM_EXIT_IOAPIC_EOI 26 > #define KVM_EXIT_HYPERV 27 > +#define KVM_EXIT_FAIL_MEM_ACCESS 28 > > /* For KVM_EXIT_INTERNAL_ERROR */ > /* Emulate instruction failed. */ > @@ -392,6 +393,11 @@ struct kvm_run { > } eoi; > /* KVM_EXIT_HYPERV */ > struct kvm_hyperv_exit hyperv; > + > + /* KVM_EXIT_FAIL_MEM_ACCESS */ > + struct { > + __u64 hva; > + } fail_mem_access; > /* Fix the size of the union. */ > char padding[256]; > }; > -- > 2.17.0 > -- Peter Xu