On 09/02/2013 05:49 PM, Gleb Natapov wrote: > On Mon, Sep 02, 2013 at 05:42:25PM +0800, Xiao Guangrong wrote: >> On 09/01/2013 05:17 PM, Gleb Natapov wrote: >>> On Fri, Aug 30, 2013 at 02:41:37PM +0200, Paolo Bonzini wrote: >>>> Page tables in a read-only memory slot will currently cause a triple >>>> fault because the page walker uses gfn_to_hva and it fails on such a slot. >>>> >>>> OVMF uses such a page table; however, real hardware seems to be fine with >>>> that as long as the accessed/dirty bits are set. Save whether the slot >>>> is readonly, and later check it when updating the accessed and dirty bits. >>>> >>> The fix looks OK to me, but some comment below. >>> >>>> Cc: stable@xxxxxxxxxxxxxxx >>>> Cc: gleb@xxxxxxxxxx >>>> Cc: Xiao Guangrong <xiaoguangrong@xxxxxxxxxxxxxxxxxx> >>>> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> >>>> --- >>>> CCing to stable@ since the regression was introduced with >>>> support for readonly memory slots. >>>> >>>> arch/x86/kvm/paging_tmpl.h | 7 ++++++- >>>> include/linux/kvm_host.h | 1 + >>>> virt/kvm/kvm_main.c | 14 +++++++++----- >>>> 3 files changed, 16 insertions(+), 6 deletions(-) >>>> >>>> diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h >>>> index 0433301..dadc5c0 100644 >>>> --- a/arch/x86/kvm/paging_tmpl.h >>>> +++ b/arch/x86/kvm/paging_tmpl.h >>>> @@ -99,6 +99,7 @@ struct guest_walker { >>>> pt_element_t prefetch_ptes[PTE_PREFETCH_NUM]; >>>> gpa_t pte_gpa[PT_MAX_FULL_LEVELS]; >>>> pt_element_t __user *ptep_user[PT_MAX_FULL_LEVELS]; >>>> + bool pte_writable[PT_MAX_FULL_LEVELS]; >>>> unsigned pt_access; >>>> unsigned pte_access; >>>> gfn_t gfn; >>>> @@ -235,6 +236,9 @@ static int FNAME(update_accessed_dirty_bits)(struct kvm_vcpu *vcpu, >>>> if (pte == orig_pte) >>>> continue; >>>> >>>> + if (unlikely(!walker->pte_writable[level - 1])) >>>> + return -EACCES; >>>> + >>>> ret = FNAME(cmpxchg_gpte)(vcpu, mmu, ptep_user, index, orig_pte, pte); >>>> if (ret) >>>> return ret; >>>> @@ -309,7 +313,8 @@ retry_walk: >>>> goto error; >>>> real_gfn = gpa_to_gfn(real_gfn); >>>> >>>> - host_addr = gfn_to_hva(vcpu->kvm, real_gfn); >>>> + host_addr = gfn_to_hva_read(vcpu->kvm, real_gfn, >>>> + &walker->pte_writable[walker->level - 1]); >>> The use of gfn_to_hva_read is misleading. The code can still write into >>> gfn. Lets rename gfn_to_hva_read to gfn_to_hva_prot() and gfn_to_hva() >>> to gfn_to_hva_write(). >> >> Yes. I agreed. >> >>> >>> This makes me think are there other places where gfn_to_hva() was >>> used, but gfn_to_hva_prot() should have been? >>> - kvm_host_page_size() looks incorrect. We never use huge page to map >>> read only memory slots currently. >> >> It only checks whether gfn have been mapped, I think we can use >> gfn_to_hva_read() instead, the real permission will be checked when we translate >> the gfn to pfn. >> > Yes, all the cases I listed should be changed to use function that looks > at both regular and RO slots. > >>> - kvm_handle_bad_page() also looks incorrect and may cause incorrect >>> address to be reported to userspace. >> >> I have no idea on this point. kvm_handle_bad_page() is called when it failed to >> translate the target gfn to pfn, then the emulator can detect the error on target gfn >> properly. no? Or i misunderstood your meaning? >> > I am talking about the following code: > > if (pfn == KVM_PFN_ERR_HWPOISON) { > kvm_send_hwpoison_signal(gfn_to_hva(vcpu->kvm, gfn), current); > return 0; > } > > pfn will be KVM_PFN_ERR_HWPOISON gfn is backed by faulty memory, we need > to report the liner address of the faulty memory to a userspace here, > but if gfn is in a RO slot gfn_to_hva() will not return correct address > here. Got it, thanks for your explanation. BTW, if you and Paolo are busy on other things, i am happy to fix these issues. :) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html