On 09/01/2013 05:17 PM, Gleb Natapov wrote: > On Fri, Aug 30, 2013 at 02:41:37PM +0200, Paolo Bonzini wrote: >> Page tables in a read-only memory slot will currently cause a triple >> fault because the page walker uses gfn_to_hva and it fails on such a slot. >> >> OVMF uses such a page table; however, real hardware seems to be fine with >> that as long as the accessed/dirty bits are set. Save whether the slot >> is readonly, and later check it when updating the accessed and dirty bits. >> > The fix looks OK to me, but some comment below. > >> Cc: stable@xxxxxxxxxxxxxxx >> Cc: gleb@xxxxxxxxxx >> Cc: Xiao Guangrong <xiaoguangrong@xxxxxxxxxxxxxxxxxx> >> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> >> --- >> CCing to stable@ since the regression was introduced with >> support for readonly memory slots. >> >> arch/x86/kvm/paging_tmpl.h | 7 ++++++- >> include/linux/kvm_host.h | 1 + >> virt/kvm/kvm_main.c | 14 +++++++++----- >> 3 files changed, 16 insertions(+), 6 deletions(-) >> >> diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h >> index 0433301..dadc5c0 100644 >> --- a/arch/x86/kvm/paging_tmpl.h >> +++ b/arch/x86/kvm/paging_tmpl.h >> @@ -99,6 +99,7 @@ struct guest_walker { >> pt_element_t prefetch_ptes[PTE_PREFETCH_NUM]; >> gpa_t pte_gpa[PT_MAX_FULL_LEVELS]; >> pt_element_t __user *ptep_user[PT_MAX_FULL_LEVELS]; >> + bool pte_writable[PT_MAX_FULL_LEVELS]; >> unsigned pt_access; >> unsigned pte_access; >> gfn_t gfn; >> @@ -235,6 +236,9 @@ static int FNAME(update_accessed_dirty_bits)(struct kvm_vcpu *vcpu, >> if (pte == orig_pte) >> continue; >> >> + if (unlikely(!walker->pte_writable[level - 1])) >> + return -EACCES; >> + >> ret = FNAME(cmpxchg_gpte)(vcpu, mmu, ptep_user, index, orig_pte, pte); >> if (ret) >> return ret; >> @@ -309,7 +313,8 @@ retry_walk: >> goto error; >> real_gfn = gpa_to_gfn(real_gfn); >> >> - host_addr = gfn_to_hva(vcpu->kvm, real_gfn); >> + host_addr = gfn_to_hva_read(vcpu->kvm, real_gfn, >> + &walker->pte_writable[walker->level - 1]); > The use of gfn_to_hva_read is misleading. The code can still write into > gfn. Lets rename gfn_to_hva_read to gfn_to_hva_prot() and gfn_to_hva() > to gfn_to_hva_write(). Yes. I agreed. > > This makes me think are there other places where gfn_to_hva() was > used, but gfn_to_hva_prot() should have been? > - kvm_host_page_size() looks incorrect. We never use huge page to map > read only memory slots currently. It only checks whether gfn have been mapped, I think we can use gfn_to_hva_read() instead, the real permission will be checked when we translate the gfn to pfn. > - kvm_handle_bad_page() also looks incorrect and may cause incorrect > address to be reported to userspace. I have no idea on this point. kvm_handle_bad_page() is called when it failed to translate the target gfn to pfn, then the emulator can detect the error on target gfn properly. no? Or i misunderstood your meaning? > - kvm_setup_async_pf() also incorrect. Makes all page fault on read > only slot to be sync. > - kvm_vm_fault() one looks OK since function assumes write only slots, > but it is obsolete and should be deleted anyway. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html