On 2012-08-21 05:02, Xiao Guangrong wrote: > In current code, if we map a readonly memory space from host to guest > and the page is not currently mapped in the host, we will get a fault > pfn and async is not allowed, then the vm will crash > > We introduce readonly memory region to map ROM/ROMD to the guest, read access > is happy for readonly memslot, write access on readonly memslot will cause > KVM_EXIT_MMIO exit > > Signed-off-by: Xiao Guangrong <xiaoguangrong@xxxxxxxxxxxxxxxxxx> > --- > Documentation/virtual/kvm/api.txt | 10 +++- > arch/x86/include/asm/kvm.h | 1 + > arch/x86/kvm/mmu.c | 9 ++++ > arch/x86/kvm/x86.c | 1 + > include/linux/kvm.h | 6 ++- > include/linux/kvm_host.h | 7 +-- > virt/kvm/kvm_main.c | 96 ++++++++++++++++++++++++++++++------- > 7 files changed, 102 insertions(+), 28 deletions(-) > > diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt > index bf33aaa..b91bfd4 100644 > --- a/Documentation/virtual/kvm/api.txt > +++ b/Documentation/virtual/kvm/api.txt > @@ -857,7 +857,8 @@ struct kvm_userspace_memory_region { > }; > > /* for kvm_memory_region::flags */ > -#define KVM_MEM_LOG_DIRTY_PAGES 1UL > +#define KVM_MEM_LOG_DIRTY_PAGES (1UL << 0) > +#define KVM_MEM_READONLY (1UL << 1) > > This ioctl allows the user to create or modify a guest physical memory > slot. When changing an existing slot, it may be moved in the guest > @@ -873,9 +874,12 @@ It is recommended that the lower 21 bits of guest_phys_addr and userspace_addr > be identical. This allows large pages in the guest to be backed by large > pages in the host. > > -The flags field supports just one flag, KVM_MEM_LOG_DIRTY_PAGES, which > +The flags field supports two flag, KVM_MEM_LOG_DIRTY_PAGES, which > instructs kvm to keep track of writes to memory within the slot. See > -the KVM_GET_DIRTY_LOG ioctl. > +the KVM_GET_DIRTY_LOG ioctl. Another flag is KVM_MEM_READONLY when the > +KVM_CAP_READONLY_MEM capability, it indicates the guest memory is read-only, > +that means, guest is only allowed to read it. The last sentence requires some refactoring. :) How about: "The KVM_CAP_READONLY_MEM capability indicates the availability of the KVM_MEM_READONLY flag. When this flag is set for a memory region, KVM only allows read accesses." > Writes will be posted to > +userspace as KVM_EXIT_MMIO exits. > > When the KVM_CAP_SYNC_MMU capability, changes in the backing of the memory > region are automatically reflected into the guest. For example, an mmap() > diff --git a/arch/x86/include/asm/kvm.h b/arch/x86/include/asm/kvm.h > index 246617e..521bf25 100644 > --- a/arch/x86/include/asm/kvm.h > +++ b/arch/x86/include/asm/kvm.h > @@ -25,6 +25,7 @@ > #define __KVM_HAVE_DEBUGREGS > #define __KVM_HAVE_XSAVE > #define __KVM_HAVE_XCRS > +#define __KVM_HAVE_READONLY_MEM > > /* Architectural interrupt line count. */ > #define KVM_NR_INTERRUPTS 256 > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > index 5548971..8e312a2 100644 > --- a/arch/x86/kvm/mmu.c > +++ b/arch/x86/kvm/mmu.c > @@ -2647,6 +2647,15 @@ static void kvm_send_hwpoison_signal(unsigned long address, struct task_struct * > > static int kvm_handle_bad_page(struct kvm_vcpu *vcpu, gfn_t gfn, pfn_t pfn) > { > + /* > + * Do not cache the mmio info caused by writing the readonly gfn > + * into the spte otherwise read access on readonly gfn also can > + * caused mmio page fault and treat it as mmio access. > + * Return 1 to tell kvm to emulate it. > + */ > + if (pfn == KVM_PFN_ERR_RO_FAULT) > + return 1; > + > if (pfn == KVM_PFN_ERR_HWPOISON) { > kvm_send_hwpoison_signal(gfn_to_hva(vcpu->kvm, gfn), current); > return 0; > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 704680d..42bbf41 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -2175,6 +2175,7 @@ int kvm_dev_ioctl_check_extension(long ext) > case KVM_CAP_GET_TSC_KHZ: > case KVM_CAP_PCI_2_3: > case KVM_CAP_KVMCLOCK_CTRL: > + case KVM_CAP_READONLY_MEM: > r = 1; > break; > case KVM_CAP_COALESCED_MMIO: > diff --git a/include/linux/kvm.h b/include/linux/kvm.h > index 2de335d..d808694 100644 > --- a/include/linux/kvm.h > +++ b/include/linux/kvm.h > @@ -106,7 +106,8 @@ struct kvm_userspace_memory_region { > * other bits are reserved for kvm internal use which are defined in > * include/linux/kvm_host.h. > */ > -#define KVM_MEM_LOG_DIRTY_PAGES 1UL > +#define KVM_MEM_LOG_DIRTY_PAGES (1UL << 0) > +#define KVM_MEM_READONLY (1UL << 1) > > /* for KVM_IRQ_LINE */ > struct kvm_irq_level { > @@ -621,6 +622,9 @@ struct kvm_ppc_smmu_info { > #define KVM_CAP_PPC_GET_SMMU_INFO 78 > #define KVM_CAP_S390_COW 79 > #define KVM_CAP_PPC_ALLOC_HTAB 80 > +#ifdef __KVM_HAVE_READONLY_MEM > +#define KVM_CAP_READONLY_MEM 81 > +#endif See the discussion on the userspace patches: The CAP, as userspace API, should really be defined unconditionally, kernel code should depend on __KVM_HAVE_READONLY_MEM or a to-be-introduced CONFIG_KVM_HAVE_xxx switch. This allows for cleaner userspace code. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html