Thanks for your review Jim. On 2017-10-13 at 09:57:45 -0700, Jim Mattson wrote: > I'll ask before Paolo does: Can you please add kvm-unit-tests to > exercise all of this new code? it is should be a API/ioctl tools rather than a kvm-unit-test. Actually, I have prepared a draft version of tools which embedded in the qemu command line, mean that we could set/get the subpage protection via qemu command. Attached the qemu patch. BTW, it is a pre-design version, I will send a formal qemu patch to qemu list after the API/ioctl was fix by kvm side. > > BTW, what generation of hardware do we need to exercise this code ourselves? As far as I know , This feature will enable on Intel next-generation Ice Lake chips. > > On Fri, Oct 13, 2017 at 4:11 PM, Zhang Yi <yi.z.zhang@xxxxxxxxxxxxxxx> wrote: > > From: Zhang Yi Z <yi.z.zhang@xxxxxxxxxxxxxxx> > > > > Hi All, > > > > Here is a patch-series which adding EPT-Based Sub-page Write Protection Support. You can get It's software developer manuals from: > > > > https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf > > > > In Chapter 4 EPT-BASED SUB-PAGE PERMISSIONS. > > > > Introduction: > > > > EPT-Based Sub-page Write Protection referred to as SPP, it is a capability which allow Virtual Machine Monitors(VMM) to specify write-permission for guest physical memory at a sub-page(128 byte) granularity. When this capability is utilized, the CPU enforces write-access permissions for sub-page regions of 4K pages as specified by the VMM. EPT-based sub-page permissions is intended to enable fine-grained memory write enforcement by a VMM for security(guest OS monitoring) and usages such as device virtualization and memory check-point. > > > > How SPP Works: > > > > SPP is active when the "sub-page write protection" VM-execution control is 1. A new 4-level paging structure named SPP page table(SPPT) is introduced, SPPT will look up the guest physical addresses to derive a 64 bit "sub-page permission" value containing sub-page write permissions. The lookup from guest-physical addresses to the sub-page region permissions is determined by a set of this SPPT paging structures. > > > > The SPPT is used to lookup write permission bits for the 128 byte sub-page regions containing in the 4KB guest physical page. EPT specifies the 4KB page level privileges that software is allowed when accessing the guest physical address, whereas SPPT defines the write permissions for software at the 128 byte granularity regions within a 4KB page. Write accesses prevented due to sub-page permissions looked up via SPPT are reported as EPT violation VM exits. Similar to EPT, a logical processor uses SPPT to lookup sub-page region write permissions for guest-physical addresses only when those addresses are used to access memory. > > > > Guest write access --> GPA --> Walk EPT --> EPT leaf entry -┐ > > ┌-----------------------------------------------------------┘ > > └-> if VMexec_control.spp && ept_leaf_entry.spp_bit (bit 61) > > | > > └-> <false> --> EPT legacy behavior > > | > > | > > └-> <true> --> if ept_leaf_entry.writable > > | > > └-> <true> --> Ignore SPP > > | > > └-> <false> --> GPA --> Walk SPP 4-level table--┐ > > | > > ┌------------<----------get-the-SPPT-point-from-VMCS-filed-----<------┘ > > | > > Walk SPP L4E table > > | > > └┐--> entry misconfiguration ------------>----------┐<----------------┐ > > | | | > > else | | > > | | | > > | ┌------------------SPP VMexit<-----------------┘ | > > | | | > > | └-> exit_qualification & sppt_misconfig --> sppt misconfig | > > | | | > > | └-> exit_qualification & sppt_miss --> sppt miss | > > └--┐ | > > | | > > walk SPPT L3E--┐--> if-entry-misconfiguration------------>------------┘ > > | | > > else | > > | | > > | | > > walk SPPT L2E --┐--> if-entry-misconfiguration-------->-------┘ > > | | > > else | > > | | > > | | > > walk SPPT L1E --┐-> if-entry-misconfiguration--->----┘ > > | > > else > > | > > └-> if sub-page writable > > └-> <true> allow, write access > > └-> <false> disallow, EPT violation > > > > Patch-sets Description: > > > > Patch 1: Documentation. > > > > Patch 2: This patch adds reporting SPP capability from VMX Procbased MSR, according to the definition of hardware spec, bit 23 is the control of the SPP capability. > > > > Patch 3: Add new secondary processor-based VM-execution control bit which defined as "sub-page write permission", same as VMX Procbased MSR, bit 23 is the enable bit of SPP. > > Also we introduced a kernel parameter "enable_ept_spp", now SPP is active when the "Sub-page Write Protection" in Secondary VM-Execution Control is set and enable the kernel parameter by "enable_ept_spp=1". > > > > Patch 4: Introduced the spptp and spp page table. > > The sub-page permission table is referenced via a 64-bit control field called Sub-Page Permission Table Pointer (SPPTP) which contains a 4K-aligned physical address. The index and encoding for this VMCS field if defined 0x2030 at this time The format of SPPTP is shown in below figure 2: > > this patch introduced the Spp paging structures, which root page will created at kvm mmu page initialization. > > Also we added a mmu page role type spp to distinguish it is a spp page or a EPT page. > > > > Patch 5: Introduced the SPP-Induced VM exit and it's handle. > > Accesses using guest-physical addresses may cause SPP-induced VM exits due to an SPPT misconfiguration or an SPPT miss. The basic VM exit reason code reporte for SPP-induced VM exits is 66. > > > > Also introduced the new exit qualification for SPPT-induced vmexits. > > > > | Bit | Contents | > > | :---- | :---------------------------------------------------------------- | > > | 10:0 | Reserved (0). | > > | 11 | SPPT VM exit type. Set for SPPT Miss, cleared for SPPT Misconfig. | > > | 12 | NMI unblocking due to IRET | > > | 63:13 | Reserved (0) | > > > > Patch 6: Added a handle of EPT subpage write protection fault. > > A control bit in EPT leaf paging-structure entries is defined as “Sub-Page Permission” (SPP bit). The bit position is 61; it is chosen from among the bits that are currently ignored by the processor and available to software. > > While hardware walking the SPP page table, If the sub-page region write permission bit is set, the write is allowed, else the write is disallowed and results in an EPT violation. > > We need peek this case in EPT violation handler, and trigger a user-space exit, return the write protected address(GVA) to user(qemu). > > > > Patch 7: Introduce ioctls to set/get Sub-Page Write Protection. > > We introduced 2 ioctls to let user application to set/get subpage write protection bitmap per gfn, each gfn corresponds to a bitmap. > > The user application, qemu, or some other security control daemon. will set the protection bitmap via this ioctl. > > the API defined as: > > struct kvm_subpage { > > __u64 base_gfn; > > __u64 npages; > > /* sub-page write-access bitmap array */ > > __u32 access_map[SUBPAGE_MAX_BITMAP]; > > }sp; > > kvm_vm_ioctl(s, KVM_SUBPAGES_SET_ACCESS, &sp) > > kvm_vm_ioctl(s, KVM_SUBPAGES_GET_ACCESS, &sp) > > > > Patch 8 ~ Patch 9: Setup spp page table and update the EPT leaf entry indicated with the SPP enable bit. > > If the sub-page write permission VM-execution control is set, treatment of write accesses to guest-physical accesses depends on the state of the accumulated write-access bit (position 1) and sub-page permission bit (position 61) in the EPT leaf paging-structure. > > Software will update the EPT leaf entry sub-page permission bit while kvm_set_subpage(patch 7). If the EPT write-access bit set to 0 and the SPP bit set to 1 in the leaf EPT paging-structure entry that maps a 4KB page, then the hardware will look up a VMM-managed Sub-Page Permission Table (SPPT), which will be prepared by setup kvm_set_subpage(patch 8). > > The hardware uses the guest-physical address and bits 11:7 of the address accessed to lookup the SPPT to fetch a write permission bit for the 128 byte wide sub-page region being accessed within the 4K guest-physical page. If the sub-page region write permission bit is set, the write is allowed, otherwise the write is disallowed and results in an EPT violation. > > Guest-physical pages mapped via leaf EPT-paging-structures for which the accumulated write-access bit and the SPP bits are both clear (0) generate EPT violations on memory writes accesses. Guest-physical pages mapped via EPT-paging-structure for which the accumulated write-access bit is set (1) allow writes, effectively ignoring the SPP bit on the leaf EPT-paging structure. > > Software will setup the spp page table level4,3,2 as well as EPT page structure, and fill the level 1 page via the 32 bit bitmaps per a single 4K page. Now it could be divided to 32 x 128 sub-pages. > > > > The SPP L4E L3E L2E is defined as below figure. > > > > | Bit | Contents | > > | :----- | :--------------------------------------------------------------------- | > > | 0 | Valid entry when set; indicates whether the entry is present | > > | 11:1 | Reserved (0) | > > | N-1:12 | Physical address of 4K aligned SPPT LX-1 Table referenced by the entry | > > | 51:N | Reserved (0) | > > | 63:52 | Reserved (0) | > > Note: N is the physical address width supported by the processor, X is the page level > > > > The SPP L1E format is defined as below figure. > > | Bit | Contents | > > | :---- | :---------------------------------------------------------------- | > > | 0+2i | Write permission for i-th 128 byte sub-page region. | > > | 1+2i | Reserved (0). | > > Note: `0<=i<=31` > > > > > > Zhang Yi Z (10): > > KVM: VMX: Added EPT Subpage Protection Documentation. > > x86/cpufeature: Add intel Sub-Page Protection to CPU features > > KVM: VMX: Added VMX SPP feature flags and VM-Execution Controls. > > KVM: VMX: Introduce the SPPTP and SPP page table. > > KVM: VMX: Introduce SPP-Induced vm exit and it's handle. > > KVM: VMX: Added handle of SPP write protection fault. > > KVM: VMX: Introduce ioctls to set/get Sub-Page Write Protection. > > KVM: VMX: Update the EPT leaf entry indicated with the SPP enable bit. > > KVM: VMX: Added setup spp page structure. > > KVM: VMX: implement setup SPP page structure in spp miss. > > > > Documentation/virtual/kvm/spp_design_kvm.txt | 272 +++++++++++++++++++++ > > arch/x86/include/asm/cpufeatures.h | 1 + > > arch/x86/include/asm/kvm_host.h | 18 +- > > arch/x86/include/asm/vmx.h | 10 + > > arch/x86/include/uapi/asm/vmx.h | 2 + > > arch/x86/kernel/cpu/intel.c | 4 + > > arch/x86/kvm/mmu.c | 340 ++++++++++++++++++++++++++- > > arch/x86/kvm/mmu.h | 1 + > > arch/x86/kvm/vmx.c | 104 ++++++++ > > arch/x86/kvm/x86.c | 99 +++++++- > > include/linux/kvm_host.h | 5 + > > include/uapi/linux/kvm.h | 16 ++ > > virt/kvm/kvm_main.c | 26 ++ > > 13 files changed, 893 insertions(+), 5 deletions(-) > > create mode 100644 Documentation/virtual/kvm/spp_design_kvm.txt > > > > -- > > 2.7.4 > >
>From a369bed5d986dccb3ca36dc5a27c6220ca2d1405 Mon Sep 17 00:00:00 2001 From: Zhang Yi Z <yi.z.zhang@xxxxxxxxxxxxxxx> Date: Tue, 14 Mar 2017 15:11:38 +0800 Subject: [PATCH] x86: Intel Sub-Page Protection support Signed-off-by: He Chen <he.chen@xxxxxxxxxxxxxxx> Signed-off-by: Zhang Yi Z <yi.z.zhang@xxxxxxxxxxxxxxx> --- hmp-commands.hx | 26 ++++++++++++++++++++++++++ hmp.c | 26 ++++++++++++++++++++++++++ hmp.h | 2 ++ include/sysemu/kvm.h | 2 ++ kvm-all.c | 40 ++++++++++++++++++++++++++++++++++++++++ linux-headers/linux/kvm.h | 15 +++++++++++++++ qapi-schema.json | 41 +++++++++++++++++++++++++++++++++++++++++ qmp.c | 43 +++++++++++++++++++++++++++++++++++++++++++ target/i386/kvm.c | 22 ++++++++++++++++++++++ 9 files changed, 217 insertions(+) diff --git a/hmp-commands.hx b/hmp-commands.hx index 8819281..7a57411 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -1766,6 +1766,32 @@ Set QOM property @var{property} of object at location @var{path} to value @var{v ETEXI { + .name = "get-subpage", + .args_type = "base_gfn:l,npages:l,filename:str", + .params = "base_gfn npages filename", + .help = "get the write-protect bitmap setting of sub-page protectio", + .cmd = hmp_get_subpage, + }, + +STEXI +@item get-subpage @var{base_gfn} @var{npages} @var{file} +Get the write-protect bitmap setting of sub-page protection in the range of @var{base_gfn} to @var{base_gfn} + @var{npages} +ETEXI + + { + .name = "set-subpage", + .args_type = "base_gfn:l,npages:l,wp_map:i", + .params = "base_gfn npages", + .help = "set the write-protect bitmap setting of sub-page protectio", + .cmd = hmp_set_subpage, + }, + +STEXI +@item set-subpage @var{base_gfn} @var{npages} +Get the write-protect bitmap setting of sub-page protection in the range of @var{base_gfn} to @var{base_gfn} + @var{npages} +ETEXI + + { .name = "info", .args_type = "item:s?", .params = "[subcommand]", diff --git a/hmp.c b/hmp.c index 261843f..7d217e9 100644 --- a/hmp.c +++ b/hmp.c @@ -2614,3 +2614,29 @@ void hmp_info_vm_generation_id(Monitor *mon, const QDict *qdict) } qapi_free_GuidInfo(info); } + +void hmp_get_subpage(Monitor *mon, const QDict *qdict) +{ + uint64_t base_gfn = qdict_get_int(qdict, "base_gfn"); + uint64_t npages = qdict_get_int(qdict, "npages"); + const char *filename = qdict_get_str(qdict, "filename"); + Error *err = NULL; + + monitor_printf(mon, "base_gfn: %ld, npages: %ld, file: %s\n", base_gfn, npages, filename); + + qmp_get_subpage(base_gfn, npages, filename, &err); + hmp_handle_error(mon, &err); +} + +void hmp_set_subpage(Monitor *mon, const QDict *qdict) +{ + uint64_t base_gfn = qdict_get_int(qdict, "base_gfn"); + uint64_t npages = qdict_get_int(qdict, "npages"); + uint32_t wp_map = qdict_get_int(qdict, "wp_map"); + Error *err = NULL; + + monitor_printf(mon, "base_gfn: %ld, npages: %ld, wp_map: %d\n", base_gfn, npages, wp_map); + + qmp_set_subpage(base_gfn, npages, wp_map, &err); + hmp_handle_error(mon, &err); +} diff --git a/hmp.h b/hmp.h index 799fd37..b72143f 100644 --- a/hmp.h +++ b/hmp.h @@ -138,5 +138,7 @@ void hmp_rocker_of_dpa_groups(Monitor *mon, const QDict *qdict); void hmp_info_dump(Monitor *mon, const QDict *qdict); void hmp_hotpluggable_cpus(Monitor *mon, const QDict *qdict); void hmp_info_vm_generation_id(Monitor *mon, const QDict *qdict); +void hmp_get_subpage(Monitor *mon, const QDict *qdict); +void hmp_set_subpage(Monitor *mon, const QDict *qdict); #endif diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h index 24281fc..f7c1340 100644 --- a/include/sysemu/kvm.h +++ b/include/sysemu/kvm.h @@ -528,4 +528,6 @@ int kvm_set_one_reg(CPUState *cs, uint64_t id, void *source); */ int kvm_get_one_reg(CPUState *cs, uint64_t id, void *target); int kvm_get_max_memslots(void); +int kvm_get_subpage_wp_map(uint64_t base_gfn, uint32_t *buf, uint64_t len); +int kvm_set_subpage_wp_map(uint64_t base_gfn, uint64_t npages, uint32_t wp_map); #endif diff --git a/kvm-all.c b/kvm-all.c index 9040bd5..58cc0a4 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -2593,6 +2593,46 @@ int kvm_get_one_reg(CPUState *cs, uint64_t id, void *target) return r; } +int kvm_get_subpage_wp_map(uint64_t base_gfn, uint32_t *buf, + uint64_t len) +{ + KVMState *s = kvm_state; + struct kvm_subpage sp = {}; + int n; + + sp.base_gfn = base_gfn; + sp.npages = len; + + + if (kvm_vm_ioctl(s, KVM_SUBPAGES_GET_ACCESS, &sp) < 0) { + DPRINTF("ioctl failed %d\n", errno); + return -1; + } + + memcpy(buf, sp.access_map, n * sizeof(uint32_t)); + + return n; +} + +int kvm_set_subpage_wp_map(uint64_t base_gfn, uint64_t npages, + uint32_t wp_map) +{ + KVMState *s = kvm_state; + struct kvm_subpage sp = {}; + + sp.base_gfn = base_gfn; + sp.npages = npages; + sp.access_map[0] = wp_map; + + + if (kvm_vm_ioctl(s, KVM_SUBPAGES_SET_ACCESS, &sp) < 0) { + DPRINTF("ioctl failed %d\n", errno); + return -1; + } + + return 0; +} + static void kvm_accel_class_init(ObjectClass *oc, void *data) { AccelClass *ac = ACCEL_CLASS(oc); diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h index 4e082a8..69de005 100644 --- a/linux-headers/linux/kvm.h +++ b/linux-headers/linux/kvm.h @@ -205,6 +205,7 @@ struct kvm_hyperv_exit { #define KVM_EXIT_S390_STSI 25 #define KVM_EXIT_IOAPIC_EOI 26 #define KVM_EXIT_HYPERV 27 +#define KVM_EXIT_SPP 28 /* For KVM_EXIT_INTERNAL_ERROR */ /* Emulate instruction failed. */ @@ -360,6 +361,10 @@ struct kvm_run { struct { __u8 vector; } eoi; + /* KVM_EXIT_SPP */ + struct { + __u64 addr; + } spp; /* KVM_EXIT_HYPERV */ struct kvm_hyperv_exit hyperv; /* Fix the size of the union. */ @@ -1126,6 +1131,8 @@ enum kvm_device_type { struct kvm_userspace_memory_region) #define KVM_SET_TSS_ADDR _IO(KVMIO, 0x47) #define KVM_SET_IDENTITY_MAP_ADDR _IOW(KVMIO, 0x48, __u64) +#define KVM_SUBPAGES_GET_ACCESS _IOR(KVMIO, 0x49, __u64) +#define KVM_SUBPAGES_SET_ACCESS _IOW(KVMIO, 0x4a, __u64) /* enable ucontrol for s390 */ struct kvm_s390_ucas_mapping { @@ -1354,4 +1361,12 @@ struct kvm_assigned_msix_entry { #define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0) #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1) +/* for KVM_SUBPAGES_GET_ACCESS and KVM_SUBPAGES_SET_ACCESS */ +#define SUBPAGE_MAX_BITMAP 256 +struct kvm_subpage { + __u64 base_gfn; + __u64 npages; + __u32 access_map[SUBPAGE_MAX_BITMAP]; /* sub-page write-access bitmap array */ +}; + #endif /* __LINUX_KVM_H */ diff --git a/qapi-schema.json b/qapi-schema.json index 32b4a4b..d6b46bb 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -6267,3 +6267,44 @@ # Since 2.9 ## { 'command': 'query-vm-generation-id', 'returns': 'GuidInfo' } + +## +# @get-subpage: +# +# This command will get setting information of sub-page +# protection. +# +# Since: 2.10 +# +# Example: +# +# -> { "execute": "get-subpage", +# "arguments": { "base_gfn": 0x1000, +# "npages": 10, +# "filename": "/tmp/spp_info" } } +# <- { "return": {} } +# +## +{ 'command': 'get-subpage', + 'data': {'base_gfn': 'uint64', 'npages': 'uint64', 'filename': 'str'} } + + +## +# @set-subpage: +# +# This command will set sub-page protection for given GFNs. +# +# Since: 2.10 +# +# Example: +# +# -> { "execute": "set-subpage", +# "arguments": { "base_gfn": 0x1000, +# "npages": 10, +# "wp_map": 0xffff0000 } } +# <- { "return": {} } +# +## +{ 'command': 'set-subpage', + 'data': {'base_gfn': 'uint64', 'npages': 'uint64', 'wp_map': 'uint32'} } + diff --git a/qmp.c b/qmp.c index fa82b59..274efdb 100644 --- a/qmp.c +++ b/qmp.c @@ -717,3 +717,46 @@ ACPIOSTInfoList *qmp_query_acpi_ospm_status(Error **errp) return head; } + +#define SUBPAGE_BUF_LEN 256 +void qmp_get_subpage(uint64_t base_gfn, uint64_t npages, + const char *filename, Error **errp) +{ + FILE *f; + uint64_t n; + uint32_t buf[SUBPAGE_BUF_LEN]; + + f = fopen(filename, "wb"); + if (!f) { + error_setg_file_open(errp, errno, filename); + return; + } + + while (npages != 0) { + n = npages; + if (n > SUBPAGE_BUF_LEN) + n = SUBPAGE_BUF_LEN; + if (kvm_get_subpage_wp_map(base_gfn, buf, n) < 0) { + error_setg(errp, QERR_IO_ERROR); + goto exit; + } + if (fwrite(buf, 4, n, f) != n) { + error_setg(errp, QERR_IO_ERROR); + goto exit; + } + base_gfn += n; + npages -= n; + } + +exit: + fclose(f); +} + +void qmp_set_subpage(uint64_t base_gfn, uint64_t npages, + uint32_t wp_map, Error **errp) +{ + if (kvm_set_subpage_wp_map(base_gfn, npages, wp_map) < 0) + error_setg(errp, QERR_IO_ERROR); + +} + diff --git a/target/i386/kvm.c b/target/i386/kvm.c index 472399f..18a43d7 100644 --- a/target/i386/kvm.c +++ b/target/i386/kvm.c @@ -3147,6 +3147,23 @@ static int kvm_handle_debug(X86CPU *cpu, return ret; } +static int kvm_handle_spp(uint64_t addr) +{ + /* + uint64_t base_gfn = addr >> 12; + uint64_t offset = addr & ((1 << 12) - 1); + int subpage_index = offset >> 7; + uint32_t mask; + + kvm_get_subpage_wp_map(base_gfn, &mask, 1); + mask |= 1UL << subpage_index; + return kvm_set_subpage_wp_map(base_gfn, 1, mask); + */ + + fprintf(stderr, "QEMU-SPP: we are in kvm_handle_spp now, addr=0x%lx!\n", addr); + return 0; +} + void kvm_arch_update_guest_debug(CPUState *cpu, struct kvm_guest_debug *dbg) { const uint8_t type_code[] = { @@ -3240,6 +3257,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run) ioapic_eoi_broadcast(run->eoi.vector); ret = 0; break; + case KVM_EXIT_SPP: + DPRINTF("handle_spp\n"); + kvm_handle_spp(run->spp.addr); + ret = 0; + break; default: fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason); ret = -1; -- 2.7.4