On 06/22/2010 07:53 PM, Nadav Har'El wrote:
On Tue, Jun 22, 2010, Nadav Har'El wrote about "Re: [PATCH 5/24] Introduce vmcs12: a VMCS structure for L1":
Note that this structure becomes an ABI, it cannot change except in a
backward compatible way due to the need for live migration. So I'd like
a documentation patch that adds a description of the content to
Documentation/kvm/. It can be as simple as listing the structure
definition.
I decided that if I add a file in Documentation/kvm, it would be very useful
for it to describe the nested vmx feature in general, in addition to the
structure that you asked documented. So here is the new patch I propose:
Great, that's always helpful.
----
Subject: [PATCH 25/25] Documentation
This patch includes a brief introduction to the nested vmx feature in the
Documentation/kvm directory. The document also includes a copy of the
vmcs12 structure, as requested by Avi Kivity.
+
+We describe in much greater detail the theory behind the nested VMX feature,
+its implementation and its performance characteristics, in IBM Research report
+H-0282, "The Turtles Project: Design and Implementation of Nested
+Virtualization", available at:
+
+ http://bit.ly/a0o9te
Please put the true url in here.
+
+Known limitations
+-----------------
+
+The current code support running Linux under a nested KVM using shadow
+page table (with bypass_guest_pf disabled).
Might as well remove this, since nvmx will not be merged with such a
gaping hole.
In theory I ought to reject anything that doesn't comply with the spec.
In practice I'll accept deviations from the spec, so long as
- those features aren't used by common guests
- when the features are attempted to be used, kvm will issue a warning
I don't think PFEC matching ought to present any implementation difficulty.
+ABIs
+----
+
+Nested VMX aims to present a standard and (eventually) fully-functional VMX
+implementation for the a guest hypervisor to use. As such, the official
+specification of the ABI that it provides is Intel's VMX specification,
+namely volume 3B of their "Intel 64 and IA-32 Architectures Software
+Developer's Manual". Not all of VMX's features are currently fully supported,
+but the goal is to eventually support them all, starting with the VMX features
+which are used in practice by popular hypervisors (KVM and others).
+
+As a VMX implementation, nested VMX presents a VMCS structure to L1.
+As mandated by the spec, other than the two fields revision_id and abort,
+this structure is *opaque* to its user, who is not supposed to know or care
+about its internal structure. Rather, the structure is accessed through the
+VMREAD and VMWRITE instructions.
+Still, for debugging purposes, KVM developers might be interested to know the
+internals of this structure; This is struct vmcs12 from arch/x86/kvm/vmx.c.
+For convenience, we repeat its content here. If the internals of this structure
+changes, this can break live migration across KVM versions. VMCS12_REVISION
+(from vmx.c) should be changed if struct vmcs12 or its inner struct shadow_vmcs
+is ever changed.
This is indeed great for debugging, we can add qemu commands to inspect
a vmcs (and add the vmptr to 'info registers').
+
+struct __packed vmcs12 {
+ /* According to the Intel spec, a VMCS region must start with the
+ * following two fields. Then follow implementation-specific data.
+ */
+ u32 revision_id;
+ u32 abort;
+
+ struct shadow_vmcs shadow_vmcs;
+
+ bool launch_state; /* set to 0 by VMCLEAR, to 1 by VMLAUNCH */
+
+ int cpu;
+ int launched;
Why is cpu needed? In what way is launched != launch_state?
Please use explicitly sized types.
+struct __packed shadow_vmcs {
+ u32 host_ia32_sysenter_cs;
u32 pad;
+ unsigned long cr0_guest_host_mask;
+ unsigned long cr4_guest_host_mask;
+ unsigned long cr0_read_shadow;
+ unsigned long cr4_read_shadow;
+ unsigned long cr3_target_value0;
+ unsigned long cr3_target_value1;
+ unsigned long cr3_target_value2;
+ unsigned long cr3_target_value3;
+ unsigned long exit_qualification;
+ unsigned long guest_linear_address;
+ unsigned long guest_cr0;
+ unsigned long guest_cr3;
+ unsigned long guest_cr4;
+ unsigned long guest_es_base;
+ unsigned long guest_cs_base;
+ unsigned long guest_ss_base;
+ unsigned long guest_ds_base;
+ unsigned long guest_fs_base;
+ unsigned long guest_gs_base;
+ unsigned long guest_ldtr_base;
+ unsigned long guest_tr_base;
+ unsigned long guest_gdtr_base;
+ unsigned long guest_idtr_base;
+ unsigned long guest_dr7;
+ unsigned long guest_rsp;
+ unsigned long guest_rip;
+ unsigned long guest_rflags;
+ unsigned long guest_pending_dbg_exceptions;
+ unsigned long guest_sysenter_esp;
+ unsigned long guest_sysenter_eip;
+ unsigned long host_cr0;
+ unsigned long host_cr3;
+ unsigned long host_cr4;
+ unsigned long host_fs_base;
+ unsigned long host_gs_base;
+ unsigned long host_tr_base;
+ unsigned long host_gdtr_base;
+ unsigned long host_idtr_base;
+ unsigned long host_ia32_sysenter_esp;
+ unsigned long host_ia32_sysenter_eip;
+ unsigned long host_rsp;
+ unsigned long host_rip;
Use u64 instead of unsigned long, otherwise the size changes during live
migration from a 32-bit host to a 64-bit host.
Reserve tons of space here.
+};
+
+
--
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html