Thanks for taking a look Jan. More below...
On 12/8/2017 12:49 AM, Jan Beulich wrote:
On 07.12.17 at 23:45, <maran.wilson@xxxxxxxxxx> wrote:
The start info structure that is defined as part of the x86/HVM direct
boot ABI and used for starting Xen PVH guests would be more versatile if
it also included a way to efficiently pass information about the memory
map to the guest.
That way Xen PVH guests would not be forced to use a hypercall to get the
information and would make it easier for KVM guests to share the PVH
entry point.
---
include/xen/interface/hvm/start_info.h | 34 +++++++++++++++++++++++++++++++---
1 file changed, 31 insertions(+), 3 deletions(-)
First of all such a change should be submitted against the canonical
copy of the header, which lives in the Xen tree.
Understood. Will do that when this converts from RFC to actual patch.
The argument of avoiding a hypercall doesn't really count imo - this
isn't in any way performance critical code. The argument of making
re-use easier is fine, though.
Okay, I will reword the commit message.
--- a/include/xen/interface/hvm/start_info.h
+++ b/include/xen/interface/hvm/start_info.h
@@ -33,7 +33,7 @@
* | magic | Contains the magic value XEN_HVM_START_MAGIC_VALUE
* | | ("xEn3" with the 0x80 bit of the "E" set).
* 4 +----------------+
- * | version | Version of this structure. Current version is 0. New
+ * | version | Version of this structure. Current version is 1. New
* | | versions are guaranteed to be backwards-compatible.
* 8 +----------------+
* | flags | SIF_xxx flags.
@@ -48,6 +48,12 @@
* 32 +----------------+
* | rsdp_paddr | Physical address of the RSDP ACPI data structure.
* 40 +----------------+
+ * | memmap_paddr | Physical address of the memory map. Only present in
+ * | | version 1 and newer of the structure.
+ * 48 +----------------+
+ * | memmap_entries | Number of entries in the memory map table. Only
+ * | | present in version 1 and newer of the structure.
+ * 52 +----------------+
Please let's make this optional even in v1 (and later), i.e. spell out
that it may be zero. That way Xen code could continue to use the
hypercall approach even.
Yes, my intention was to make this optional. I will spell it out.
Also please spell out a 4-byte reserved entry at the end, to make
the specified structure a multiple of 8 in size again regardless of
bitness of the producer/consumer.
Sure, I can add that.
@@ -62,6 +68,17 @@
* | reserved |
* 32 +----------------+
*
+ * The layout of each entry in the memory map table is as follows and no
+ * padding is used between entries in the array:
+ *
+ * 0 +----------------+
+ * | addr | Base address
+ * 8 +----------------+
+ * | size | Size of mapping
+ * 16 +----------------+
+ * | type | E820_TYPE_xxx
+ * 20 +----------------|
I'm not convinced of re-using E820 types here. I can see that this
might ease the consumption in Linux, but I don't think there should
be any connection to x86 aspects here - the data being supplied is
x86-agnostic, and Linux'es placement of the header is also making
no connection to x86 (oddly enough, the current placement in the
Xen tree does, for a reason which escapes me).
I could also imagine reasons to add new types without them being
sanctioned by whoever maintains E820 type assignments.
So there are three aspects to discuss here.
1) The addition of the "E820_TYPE_xxx" comment. I am fine with just
changing that to "mapping type" and leaving it as something to be
coordinated between the hypervisor and the guest OS being started by
that hypervisor.
2) x86 vs x86-agnostic. While I'm trying to keep this interface generic
in terms of guest OS (like Linux, FreeBSD, possible other guests in the
future) and hypervisor type (Xen, QEMU/KVM, etc), I was actually under
the impression that we are dealing with an ABI that is very much x86
specific.
The canonical document describing the ABI
(https://xenbits.xen.org/docs/unstable/misc/pvh.html) is titled "x86/HVM
direct boot ABI" and goes on to describe an interface in very
x86-specific terms. i.e. The ebx register must contain a pointer, cs,
ds, es must be set a certain way, etc.
That is probably why Xen's placement of the header file is in a x86
section of the tree. And also why there already exist a number of "x86"
references in the existing header file. A quick grep of the existing
header file will show lines like:
"C representation of the x86/HVM start info layout"
"Start of day structure passed to PVH guests and to HVM guests in %ebx"
"Xen on x86 will always try to place all the data below the 4GiB"
If at some point in the future someone decides to implement a similar
ABI for a different CPU architecture while re-using this same
hvm_start_info struct, then this header will have to be redone a bit
anyway. But I'm not aware of any other such ABI that exists or is
currently in the works.
3) The (packed) layout of the hvm_memmap_table_entry struct. I did
initially consider just making this a new structure that did not
necessarily match struct e820_entry in its array layout. But, it's not
just the consumer that has an easier time digesting it in the e820_entry
array format. It's also the producer side (QEMU for instance) where code
already exists to lay out this information in e820_entry array format.
And since this is all x86 specific anyway, it just seemed like I would
be needlessly making more work for both ends by inventing a completely
new memory map layout just for the sake of being different. Especially
when there doesn't seem to be anything terribly broken about the
existing e820_entry array format as a general purpose memory map.
Would that be acceptable?
As to the size field - you need to spell out whether these are bytes
or pages (it might be worthwhile to also make this explicit for the
addr one, but there I view it as less of a problem, since "address"
doesn't commonly mean a page granular entity).
Sure, I will change that to "Size of the mapping in bytes".
Also this again lacks a 4-byte reserved field at the end.
For the reasons described above, I'd much prefer to leave this struct
packed and without the 4-byte reserved field. But let me know if there
is a compelling reason that I have missed.
@@ -86,13 +103,24 @@ struct hvm_start_info {
uint64_t cmdline_paddr; /* Physical address of the command line. */
uint64_t rsdp_paddr; /* Physical address of the RSDP ACPI data */
/* structure.
*/
-};
+ uint64_t memmap_paddr; /* Physical address of an array of */
+ /* hvm_memmap_table_entry. Only present in */
+ /* Ver 1 or later. For e820 mem map table. */
+ uint32_t memmap_entries; /* Only present in Ver 1 or later. Number of */
+ /* entries in the memmap table. */
+} __attribute__((packed));
No packed attribute here and below please, at least not in the
canonical (non-Linux) variant of the header.
Sure, with the addition of the 4 byte reserved field you mentioned
earlier, I'll remove the packed attribute for the two existing
structures. But as discussed above, would prefer to leave it in place
for the new memory map array struct.
Thanks,
-Maran
Jan