On Fri, Jun 14, 2024 at 08:49:57AM +0100, Daniel P. Berrangé wrote: > On Fri, Jun 14, 2024 at 09:04:33AM +0800, Xiaoyao Li wrote: > > On 6/13/2024 4:35 PM, Duan, Zhenzhong wrote: > > > > > > > > > > -----Original Message----- > > > > From: Li, Xiaoyao <xiaoyao.li@xxxxxxxxx> > > > > Subject: Re: [PATCH v5 25/65] i386/tdx: Add property sept-ve-disable for > > > > tdx-guest object > > > > > > > > On 6/6/2024 6:45 PM, Daniel P. Berrangé wrote: > > > > > Copying Zhenzhong Duan as my point relates to the proposed libvirt > > > > > TDX patches. > > > > > > > > > > On Thu, Feb 29, 2024 at 01:36:46AM -0500, Xiaoyao Li wrote: > > > > > > Bit 28 of TD attribute, named SEPT_VE_DISABLE. When set to 1, it > > > > disables > > > > > > EPT violation conversion to #VE on guest TD access of PENDING pages. > > > > > > > > > > > > Some guest OS (e.g., Linux TD guest) may require this bit as 1. > > > > > > Otherwise refuse to boot. > > > > > > > > > > > > Add sept-ve-disable property for tdx-guest object, for user to configure > > > > > > this bit. > > > > > > > > > > > > Signed-off-by: Xiaoyao Li <xiaoyao.li@xxxxxxxxx> > > > > > > Acked-by: Gerd Hoffmann <kraxel@xxxxxxxxxx> > > > > > > Acked-by: Markus Armbruster <armbru@xxxxxxxxxx> > > > > > > --- > > > > > > Changes in v4: > > > > > > - collect Acked-by from Markus > > > > > > > > > > > > Changes in v3: > > > > > > - update the comment of property @sept-ve-disable to make it more > > > > > > descriptive and use new format. (Daniel and Markus) > > > > > > --- > > > > > > qapi/qom.json | 7 ++++++- > > > > > > target/i386/kvm/tdx.c | 24 ++++++++++++++++++++++++ > > > > > > 2 files changed, 30 insertions(+), 1 deletion(-) > > > > > > > > > > > > diff --git a/qapi/qom.json b/qapi/qom.json > > > > > > index 220cc6c98d4b..89ed89b9b46e 100644 > > > > > > --- a/qapi/qom.json > > > > > > +++ b/qapi/qom.json > > > > > > @@ -900,10 +900,15 @@ > > > > > > # > > > > > > # Properties for tdx-guest objects. > > > > > > # > > > > > > +# @sept-ve-disable: toggle bit 28 of TD attributes to control disabling > > > > > > +# of EPT violation conversion to #VE on guest TD access of PENDING > > > > > > +# pages. Some guest OS (e.g., Linux TD guest) may require this to > > > > > > +# be set, otherwise they refuse to boot. > > > > > > +# > > > > > > # Since: 9.0 > > > > > > ## > > > > > > { 'struct': 'TdxGuestProperties', > > > > > > - 'data': { }} > > > > > > + 'data': { '*sept-ve-disable': 'bool' } } > > > > > > > > > > So this exposes a single boolean property that gets mapped into one > > > > > specific bit in the TD attributes: > > > > > > > > > > > + > > > > > > +static void tdx_guest_set_sept_ve_disable(Object *obj, bool value, Error > > > > **errp) > > > > > > +{ > > > > > > + TdxGuest *tdx = TDX_GUEST(obj); > > > > > > + > > > > > > + if (value) { > > > > > > + tdx->attributes |= TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE; > > > > > > + } else { > > > > > > + tdx->attributes &= ~TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE; > > > > > > + } > > > > > > +} > > > > > > > > > > If I look at the documentation for TD attributes > > > > > > > > > > https://download.01.org/intel-sgx/latest/dcap- > > > > latest/linux/docs/Intel_TDX_DCAP_Quoting_Library_API.pdf > > > > > > > > > > Section "A.3.4. TD Attributes" > > > > > > > > > > I see "TD attributes" is a 64-bit int, with 5 bits currently > > > > > defined "DEBUG", "SEPT_VE_DISABLE", "PKS", "PL", "PERFMON", > > > > > and the rest currently reserved for future use. This makes me > > > > > wonder about our modelling approach into the future ? > > > > > > > > > > For the AMD SEV equivalent we've just directly exposed the whole > > > > > field as an int: > > > > > > > > > > 'policy' : 'uint32', > > > > > > > > > > For the proposed SEV-SNP patches, the same has been done again > > > > > > > > > > https://lists.nongnu.org/archive/html/qemu-devel/2024- > > > > 06/msg00536.html > > > > > > > > > > '*policy': 'uint64', > > > > > > > > > > > > > > > The advantage of exposing individual booleans is that it is > > > > > self-documenting at the QAPI level, but the disadvantage is > > > > > that every time we want to expose ability to control a new > > > > > bit in the policy we have to modify QEMU, libvirt, the mgmt > > > > > app above libvirt, and whatever tools the end user has to > > > > > talk to the mgmt app. > > > > > > > > > > If we expose a policy int, then newly defined bits only require > > > > > a change in QEMU, and everything above QEMU will already be > > > > > capable of setting it. > > > > > > > > > > In fact if I look at the proposed libvirt patches, they have > > > > > proposed just exposing a policy "int" field in the XML, which > > > > > then has to be unpacked to set the individual QAPI booleans > > > > > > > > > > > > > > https://lists.libvirt.org/archives/list/devel@xxxxxxxxxxxxxxxxx/message/WXWX > > > > EESYUA77DP7YIBP55T2OPSVKV5QW/ > > > > > > > > > > On balance, I think it would be better if QEMU just exposed > > > > > the raw TD attributes policy as an uint64 at QAPI, instead > > > > > of trying to unpack it to discrete bool fields. This gives > > > > > consistency with SEV and SEV-SNP, and with what's proposed > > > > > at the libvirt level, and minimizes future changes when > > > > > more policy bits are defined. > > > > > > > > The reasons why introducing individual bit of sept-ve-disable instead of > > > > a raw TD attribute as a whole are that > > > > > > > > 1. other bits like perfmon, PKS, KL are associated with cpu properties, > > > > e.g., > > > > > > > > perfmon -> pmu, > > > > pks -> pks, > > > > kl -> keylokcer feature that QEMU currently doesn't support > > > > > > > > If allowing configuring attribute directly, we need to deal with the > > > > inconsistence between attribute vs cpu property. > > > > > > What about defining those bits associated with cpu properties reserved > > > But other bits work as Daniel suggested way. > > > > I don't understand. Do you mean we provide the interface to configure raw 64 > > bit attributes while some bits of it are reserved? > > Just have a mask of what bits are permitted to be set, and report an > error if the user sets non-permitted bits. Looking at the IGVM patches, the CnofidentialGuestSupport class is gaining a "set_guest_policy" method, which takes a "uint64_t policy". If (when) IGVM support is extended to cover TDX too, then we're going to need to accept the user providing the policy in integer format via the IGVM file metadata. This will require adding code to check for any inconsistency between the policy bitmask, and the CPU flags. https://lists.nongnu.org/archive/html/qemu-devel/2024-06/msg04037.html so to me this is another reason to just expose the policy as an integer in the QAPI/QOM structure too. Everywhere just wants to be working with policy in integer format. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|