On 8/23/22 08:52, Daniel P. Berrangé wrote:
On Mon, Aug 22, 2022 at 03:17:34PM -0400, Stefan Berger wrote:
On 8/22/22 12:46, Daniel P. Berrangé wrote:
On Mon, Aug 22, 2022 at 08:05:53AM -0400, Stefan Berger wrote:
When share storage for the TPM state files has been setup betwen hosts then
remove the TPM state files and directory only when undefining a VM and only
if the attribute persistent_state is not set. Avoid removing the TPM state
files and directory structure when a VM is migrated and shared storage is
used since this would also remove those files and directory structure on
the destination side.
I think our current undefine behaviour is probably flawed. We go to the
trouble of refusing to remove the firmware NVRAM when undefining because
it contains important VM state, but then happily blow away the TPM state.
Totally inconsistent behaviour :-( Its too late to change the default
behaviour, but we likely ought to add a flag
VIR_DOMAIN_UNDEFINE_KEEP_TPM
and plumb that through the varius code paths, which would remove the
need for this specific 'qemuDomainUndefineReason' enum.
I think the granularity encoded in the reason is necessary for the following
patch I was going to post later on:
Subject: [PATCH] qemu: tpm: Remove TPM state with non-shared storage
This patch 'fixes' the behavior of the persistent_state TPM domain XML
attribute that intends to preserve the state of the TPM but should not
keep the state around on all the hosts a VM has been migrated to. It
removes the TPM state directory structure from the source host upon
successful migration when non-shared storage is used. Similarly, it
removes it from the destination host upon migration failure when
non-shared storage is used.
'persistent_state' is something that applies to transient VMs.
It's an attribute set in the TPM's domain XML and affects the TPM state
permanently:
persistent_state
The persistent_state attribute indicates whether 'swtpm' TPM state
is kept or not when a transient domain is powered off or undefined. This
option can be used for preserving TPM state. By default the value is no.
This attribute only works with the emulator backend. The accepted values
are yes and no. Since 7.0.0
The documentation may say transient but the code seems to just not
remove the state at all once a user has set this in the domain XML:
static void
qemuTPMEmulatorCleanupHost(virDomainTPMDef *tpm)
{
if (!tpm->data.emulator.persistent_state)
qemuTPMEmulatorDeleteStorage(tpm);
}
https://github.com/libvirt/libvirt/blob/master/src/qemu/qemu_tpm.c#L707
We cannot get rid of the persistent_state in the domain XML but with
--keep-tpm we would have a second method to keep the state. My guess is
the author of the persistent_state patch did not want to instrument
callers but achieve the effect with just producing a different XML.
What I'm suggesting is that we should have some control over
this for persistent VMs, when calling 'undefine'.
ie virsh undefine --keep-nvram --keep-tpm $VMNAME
if we have this feature in the public API, then its impl should
support both this feature and your future patch for
'persistent_state'.
And how do we handle removal decisions on shared storage so that the
directory structure is not removed when a VM is undefined on the source
machine versus we want to have the directory structure removed on the
source machine upon successful migration when no shared storage is used?
This TPM-specific decision for removal seems better on the qemu_tpm.c
level rather than having to decide whether to keep the TPM state around
at all the call-sites that patch 1/7 instrumented with the reason
parameter (albeit unspecific in most cases) and pass down a KEEP_TPM
parameter instead. Getting the reason for the undefine on the qemu_tpm.c
level lets us make this TPM-specific decision there rather than at all
sites further up in the call stack.
Regards,
Stefan