Re: [libvirt PATCH v3 5/5] qemu: enable asynchronous teardown on s390x hosts by default

Daniel P. Berrangé <berrange@xxxxxxxxxx> · Tue, 11 Jul 2023 14:57:45 +0100

On Tue, Jul 11, 2023 at 03:48:25PM +0200, Claudio Imbrenda wrote:
> On Tue, 11 Jul 2023 09:17:00 +0100
> Daniel P. Berrangé <berrange@xxxxxxxxxx> wrote:
> 
> [...]
> 
> > > We could add additional time depending on the guest memory size BUT with
> > > Secure Execution the timeout would need to be increased by factors (two
> > > digits). Also for libvirt it is not possible to detect if the guest is in
> > > Secure Execution mode.  
> > 
> > What component is causing this 2 orders of magnitude delay in shutting
> 
> Secure Execution (protected VMs)

So its the hardware that imposes the penalty, rather than something
the kenrel is doing ?

Can anything else mitigate this ?  eg does using huge pages make it
faster than normal pages ?

> > down a guest ? If the host can't tell if Secure Execution mode is
> > enabled or not, why would any code path be different & slower ?
> 
> The host kernel (and QEMU) know if a specific VM is running in
> secure mode, but there is no meaningful way for this information to be
> communicated outwards (e.g. to libvirt)

Can we expose this in one of the QMP commands, or a new one ? It feels
like a mgmt app is going to want to know if a guest is running in secure
mode or not, so it can know if this shutdown penalty is going to be
present.

> During teardown, the host kernel will need to do some time-consuming
> extra cleanup for each page that belonged to a secure guest.
> 
> > 
> > > I also assume that timeouts of +1h are not acceptable. Wouldn't a long
> > > timeout cause other trouble like stalling "virsh list" run in parallel?  
> > 
> > Well a 1 hour timeout is pretty insane, even with the async teardown
> 
> I think we all agree, and that's why asynchronous teardown was
> implemented
> 
> > that's terrible as RAM is unable to be used for any new guest for
> > an incredibly long time.
> 
> I'm not sure what you mean here. RAM is not kept aside until the
> teardown is complete; cleared pages are returned to the free pool
> immediately as they are cleared. i.e. when the cleanup is halfway
> through, half of the memory will have been freed.

Yes, it is incrementally released, but in practice most hypervisors are
memory constrained. So if you stop a 2 TB guest, and want to then boot it
again, unless you have a couple of free TB of RAM hanging around, you're
going to need to wait for most all of the orignial RAM to be reclaimed.

Async cleanup definitely helps, but there's only so much it can do.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|