Re: [PATCH] qemuProcessStop: Don't unlock domain until most of the cleanup is done

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jul 23, 2024 at 02:41:13PM +0200, Peter Krempa wrote:
> On Fri, Jul 19, 2024 at 14:46:49 +0200, Michal Privoznik wrote:
> > Imagine two threads. Thread A is executing qemuProcessStop() and
> > thread B is executing qemuDomainCreateXML(). To make things more
> > interesting, the domain XML passed to qemuDomainCreateXML matches
> > name + UUID of that the virDomainObj that qemuProcessStop() is
> > currently working with. Here's what happens.
> > 
> > 1) Thread A locks @vm, enters qemuProcessStop().
> > 
> > 2) Thread B parses given XML, calls virDomainObjListAdd() ->
> >    virDomainObjListAdd() -> virDomainObjListAddLocked() ->
> >    virDomainObjListFindByUUIDLocked(). Since there's a match on
> >    UUID, an virDomainObj object is returned and the thread
> >    proceeds to calling virObjectLock(). NB, it's the same object
> >    as in thread A.
> > 
> > 3) Thread A sets vm->def->id = -1; this means that from this
> >    point on, virDomainObjIsActive() will return false.
> > 
> > 4) Thread A calls qemuDomainObjStopWorker() which unlocks the
> >    @vm.
> > 
> > 5) Thread B acquires the @vm lock and since
> >    virDomainObjIsActive() returns false, it proceeds to calling
> >    virDomainObjAssignDef() where vm->def is replaced.
> > 
> > 6) Thread B then calls qemuProcessBeginJob() which unlocks the
> >    @vm temporarily.
> > 
> > 7) Thread A, still in qemuDomainObjStopWorker() acquires @vm lock
> >    and proceeds with cleanup.
> > 
> > 8) Thread A finds different definition than the one needing
> >    cleanup.
> > 
> > In my testing I've seen stale pointers, e.g.
> > vm->def->nets[0]->priv was NULL, which lead to a SIGSEGV as
> > there's  'QEMU_DOMAIN_NETWORK_PRIVATE(net)->created' line when
> > cleaning up nets. Your mileage may vary.
> > 
> > Even if we did not crash, the plain fact that vm->def is changed
> > in the middle of qemuProcessStop() means we might be cleaning up
> > something else than intended.
> 
> This paragraph is the important bit. The root cause of the problem here
> is that 'virDomainObjListAdd' inside 'qemuDomainCreateXML' can modify
> 'vm->def' whithout holding any _MODIFY-type JOB on the domain object
> which we normally require for any modification of 'vm->def' related data.
> 
> This wasn't a problem until now as we've relinquished the lock on @vm
> only in situations when the @vm object was considered live:
> 
>  1) Before the per-VM thread cleanup was added to qemuProcessStop it
>     never unlocked
>  2) After the per-VM thread cleanup was added, this unlock was done
>     prior to setting vm->def->id to '-1'
>  3) All other cases are done only when the VM is live.
> 
> > As a fix, I'm moving all lines that obviously touch vm->def
> > before the domain object is unlocked. That left
> > virHookCall(VIR_HOOK_QEMU_OP_STOPPED, VIR_HOOK_SUBOP_END) nearly
> > next to virHookCall(VIR_HOOK_QEMU_OP_RELEASE,
> > VIR_HOOK_SUBOP_END) which I figured is not something we want. So
> > I've shuffled things a bit more.
> 
> This feels like a fix for symptoms of 'virDomainObjListAdd' not
> honouring the _MODIFY-type job expectation, and we're shuffling code
> around so that it doesn't care about the broken expectation.
> 
> Since I don't currently have a better idea of how to fix this I'm okay
> with this patch given the following conditions:
> 
> > Fixes: 3865410e7f67ca4ec66e9a905e75f452762a97f0
> 
> Explain that the above commit inverted the order of setting the VM as
> inactive and unlocking thus allowing the above sequence of events to
> happen, and,

Why not just revert that change ? It claimed to be making things
safer, but did the opposite.   Even with this fixup below I'm
pretty uncomfortable with setting 'id = -1' and unlocking the
@vm, before we've done all our cleanup.

> 
> > Resolves: https://issues.redhat.com/browse/RHEL-49607
> > Signed-off-by: Michal Privoznik <mprivozn@xxxxxxxxxx>
> > ---
> >  src/qemu/qemu_process.c | 122 ++++++++++++++++++++--------------------
> >  1 file changed, 62 insertions(+), 60 deletions(-)
> > 
> > diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
> > index 25dfd04272..9ea6c678b8 100644
> > --- a/src/qemu/qemu_process.c
> > +++ b/src/qemu/qemu_process.c
> > @@ -8530,6 +8530,18 @@ void qemuProcessStop(virQEMUDriver *driver,
> >                                   VIR_QEMU_PROCESS_KILL_FORCE|
> >                                   VIR_QEMU_PROCESS_KILL_NOCHECK));
> >  
> > +    vm->pid = 0;
> > +
> > +    /* now that we know it's stopped call the hook if present */
> > +    if (virHookPresent(VIR_HOOK_DRIVER_QEMU)) {
> > +        g_autofree char *xml = qemuDomainDefFormatXML(driver, NULL, vm->def, 0);
> > +
> > +        /* we can't stop the operation even if the script raised an error */
> > +        ignore_value(virHookCall(VIR_HOOK_DRIVER_QEMU, vm->def->name,
> > +                                 VIR_HOOK_QEMU_OP_STOPPED, VIR_HOOK_SUBOP_END,
> > +                                 NULL, xml, NULL));
> > +    }
> > +
> >      if (priv->agent) {
> >          g_clear_pointer(&priv->agent, qemuAgentClose);
> >      }
> > @@ -8553,25 +8565,6 @@ void qemuProcessStop(virQEMUDriver *driver,
> >  
> >      qemuDBusStop(driver, vm);
> >  
> > -    /* Only after this point we can reset 'priv->beingDestroyed' so that
> > -     * there's no point at which the VM could be considered as alive between
> > -     * entering the destroy job and this point where the active "flag" is
> > -     * cleared.
> > -     */
> > -    vm->def->id = -1;
> > -    priv->beingDestroyed = false;
> > -
> > -    /* Wake up anything waiting on domain condition */
> > -    virDomainObjBroadcast(vm);
> > -
> > -    /* IMPORTANT: qemuDomainObjStopWorker() unlocks @vm in order to prevent
> > -     * deadlocks with the per-VM event loop thread. This MUST be done after
> > -     * marking the VM as dead */
> > -    qemuDomainObjStopWorker(vm);
> > -
> > -    if (!!g_atomic_int_dec_and_test(&driver->nactive) && driver->inhibitCallback)
> > -        driver->inhibitCallback(false, driver->inhibitOpaque);
> > -
> >      /* Clear network bandwidth */
> >      virDomainClearNetBandwidth(vm->def);
> >  
> > @@ -8588,18 +8581,6 @@ void qemuProcessStop(virQEMUDriver *driver,
> >          }
> >      }
> >  
> > -    virPortAllocatorRelease(priv->nbdPort);
> > -    priv->nbdPort = 0;
> > -
> > -    if (priv->monConfig) {
> > -        if (priv->monConfig->type == VIR_DOMAIN_CHR_TYPE_UNIX)
> > -            unlink(priv->monConfig->data.nix.path);
> > -        g_clear_pointer(&priv->monConfig, virObjectUnref);
> > -    }
> > -
> > -    /* Remove the master key */
> > -    qemuDomainMasterKeyRemove(priv);
> > -
> >      ignore_value(virDomainChrDefForeach(vm->def,
> >                                          false,
> >                                          qemuProcessCleanupChardevDevice,
> > @@ -8609,22 +8590,6 @@ void qemuProcessStop(virQEMUDriver *driver,
> >      /* Its namespace is also gone then. */
> >      qemuDomainDestroyNamespace(driver, vm);
> >  
> > -    virFileDeleteTree(priv->libDir);
> > -    virFileDeleteTree(priv->channelTargetDir);
> > -
> > -    /* Stop autodestroy in case guest is restarted */
> > -    virCloseCallbacksDomainRemove(vm, NULL, qemuProcessAutoDestroy);
> > -
> > -    /* now that we know it's stopped call the hook if present */
> > -    if (virHookPresent(VIR_HOOK_DRIVER_QEMU)) {
> > -        g_autofree char *xml = qemuDomainDefFormatXML(driver, NULL, vm->def, 0);
> > -
> > -        /* we can't stop the operation even if the script raised an error */
> > -        ignore_value(virHookCall(VIR_HOOK_DRIVER_QEMU, vm->def->name,
> > -                                 VIR_HOOK_QEMU_OP_STOPPED, VIR_HOOK_SUBOP_END,
> > -                                 NULL, xml, NULL));
> > -    }
> > -
> >      /* Reset Security Labels unless caller don't want us to */
> >      if (!(flags & VIR_QEMU_PROCESS_STOP_NO_RELABEL))
> >          qemuSecurityRestoreAllLabel(driver, vm,
> > @@ -8672,8 +8637,6 @@ void qemuProcessStop(virQEMUDriver *driver,
> >          virResctrlAllocRemove(vm->def->resctrls[i]->alloc);
> >      }
> >  
> > -    qemuProcessRemoveDomainStatus(driver, vm);
> > -
> >      /* Remove VNC and Spice ports from port reservation bitmap, but only if
> >         they were reserved by the driver (autoport=yes)
> >      */
> > @@ -8706,20 +8669,9 @@ void qemuProcessStop(virQEMUDriver *driver,
> >          }
> >      }
> >  
> > -    for (i = 0; i < vm->ndeprecations; i++)
> > -        g_free(vm->deprecations[i]);
> > -    g_clear_pointer(&vm->deprecations, g_free);
> > -    vm->ndeprecations = 0;
> > -    vm->taint = 0;
> > -    vm->pid = 0;
> > -    virDomainObjSetState(vm, VIR_DOMAIN_SHUTOFF, reason);
> >      for (i = 0; i < vm->def->niothreadids; i++)
> >          vm->def->iothreadids[i]->thread_id = 0;
> >  
> > -    /* clean up a possible backup job */
> > -    if (priv->backup)
> > -        qemuBackupJobTerminate(vm, VIR_DOMAIN_JOB_STATUS_CANCELED);
> > -
> >      /* Do this explicitly after vm->pid is reset so that security drivers don't
> >       * try to enter the domain's namespace which is non-existent by now as qemu
> >       * is no longer running. */
> > @@ -8753,6 +8705,56 @@ void qemuProcessStop(virQEMUDriver *driver,
> >  
> >      qemuSecurityReleaseLabel(driver->securityManager, vm->def);
> >  
> > +    /* Only after this point we can reset 'priv->beingDestroyed' so that
> > +     * there's no point at which the VM could be considered as alive between
> > +     * entering the destroy job and this point where the active "flag" is
> > +     * cleared.
> > +     */
> > +    vm->def->id = -1;
> > +    priv->beingDestroyed = false;
> > +
> > +    /* Wake up anything waiting on domain condition */
> > +    virDomainObjBroadcast(vm);
> > +
> > +    /* IMPORTANT: qemuDomainObjStopWorker() unlocks @vm in order to prevent
> > +     * deadlocks with the per-VM event loop thread. This MUST be done after
> > +     * marking the VM as dead */
> 
> 
> Extend the comment here stating how virDomainObjAssignDef() in
> combination with vm->def->id being -1 can cause update of vm->def and
> thus no code below can ever access it any more.
> 
> > +    qemuDomainObjStopWorker(vm);
> > +
> > +    if (!!g_atomic_int_dec_and_test(&driver->nactive) && driver->inhibitCallback)
> > +        driver->inhibitCallback(false, driver->inhibitOpaque);
> > +
> > +    virPortAllocatorRelease(priv->nbdPort);
> > +    priv->nbdPort = 0;
> > +
> > +    if (priv->monConfig) {
> > +        if (priv->monConfig->type == VIR_DOMAIN_CHR_TYPE_UNIX)
> > +            unlink(priv->monConfig->data.nix.path);
> > +        g_clear_pointer(&priv->monConfig, virObjectUnref);
> > +    }
> > +
> > +    /* Remove the master key */
> > +    qemuDomainMasterKeyRemove(priv);
> > +
> > +    virFileDeleteTree(priv->libDir);
> > +    virFileDeleteTree(priv->channelTargetDir);
> > +
> > +    /* Stop autodestroy in case guest is restarted */
> > +    virCloseCallbacksDomainRemove(vm, NULL, qemuProcessAutoDestroy);
> > +
> > +    qemuProcessRemoveDomainStatus(driver, vm);
> > +
> > +    for (i = 0; i < vm->ndeprecations; i++)
> > +        g_free(vm->deprecations[i]);
> > +    g_clear_pointer(&vm->deprecations, g_free);
> > +    vm->ndeprecations = 0;
> > +    vm->taint = 0;
> > +    virDomainObjSetState(vm, VIR_DOMAIN_SHUTOFF, reason);
> > +
> > +    /* clean up a possible backup job */
> > +    if (priv->backup)
> > +        qemuBackupJobTerminate(vm, VIR_DOMAIN_JOB_STATUS_CANCELED);
> > +
> >      /* clear all private data entries which are no longer needed */
> >      qemuDomainObjPrivateDataClear(priv);
> >  
> > -- 
> > 2.44.2
> > 
> 
> Reviewed-by: Peter Krempa <pkrempa@xxxxxxxxxx>
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]

  Powered by Linux