On 7/25/24 14:18, Daniel P. Berrangé wrote: > On Thu, Jul 25, 2024 at 12:57:59PM +0200, Michal Privoznik wrote: >> Currently, qemuProcessStop() unlocks given domain object right in >> the middle of cleanup process. This is dangerous because there >> might be another thread which is executing virDomainObjListAdd(). >> And since the domain object is on the list of domain objects AND >> by the time qemuProcessStop() unlocks it the object is also >> marked as inactive, the other thread acquires the lock and >> switches vm->def pointer. >> >> The unlocking of domain object is needed though, to allow even >> processing thread finish its queue. Well, the processing can be >> done before any cleanup is attempted. >> >> Therefore, use freshly introduced virEventThreadStop() to join >> the event thread and drop lock/unlock from the middle of >> qemuProcessStop(). >> >> Fixes: 3865410e7f67ca4ec66e9a905e75f452762a97f0 >> Resolves: https://issues.redhat.com/browse/RHEL-49607 >> Signed-off-by: Michal Privoznik <mprivozn@xxxxxxxxxx> >> --- >> src/qemu/qemu_process.c | 14 ++++++++++---- >> 1 file changed, 10 insertions(+), 4 deletions(-) >> >> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c >> index 6255ba92e7..0869307a90 100644 >> --- a/src/qemu/qemu_process.c >> +++ b/src/qemu/qemu_process.c >> @@ -8615,6 +8615,14 @@ void qemuProcessStop(virQEMUDriver *driver, >> * reporting so we don't squash a legit error. */ >> virErrorPreserveLast(&orig_err); >> >> + /* By unlocking the domain object the events processing thread is >> + * allowed to finish its job. */ >> + if (priv->eventThread) { >> + virObjectUnlock(vm); >> + virEventThreadStop(priv->eventThread); >> + virObjectLock(vm); >> + } >> + > > I wonder if this is a little too early. > > We don't call qemuMOnitorClose until a short while later, > just after we have done qemuProcessKill. > > If we stop the event loop here, then I worry that we're > at risk of missing some final monitor events that might > be desirable ? eg the final lifecycle events indicating > that we're stopping/stopped ? But can we close the monitor without a job? And IIUC, we shouldn't try to stop the event loop thread with a job held (as those callbacks try to acquire _MODIFY job on their own). Michal