Re: [PATCH 2/2] qemu: Use virEventThreadStop() in qemuProcessStop()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/25/24 14:33, Peter Krempa wrote:
> On Thu, Jul 25, 2024 at 12:57:59 +0200, Michal Privoznik wrote:
>> Currently, qemuProcessStop() unlocks given domain object right in
>> the middle of cleanup process. This is dangerous because there
>> might be another thread which is executing virDomainObjListAdd().
>> And since the domain object is on the list of domain objects AND
>> by the time qemuProcessStop() unlocks it the object is also
>> marked as inactive, the other thread acquires the lock and
>> switches vm->def pointer.
>>
>> The unlocking of domain object is needed though, to allow even
>> processing thread finish its queue. Well, the processing can be
>> done before any cleanup is attempted.
>>
>> Therefore, use freshly introduced virEventThreadStop() to join
>> the event thread and drop lock/unlock from the middle of
>> qemuProcessStop().
>>
>> Fixes: 3865410e7f67ca4ec66e9a905e75f452762a97f0
>> Resolves: https://issues.redhat.com/browse/RHEL-49607
>> Signed-off-by: Michal Privoznik <mprivozn@xxxxxxxxxx>
>> ---
>>  src/qemu/qemu_process.c | 14 ++++++++++----
>>  1 file changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
>> index 6255ba92e7..0869307a90 100644
>> --- a/src/qemu/qemu_process.c
>> +++ b/src/qemu/qemu_process.c
>> @@ -8615,6 +8615,14 @@ void qemuProcessStop(virQEMUDriver *driver,
>>       * reporting so we don't squash a legit error. */
>>      virErrorPreserveLast(&orig_err);
>>  
>> +    /* By unlocking the domain object the events processing thread is
>> +     * allowed to finish its job. */
>> +    if (priv->eventThread) {
>> +        virObjectUnlock(vm);
>> +        virEventThreadStop(priv->eventThread);
>> +        virObjectLock(vm);
>> +    }
> 
> Moving this unlocking above the line where we reset vm->def->id below
> directly contradicts ...
> 
> 
>> +
>>      if (asyncJob != VIR_ASYNC_JOB_NONE) {
>>          if (virDomainObjBeginNestedJob(vm, asyncJob) < 0)
>>              goto cleanup;
>> @@ -8681,10 +8689,8 @@ void qemuProcessStop(virQEMUDriver *driver,
>>      /* Wake up anything waiting on domain condition */
>>      virDomainObjBroadcast(vm);
>>  
>> -    /* IMPORTANT: qemuDomainObjStopWorker() unlocks @vm in order to prevent
>> -     * deadlocks with the per-VM event loop thread. This MUST be done after
>> -     * marking the VM as dead */
> 
> ... this message here. Now due to other patches I've pushed, this is now
> not a problem directly as we now keep the 'beingDestroyed' flag set
> during the whole time thanks to my patch and Jirka's patch fixing few
> cases where the flag would be leaked.
> 
> What I'm missing is a note why this is safe which I'd expect in the
> commit message:
> 
>  This patch moves the unlocking of the VM object once again above the
>  code which marks the VM as inactive by clearing 'vm->def->id'.
> 
>  At this point it's now safe to do as other qemu driver code which may
>  be synchronized on the VM lock now also checks the
>  'priv->beingDestroyed' flag in addition to 'vm->def->id'.
> 
> I'd also most likely prefer if 'qemuProcessStop' explicitly sets
> 'priv->beingDestroyed' to true before unlocking. The flag will be later
> cleared but I didn't really go through all places where it's called to
> see if we assert the flag before.

I can set the flag, sure. And to some extent it feels safer. But, since
the domain object is no longer unlocked I'm wondering whether the flag
is needed at all?

I mean, with this patch merged, we no longer have a race condition, do we?

Michal



[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]

  Powered by Linux