Re: [PATCH 3/3] Avoid rare race when undefining domain

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 31.10.2014 07:35, Martin Kletzander wrote:
On Thu, Oct 30, 2014 at 05:44:23PM +0000, Daniel P. Berrange wrote:
On Thu, Oct 30, 2014 at 06:39:56PM +0100, Michal Privoznik wrote:
On 30.10.2014 16:04, Martin Kletzander wrote:
>When one domain is being undefined and at the same time started, for
>example, there is a possibility of a rare problem occuring.
>
>  - Thread 1 does virDomainUndefine(), has the lock, checks that the
>    domain is active and because it's not, calls
>    virDomainObjListRemove().
>
>  - Thread 2 does virDomainCreate() and tries to lock the domain.
>
>  - Thread 1 needs to lock domain list in order to remove the domain
from
>    it, but must unlock domain first (proper order is to lock domain
list
>    first and the domain itself second).
>
>  - Thread 2 grabs the lock, starts the domain and releases the lock.
>
>  - Thread 1 grabs the lock and removes the domain from list.
>
>With this patch:
>
>  - The undefining domain gets marked as "to undefine" before it is
>    unlocked.
>
>  - If domain is found in any of the search APIs, it's returned only if
>    it is not marked as "to undefine".  The check is done while the
>    domain is locked.
>
>Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1150505
>
>Signed-off-by: Martin Kletzander <mkletzan@xxxxxxxxxx>
>---
>  src/conf/domain_conf.c | 23 ++++++++++++++++++++---
>  src/conf/domain_conf.h |  1 +
>  2 files changed, 21 insertions(+), 3 deletions(-)
>
>diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
>index 43574e1..b92a58a 100644
>--- a/src/conf/domain_conf.c
>+++ b/src/conf/domain_conf.c
>@@ -1054,8 +1054,13 @@ virDomainObjPtr
virDomainObjListFindByID(virDomainObjListPtr doms,
>      virDomainObjPtr obj;
>      virObjectLock(doms);
>      obj = virHashSearch(doms->objs, virDomainObjListSearchID, &id);
>-    if (obj)
>+    if (obj) {
>          virObjectLock(obj);
>+        if (obj->undefining) {
>+            virObjectUnlock(obj);
>+            obj = NULL;
>+        }
>+    }
>      virObjectUnlock(doms);
>      return obj;
>  }

I find this too hackish, Wouldn't it be better to hold the domain list
locked during the whole undefine procedure? On one hand the
throughput would
get hurt but on the other, the code would be cleaner. Or maybe we can
just
make the Undefine() API to grab the QEMU_JOB_MODIFY job which would
make the
APIs serialize.

Yep, it feels like "Undefine" could be treated as a job, so that all
the other APIs get blocked/serialized by it.


That won't work for anything else than QEMU.  I tried creating
something like a job in the generic level which would fix it in
e.g. LXC as well.

I've always felt that we should have a job control in other drivers too. In the meantime, either we should go with domain list locked through some APIs, or just solve this in QEMU driver and postpone the fix in other drivers until such time we introduce job control to them.

Michal

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list




[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]