On Tue, Jan 26, 2016 at 01:31:25PM +0300, Maxim Nestratov wrote:
26.01.2016 12:29, Martin Kletzander пишет:On Mon, Jan 25, 2016 at 12:16:12PM +0300, Maxim Nestratov wrote:A pretty nasty deadlock occurs while trying to rename a VM in parallel with virDomainObjListNumOfDomains. The short description of the problem is as follows: Thread #1: qemuDomainRename: ------> aquires domain lock by qemuDomObjFromDomain ---------> waits for domain list lock in any of the listed functions: - virDomainObjListFindByName - virDomainObjListRenameAddNew - virDomainObjListRenameRemove Thread #2: virDomainObjListNumOfDomains: ------> aquires domain list lock ---------> waits for domain lock in virDomainObjListCountOuch, the rename API should hold the list lock all the time it's doing something and not acquire it when any domain is locked. But you are duplicating a lot of code,Not that much
I meant to edit that out after reading the whole e-mail, my apologies.
The patch establishes a single order of taking locks: driver->domains list first, then a particular VM. This is done by implementing a set of virDomainObjListXxxLocked functions working with driver->domains that assume that list lock is already aquired by calling functions. Signed-off-by: Maxim Nestratov <mnestratov@xxxxxxxxxxxxx> --- src/conf/virdomainobjlist.c | 74 +++++++++++++++++++++++++++++++++++++-------- src/conf/virdomainobjlist.h | 9 ++++++ src/libvirt_private.syms | 4 +++ src/qemu/qemu_driver.c | 40 +++++++++++++++++++++--- 4 files changed, 110 insertions(+), 17 deletions(-) diff --git a/src/conf/virdomainobjlist.c b/src/conf/virdomainobjlist.c index 7568f93..89e28a8 100644 --- a/src/conf/virdomainobjlist.c +++ b/src/conf/virdomainobjlist.c @@ -132,21 +132,19 @@ virDomainObjPtr virDomainObjListFindByID(virDomainObjListPtr doms, static virDomainObjPtr -virDomainObjListFindByUUIDInternal(virDomainObjListPtr doms, +virDomainObjListFindByUUIDInternalLocked(virDomainObjListPtr doms, const unsigned char *uuid, bool ref)Indentation is off for renamed functions.Ok. I'll fix it if we decide to go on with my approach.{ char uuidstr[VIR_UUID_STRING_BUFLEN]; virDomainObjPtr obj; - virObjectLock(doms); virUUIDFormat(uuid, uuidstr); obj = virHashLookup(doms->objs, uuidstr); - if (ref) { + if (ref) virObjectRef(obj); - virObjectUnlock(doms); - } + if (obj) { virObjectLock(obj); if (obj->removing) { @@ -156,8 +154,19 @@ virDomainObjListFindByUUIDInternal(virDomainObjListPtr doms, obj = NULL; } } - if (!ref) - virObjectUnlock(doms); + return obj; +} + +static virDomainObjPtr +virDomainObjListFindByUUIDInternal(virDomainObjListPtr doms, + const unsigned char *uuid, + bool ref) +{ + virDomainObjPtr obj; + + virObjectLock(doms); + obj = virDomainObjListFindByUUIDInternalLocked(doms, uuid, ref); + virObjectUnlock(doms);This won't work for ref == true as in that case the unlock is not the last thing done in virDomainObjListFindByUUIDInternalLocked().I still think that this won't cause any problem as far as list lock is aquired. Otherwise we would have seen them in virDomainObjListNumOfDomains
That's kind of why there's the difference between ref=true and ref=false. The thing is that ref=false is prone to deadlocks even though they are not as easy to reproduce. We'll see what the decision will be. Martin
Attachment:
signature.asc
Description: Digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list