Eric Blake wrote: > On 06/20/2012 03:07 PM, Jim Fehlig wrote: > >> I'm looking into a libvirtd deadlock on daemon shutdown. The deadlock >> occurs when shutting down virNetServer. If the handling of a job is in >> flight in virNetServerHandleJob(), the virNetServer lock is acquired >> when freeing job->prog (src/rpc/virnetserver.c:167). But the lock is >> already held in virNetServerFree(), which is blocked in >> virThreadPoolFree() waiting for all the workers to finish. No progress >> can be made. >> >> The attached hack fixes the problem, but I'm not convinced this is an >> appropriate fix. Is it necessary to hold the virNetServer lock when >> calling virNetServerProgramFree(job->prog)? I notice the lock is not >> held in the error path of virNetServerHandleJob(). >> >> > > >> +++ b/src/rpc/virnetserver.c >> @@ -774,7 +774,9 @@ void virNetServerFree(virNetServerPtr srv) >> for (i = 0 ; i < srv->nservices ; i++) >> virNetServerServiceToggle(srv->services[i], false); >> >> + virNetServerUnlock(srv); >> virThreadPoolFree(srv->workers); >> + virNetServerLock(srv); >> > > As written, this can't be right, because it reads a field of srv outside > the locks. But maybe you meant: > > <type> tmp = srv->workers; > srv->workers = NULL; > virNetServerUnlock(srv); > virThreadPoolFree(tmp); > virNetServerLock(srv); > > as a possible fix that doesn't violate the safety rules of reading > fields from srv outside a lock. > Perhaps, but it _currently_ appears srv->workers is only set in virNetServerNew(), which is called when libvirtd starts. I suppose that could change in the future, causing a bug as I wrote it. > I hope danpb chimes in on this one, as he is more of an expert when it > comes to the locking rules in virnet*. > Agreed. I think there is a better fix here. Thanks, Jim -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list