On Thu, Apr 02, 2015 at 06:21:01PM +0200, Michal Privoznik wrote: > https://bugzilla.redhat.com/show_bug.cgi?id=1200149 > > Even though we have a mutex mechanism so that two clients don't spawn > two daemons, it's not strong enough. It can happen that while one > client is spawning the daemon, the other one fails to connect. > Basically two possible errors can happen: > > error: Failed to connect socket to '/home/mprivozn/.cache/libvirt/libvirt-sock': Connection refused > > or: > > error: Failed to connect socket to '/home/mprivozn/.cache/libvirt/libvirt-sock': No such file or directory > > The problem in both cases is, the daemon is only starting up, while we > are trying to connect (and fail). We should postpone the connecting > phase until the daemon is started (by the other thread that is > spawning it). In order to do that, create a file lock 'libvirt-lock' > in the directory where session daemon would create its socket. So even > when called from multiple processes, spawning a daemon will serialize > on the file lock. So only the first to come will spawn the daemon. > > Signed-off-by: Michal Privoznik <mprivozn@xxxxxxxxxx> > --- > > diff to v1: > -hopefully this one is race free > > src/rpc/virnetsocket.c | 148 ++++++++++++------------------------------------- > 1 file changed, 36 insertions(+), 112 deletions(-) ACK, with a few changes to the lock file path > > diff --git a/src/rpc/virnetsocket.c b/src/rpc/virnetsocket.c > index 6b019cc..2b891ae 100644 > --- a/src/rpc/virnetsocket.c > +++ b/src/rpc/virnetsocket.c > @@ -543,10 +539,10 @@ int virNetSocketNewConnectUNIX(const char *path, > const char *binary, > virNetSocketPtr *retsock) > { > - char *binname = NULL; > - char *pidpath = NULL; > - int fd, passfd = -1; > - int pidfd = -1; > + char *lockpath = NULL; > + int lockfd = -1; > + int fd = -1; > + int retries = 100; > virSocketAddr localAddr; > virSocketAddr remoteAddr; > > @@ -561,6 +557,22 @@ int virNetSocketNewConnectUNIX(const char *path, > return -1; > } > > + if (spawnDaemon) { > + if (virPidFileConstructPath(false, NULL, "libvirt-lock", &lockpath) < 0) > + goto error; Here the lock gets named: .cache/libvirt/libvirt-lock.pid, even though it does not contain a pid and we could be using this function to spawn virlockd too, not just libvirtd. Can you move the last_component call here and name it "%s.lock", binname? Jan
Attachment:
signature.asc
Description: Digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list