On Tue, Sep 22, 2020 at 12:07 PM Ján Tomko <jtomko@xxxxxxxxxx> wrote: > > On a Tuesday in 2020, Douglas, William wrote: > >On Tue, Sep 22, 2020 at 1:11 AM Daniel P. Berrangé <berrange@xxxxxxxxxx> wrote: > >> > >> On Mon, Sep 21, 2020 at 12:05:48PM -0700, Douglas, William wrote: > >> > On Tue, Sep 15, 2020 at 5:53 AM Daniel P. Berrangé <berrange@xxxxxxxxxx> wrote: > >> > > > >> > > On Thu, Aug 27, 2020 at 11:24:32AM -0700, William Douglas wrote: > >> > > >> > <snip> > >> > > >> > > > +virCHMonitorPtr > >> > > > +virCHMonitorNew(virDomainObjPtr vm, const char *socketdir) > >> > > > +{ > >> > > > + virCHMonitorPtr ret = NULL; > >> > > > + virCHMonitorPtr mon = NULL; > >> > > > + virCommandPtr cmd = NULL; > >> > > > + int pings = 0; > >> > > > + > >> > > > + if (virCHMonitorInitialize() < 0) > >> > > > + return NULL; > >> > > > + > >> > > > + if (!(mon = virObjectLockableNew(virCHMonitorClass))) > >> > > > + return NULL; > >> > > > + > >> > > > + mon->socketpath = g_strdup_printf("%s/%s-socket", socketdir, vm->def->name); > >> > > > + > >> > > > + /* prepare to launch Cloud-Hypervisor socket */ > >> > > > + if (!(cmd = chMonitorBuildSocketCmd(vm, mon->socketpath))) > >> > > > + goto cleanup; > >> > > > + > >> > > > + if (virFileMakePath(socketdir) < 0) { > >> > > > + virReportSystemError(errno, > >> > > > + _("Cannot create socket directory '%s'"), > >> > > > + socketdir); > >> > > > + goto cleanup; > >> > > > + } > >> > > > + > >> > > > + /* launch Cloud-Hypervisor socket */ > >> > > > + if (virCommandRunAsync(cmd, &mon->pid) < 0) > >> > > > + goto cleanup; > >> > > > + > >> > > > + /* get a curl handle */ > >> > > > + mon->handle = curl_easy_init(); > >> > > > + > >> > > > + /* try to ping VMM socket 5 times to make sure it is ready */ > >> > > > + while (pings < 5) { > >> > > > + if (virCHMonitorPingVMM(mon) == 0) > >> > > > + break; > >> > > > + if (pings == 5) > >> > > > + goto cleanup; > >> > > > + > >> > > > + g_usleep(100 * 1000); > >> > > > + } > >> > > > >> > > This is highly undesirable. Is there no way to launch the CH process > >> > > such that the socket is guaranteed to be accepting requests by the > >> > > time it has forked into the background ? Or is there a way to pass > >> > > in a pre-opened FD for the monitor socket ? > >> > > > >> > > This kind of wait with timeout will cause startup failures due to > >> > > timeout under load. > >> > > >> > Currently the cloud-hypervisor process doesn't fork into the > >> > background so that was initially what I was thinking to add to > >> > cloud-hypervsior. Would tracking the pid of the cloud-hypervsior > >> > running in the background be required at that point (assuming the > >> > socket path to control the daemon would be known and working given > >> > virCommandRun returns successfully)? > >> > >> It doesn't especially matter whether cloud-hypervsior forks into > >> the background itself, or whether libvirt forks it into the > >> background when launching it. > >> > >> The important thing is simply to ensure that whichever startup > >> approach is used can be implemented in a race-free manner without > >> needing busy-waits or timeouts. > >> > >> If cloud-hypervisor forks into the background, then it would > >> need to write a pidfile so libvirt can determine the pid that > >> ended up running. The act of libvirt waiting for the pidfile > >> to be written is potentially racy though, so that might not be > >> be best. > >> > >> Based on what we learnt from QEMU, I think being able to pass > >> in a pre-created UNIX listener socket is the best. That lets > >> libvirt immedately issue a connect() start after forking the > >> cloud-hypervisor process. If cloud-hypervisor succesfully > >> starts up, then the client socket becomes live and can be used. > >> if it fails to startup, then the client socket libvirt has > >> will get EOF and thus we'll see the startup failure. This > >> avoids any looping/sleeping/timeout. > > > >I think I'm misreading/missing something from the qemu setup but > >looking at qemu_monitor.c's qemuMonitorOpenUnix function I see a retry > >connect loop based on a timeout when trying to connect to a socket > > That loop is guarded by: if (retry) > > The important code path: > qemuProcessLaunch > `_ qemuProcessWaitForMonitor > `_ qemuConnectMonitor > > qemuProcessWaitForMonitor sets retry=false if we're dealing with > a recent enough QEMU that supports FD passing for its chardevs > > The logic that creates the socket on libvirt's side lives (sadly) > in qemuBuildChrChardevStr. > Ah I was missing the possibility of retry being set to false (I saw false would be set on a reconnect I think but didn't look much further) and the entirety of the qemuBuildChrChardevStr being the key part of the setup. Thank you very much!