On 03/05/2018 10:39 AM, Wuzongyong (Euler Dept) wrote: > > > Thanks, > Zongyong Wu [Please don't top post on technical lists] > > >> -----Original Message----- >> From: Michal Privoznik [mailto:mprivozn@xxxxxxxxxx] >> Sent: Monday, March 05, 2018 5:27 PM >> To: Wuzongyong (Euler Dept) <cordius.wu@xxxxxxxxxx>; libvir- >> list@xxxxxxxxxx >> Cc: Wanzongshun (Vincent) <wanzongshun@xxxxxxxxxx>; weijinfen >> <weijinfen@xxxxxxxxxx> >> Subject: Re: [Question]Libvirt doesn't care about qemu monitor >> event if fail to destroy qemu process >> >> On 03/05/2018 03:20 AM, Wuzongyong (Euler Dept) wrote: >>> Hi, >>> >>> We unregister qemu monitor after sending QEMU_PROCESS_EVENT_MONITOR_EOF >> to workerPool: >>> >>> static void >>> qemuProcessHandleMonitorEOF(qemuMonitorPtr mon, >>> virDomainObjPtr vm, >>> void *opaque) { >>> virQEMUDriverPtr driver = opaque; >>> qemuDomainObjPrivatePtr priv; >>> struct qemuProcessEvent *processEvent; ... >>> processEvent->eventType = QEMU_PROCESS_EVENT_MONITOR_EOF; >>> processEvent->vm = vm; >>> >>> virObjectRef(vm); >>> if (virThreadPoolSendJob(driver->workerPool, 0, processEvent) < 0) { >>> ignore_value(virObjectUnref(vm)); >>> VIR_FREE(processEvent); >>> goto cleanup; >>> } >>> >>> /* We don't want this EOF handler to be called over and over while >> the >>> * thread is waiting for a job. >>> */ >>> qemuMonitorUnregister(mon); >>> ... >>> } >>> >>> Then we handle QEMU_PROCESS_EVENT_MONITOR_EOF in processMonitorEOFEvent >> function: >>> >>> static void >>> processMonitorEOFEvent(virQEMUDriverPtr driver, >>> virDomainObjPtr vm) { >>> ... >>> if (qemuProcessBeginStopJob(driver, vm, QEMU_JOB_DESTROY, true) < >> 0) >>> return; >>> ... >>> } >>> >>> Here, libvirt will show that the vm state is running all the time if >>> qemuProcessBeginStopJob return -1 even though qemu may terminate or be >> killed later. >>> >>> So, may be we should re-register the monitor when >> qemuProcessBeginStopJob failed? >> >> The fact that processMonitorEOFEvent() failed to grab DESTROY job means >> that we screwed up earlier and now you're just seeing effects of it. >> Threads should be albe to acquire DESTROY job at any point, regardless of >> other jobs set on the domain object. >> >> Can you please: >> a) try to turn on debug logs [1] and tell us why acquiring DESTROY job >> failed? You should see an error message like this: >> >> error: cannot acquire state change lock .. >> >> b) tell us what is your libvirt version and if you're able to reproduce >> this with the latest git HEAD? >> > > I said " qemuProcessBeginStopJob failed" means that: Oh, I though that the message you've sent earlier is related to this: https://www.redhat.com/archives/libvir-list/2018-March/msg00148.html So you are not accidentally sending SIGKILL to qemu then? > we failed to kill qemu process in 15 seconds (refer to virProcessKillPainfully). > IOW, we send SIGTERM and SIGKILL but the qemu process doesn't exit in 15s, and > then libvirt will think qemu is still in running state event though qemu exit > indeed after the 15s loop in virProcessKillPainfully. What state is qemu process in then? I mean, how can we see EOF if the process still exists? Michal -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list