On 2020/05/04 17:13, Michal Privoznik wrote: >On 5/4/20 10:07 AM, Peter Krempa wrote: >> On Fri, May 01, 2020 at 16:09:04 +0900, MIKI Nobuhiro wrote: >>> The waiting time to acquire the lock times out, which leads to a segment fault. >> >> Could you please elaborate here? Adding this band-aid is pointless if it >> can timeout later. We do want to fix any locking issue but without >> information we can't really. >> >>> In essence we should make improvements around locks, but as a workaround we >>> will change the timeout to allow the user to increase it. >>> This value was defined as 30 seconds, so use it as the default value. >>> The logs are as follows: >>> >>> ``` >>> Timed out during operation: cannot acquire state change lock \ >>> (held by monitor=remoteDispatchDomainCreateWithFlags) >>> libvirtd.service: main process exited, code=killed,status=11/SEGV >>> ``` >> >> Unfortunately I don't consider this a proper justification for the >> change below. Either re-state why you want this, e.g. saying that >> shortening time may give users greater feedback, but mentioning that it >> works around a crash is not acceptable as a justification for something >> which doesn't fix the crash. > >Agreed. Allowing users to configure the timeout makes sense - we already >do that for other timeouts, but if it is masking a real bug we need to >fix it first. Do you have any steps to reproduce the bug? Are you able >to get the stack trace from the coredump? Here is a stacktrace from the coredump. But, today I tested again on master branch (commit eea5d63a221a8f36a3ed5b1189fe619d4fa1fde2), and every virtual machines was booted successfully. So it seems that this bug is already fixed. I apologize for any time you may spend for me. (gdb) p mon $1 = (qemuMonitor *) 0x7fe0dc0142e0 (gdb) p mon->msg $2 = (qemuMonitorMessagePtr) 0x0 # I supposed that mon is shared between worker threads and some thread may set mon->msg = NULL. (gdb) bt #0 qemuMonitorSend (mon=mon@entry=0x7fe0dc0142e0, msg=msg@entry=0x7fe0e3f32350) at qemu/qemu_monitor.c:981 #1 0x00007fe0d23c4428 in qemuMonitorJSONCommandWithFd (mon=0x7fe0dc0142e0, cmd=cmd@entry=0x7fe0dc014660, scm_fd=scm_fd@entry=-1, reply=reply@entry=0x7fe0e3f323e0) at qemu/qemu_monitor_json.c:333 #2 0x00007fe0d23c61cf in qemuMonitorJSONCommand (reply=0x7fe0e3f323e0, cmd=0x7fe0dc014660, mon=<optimized out>) at qemu/qemu_monitor_json.c:358 #3 qemuMonitorJSONSetCapabilities (mon=<optimized out>) at qemu/qemu_monitor_json.c:1611 #4 0x00007fe0d23b6453 in qemuMonitorSetCapabilities (mon=<optimized out>) at qemu/qemu_monitor.c:1582 #5 0x00007fe0d2394e43 in qemuProcessInitMonitor (asyncJob=QEMU_ASYNC_JOB_START, vm=0x7fe0cc028670, driver=0x7fe0801290c0) at qemu/qemu_process.c:1928 #6 qemuConnectMonitor (driver=driver@entry=0x7fe0801290c0, vm=vm@entry=0x7fe0cc028670, asyncJob=asyncJob@entry=6, retry=retry@entry=false, logCtxt=logCtxt@entry=0x7fe0dc044b40) at qemu/qemu_process.c:2003 #7 0x00007fe0d239b69c in qemuProcessWaitForMonitor (logCtxt=0x7fe0dc044b40, asyncJob=6, vm=0x7fe0cc028670, driver=0x7fe0801290c0) at qemu/qemu_process.c:2413 #8 qemuProcessLaunch (conn=conn@entry=0x7fe0c4000a00, driver=driver@entry=0x7fe0801290c0, vm=vm@entry=0x7fe0cc028670, asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_START, incoming=incoming@entry=0x0, snapshot=snapshot@entry=0x0, vmop=vmop@entry=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=flags@entry=17) at qemu/qemu_process.c:6993 #9 0x00007fe0d239f8f2 in qemuProcessStart (conn=conn@entry=0x7fe0c4000a00, driver=driver@entry=0x7fe0801290c0, vm=vm@entry=0x7fe0cc028670, updatedCPU=updatedCPU@entry=0x0, asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_START, migrateFrom=migrateFrom@entry=0x0, migrateFd=migrateFd@entry=-1, migratePath=migratePath@entry=0x0, snapshot=snapshot@entry=0x0, vmop=vmop@entry=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=17, flags@entry=1) at qemu/qemu_process.c:7230 #10 0x00007fe0d2402d59 in qemuDomainObjStart (conn=0x7fe0c4000a00, driver=driver@entry=0x7fe0801290c0, vm=0x7fe0cc028670, flags=flags@entry=0, asyncJob=QEMU_ASYNC_JOB_START) at qemu/qemu_driver.c:7650 #11 0x00007fe0d2403436 in qemuDomainCreateWithFlags (dom=0x7fe0dc0050d0, flags=0) at qemu/qemu_driver.c:7703 #12 0x00007fe0f394f88d in virDomainCreateWithFlags (domain=domain@entry=0x7fe0dc0050d0, flags=0) at libvirt-domain.c:6600 #13 0x000055d9e00348a2 in remoteDispatchDomainCreateWithFlags (server=0x55d9e1c95140, msg=0x55d9e1cb7d10, ret=0x7fe0dc004b80, args=0x7fe0dc005110, rerr=0x7fe0e3f32c10, client=<optimized out>) at remote/remote_daemon_dispatch_stubs.h:4819 #14 remoteDispatchDomainCreateWithFlagsHelper (server=0x55d9e1c95140, client=<optimized out>, msg=0x55d9e1cb7d10, rerr=0x7fe0e3f32c10, args=0x7fe0dc005110, ret=0x7fe0dc004b80) at remote/remote_daemon_dispatch_stubs.h:4797 #15 0x00007fe0f387c0d9 in virNetServerProgramDispatchCall (msg=0x55d9e1cb7d10, client=0x55d9e1cb6ce0, server=0x55d9e1c95140, prog=0x55d9e1cb3a40) at rpc/virnetserverprogram.c:435 #16 virNetServerProgramDispatch (prog=0x55d9e1cb3a40, server=server@entry=0x55d9e1c95140, client=0x55d9e1cb6ce0, msg=0x55d9e1cb7d10) at rpc/virnetserverprogram.c:302 #17 0x00007fe0f388137d in virNetServerProcessMsg (msg=<optimized out>, prog=<optimized out>, client=<optimized out>, srv=0x55d9e1c95140) at rpc/virnetserver.c:137 #18 virNetServerHandleJob (jobOpaque=<optimized out>, opaque=0x55d9e1c95140) at rpc/virnetserver.c:158 #19 0x00007fe0f37a9c31 in virThreadPoolWorker (opaque=opaque@entry=0x55d9e1c94e50) at util/virthreadpool.c:163 #20 0x00007fe0f37a9038 in virThreadHelper (data=<optimized out>) at util/virthread.c:196 #21 0x00007fe0f0d8ce65 in start_thread () from /lib64/libpthread.so.0 #22 0x00007fe0f0ab588d in clone () from /lib64/libc.so.6 >> Changes to news.xml always must be in a separate commit. > >Just a short explanation - this is to ease possible backports. For >instance, if there is a bug fix in version X, but a distro wants to >backport it to version X-1 then the news.xml looks completely different >there and the cherry-pick won't apply cleanly. Thank you for your reviews. I think this modification might be useful for other situations. So, I'll reconstruct this patch and submit again.