libvirt segfaults with "internal,error: Missing monitor reply object", during block live-migration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear libvirt community,


Using recent Ubuntu Stein Cloud Packages, we are observing random libvirtd live-migration crashes on the target host.
Libvirt is having a SEGFAULT with the qemu driver. Transferring block devices usually works without issues.
However, the following memory transfer is causing the target libvirtd randomly to close down its socket, resulting in a roll-backed migration process. I can reproduce this with large VMs, which have a large memory pool.

The last error message we see in libvirt logs is:
error : qemuMonitorJSONCommandWithFd:315 : internal error: Missing monitor reply object

With this, libvirt segfaults and restarts.
Before we encountered this issue, we used an older nova-compute package (19.0.3).
Not sure if this made a difference with usage of libvirtd-api.
After upgrade, we also see a lot of recurring errors during migration:

warning : qemuDomainObjBeginJobInternal:7044 : Cannot start job (query, none, none) for domain instance-00008f56; current job is (none, none, migration in) owned by (0 <null>, 0 <null>, 0 remoteDispatchDomainMigratePrepare3Params (flags=0x809b)) for (0s, 0s, 14834s)
error : qemuDomainObjBeginJobInternal:7066 : Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainMigratePrepare3Params)

They don't abort the running migration process, but spam every minute to the systemd journal.

Source and destination run the same packages:

Ubuntu 18.04.4 LTS (GNU/Linux 4.15.0-99-generic x86_64)
OpenStack Stein (Ubuntu Cloud Archive)
Libvirt+QEMU_x86
keystone-common 2:15.0.1-0ubuntu1~cloud0
libvirt-daemon 5.0.0-1ubuntu2.6~cloud0
qemu-system-x86 1:3.1+dfsg-2ubuntu3.7~cloud0
neutron-linuxbridge-agent 2:14.2.0-0ubuntu1~cloud0
neutron-plugin-ml2 2:14.2.0-0ubuntu1~cloud0
nova-compute 2:19.2.0-0ubuntu1~cloud0
nova-compute-libvirt 2:19.2.0-0ubuntu1~cloud0

I have attached source/destination debug logs from libvirtd and nova-compute here:

https://denzelx.ddns.net/index.php/s/KPJ7vv4aTcb69XD

Any help would be nice!


Best Regards
-- 
M.Sc Alex Walender
de.NBI Cloud Bielefeld Administrator
Center for Biotechnology (CeBiTec)

University of Bielefeld
33594 Bielefeld
Germany
room: M3-118
phone: +49 (521) 106 2907

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Virt Tools]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]

  Powered by Linux