On Tue, Jun 04, 2019 at 14:44:29 +0200, Lentes, Bernd wrote: > Hi, Hi, > > i have several domains running on a 2-node HA-cluster. > Each night i create snapshots of the domains, after copying the consistent raw file to a CIFS server i blockcommit the changes into the raw files. > That's running quite well. > But recent the blockcommit didn't work for one domain: > I create a logfile from the whole procedure: > =============================================================== > ... > Sat Jun 1 03:05:24 CEST 2019 > Target Source > ------------------------------------------------ > vdb /mnt/snap/severin.sn > hdc - > > /usr/bin/virsh blockcommit severin /mnt/snap/severin.sn --verbose --active --pivot > Block commit: [ 0 %]Block commit: [ 15 %]Block commit: [ 28 %]Block commit: [ 35 %]Block commit: [ 43 %]Block commit: [ 53 %]Block commit: [ 63 %]Block commit: [ 73 %]Block commit: [ 82 %]Block commit: [ 89 %]Block commit: [ 98 %]Block commit: [100 %]Target Source > ------------------------------------------------ > vdb /mnt/snap/severin.sn > ... > ============================================================== > > The libvirtd-log says (it's UTC IIRC): > ============================================================= > ... > 2019-05-31 20:31:34.481+0000: 4170: error : qemuMonitorIO:719 : internal error: End of file from qemu monitor > 2019-06-01 01:05:32.233+0000: 4170: error : qemuMonitorIO:719 : internal error: End of file from qemu monitor This message is printed if qemu crashes for some reason and then closes the monitor socket unexpectedly. > 2019-06-01 01:05:43.804+0000: 22605: warning : qemuGetProcessInfo:1461 : cannot parse process status data > 2019-06-01 01:05:43.848+0000: 22596: warning : qemuGetProcessInfo:1461 : cannot parse process status data > 2019-06-01 01:06:11.438+0000: 26112: warning : qemuDomainObjBeginJobInternal:4865 : Cannot start job (destroy, none) for doma > in severin; current job is (modify, none) owned by (5372 remoteDispatchDomainBlockJobAbort, 0 <null>) for (39s, 0s) > 2019-06-01 01:06:11.438+0000: 26112: error : qemuDomainObjBeginJobInternal:4877 : Timed out during operation: cannot acquire > state change lock (held by remoteDispatchDomainBlockJobAbort) So this means that the virDomainBlockJobAbort API which is also used for --pivot got stuck for some time. This is kind of strange if the VM crashed, there might also be a bug in the synchronous block job handling, but it's hard to tell from this log. > 2019-06-01 01:06:13.976+0000: 5369: warning : qemuGetProcessInfo:1461 : cannot parse process status data > 2019-06-01 01:06:14.028+0000: 22596: warning : qemuGetProcessInfo:1461 : cannot parse process status data > 2019-06-01 01:06:44.165+0000: 5371: warning : qemuGetProcessInfo:1461 : cannot parse process status data > 2019-06-01 01:06:44.218+0000: 22605: warning : qemuGetProcessInfo:1461 : cannot parse process status data > 2019-06-01 01:07:14.343+0000: 5369: warning : qemuGetProcessInfo:1461 : cannot parse process status data > 2019-06-01 01:07:14.387+0000: 22598: warning : qemuGetProcessInfo:1461 : cannot parse process status data > 2019-06-01 01:07:44.495+0000: 22605: warning : qemuGetProcessInfo:1461 : cannot parse process status data > ... > =========================================================== > and "cannot parse process status data" continuously until the end of the logfile. > > The syslog from the domain itself didn't reveal anything, it just continues to run. > The libvirt log from the domains just says: > qemu-system-x86_64: block/mirror.c:864: mirror_run: Assertion `((&bs->tracked_requests)->lh_first == ((void *)0))' failed. So that's interresting. Usually assertion failure in qemu leads to calling abort() and thus the vm would have crashed. Didn't you HA solution restart it? At any rate it would be really beneficial if you could collect debug logs for libvirtd which also contain the monitor interactions with qemu: https://wiki.libvirt.org/page/DebugLogs The qemu assertion failure above should ideally be reported to qemu, but if you are able to reproduce the problem with libvirtd debug logs enabled I can extract more useful info from there which the qemu project would ask you anyways.
Attachment:
signature.asc
Description: PGP signature
_______________________________________________ libvirt-users mailing list libvirt-users@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvirt-users