Dear Adam First we were using a python script that was working on 4 threads and therefore removing 4 snapshots at the time throughout the cluster, that still caused problems. Now i took the snapshot removing out of the threaded part an i am just looping through each snapshot on each VM one after another, even with ³sleeps² inbetween, but the problem remains. But i am getting the impression that it is a problem with the amount of snapshots that are deleted in a certain time, if i delete manually and one after another (meaning every 10 min or so) i do not have problems, if i delete manually and do several at once and on one VM the next one just after one finished, the risk seems to increase. I do not think it is the number of VMS because we had this on hosts with only 3 or 4 Vms running I will try restarting the libvirt and see what happens. We are not using RHEL 7.1 only CentOS 7.1 Is there anything else we can look at when this happens again ? Regards Soeren On 02/06/15 18:53, "Adam Litke" <alitke@xxxxxxxxxx> wrote: >Hello Soeren. > >I've started to look at this issue and I'd agree that at first glance >it looks like a libvirt issue. The 'cannot acquire state change lock' >messages suggest a locking bug or severe contention at least. To help >me better understand the problem I have a few questions about your >setup. > >From your earlier report it appears that you have 15 VMs running on >the failing host. Are you attempting to remove snapshots from all VMs >at the same time? Have you tried with fewer concurrent operations? >I'd be curious to understand if the problem is connected to the >number of VMs running or the number of active block jobs. > >Have you tried RHEL-7.1 as a hypervisor host? > >Rather than rebooting the host, does restarting libvirtd cause the VMs >to become responsive again? Note that this operation may cause the >host to move to Unresponsive state in the UI for a short period of >time. > >Thanks for your report. > >On 31/05/15 23:39 +0000, Soeren Malchow wrote: >>And sorry, another update, it does kill the VM partly, it was still >>pingable when i wrote the last mail, but no ssh and no spice console >>possible >> >>From: Soeren Malchow >><soeren.malchow@xxxxxxxx<mailto:soeren.malchow@xxxxxxxx>> >>Date: Monday 1 June 2015 01:35 >>To: Soeren Malchow >><soeren.malchow@xxxxxxxx<mailto:soeren.malchow@xxxxxxxx>>, >>"libvirt-users@xxxxxxxxxx<mailto:libvirt-users@xxxxxxxxxx>" >><libvirt-users@xxxxxxxxxx<mailto:libvirt-users@xxxxxxxxxx>>, users >><users@xxxxxxxxx<mailto:users@xxxxxxxxx>> >>Subject: Re: [ovirt-users] Bug in Snapshot Removing >> >>Small addition again: >> >>This error shows up in the log while removing snapshots WITHOUT >>rendering the Vms unresponsive >> >>‹ >>Jun 01 01:33:45 mc-dc3ham-compute-02-live.mc.mcon.net libvirtd[1657]: >>Timed out during operation: cannot acquire state change lock >>Jun 01 01:33:45 mc-dc3ham-compute-02-live.mc.mcon.net vdsm[6839]: vdsm >>vm.Vm ERROR vmId=`56848f4a-cd73-4eda-bf79-7eb80ae569a9`::Error getting >>block job info >> >>Traceback (most recent call last): >> File >>"/usr/share/vdsm/virt/vm.py", line 5759, in queryBlockJobsŠ >> >>‹ >> >> >> >>From: Soeren Malchow >><soeren.malchow@xxxxxxxx<mailto:soeren.malchow@xxxxxxxx>> >>Date: Monday 1 June 2015 00:56 >>To: "libvirt-users@xxxxxxxxxx<mailto:libvirt-users@xxxxxxxxxx>" >><libvirt-users@xxxxxxxxxx<mailto:libvirt-users@xxxxxxxxxx>>, users >><users@xxxxxxxxx<mailto:users@xxxxxxxxx>> >>Subject: [ovirt-users] Bug in Snapshot Removing >> >>Dear all >> >>I am not sure if the mail just did not get any attention between all the >>mails and this time it is also going to the libvirt mailing list. >> >>I am experiencing a problem with VM becoming unresponsive when removing >>Snapshots (Live Merge) and i think there is a serious problem. >> >>Here are the previous mails, >> >>http://lists.ovirt.org/pipermail/users/2015-May/033083.html >> >>The problem is on a system with everything on the latest version, CentOS >>7.1 and ovirt 3.5.2.1 all upgrades applied. >> >>This Problem did NOT exist before upgrading to CentOS 7.1 with an >>environment running ovirt 3.5.0 and 3.5.1 and Fedora 20 with the >>libvirt-preview repo activated. >> >>I think this is a bug in libvirt, not ovirt itself, but i am not sure. >>The actual file throwing the exception is in VDSM >>(/usr/share/vdsm/virt/vm.py, line 697). >> >>We are very willing to help, test and supply log files in anyway we can. >> >>Regards >>Soeren >> > >>_______________________________________________ >>Users mailing list >>Users@xxxxxxxxx >>http://lists.ovirt.org/mailman/listinfo/users > > >-- >Adam Litke _______________________________________________ libvirt-users mailing list libvirt-users@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvirt-users