On 03/06/15 07:36 +0000, Soeren Malchow wrote:
Dear Adam First we were using a python script that was working on 4 threads and therefore removing 4 snapshots at the time throughout the cluster, that still caused problems. Now i took the snapshot removing out of the threaded part an i am just looping through each snapshot on each VM one after another, even with ³sleeps² inbetween, but the problem remains. But i am getting the impression that it is a problem with the amount of snapshots that are deleted in a certain time, if i delete manually and one after another (meaning every 10 min or so) i do not have problems, if i delete manually and do several at once and on one VM the next one just after one finished, the risk seems to increase.
Hmm. In our lab we extensively tested removing a snapshot for a VM with 4 disks. This means 4 block jobs running simultaneously. Less than 10 minutes later (closer to 1 minute) we would remove a second snapshot for the same VM (again involving 4 block jobs). I guess we should rerun this flow on a fully updated CentOS 7.1 host to see about local reproduction. Seems your case is much simpler than this though. Is this happening every time or intermittently?
I do not think it is the number of VMS because we had this on hosts with only 3 or 4 Vms running I will try restarting the libvirt and see what happens. We are not using RHEL 7.1 only CentOS 7.1 Is there anything else we can look at when this happens again ?
I'll defer to Eric Blake for the libvirt side of this. Eric, would enabling debug logging in libvirtd help to shine some light on the problem? -- Adam Litke _______________________________________________ libvirt-users mailing list libvirt-users@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvirt-users