Hi, This is not happening every time, the last time i had this, it was a script runnning, and something like th 9. Vm and the 23. Vm had a problem, and it is not always the same VMS, it is not about the OS (happen for Windows and Linux alike) And as i said it also happened when i tried to remove the snapshots sequentially, here is the code (i know it is probably not the elegant way, but i am not a developer) and the code actually has correct indentions. <― snip ―> print "Snapshot deletion" try: time.sleep(300) Connect() vms = api.vms.list() for vm in vms: print ("Deleting snapshots for %s ") % vm.name snapshotlist = vm.snapshots.list() for snapshot in snapshotlist: if snapshot.description != "Active VM": time.sleep(30) snapshot.delete() try: while api.vms.get(name=vm.name).snapshots.get(id=snapshot.id).snapshot_status == "locked": print("Waiting for snapshot %s on %s deletion to finish") % (snapshot.description, vm.name) time.sleep(60) except Exception as e: print ("Snapshot %s does not exist anymore") % snapshot.description print ("Snapshot deletion for %s done") % vm.name print ("Deletion of snapshots done") api.disconnect() except Exception as e: print ("Something went wrong when deleting the snapshots\n%s") % str(e) <― snip ―> Cheers Soeren On 03/06/15 15:20, "Adam Litke" <alitke@xxxxxxxxxx> wrote: >On 03/06/15 07:36 +0000, Soeren Malchow wrote: >>Dear Adam >> >>First we were using a python script that was working on 4 threads and >>therefore removing 4 snapshots at the time throughout the cluster, that >>still caused problems. >> >>Now i took the snapshot removing out of the threaded part an i am just >>looping through each snapshot on each VM one after another, even with >>³sleeps² inbetween, but the problem remains. >>But i am getting the impression that it is a problem with the amount of >>snapshots that are deleted in a certain time, if i delete manually and >>one >>after another (meaning every 10 min or so) i do not have problems, if i >>delete manually and do several at once and on one VM the next one just >>after one finished, the risk seems to increase. > >Hmm. In our lab we extensively tested removing a snapshot for a VM >with 4 disks. This means 4 block jobs running simultaneously. Less >than 10 minutes later (closer to 1 minute) we would remove a second >snapshot for the same VM (again involving 4 block jobs). I guess we >should rerun this flow on a fully updated CentOS 7.1 host to see about >local reproduction. Seems your case is much simpler than this though. >Is this happening every time or intermittently? > >>I do not think it is the number of VMS because we had this on hosts with >>only 3 or 4 Vms running >> >>I will try restarting the libvirt and see what happens. >> >>We are not using RHEL 7.1 only CentOS 7.1 >> >>Is there anything else we can look at when this happens again ? > >I'll defer to Eric Blake for the libvirt side of this. Eric, would >enabling debug logging in libvirtd help to shine some light on the >problem? > >-- >Adam Litke _______________________________________________ libvirt-users mailing list libvirt-users@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvirt-users