The patch works well. Thanks for your reply. Sorry to reply too late. I'm kinda of busy with other jobs recently. > -----Original Message----- > From: Daniel P. Berrange [mailto:berrange@xxxxxxxxxx] > Sent: Monday, October 28, 2013 7:55 PM > To: Wangyufei (A) > Cc: libvir-list@xxxxxxxxxx; Xuchao (H); Wangrui (K) > Subject: Re: When vm's status file being left over, some persistent > but inactive vms will be lost by libvirtd after libvirtd rebooting. > > On Fri, Oct 18, 2013 at 03:00:22AM +0000, Wangyufei (A) wrote: > > Hello, > > I found a problem that: > > vm's status file may be left over in the path /var/run/libvirt/qemu under > some situation, such as host reboot. When vm's status file is left over, some > > persistent but inactive vms will be lost by libvirtd after it is rebooted. And > you can do as follows to reproduce the problem: > > 1、Create a vm and start it by the commands: virsh define vm-xml and > virsh start vm-name. > > 2、Stop the libvirtd by the command: service libvirtd stop. > > 3、Kill the qemu process related to the vm, and make the vm's status file > left over. > > 4、Start libvirtd. > > After starting the libvirtd service, we find that the vm has been lost by > libvirtd with command"virsh list --all". > > What we expect is that the vm is shown with shutoff status, should we? > > > > The reason for the problem is that: > > During libvirtd startup, it first loads status files of vms under the path > /var/run/libvirt/qemu, creates virDomainObj for each vm and adds it to > > driver->domains list. > > Then it creates a thread to connect related qemu process for each > virDomainObj in the domains list.Because the qemu process has been killed, > so connecting to > > qemu will be failed. When connecting to qemu failed, connection-thread > will do the follows: > > 1、Check if vm->persistent is 1. > > 2、If vm->persistent is not 1, then qemuDomainRemoveInactive() is > called to remove the virDomainObj. > > 3、Then the following calling sequence will > occur:qemuDomainRemoveInactive() > -->virDomainObjListRemove()-->virHashRemoveEntry(). Around > virHashRemoveEntry(), > > domlist and dom will be locked and unlocked sequencely. > > The problem of the above steps is that vm->persistent maybe has been > set to 1 by libvirtd main-thread when connection-thread calling > virHashRemoveEntry() to > > remove the dom. That is a persistent virDomainObj is removed during > libvirtd startup. > > > > Two ways can resolve the above problem: > > 1、expending the range of locking virDomainObj and virDomainObjList, > lock the object of virDomainObj and virDomainObjList in connection-thread > before checking vm->persistent. > > 2、checking vm->persistent again before calling virHashRemoveEntry(). > > > > Do you think it is a problem described above and which way listed > above is more suitable to resolve the problem, or is there any other better > idea? Any suggestions? > > The problem here is really that we should have loaded the persistent > configs before we started the thread to reconnect. That ensures that > the VM is marked persistent before the thread runs. > > Can you test the patch I've just sent for this. > > BTW, also please configure your email client to add line breaks at 80 > characters or less. > > > Daniel > -- > |: http://berrange.com -o- > http://www.flickr.com/photos/dberrange/ :| > |: http://libvirt.org -o- > http://virt-manager.org :| > |: http://autobuild.org -o- > http://search.cpan.org/~danberr/ :| > |: http://entangle-photo.org -o- > http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list