On 5/15/22 11:48 AM, Digimer wrote:
Hi all,
I've got a series of programs that monitor various things on a CentOS
Stream 8 VM host. All of these scripts work when called directly.
However, when I have a parent program that calls all the little programs
in series, I found that some virsh calls hang.
Is your script being called from a libvirt "hook" script?
(https://libvirt.org/hooks.html )If so, that won't work - a libvirt hook
script is called from within libvirt, and can't call back into libvirt.
Other than that, is there anything different about the context the
script is being run from vs. the context you're directly running virsh from?
Initially, there were two scripts that were hanging repeatedly. Once
called 'virsh net-list --all --name', so I changed it to check for
configs in '/etc/libvirt/qemu/networks/', and that script started
working. The other script though calls 'virsh list --all', and that
can't be easily swapped out, so I really need to find the source of
these hangs.
Whenever the hang happens, about 30~45 seconds later, I see
'libvirtd[1643714]: Cannot recv data: Connection reset by peer'.
I think the issue is striking other scripts that run, but this
scenario is happening predictably and consistently right now.
I thought it might be a concurrent connect limit or a problem with
how many times virsh is called by a script, so I wrote a test script
that kept calling 'virsh list --all' each second, but it was close to
100 calls without hanging, far more that all the calls in my scripts
combined, so I don't think that's it.
Any advice/guidance would be very much appreciated!
--
Digimer
Papers and Projects:https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould