I've noticed that in some cases systemd was quick enough and even if libvirt-guests.service is marked to be started after the libvirtd.service my guests were not resumed as libvirt-guests.sh failed to connect. This is because of a simple fact: systemd correctly starts libvirt-guests after it execs libvirtd. However, the daemon is not able to accept connections right from the start. It's doing some initialization which may take ages. This problem is not limited to systemd only, indeed. Any init system that is able to startup services in parallel (e.g. OpenRC) may run into this situation. The fix is to try connecting not only once, but continuously a few times with a small sleep in between tries. Signed-off-by: Michal Privoznik <mprivozn@xxxxxxxxxx> --- tools/libvirt-guests.sh.in | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/tools/libvirt-guests.sh.in b/tools/libvirt-guests.sh.in index 38e93c5..f14598e 100644 --- a/tools/libvirt-guests.sh.in +++ b/tools/libvirt-guests.sh.in @@ -37,6 +37,8 @@ SHUTDOWN_TIMEOUT=300 PARALLEL_SHUTDOWN=0 START_DELAY=0 BYPASS_CACHE=0 +CONNECT_RETRIES=10 +RETRIES_SLEEP=.5 test -f "$sysconfdir"/sysconfig/libvirt-guests && . "$sysconfdir"/sysconfig/libvirt-guests @@ -87,12 +89,17 @@ test_connect() { uri=$1 - run_virsh "$uri" connect 2>/dev/null - if [ $? -ne 0 ]; then - eval_gettext "Can't connect to \$uri. Skipping." - echo - return 1 - fi + for ((i = 0; i < ${CONNECT_RETRIES}; i++)); do + run_virsh "$uri" connect 2>/dev/null + if [ $? -eq 0 ]; then + return 0; + fi + sleep ${RETRIES_SLEEP} + eval_gettext "Unable to connect to libvirt currently. Retrying .. \$i" + done + eval_gettext "Can't connect to \$uri. Skipping." + echo + return 1 } # list_guests URI PERSISTENT -- 1.9.0 -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list