On Fri, Mar 05, 2010 at 03:33:10PM -0500, Laine Stump wrote: > I applied this patch and tried it out. It appears to work as > advertised, which is useful. Would be even better if there was some > higher level handling to automatically retry the commands that fail > due to sigpipe. > > Beyond that, I noticed a few typos, but don't have enough expertise > about signals to know the definitive answer about resetting the > signal handler (multiple times does work properly for me, so at > least on Linux it seems to be unnecessary). > > On 03/05/2010 05:11 AM, Daniel Veillard wrote: > > This is a usability issue for virsh in case of disconnections, > >for example if the remote libvirtd is restarted: > > https://bugzilla.redhat.com/show_bug.cgi?id=526656 > > > >the patch catch those and tries to automatically reconnect instead > >of virsh exitting. The typical interaction with this patch is that > > s/exitting/exiting/ > > >the command fails, but virsh automatically reconnects instead of > >exitting, but it won't try to restart the failed command (since this > > s/exitting/exiting/ > > >could have significant side effects). Example of such interraction: > > s/interraction/interaction/ > > (your key repeat setting is too quick ;-) I blame my new expresso machine, delivering way more caffeeine for the same amount (and brand) of coffee than the previous one, I'm still in the tuning phase >:-> > >------------------------------------------------------- > > > >virsh # list --all > > Id Name State > >---------------------------------- > > - RHEL-5.4-64 shut off > > - WinXP shut off > > > >virsh # list --all > >error: Failed to list active domains > >error: cannot send data: Broken pipe > >error: Reconnected to the hypervisor > > > >virsh # list --all > > Id Name State > >---------------------------------- > > - RHEL-5.4-64 shut off > > - WinXP shut off > > > >virsh # > > > >------------------------------------------------------- > > > > The only thing I'm unsure is if the signal handler should be reset > >once it was received once. I don't in this patch and that seems to > >work fine, but I somehow remember the fact that in some circumstances > >a signal handler needs to be rearmed when received once. As is this > >seems to work fine with SIGPIPE and linux. > > > > > > > >Make virsh reconnect when loosing connection > > s/loosing/losing/ okay > >Right now when the connected libvirtd restarts virsh gets a SIGPIPE > >and dies, this change the behaviour to try to reconnect if the > >signal was received or command error indicated a connection or RPC > >failure. Note that the failing command is not restarted. > > > >* tools/virsh.c: catch SIGPIPE signals as well as connection related > > failures, add some automatic reconnection code and appropriate error > > messages. > > > >diff --git a/tools/virsh.c b/tools/virsh.c > >index 65487ed..2ec9cfc 100644 > >--- a/tools/virsh.c > >+++ b/tools/virsh.c > >@@ -30,6 +30,7 @@ > > #include<errno.h> > > #include<sys/stat.h> > > #include<inttypes.h> > >+#include<signal.h> > > > > #include<libxml/parser.h> > > #include<libxml/tree.h> > >@@ -397,6 +398,58 @@ out: > > last_error = NULL; > > } > > > >+/* > >+ * Detection of disconnections and automatic reconnection support > >+ */ > >+static int disconnected = 0; /* we may have been disconnected */ > >+ > >+/* > >+ * vshCatchDisconnect: > >+ * > >+ * We get here when a SIGPIPE is being raised, we can't do much in the > >+ * handler, just save the fact it was raised > >+ */ > >+static void vshCatchDisconnect(int sig, siginfo_t * siginfo, > >+ void* context ATTRIBUTE_UNUSED) { > >+ if ((sig == SIGPIPE) || (siginfo->si_signo == SIGPIPE)) > >+ disconnected++; > >+} > >+ > >+/* > >+ * vshSetupSignals: > >+ * > >+ * Catch SIGPIPE signals which may arise when desconnection from libvirtd occurs */ > > s/desconnection/disconnection/ also split in 2 lines, it was too long > >+static int > >+vshSetupSignals(void) { > >+ struct sigaction sig_action; > >+ > >+ sig_action.sa_sigaction = vshCatchDisconnect; > >+ sig_action.sa_flags = SA_SIGINFO; > >+ sigemptyset(&sig_action.sa_mask); > >+ > >+ sigaction(SIGPIPE,&sig_action, NULL); > >+} > >+ > >+/* > >+ * vshReconnect: > >+ * > >+ * Reconnect after an > >+ * > >+ */ > >+static int > >+vshReconnect(vshControl *ctl) { > >+ if (ctl->conn != NULL) > >+ virConnectClose(ctl->conn); > >+ > >+ ctl->conn = virConnectOpenAuth(ctl->name, > >+ virConnectAuthPtrDefault, > >+ ctl->readonly ? VIR_CONNECT_RO : 0); > >+ if (!ctl->conn) > >+ vshError(ctl, "%s", _("Failed to reconnect to the hypervisor")); > >+ else > >+ vshError(ctl, "%s", _("Reconnected to the hypervisor")); > >+ disconnected = 0; > >+} > > > > /* --------------- > > * Commands > >@@ -8332,6 +8385,9 @@ vshCommandRun(vshControl *ctl, const vshCmd *cmd) > > while (cmd) { > > struct timeval before, after; > > > >+ if ((ctl->conn == NULL) || (disconnected != 0)) > >+ vshReconnect(ctl); > >+ > > if (ctl->timing) > > GETTIMEOFDAY(&before); > > > >@@ -8343,6 +8399,17 @@ vshCommandRun(vshControl *ctl, const vshCmd *cmd) > > if (ret == FALSE) > > virshReportError(ctl); > > > >+ /* try to catch automatically disconnections */ > > s/catch automatically/automatically catch/ okay > >+ ((disconnected != 0) || > >+ ((last_error != NULL)&& > >+ (((last_error->code == VIR_ERR_SYSTEM_ERROR)&& > >+ (last_error->domain == 13)) || > > 13 == VIR_FROM_REMOTE, right? Why use the literal? no reason, fixed ! > >+ (last_error->code == VIR_ERR_RPC) || > >+ (last_error->code == VIR_ERR_NO_CONNECT) || > >+ (last_error->code == VIR_ERR_INVALID_CONN))))) > >+ vshReconnect(ctl); > >+ > > if (STREQ(cmd->def->name, "quit")) /* hack ... */ > > return ret; > > > >@@ -8673,9 +8740,11 @@ vshError(vshControl *ctl, const char *format, ...) > > { > > va_list ap; > > > >- va_start(ap, format); > >- vshOutputLogFile(ctl, VSH_ERR_ERROR, format, ap); > >- va_end(ap); > >+ if (ctl != NULL) { > >+ va_start(ap, format); > >+ vshOutputLogFile(ctl, VSH_ERR_ERROR, format, ap); > >+ va_end(ap); > >+ } > > > > fputs(_("error: "), stderr); > > > >@@ -8751,6 +8820,9 @@ vshInit(vshControl *ctl) > > /* set up the library error handler */ > > virSetErrorFunc(NULL, virshErrorHandler); > > > >+ /* set up the signals handlers to catch disconnections */ > >+ vshSetupSignals(); > >+ > > ctl->conn = virConnectOpenAuth(ctl->name, > > virConnectAuthPtrDefault, > > ctl->readonly ? VIR_CONNECT_RO : 0); > > Okay, applied and pushed, thanks for the feedback ! Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@xxxxxxxxxxxx | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/ -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list