On 07.09.2012 14:47, Daniel P. Berrange wrote: > On Fri, Sep 07, 2012 at 02:44:03PM +0200, Christophe Fergeau wrote: >> On Fri, Sep 07, 2012 at 01:24:35PM +0100, Daniel P. Berrange wrote: >>> A nice long detailed explanation. I agree that this scenario you >>> outline is plausible as an explanation for why Boxes sometimes >>> stops getting events from libvirtd. >> >> I've ran more tests in the mean time without this patch applied, but >> with the one below to add some debugging: >> >> diff --git a/src/conf/domain_event.c b/src/conf/domain_event.c >> index 43ecdcf..33d90fb 100644 >> --- a/src/conf/domain_event.c >> +++ b/src/conf/domain_event.c >> @@ -1501,7 +1501,13 @@ virDomainEventStateRegisterID(virConnectPtr conn, >> int ret = -1; >> >> virDomainEventStateLock(state); >> + VIR_WARN("RegisterID"); [1] >> >> + if ((state->callbacks->count == 0) && (state->timer == -1)) { >> + if (state->queue->count != 0) { >> + VIR_WARN("REG: queue's not empty: %d", state->queue->count); I understand the WARN level here but not in [1]. Isn't DEBUG just enough there? >> + } >> + } >> if ((state->callbacks->count == 0) && >> (state->timer == -1) && >> (state->timer = virEventAddTimeout(-1, >> @@ -1584,6 +1590,7 @@ virDomainEventStateDeregisterID(virConnectPtr conn, >> { >> int ret; >> >> + VIR_WARN("DeregisterID"); ditto >> virDomainEventStateLock(state); >> if (state->isDispatching) >> ret = virDomainEventCallbackListMarkDeleteID(conn, >> @@ -1596,6 +1603,9 @@ virDomainEventStateDeregisterID(virConnectPtr conn, >> state->timer != -1) { >> virEventRemoveTimeout(state->timer); >> state->timer = -1; >> + if (state->queue->count != 0) { >> + VIR_WARN("DEREG: queue's not empty: %d", state->queue->count); >> + } >> } >> >> virDomainEventStateUnlock(state); >> >> >> I've hit the event lost issue once, and right when this started happening, >> the log was: >> 2012-09-06 11:37:06.094+0000: 30498: warning : >> virDomainEventStateDeregisterID:1593 : DeregisterID >> 2012-09-06 11:37:06.094+0000: 30498: warning : >> virDomainEventStateDeregisterID:1607 : DEREG: queue's not empty: 1 >> 2012-09-06 11:45:42.363+0000: 30502: warning : >> virDomainEventStateRegisterID:1504 : RegisterID >> 2012-09-06 11:45:42.363+0000: 30502: warning : >> virDomainEventStateRegisterID:1508 : REG: queue's not empty: 1 >> >> and after that, no events and these warnings kept happening with an >> increasing number of queued events which is consistent with the hypothesis I made >> in this patch. > > Great, that's pretty encouraging then. > > Daniel > -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list