On Thu, 2024-07-04 at 08:10 +0100, Daniel P. Berrangé wrote: > On Wed, Jul 03, 2024 at 02:44:37PM +0200, Tim Wiederhake wrote: > > `pthread_mutex_destroy`, `pthread_mutex_lock` and > > `pthread_mutex_unlock` > > return an error code that is currently ignored. > > > > Add debug information if one of these operations failed, e.g. when > > there > > is an attempt to destroy a still locked mutex or unlock an already > > unlocked mutex. Both scenarios are considered undefined behavior. > > > > Signed-off-by: Tim Wiederhake <twiederh@xxxxxxxxxx> > > --- > > src/util/virthread.c | 15 ++++++++++++--- > > 1 file changed, 12 insertions(+), 3 deletions(-) > > > > diff --git a/src/util/virthread.c b/src/util/virthread.c > > index 5422bb74fd..14116a2221 100644 > > --- a/src/util/virthread.c > > +++ b/src/util/virthread.c > > @@ -35,7 +35,10 @@ > > > > #include "viralloc.h" > > #include "virthreadjob.h" > > +#include "virlog.h" > > > > +#define VIR_FROM_THIS VIR_FROM_THREAD > > +VIR_LOG_INIT("util.thread"); > > > > int virOnce(virOnceControl *once, virOnceFunc init) > > { > > @@ -83,17 +86,23 @@ int virMutexInitRecursive(virMutex *m) > > > > void virMutexDestroy(virMutex *m) > > { > > - pthread_mutex_destroy(&m->lock); > > + if (pthread_mutex_destroy(&m->lock)) { > > + VIR_WARN("Failed to destroy mutex=%p", m); > > + } > > } > > > > void virMutexLock(virMutex *m) > > { > > - pthread_mutex_lock(&m->lock); > > + if (pthread_mutex_lock(&m->lock)) { > > + VIR_WARN("Failed to lock mutex=%p", m); > > + } > > } > > > > void virMutexUnlock(virMutex *m) > > { > > - pthread_mutex_unlock(&m->lock); > > + if (pthread_mutex_unlock(&m->lock)) { > > + VIR_WARN("Failed to unlock mutex=%p", m); > > + } > > } > > I'd be surprised if these lock/unlock warnings ever trigger, since > IIUC > they would need us to be using an error checking mutex, not a regular > mutex. IOW, aren't these just adding condition test overhead + > unreachable > code to the lock calls ? > > The 2nd patch shows failures in the destroy calls IIUC. > > With regards, > Daniel I have looked more closely into the issue now. pthread_mutex_lock and pthread_mutex_unlock do indeed not return a non-zero value over us not using error checking mutexes. During my last attempt at fixing the issues I had a patch that would count lockings and unlockings of mutexes explicitly, and I believe I recall seeing problems in that area as well. Sadly, I cannot reproduce that now, at least not reliably: Ignoring the warnings for pthread_mutex_destroy, virnetdaemontest does seem to trigger my "number of locks == number of unlocks" check in about 3 out of 10.000 runs. And sometimes with a frequency of 1 in 10. Sometimes not at all. In any case: I do not consider the checks for locking / unlocking dead code. So far I have been using the test suite to check for obvious issues, but I cannot rule out that libvirt itself has race conditions too. I would advocate for merging this patch as is, and add a patch to enable error checking for the mutexes. Regards, Tim