Originaly, the code checked if another client is the queue and infered ownership of the buck from that. Commit fa9595003d043df9f2efe95521c008 added a separate variable to track the buck. That caused, that a new call might enter claiming it has the buck, while another thread was signalled to take the buck. This ends in two threads claiming they hold the buck and entering poll(). This happens due to a race on waking up threads on the client lock mutex. This caused multi-threaded clients to hang, most prominently visible and reproducible on python based clients, like virt-manager. This patch causes threads, that have been signalled to take the buck to re-check if buck is held by another thread. --- src/rpc/virnetclient.c | 9 ++++++++- 1 files changed, 8 insertions(+), 1 deletions(-) diff --git a/src/rpc/virnetclient.c b/src/rpc/virnetclient.c index 5165c8d..07d8a60 100644 --- a/src/rpc/virnetclient.c +++ b/src/rpc/virnetclient.c @@ -1545,6 +1545,7 @@ static int virNetClientIO(virNetClientPtr client, virNetClientCallQueue(&client->waitDispatch, thiscall); /* Check to see if another thread is dispatching */ +recheck: if (client->haveTheBuck) { char ignore = 1; @@ -1592,7 +1593,13 @@ static int virNetClientIO(virNetClientPtr client, goto cleanup; } - /* Grr, someone passed the buck onto us ... */ + /* Grr, someone might have passed the buck onto us ... */ + + /* We need to re-check if the buck has been passed to this thread + * as this thread might have been signalled to wake up, but the another + * call might acquire the lock before this thread manages to wake up. + * This could cause that two threads claim they have the buck */ + goto recheck; } VIR_DEBUG("We have the buck %p %p", client->waitDispatch, thiscall); -- 1.7.3.4 -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list