Re: [PATCH 3/3] rpc: Rework rpc notifications in main and side thread

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Dec 15, 2023 at 02:32:19AM +0800, Fima Shevrin wrote:
> RPC client implementation uses the following paradigm. The critical
> section is organized via virObjectLock(client)/virObjectUnlock(client)
> braces. Though this is potentially problematic as
>     main thread:                            side thread:
>     virObjectUnlock(client);
>                                         virObjectLock(client);
>                                         g_main_loop_quit(client->eventLoop);
>                                         virObjectUnlock(client);
>     g_main_loop_run(client->eventLoop);
> 
> This means in particular that is the main thread is executing very long
> request like VM migration, the wakeup from the side thread could be
> stuck until the main request will be fully completed.

Ah, I understand this now. The flag set from g_main_loop_quit() is
ignored and overwritten by g_main_loop_run(). So the interruption
from the side thread never happens.

Your approach to solving this is by delaying the virObjectUnlock
call until g_main_loop_run is running, which is clever but I don't
much like playing games with the mutex locking.

We need a way to interrupt the main loop, even if it hasn't started
running yet. The previous impl in libvirt used a pipe for this trick,
effectively as a dummy event source that would be watched.

I think we can do the same here, though without needing an actual
pipe which causes Windows portability problems.

Glib already has an internal mechansim for breaking out of poll(),
via its (private) GWakeup object which encapsulates a pipe. We just
need a way of triggering this mechanism.

I believe this can be achieved using just an idle source ie

static gboolean
dummy_cb(void *opaque)
{
    return G_SOURCE_REMOVE;
}

...

   GSource *dummy = g_idle_source_new()
   g_source_set_callback(dummy, dummy_cb, NULL, NULL);
   g_source_attach(dummy, client->eventCtx);


The g_source_attach() method is what breaks the main loop out of
its poll() sleep. If it wasn't currently in poll, it is effectively
a no-op.

> 
> Discrubed case is easily reproducible with the simple python scripts doing slow
> and fast requests in parallel from two different threads.
> 
> Our idea is to release the lock at the prepare stage and avoid libvirt stuck
> during the interaction between main and side threads.
> 
> Co-authored-by: Denis V. Lunev <den@xxxxxxxxxx>
> Co-authored-by: Nikolai Barybin <nikolai.barybin@xxxxxxxxxxxxx>
> 
> Signed-off-by: Fima Shevrin <efim.shevrin@xxxxxxxxxxxxx>
> ---
>  src/rpc/virnetclient.c       | 17 ++++++++++++-----
>  src/util/vireventglibwatch.c | 28 ++++++++++++++++++++++++++--
>  2 files changed, 38 insertions(+), 7 deletions(-)
> 
> diff --git a/src/rpc/virnetclient.c b/src/rpc/virnetclient.c
> index de8ebc2da9..63bd42ed3a 100644
> --- a/src/rpc/virnetclient.c
> +++ b/src/rpc/virnetclient.c
> @@ -987,6 +987,9 @@ int virNetClientSetTLSSession(virNetClient *client,
>       * etc.  If we make the grade, it will send us a '\1' byte.
>       */
>  
> +    /* Here we are not passing the client to virEventGLibAddSocketWatch,
> +     * since the entire virNetClientSetTLSSession function requires a lock.
> +     */
>      source = virEventGLibAddSocketWatch(virNetSocketGetFD(client->sock),
>                                          G_IO_IN,
>                                          client->eventCtx,
> @@ -1692,14 +1695,18 @@ static int virNetClientIOEventLoop(virNetClient *client,
>          if (client->nstreams)
>              ev |= G_IO_IN;
>  
> +        /*
> +         * We don't need to call virObjectLock(client) here,
> +         * since the .prepare function inside glib Main Loop
> +         * will do this. virEventGLibAddSocketWatch is responsible
> +         * for passing client var in glib .prepare
> +         */
>          source = virEventGLibAddSocketWatch(virNetSocketGetFD(client->sock),
>                                              ev,
>                                              client->eventCtx,
> -                                            virNetClientIOEventFD, &data, NULL, NULL);
> -
> -        /* Release lock while poll'ing so other threads
> -         * can stuff themselves on the queue */
> -        virObjectUnlock(client);
> +                                            virNetClientIOEventFD, &data,
> +                                            (virObjectLockable *)client,
> +                                            NULL);
>  
>  #ifndef WIN32
>          /* Block SIGWINCH from interrupting poll in curses programs,
> diff --git a/src/util/vireventglibwatch.c b/src/util/vireventglibwatch.c
> index 7680656ba2..641b772995 100644
> --- a/src/util/vireventglibwatch.c
> +++ b/src/util/vireventglibwatch.c
> @@ -34,11 +34,23 @@ struct virEventGLibFDSource {
>  
>  
>  static gboolean
> -virEventGLibFDSourcePrepare(GSource *source G_GNUC_UNUSED,
> +virEventGLibFDSourcePrepare(GSource *source,
>                              gint *timeout)
>  {
> +    virEventGLibFDSource *ssource = (virEventGLibFDSource *)source;
>      *timeout = -1;
>  
> +    if (ssource->client != NULL)
> +        virObjectUnlock(ssource->client);
> +
> +    /*
> +     * Prepare function may be called multiple times
> +     * in glib Main Loop, thus we assign source->client
> +     * a null pointer to avoid calling pthread_mutex_unlock
> +     * on an already unlocked mutex.
> +     * */
> +    ssource->client = NULL;
> +
>      return FALSE;
>  }
>  
> @@ -123,11 +135,23 @@ struct virEventGLibSocketSource {
>  
>  
>  static gboolean
> -virEventGLibSocketSourcePrepare(GSource *source G_GNUC_UNUSED,
> +virEventGLibSocketSourcePrepare(GSource *source,
>                                  gint *timeout)
>  {
> +    virEventGLibSocketSource *ssource = (virEventGLibSocketSource *)source;
>      *timeout = -1;
>  
> +    if (ssource->client != NULL)
> +        virObjectUnlock(ssource->client);
> +
> +    /*
> +     * Prepare function may be called multiple times
> +     * in glib Main Loop, thus we assign source->client
> +     * a null pointer to avoid calling pthread_mutex_unlock
> +     * on an already unlocked mutex.
> +     * */
> +    ssource->client = NULL;
> +
>      return FALSE;
>  }
>  
> -- 
> 2.34.1
> _______________________________________________
> Devel mailing list -- devel@xxxxxxxxxxxxxxxxx
> To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxx

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
_______________________________________________
Devel mailing list -- devel@xxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxx




[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]

  Powered by Linux