Re: [PATCH] migration: Don't assert() if MIGRATE_DATA comes before attaching the agent

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ping ?
Does this address your concerns?

On Tue, Feb 25, 2014 at 04:14:55PM +0100, Christophe Fergeau wrote:
> Hey,
> 
> On Mon, Feb 24, 2014 at 08:26:34PM +0100, Marc-André Lureau wrote:
> > On Mon, Feb 24, 2014 at 6:44 PM, Christophe Fergeau <cfergeau@xxxxxxxxxx> wrote:
> > > During seamless migration, after switching host, if a client was connected
> > > during the migration, it will have data to send back to the new
> > > qemu/spice-server instance. This is handled through MIGRATE_DATA messages.
> > > SPICE char devices use such MIGRATE_DATA messages to restore their state.
> > >
> > > However, the MIGRATE_DATA message can arrive any time after the new qemu
> > > instance has started, this can happen before or after the SPICE char
> > > devices have been created. In order to handle this, if the migrate data
> > > arrives early, it's stored in reds->agent_state.mig_data, and
> > > attach_to_red_agent() will restore the agent state as appropriate.
> > >
> > > Unfortunately this does not work as expected as expected. If
> > > attach_to_red_agent() is called before the MIGRATE_DATA message reaches the
> > > server, all goes well, but if MIGRATE_DATA reaches the server before
> > > attach_to_red_agent() gets called, then some assert() get triggered in
> > > spice_char_device_state_restore():
> > >
> > > ((null):32507): Spice-ERROR **: char_device.c:937:spice_char_device_state_restore: assertion `dev->num_clients == 1 && dev->wait_for_migrate_data' failed
> > > Thread 3 (Thread 0x7f406b543700 (LWP 32543)):
> > > Thread 2 (Thread 0x7f40697ff700 (LWP 32586)):
> > > Thread 1 (Thread 0x7f4079b45a40 (LWP 32507)):
> > >
> > > What happens is that dev->wait_for_migrate_data is unconditionally cleared when
> > > completing handling of the MIGRATE_DATA message, so it's FALSE when
> > 
> > Isn't it going to be still FALSE after this patch?
> > 
> > > spice_char_device_state_restore() is called. Moreover, dev->num_clients is
> > > also 0 as this is only increased by spice_char_device_start() which
> > > spice_server_char_device_add_interface() calls after
> > > attach_to_red_agent()/spice_char_device_state_restore() have been called.
> > 
> > I don't see how this could have changed either.
> > 
> > > This commit changes the logic in spice_server_char_device_add_interface(),
> > > and when there is migrate data pending in reds->agent_state.mig_data, we
> > > only attempt to restore it after successfully initializing/starting the
> > > needed char device.
> > 
> > Sorry, I must be tired reading the server code, but I don't see how
> > this could change the race you described above. I am surely missing
> > something :)
> 
> I'll have to improve the commit log as I had to reproduce the bug and poke
> at the code some more in order to answer your comments :(
> See below for answers to your comments above
> 
> 
> > 
> > >
> > > This fixes https://bugzilla.redhat.com/show_bug.cgi?id=1035184
> > > ---
> > >  server/reds.c | 26 +++++++++++++++-----------
> > >  1 file changed, 15 insertions(+), 11 deletions(-)
> > >
> > > diff --git a/server/reds.c b/server/reds.c
> > > index 1f02553..217c74e 100644
> > > --- a/server/reds.c
> > > +++ b/server/reds.c
> > > @@ -2856,16 +2856,7 @@ static SpiceCharDeviceState *attach_to_red_agent(SpiceCharDeviceInstance *sin)
> > >      state->read_filter.discard_all = FALSE;
> > >      reds->agent_state.plug_generation++;
> > >
> > > -    if (reds->agent_state.mig_data) {
> > > -        spice_assert(reds->agent_state.plug_generation == 1);
> > > -        reds_agent_state_restore(reds->agent_state.mig_data);
> > > -        free(reds->agent_state.mig_data);
> > > -        reds->agent_state.mig_data = NULL;
> > > -    } else if (!red_channel_waits_for_migrate_data(&reds->main_channel->base)) {
> > > -        /* we will assoicate the client with the char device, upon reds_on_main_agent_start,
> > > -         * in response to MSGC_AGENT_START */
> > > -        main_channel_push_agent_connected(reds->main_channel);
> > > -    } else {
> > > +    if (red_channel_waits_for_migrate_data(&reds->main_channel->base) || reds->agent_state.mig_data) {
> 
> Note the added  || reds->agent_state.mig_data here. agent_state.mig_data
> will be non-NULL iff we got a MIGRATE_DATA message before calling
> attach_to_red_agent(). The whole block is:
> 
>     if (red_channel_waits_for_migrate_data(&reds->main_channel->base) || reds->agent_state.mig_data) {
>        spice_debug("waiting for migration data");
>         if (!spice_char_device_client_exists(reds->agent_state.base, reds_get_client())) {
>             int client_added;
> 
>             client_added = spice_char_device_client_add(reds->agent_state.base,
> 
> This call to spice_char_device_client_add() will cause dev->num_clients to
> increase
> 
>                                                         reds_get_client(),
>                                                         TRUE, /* flow control */
>                                                         REDS_VDI_PORT_NUM_RECEIVE_BUFFS,
>                                                         REDS_AGENT_WINDOW_SIZE,
>                                                         ~0,
>                                                         TRUE);
> 
> and this last argument set to TRUE is 'int wait_for_migrate_data' and its
> value is assigned to dev->wait_for_migrate_data.
> 
> So thanks to the added condition, the char device is in the state
> spice_char_device_state_restore() expects.
> 
> Christophe



> _______________________________________________
> Spice-devel mailing list
> Spice-devel@xxxxxxxxxxxxxxxxxxxxx
> http://lists.freedesktop.org/mailman/listinfo/spice-devel

Attachment: pgpSVl4dUe9rV.pgp
Description: PGP signature

_______________________________________________
Spice-devel mailing list
Spice-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/spice-devel

[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]     [Monitors]