Ping ? Does this address your concerns? On Tue, Feb 25, 2014 at 04:14:55PM +0100, Christophe Fergeau wrote: > Hey, > > On Mon, Feb 24, 2014 at 08:26:34PM +0100, Marc-André Lureau wrote: > > On Mon, Feb 24, 2014 at 6:44 PM, Christophe Fergeau <cfergeau@xxxxxxxxxx> wrote: > > > During seamless migration, after switching host, if a client was connected > > > during the migration, it will have data to send back to the new > > > qemu/spice-server instance. This is handled through MIGRATE_DATA messages. > > > SPICE char devices use such MIGRATE_DATA messages to restore their state. > > > > > > However, the MIGRATE_DATA message can arrive any time after the new qemu > > > instance has started, this can happen before or after the SPICE char > > > devices have been created. In order to handle this, if the migrate data > > > arrives early, it's stored in reds->agent_state.mig_data, and > > > attach_to_red_agent() will restore the agent state as appropriate. > > > > > > Unfortunately this does not work as expected as expected. If > > > attach_to_red_agent() is called before the MIGRATE_DATA message reaches the > > > server, all goes well, but if MIGRATE_DATA reaches the server before > > > attach_to_red_agent() gets called, then some assert() get triggered in > > > spice_char_device_state_restore(): > > > > > > ((null):32507): Spice-ERROR **: char_device.c:937:spice_char_device_state_restore: assertion `dev->num_clients == 1 && dev->wait_for_migrate_data' failed > > > Thread 3 (Thread 0x7f406b543700 (LWP 32543)): > > > Thread 2 (Thread 0x7f40697ff700 (LWP 32586)): > > > Thread 1 (Thread 0x7f4079b45a40 (LWP 32507)): > > > > > > What happens is that dev->wait_for_migrate_data is unconditionally cleared when > > > completing handling of the MIGRATE_DATA message, so it's FALSE when > > > > Isn't it going to be still FALSE after this patch? > > > > > spice_char_device_state_restore() is called. Moreover, dev->num_clients is > > > also 0 as this is only increased by spice_char_device_start() which > > > spice_server_char_device_add_interface() calls after > > > attach_to_red_agent()/spice_char_device_state_restore() have been called. > > > > I don't see how this could have changed either. > > > > > This commit changes the logic in spice_server_char_device_add_interface(), > > > and when there is migrate data pending in reds->agent_state.mig_data, we > > > only attempt to restore it after successfully initializing/starting the > > > needed char device. > > > > Sorry, I must be tired reading the server code, but I don't see how > > this could change the race you described above. I am surely missing > > something :) > > I'll have to improve the commit log as I had to reproduce the bug and poke > at the code some more in order to answer your comments :( > See below for answers to your comments above > > > > > > > > > > This fixes https://bugzilla.redhat.com/show_bug.cgi?id=1035184 > > > --- > > > server/reds.c | 26 +++++++++++++++----------- > > > 1 file changed, 15 insertions(+), 11 deletions(-) > > > > > > diff --git a/server/reds.c b/server/reds.c > > > index 1f02553..217c74e 100644 > > > --- a/server/reds.c > > > +++ b/server/reds.c > > > @@ -2856,16 +2856,7 @@ static SpiceCharDeviceState *attach_to_red_agent(SpiceCharDeviceInstance *sin) > > > state->read_filter.discard_all = FALSE; > > > reds->agent_state.plug_generation++; > > > > > > - if (reds->agent_state.mig_data) { > > > - spice_assert(reds->agent_state.plug_generation == 1); > > > - reds_agent_state_restore(reds->agent_state.mig_data); > > > - free(reds->agent_state.mig_data); > > > - reds->agent_state.mig_data = NULL; > > > - } else if (!red_channel_waits_for_migrate_data(&reds->main_channel->base)) { > > > - /* we will assoicate the client with the char device, upon reds_on_main_agent_start, > > > - * in response to MSGC_AGENT_START */ > > > - main_channel_push_agent_connected(reds->main_channel); > > > - } else { > > > + if (red_channel_waits_for_migrate_data(&reds->main_channel->base) || reds->agent_state.mig_data) { > > Note the added || reds->agent_state.mig_data here. agent_state.mig_data > will be non-NULL iff we got a MIGRATE_DATA message before calling > attach_to_red_agent(). The whole block is: > > if (red_channel_waits_for_migrate_data(&reds->main_channel->base) || reds->agent_state.mig_data) { > spice_debug("waiting for migration data"); > if (!spice_char_device_client_exists(reds->agent_state.base, reds_get_client())) { > int client_added; > > client_added = spice_char_device_client_add(reds->agent_state.base, > > This call to spice_char_device_client_add() will cause dev->num_clients to > increase > > reds_get_client(), > TRUE, /* flow control */ > REDS_VDI_PORT_NUM_RECEIVE_BUFFS, > REDS_AGENT_WINDOW_SIZE, > ~0, > TRUE); > > and this last argument set to TRUE is 'int wait_for_migrate_data' and its > value is assigned to dev->wait_for_migrate_data. > > So thanks to the added condition, the char device is in the state > spice_char_device_state_restore() expects. > > Christophe > _______________________________________________ > Spice-devel mailing list > Spice-devel@xxxxxxxxxxxxxxxxxxxxx > http://lists.freedesktop.org/mailman/listinfo/spice-devel
Attachment:
pgpSVl4dUe9rV.pgp
Description: PGP signature
_______________________________________________ Spice-devel mailing list Spice-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/spice-devel