Radim Krcmar <rkrcmar@xxxxxxxxxx> writes: > 2016-03-18 13:33+0100, Vitaly Kuznetsov: >> Kdump keeps biting. Turns out CHANNELMSG_UNLOAD_RESPONSE is always >> delivered to CPU0 regardless of what CPU we're sending CHANNELMSG_UNLOAD >> from. vmbus_wait_for_unload() doesn't account for the fact that in case >> we're crashing on some other CPU and CPU0 is still alive and operational >> CHANNELMSG_UNLOAD_RESPONSE will be delivered there completing >> vmbus_connection.unload_event, our wait on the current CPU will never >> end. > > (Any chance of learning about this behavior from the spec?) > >> Do the following: >> 1) Check for completion_done() in the loop. In case interrupt handler is >> still alive we'll get the confirmation we need. >> >> 2) Always read CPU0's message page as CHANNELMSG_UNLOAD_RESPONSE will be >> delivered there. We can race with still-alive interrupt handler doing >> the same but we don't care as we're checking completion_done() now. > > (Yeah, seems better than hv_setup_vmbus_irq(NULL) or other hacks.) > >> 3) Cleanup message pages on all CPUs. This is required (at least for the >> current CPU as we're clearing CPU0 messages now but we may want to bring >> up additional CPUs on crash) as new messages won't be delivered till we >> consume what's pending. On boot we'll place message pages somewhere else >> and we won't be able to read stale messages. > > What if HV already set the pending message bit on current message, > do we get any guarantees that clearing once after UNLOAD_RESPONSE is > enough? I think so but I'd like to get a confirmation from K.Y./Alex/Haiyang. > >> Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> >> --- > > I had a question about NULL below. (Parenthesised rants aren't related > to r-b tag. ;) > >> drivers/hv/channel_mgmt.c | 30 +++++++++++++++++++++++++----- >> 1 file changed, 25 insertions(+), 5 deletions(-) >> >> diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c >> index b10e8f74..5f37057 100644 >> --- a/drivers/hv/channel_mgmt.c >> +++ b/drivers/hv/channel_mgmt.c >> @@ -512,14 +512,26 @@ static void init_vp_index(struct vmbus_channel *channel, const uuid_le *type_gui >> >> static void vmbus_wait_for_unload(void) >> { >> - int cpu = smp_processor_id(); >> - void *page_addr = hv_context.synic_message_page[cpu]; >> + int cpu; >> + void *page_addr = hv_context.synic_message_page[0]; >> struct hv_message *msg = (struct hv_message *)page_addr + >> VMBUS_MESSAGE_SINT; >> struct vmbus_channel_message_header *hdr; >> bool unloaded = false; >> >> - while (1) { >> + /* >> + * CHANNELMSG_UNLOAD_RESPONSE is always delivered to CPU0. When we're >> + * crashing on a different CPU let's hope that IRQ handler on CPU0 is >> + * still functional and vmbus_unload_response() will complete >> + * vmbus_connection.unload_event. If not, the last thing we can do is >> + * read message page for CPU0 regardless of what CPU we're on. >> + */ >> + while (!unloaded) { > > (I'd feel a bit safer if this was bounded by some timeout, but all > scenarios where this would make a difference are unplausible ... > queue_work() not working while the rest is fine is the best one.) > >> + if (completion_done(&vmbus_connection.unload_event)) { >> + unloaded = true; > > (No need to set unloaded when you break.) > >> + break; >> + } >> + >> if (READ_ONCE(msg->header.message_type) == HVMSG_NONE) { >> mdelay(10); >> continue; >> @@ -530,9 +542,17 @@ static void vmbus_wait_for_unload(void) > > (I'm not a huge fan of the unloaded variable; what about remembering the > header/msgtype here ... > >> unloaded = true; >> >> vmbus_signal_eom(msg); > > and checking its value here?) > Sure, but we'll have to use a variable for that ... why would it be better than 'unloaded'? >> + } >> >> - if (unloaded) >> - break; >> + /* >> + * We're crashing and already got the UNLOAD_RESPONSE, cleanup all >> + * maybe-pending messages on all CPUs to be able to receive new >> + * messages after we reconnect. >> + */ >> + for_each_online_cpu(cpu) { >> + page_addr = hv_context.synic_message_page[cpu]; > > Can't this be NULL? It can't, we allocate it from hv_synic_alloc() (and we don't support cpu onlining/offlining on WS2012R2+). > >> + msg = (struct hv_message *)page_addr + VMBUS_MESSAGE_SINT; >> + msg->header.message_type = HVMSG_NONE; >> } > > (And, this block belongs to a separate function. ;]) I thought about moving it to hv_crash_handler() but then I decided to leave it here as the need for this fixup is rather an artifact of how we recieve the message. But I'm flexible here) -- Vitaly _______________________________________________ devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxx http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel