RE: [PATCH v2] vmbus: fix missed ring events on boot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Stephen Hemminger [mailto:stephen@xxxxxxxxxxxxxxxxxx]
> Sent: Saturday, March 18, 2017 9:55 PM
> To: gregkh@xxxxxxxxxxxxxxxxxxx; KY Srinivasan <kys@xxxxxxxxxxxxx>;
> Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx; devel@xxxxxxxxxxxxxxxxxxxxxx; Stephen
> Hemminger <sthemmin@xxxxxxxxxxxxx>
> Subject: [PATCH v2] vmbus: fix missed ring events on boot
> 
> During initialization, the channel initialization code schedules the
> tasklet to scan the VMBUS receive event page (i.e. simulates an
> interrupt). The problem was that it invokes the tasklet on a different
> CPU from where it normally runs and therefore if an event is present,
> it will clear the bit but not find the associated channel.
> 
> This can lead to missed events, typically stuck tasks, during bootup
> when sub channels are being initialized. Typically seen as stuck
> boot with 8 or more CPU's.
> 
> This patch is not necessary for upstream (4.11 and later) since
> commit 631e63a9f346 ("vmbus: change to per channel tasklet").
> This changed vmbus code to get rid of common tasklet which
> caused the problem.
> 
> Cc: stable@xxxxxxxxxxxxxxx
> Fixes: 638fea33aee8 ("Drivers: hv: vmbus: fix the race when querying &
> updating the percpu list")
> Signed-off-by: Stephen Hemminger <sthemmin@xxxxxxxxxxxxx>
> ---
> v2 - simplified version (only need to change one function).
>      also don't need to wait for tasklet to be scheduled on other CPU
>      add Fixes tag.
> 
>  drivers/hv/channel_mgmt.c | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> index 0af7e39006c8..63c903b00a58 100644
> --- a/drivers/hv/channel_mgmt.c
> +++ b/drivers/hv/channel_mgmt.c
> @@ -361,8 +361,19 @@ void hv_event_tasklet_enable(struct
> vmbus_channel *channel)
>  	tasklet = hv_context.event_dpc[channel->target_cpu];
>  	tasklet_enable(tasklet);
> 
> -	/* In case there is any pending event */
> -	tasklet_schedule(tasklet);
> +	/*
> +	 * In case there is any pending event schedule a rescan
> +	 * but must be on the correct CPU for the channel.
> +	 */
> +	if (channel->target_cpu == get_cpu()) {
> +		put_cpu();

We could be preempted here and end up scheduling the taklet on the wrong CPU.

> +		tasklet_schedule(tasklet);
> +	} else {
> +		smp_call_function_single(channel->target_cpu,
> +					 (smp_call_func_t)tasklet_schedule,
> +					 tasklet, false);
> +		put_cpu();
> +	}

Why not call put_cpu() at the end of the block to close the preemption window we have
earlier.

K. Y
>  }
> 
>  void hv_process_channel_removal(struct vmbus_channel *channel, u32
> relid)
> --
> 2.11.0





[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]