RE: [PATCH] Drivers: hv: kvp/vss: Avoid accessing a ringbuffer not initialized yet

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Dexuan Cui <decui@xxxxxxxxxxxxx> Sent: Monday, September 9, 2024 9:47 AM
> 
> If the KVP (or VSS) daemon starts before the VMBus channel's ringbuffer is
> fully initialized, we can hit the panic below:
> 
> hv_utils: Registering HyperV Utility Driver
> hv_vmbus: registering driver hv_utils
> ...
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> CPU: 44 UID: 0 PID: 2552 Comm: hv_kvp_daemon Tainted: G E 6.11.0-rc3+ #1
> RIP: 0010:hv_pkt_iter_first+0x12/0xd0
> Call Trace:
> ...
>  vmbus_recvpacket
>  hv_kvp_onchannelcallback
>  vmbus_on_event
>  tasklet_action_common
>  tasklet_action
>  handle_softirqs
>  irq_exit_rcu
>  sysvec_hyperv_stimer0
>  </IRQ>
>  <TASK>
>  asm_sysvec_hyperv_stimer0
> ...
>  kvp_register_done
>  hvt_op_read
>  vfs_read
>  ksys_read
>  __x64_sys_read
> 
> This can happen because the KVP/VSS channel callback can be invoked
> even before the channel is fully opened:
> 1) as soon as hv_kvp_init() -> hvutil_transport_init() creates
> /dev/vmbus/hv_kvp, the kvp daemon can open the device file immediately and
> register itself to the driver by writing a message KVP_OP_REGISTER1 to the
> file (which is handled by kvp_on_msg() ->kvp_handle_handshake()) and
> reading the file for the driver's response, which is handled by
> hvt_op_read(), which calls hvt->on_read(), i.e. kvp_register_done().
> 
> 2) the problem with kvp_register_done() is that it can cause the
> channel callback to be called even before the channel is fully opened,
> and when the channel callback is starting to run, util_probe()->
> vmbus_open() may have not initialized the ringbuffer yet, so the
> callback can hit the panic of NULL pointer dereference.
> 
> To reproduce the panic consistently, we can add a "ssleep(10)" for KVP in
> __vmbus_open(), just before the first hv_ringbuffer_init(), and then we
> unload and reload the driver hv_utils, and run the daemon manually within
> the 10 seconds.
> 
> Fix the panic by checking the channel state in the channel callback.
> To avoid the race condition with __vmbus_open(), we disable and enable
> the channel callback temporarily in __vmbus_open().
> 
> The channel callbacks of the other VMBus devices don't need to check
> the channel state since they can't run before the channels are fully
> initialized.
> 
> Note: we would also need to fix the fcopy driver code, but that has
> been removed in commit ec314f61e4fc ("Drivers: hv: Remove fcopy driver") in
> the mainline kernel since v6.10. For old 6.x LTS kernels, and the 5.x
> and 4.x LTS kernels, the fcopy driver needs to be fixed when the
> fix is backported to the stable kernel branches.

An alternate approach occurs to me. util_probe() does these three
things in order:

1) Allocates the receive buffer
2) Calls the util_init() function, which for KVP and VSS creates the char dev
3) Sets up the VMBus channel, including calling vmbus_open()

What if the order of #2 and #3 were swapped in util_probe()? I
don't immediately see any interdependency between #2 and #3
for KVP and VSS, nor for Shutdown and Timesync. With the swap,
the VMBus channel would be fully open by the time the /dev entry
appears and the user space daemon can do anything.

I haven't though too deeply about this, so maybe there's a problem
somewhere. But if not, it seems a lot cleaner.

Michael

> 
> Fixes: e0fa3e5e7df6 ("Drivers: hv: utils: fix a race on userspace daemons registration")
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Dexuan Cui <decui@xxxxxxxxxxxxx>
> ---
>  drivers/hv/channel.c     | 11 +++++++++++
>  drivers/hv/hv_kvp.c      |  3 +++
>  drivers/hv/hv_snapshot.c |  3 +++
>  3 files changed, 17 insertions(+)
> 
> diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
> index fb8cd8469328..685e407a3fdf 100644
> --- a/drivers/hv/channel.c
> +++ b/drivers/hv/channel.c
> @@ -657,6 +657,14 @@ static int __vmbus_open(struct vmbus_channel
> *newchannel,
>  			return -ENOMEM;
>  	}
> 
> +	/*
> +	 * The channel callbacks of KVP/VSS may run before __vmbus_open()
> +	 * finishes (see kvp_register_done() -> ... -> kvp_poll_wrapper()), so
> +	 * they check newchannel->state to tell the ringbuffer has been fully
> +	 * initialized or not. Disable and enable the tasklet to avoid the race.
> +	 */
> +	tasklet_disable(&newchannel->callback_event);
> +
>  	newchannel->state = CHANNEL_OPENING_STATE;
>  	newchannel->onchannel_callback = onchannelcallback;
>  	newchannel->channel_callback_context = context;
> @@ -750,6 +758,8 @@ static int __vmbus_open(struct vmbus_channel *newchannel,
>  	}
> 
>  	newchannel->state = CHANNEL_OPENED_STATE;
> +	tasklet_enable(&newchannel->callback_event);
> +
>  	kfree(open_info);
>  	return 0;
> 
> @@ -766,6 +776,7 @@ static int __vmbus_open(struct vmbus_channel *newchannel,
>  	hv_ringbuffer_cleanup(&newchannel->inbound);
>  	vmbus_free_requestor(&newchannel->requestor);
>  	newchannel->state = CHANNEL_OPEN_STATE;
> +	tasklet_enable(&newchannel->callback_event);
>  	return err;
>  }
> 
> diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c
> index d35b60c06114..ec098067e579 100644
> --- a/drivers/hv/hv_kvp.c
> +++ b/drivers/hv/hv_kvp.c
> @@ -662,6 +662,9 @@ void hv_kvp_onchannelcallback(void *context)
>  	if (kvp_transaction.state > HVUTIL_READY)
>  		return;
> 
> +	if (channel->state != CHANNEL_OPENED_STATE)
> +		return;
> +
>  	if (vmbus_recvpacket(channel, recv_buffer, HV_HYP_PAGE_SIZE * 4, &recvlen, &requestid)) {
>  		pr_err_ratelimited("KVP request received. Could not read into recv buf\n");
>  		return;
> diff --git a/drivers/hv/hv_snapshot.c b/drivers/hv/hv_snapshot.c
> index 0d2184be1691..f7924c2fc62e 100644
> --- a/drivers/hv/hv_snapshot.c
> +++ b/drivers/hv/hv_snapshot.c
> @@ -301,6 +301,9 @@ void hv_vss_onchannelcallback(void *context)
>  	if (vss_transaction.state > HVUTIL_READY)
>  		return;
> 
> +	if (channel->state != CHANNEL_OPENED_STATE)
> +		return;
> +
>  	if (vmbus_recvpacket(channel, recv_buffer, VSS_MAX_PKT_SIZE, &recvlen, &requestid)) {
>  		pr_err_ratelimited("VSS request received. Could not read into recv buf\n");
>  		return;
> --
> 2.25.1






[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux