There is a race condition for an established connection that is being closed by the guest: the refcnt is 4 at the end of hvs_release() (Note: here the 'remove_sock' is false): 1 for the initial value; 1 for the sk being in the bound list; 1 for the sk being in the connected list; 1 for the delayed close_work. After hvs_release() finishes, __vsock_release() -> sock_put(sk) *may* decrease the refcnt to 3. Concurrently, hvs_close_connection() runs in another thread: calls vsock_remove_sock() to decrease the refcnt by 2; call sock_put() to decrease the refcnt to 0, and free the sk; next, the "release_sock(sk)" may hang due to use-after-free. In the above, after hvs_release() finishes, if hvs_close_connection() runs faster than "__vsock_release() -> sock_put(sk)", then there is not any issue, because at the beginning of hvs_close_connection(), the refcnt is still 4. The issue can be resolved if an extra reference is taken when the connection is established. Fixes: a9eeb998c28d ("hv_sock: Add support for delayed close") Signed-off-by: Dexuan Cui <decui@xxxxxxxxxxxxx> Cc: stable@xxxxxxxxxxxxxxx --- Changes in v2: Changed the location of the sock_hold() call. Updated the changelog accordingly. Thanks Sunil for the suggestion! With the proper kernel debugging options enabled, first a warning can appear: kworker/1:0/4467 is freeing memory ..., with a lock still held there! stack backtrace: Workqueue: events vmbus_onmessage_work [hv_vmbus] Call Trace: dump_stack+0x67/0x90 debug_check_no_locks_freed.cold.52+0x78/0x7d slab_free_freelist_hook+0x85/0x140 kmem_cache_free+0xa5/0x380 __sk_destruct+0x150/0x260 hvs_close_connection+0x24/0x30 [hv_sock] vmbus_onmessage_work+0x1d/0x30 [hv_vmbus] process_one_work+0x241/0x600 worker_thread+0x3c/0x390 kthread+0x11b/0x140 ret_from_fork+0x24/0x30 and then the following release_sock(sk) can hang: watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:0:4467] ... irq event stamp: 62890 CPU: 1 PID: 4467 Comm: kworker/1:0 Tainted: G W 5.2.0+ #39 Workqueue: events vmbus_onmessage_work [hv_vmbus] RIP: 0010:queued_spin_lock_slowpath+0x2b/0x1e0 ... Call Trace: do_raw_spin_lock+0xab/0xb0 release_sock+0x19/0xb0 vmbus_onmessage_work+0x1d/0x30 [hv_vmbus] process_one_work+0x241/0x600 worker_thread+0x3c/0x390 kthread+0x11b/0x140 ret_from_fork+0x24/0x30 net/vmw_vsock/hyperv_transport.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c index f2084e3f7aa4..9d864ebeb7b3 100644 --- a/net/vmw_vsock/hyperv_transport.c +++ b/net/vmw_vsock/hyperv_transport.c @@ -312,6 +312,11 @@ static void hvs_close_connection(struct vmbus_channel *chan) lock_sock(sk); hvs_do_close_lock_held(vsock_sk(sk), true); release_sock(sk); + + /* Release the refcnt for the channel that's opened in + * hvs_open_connection(). + */ + sock_put(sk); } static void hvs_open_connection(struct vmbus_channel *chan) @@ -407,6 +412,9 @@ static void hvs_open_connection(struct vmbus_channel *chan) } set_per_channel_state(chan, conn_from_host ? new : sk); + + /* This reference will be dropped by hvs_close_connection(). */ + sock_hold(conn_from_host ? new : sk); vmbus_set_chn_rescind_callback(chan, hvs_close_connection); /* Set the pending send size to max packet size to always get -- 2.19.1