> -----Original Message----- > From: Dexuan Cui <decui@xxxxxxxxxxxxx> > Sent: Tuesday, July 30, 2019 6:26 PM > To: Sunil Muthuswamy <sunilmut@xxxxxxxxxxxxx>; David Miller <davem@xxxxxxxxxxxxx>; netdev@xxxxxxxxxxxxxxx > Cc: KY Srinivasan <kys@xxxxxxxxxxxxx>; Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>; Stephen Hemminger > <sthemmin@xxxxxxxxxxxxx>; sashal@xxxxxxxxxx; Michael Kelley <mikelley@xxxxxxxxxxxxx>; linux-hyperv@xxxxxxxxxxxxxxx; linux- > kernel@xxxxxxxxxxxxxxx; olaf@xxxxxxxxx; apw@xxxxxxxxxxxxx; jasowang@xxxxxxxxxx; vkuznets <vkuznets@xxxxxxxxxx>; > marcelo.cerri@xxxxxxxxxxxxx > Subject: [PATCH v2 net] hv_sock: Fix hang when a connection is closed > > > There is a race condition for an established connection that is being closed > by the guest: the refcnt is 4 at the end of hvs_release() (Note: here the > 'remove_sock' is false): > > 1 for the initial value; > 1 for the sk being in the bound list; > 1 for the sk being in the connected list; > 1 for the delayed close_work. > > After hvs_release() finishes, __vsock_release() -> sock_put(sk) *may* > decrease the refcnt to 3. > > Concurrently, hvs_close_connection() runs in another thread: > calls vsock_remove_sock() to decrease the refcnt by 2; > call sock_put() to decrease the refcnt to 0, and free the sk; > next, the "release_sock(sk)" may hang due to use-after-free. > > In the above, after hvs_release() finishes, if hvs_close_connection() runs > faster than "__vsock_release() -> sock_put(sk)", then there is not any issue, > because at the beginning of hvs_close_connection(), the refcnt is still 4. > > The issue can be resolved if an extra reference is taken when the > connection is established. > > Fixes: a9eeb998c28d ("hv_sock: Add support for delayed close") > Signed-off-by: Dexuan Cui <decui@xxxxxxxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx > > --- > > Changes in v2: > Changed the location of the sock_hold() call. > Updated the changelog accordingly. > > Thanks Sunil for the suggestion! > > > With the proper kernel debugging options enabled, first a warning can > appear: > > kworker/1:0/4467 is freeing memory ..., with a lock still held there! > stack backtrace: > Workqueue: events vmbus_onmessage_work [hv_vmbus] > Call Trace: > dump_stack+0x67/0x90 > debug_check_no_locks_freed.cold.52+0x78/0x7d > slab_free_freelist_hook+0x85/0x140 > kmem_cache_free+0xa5/0x380 > __sk_destruct+0x150/0x260 > hvs_close_connection+0x24/0x30 [hv_sock] > vmbus_onmessage_work+0x1d/0x30 [hv_vmbus] > process_one_work+0x241/0x600 > worker_thread+0x3c/0x390 > kthread+0x11b/0x140 > ret_from_fork+0x24/0x30 > > and then the following release_sock(sk) can hang: > > watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:0:4467] > ... > irq event stamp: 62890 > CPU: 1 PID: 4467 Comm: kworker/1:0 Tainted: G W 5.2.0+ #39 > Workqueue: events vmbus_onmessage_work [hv_vmbus] > RIP: 0010:queued_spin_lock_slowpath+0x2b/0x1e0 > ... > Call Trace: > do_raw_spin_lock+0xab/0xb0 > release_sock+0x19/0xb0 > vmbus_onmessage_work+0x1d/0x30 [hv_vmbus] > process_one_work+0x241/0x600 > worker_thread+0x3c/0x390 > kthread+0x11b/0x140 > ret_from_fork+0x24/0x30 > > net/vmw_vsock/hyperv_transport.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c > index f2084e3f7aa4..9d864ebeb7b3 100644 > --- a/net/vmw_vsock/hyperv_transport.c > +++ b/net/vmw_vsock/hyperv_transport.c > @@ -312,6 +312,11 @@ static void hvs_close_connection(struct vmbus_channel *chan) > lock_sock(sk); > hvs_do_close_lock_held(vsock_sk(sk), true); > release_sock(sk); > + > + /* Release the refcnt for the channel that's opened in > + * hvs_open_connection(). > + */ > + sock_put(sk); > } > > static void hvs_open_connection(struct vmbus_channel *chan) > @@ -407,6 +412,9 @@ static void hvs_open_connection(struct vmbus_channel *chan) > } > > set_per_channel_state(chan, conn_from_host ? new : sk); > + > + /* This reference will be dropped by hvs_close_connection(). */ > + sock_hold(conn_from_host ? new : sk); > vmbus_set_chn_rescind_callback(chan, hvs_close_connection); > > /* Set the pending send size to max packet size to always get > -- > 2.19.1 Reviewed-by: Sunil Muthuswamy <sunilmut@xxxxxxxxxxxxx>