Re: [PATCH resend] vsock/virtio: avoid potential deadlock when vsock device remove

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 12, 2021 at 01:30:56PM +0800, Longpeng(Mike) wrote:
There's a potential deadlock case when remove the vsock device or
process the RESET event:

 vsock_for_each_connected_socket:
     spin_lock_bh(&vsock_table_lock) ----------- (1)
     ...
         virtio_vsock_reset_sock:
             lock_sock(sk) --------------------- (2)
     ...
     spin_unlock_bh(&vsock_table_lock)

lock_sock() may do initiative schedule when the 'sk' is owned by
other thread at the same time, we would receivce a warning message
that "scheduling while atomic".

Even worse, if the next task (selected by the scheduler) try to
release a 'sk', it need to request vsock_table_lock and the deadlock
occur, cause the system into softlockup state.
 Call trace:
  queued_spin_lock_slowpath
  vsock_remove_bound
  vsock_remove_sock
  virtio_transport_release
  __vsock_release
  vsock_release
  __sock_release
  sock_close
  __fput
  ____fput

So we should not require sk_lock in this case, just like the behavior
in vhost_vsock or vmci.

The difference with vhost_vsock is that here we call it also when we receive an event in the event queue (for example because we are migrating the VM).

I think the idea of this lock was to prevent concurrency with RX loop, but actually if a socket is connected, it can only change state to TCP_CLOSING/TCP_CLOSE.

I don't think there is any problem not to take the lock, at most we could take the rx_lock in virtio_vsock_event_handle(), but I'm not sure it's necessary.


Cc: Stefan Hajnoczi <stefanha@xxxxxxxxxx>
Cc: Stefano Garzarella <sgarzare@xxxxxxxxxx>
Cc: "David S. Miller" <davem@xxxxxxxxxxxxx>
Cc: Jakub Kicinski <kuba@xxxxxxxxxx>

We should add:
Fixes: 0ea9e1d3a9e3 ("VSOCK: Introduce virtio_transport.ko")
Signed-off-by: Longpeng(Mike) <longpeng2@xxxxxxxxxx>
---
net/vmw_vsock/virtio_transport.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index e0c2c99..4f7c99d 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -357,11 +357,14 @@ static void virtio_vsock_event_fill(struct virtio_vsock *vsock)

static void virtio_vsock_reset_sock(struct sock *sk)
{
-	lock_sock(sk);
+	/* vmci_transport.c doesn't take sk_lock here either.  At least we're
+ * under vsock_table_lock so the sock cannot disappear while we're
+	 * executing.
+	 */
+
	sk->sk_state = TCP_CLOSE;
	sk->sk_err = ECONNRESET;
	sk_error_report(sk);
-	release_sock(sk);
}

static void virtio_vsock_update_guest_cid(struct virtio_vsock *vsock)
--
1.8.3.1


With the Fixes tag added:

Reviewed-by: Stefano Garzarella <sgarzare@xxxxxxxxxx>




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux