Re: [RFC PATCH] libceph: wait for con->work to finish when cancelling con

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 3/8/22 7:45 PM, Jeff Layton wrote:
On Tue, 2022-03-08 at 17:59 +0800, xiubli@xxxxxxxxxx wrote:
From: Xiubo Li <xiubli@xxxxxxxxxx>

When reconnecting MDS it will reopen the con with new ip address,
but the when opening the con with new address it couldn't be sure
that the stale work has finished. So it's possible that the stale
work queued will use the new data.

This will use cancel_delayed_work_sync() instead.

URL: https://tracker.ceph.com/issues/54461
Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx>
---
  net/ceph/messenger.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index d3bb656308b4..32eb5dc00583 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -1416,7 +1416,7 @@ static void queue_con(struct ceph_connection *con)
static void cancel_con(struct ceph_connection *con)
  {
-	if (cancel_delayed_work(&con->work)) {
+	if (cancel_delayed_work_sync(&con->work)) {
  		dout("%s %p\n", __func__, con);
  		con->ops->put(con);
  	}
Won't this deadlock?

This function is called from ceph_con_close with the con->mutex held.
The work will try to take the same mutex and will get stuck. If you want
to do this, then you may also need to change it to call cancel_con after
dropping the mutex.

Yeah, correct :-)

- Xiubo





[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux