On Tue, 2022-03-08 at 17:59 +0800, xiubli@xxxxxxxxxx wrote: > From: Xiubo Li <xiubli@xxxxxxxxxx> > > When reconnecting MDS it will reopen the con with new ip address, > but the when opening the con with new address it couldn't be sure > that the stale work has finished. So it's possible that the stale > work queued will use the new data. > > This will use cancel_delayed_work_sync() instead. > > URL: https://tracker.ceph.com/issues/54461 > Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx> > --- > net/ceph/messenger.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c > index d3bb656308b4..32eb5dc00583 100644 > --- a/net/ceph/messenger.c > +++ b/net/ceph/messenger.c > @@ -1416,7 +1416,7 @@ static void queue_con(struct ceph_connection *con) > > static void cancel_con(struct ceph_connection *con) > { > - if (cancel_delayed_work(&con->work)) { > + if (cancel_delayed_work_sync(&con->work)) { > dout("%s %p\n", __func__, con); > con->ops->put(con); > } Won't this deadlock? This function is called from ceph_con_close with the con->mutex held. The work will try to take the same mutex and will get stuck. If you want to do this, then you may also need to change it to call cancel_con after dropping the mutex. -- Jeff Layton <jlayton@xxxxxxxxxx>