Re: [PATCH net 5/5] net/smc: put sk reference if close work was canceled

Wenjia Zhang <wenjia@xxxxxxxxxxxxx> · Thu, 19 Oct 2023 19:40:46 +0200

On 19.10.23 09:33, D. Wythe wrote:


On 10/19/23 4:26 AM, Wenjia Zhang wrote:


On 17.10.23 04:06, D. Wythe wrote:


On 10/13/23 3:04 AM, Wenjia Zhang wrote:


On 11.10.23 09:33, D. Wythe wrote:
From: "D. Wythe" <alibuda@xxxxxxxxxxxxxxxxx>

Note that we always hold a reference to sock when attempting
to submit close_work. 
yes
Therefore, if we have successfully
canceled close_work from pending, we MUST release that reference
to avoid potential leaks.

Isn't the corresponding reference already released inside the 
smc_close_passive_work()?


Hi Wenjia,

If we successfully cancel the close work from the pending state,
it means that smc_close_passive_work() has never been executed.

You can find more details here.

/**
* cancel_work_sync - cancel a work and wait for it to finish
* @work:the work to cancel
*
* Cancel @work and wait for its execution to finish. This function
* can be used even if the work re-queues itself or migrates to
* another workqueue. On return from this function, @work is
* guaranteed to be not pending or executing on any CPU.
*
* cancel_work_sync(&delayed_work->work) must not be used for
* delayed_work's. Use cancel_delayed_work_sync() instead.
*
* The caller must ensure that the workqueue on which @work was last
* queued can't be destroyed before this function returns.
*
* Return:
* %true if @work was pending, %false otherwise.
*/
boolcancel_work_sync(structwork_struct *work)
{
return__cancel_work_timer(work, false);
}

Best wishes,
D. Wythe
As I understand, queue_work() would wake up the work if the work is 
not already on the queue. And the sock_hold() is just prio to the 
queue_work(). That means, cancel_work_sync() would cancel the work 
either before its execution or after. If your fix refers to the former 
case, at this moment, I don't think the reference can be hold, thus it 
is unnecessary to put it.


I am quite confuse about why you think when we cancel the work before 
its execution,
the reference can not be hold ?


Perhaps the following diagram can describe the problem in better way :

smc_close_cancel_work
smc_cdc_msg_recv_action


sock_hold
queue_work
                                                                if 
(cancel_work_sync())        // successfully cancel before execution
sock_put()                        //  need to put it since we already 
hold a ref before   queue_work()


ha, I already thought you might ask such question:P

I think here two Problems need to be clarified:

1) Do you think the bh_lock_sock/bh_unlock_sock in the smc_cdc_msg_recv 
does not protect the smc_cdc_msg_recv_action() from cancel_work_sync()?
Maybe that would go back to the discussion in the other patch on the 
behaviors of the locks.

2) If the queue_work returns true, as I said in the last main, the work 
should be (being) executed. How could the cancel_work_sync() cancel the 
work before execution successgully?

Fixes: 42bfba9eaa33 ("net/smc: immediate termination for SMCD link 
groups")
Signed-off-by: D. Wythe <alibuda@xxxxxxxxxxxxxxxxx>
---
  net/smc/smc_close.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/smc/smc_close.c b/net/smc/smc_close.c
index 449ef45..10219f5 100644
--- a/net/smc/smc_close.c
+++ b/net/smc/smc_close.c
@@ -116,7 +116,8 @@ static void smc_close_cancel_work(struct 
smc_sock *smc)
      struct sock *sk = &smc->sk;
        release_sock(sk);
-    cancel_work_sync(&smc->conn.close_work);
+    if (cancel_work_sync(&smc->conn.close_work))
+        sock_put(sk);
      cancel_delayed_work_sync(&smc->conn.tx_work);
      lock_sock(sk);
  }