The complete process is like this:
xprt_destroy
wait_on_bit_lock(&xprt->state, XPRT_LOCKED, TASK_UNINTERRUPTIBLE)
-->getlock
del_timer_sync(&xprt->timer) -->del xprt->timer
INIT_WORK(&xprt->task_cleanup, xprt_destroy_cb)
xprt_destroy_cb
xs_destroy(xprt->ops->destroy)
cancel_delayed_work_sync -->will call
transport->connect_worker, whose callback is xs_udp_setup_socket
xs_xprt_free(xprt) -->free xprt
xs_udp_setup_socket
sock = xs_create_sock
xprt_unlock_connect
if (!test_bit(XPRT_LOCKED, &xprt->state)) -->state is XPRT_LOCKED
xprt_schedule_autodisconnect
mod_timer
internal_add_timer -->insert xprt->timer to base timer list
On 2020/6/4 23:39, Trond Myklebust wrote:
On Thu, 2020-06-04 at 22:49 +0800, Zheng Bin wrote:
If RPC use udp as it's transport protocol, transport->connect_worker
will call xs_udp_setup_socket.
xs_udp_setup_socket
sock = xs_create_sock
if (IS_ERR(sock))
goto out;
out:
xprt_unlock_connect
xprt_schedule_autodisconnect
mod_timer
internal_add_timer -->insert xprt->timer to base timer
list
xs_destroy
cancel_delayed_work_sync(&transport->connect_worker)
xs_xprt_free(xprt) -->free xprt
Thus use-after-free will happen.
Signed-off-by: Zheng Bin <zhengbin13@xxxxxxxxxx>
---
net/sunrpc/xprtsock.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 845d0be805ec..c796808e7f7a 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -1242,6 +1242,7 @@ static void xs_destroy(struct rpc_xprt *xprt)
dprintk("RPC: xs_destroy xprt %p\n", xprt);
cancel_delayed_work_sync(&transport->connect_worker);
+ del_timer_sync(&xprt->timer);
xs_close(xprt);
cancel_work_sync(&transport->recv_worker);
cancel_work_sync(&transport->error_worker);
--
2.26.0.106.g9fadedd
I'm confused. How can this happen given that xprt_destroy() first takes
the XPRT_LOCK, and then deletes xprt->timer?
Right now, the socket code knows nothing about the details of xprt-
timer and what it is used for. I'd prefer to keep it that way if
possible.