Re: nvme-tcp: fix a possible UAF when failing to send request

"Maurizio Lombardi" <mlombard@xxxxxxxxxxxxxxx> · Wed, 12 Feb 2025 09:11:34 +0100

On Tue Feb 11, 2025 at 9:04 AM CET, zhang.guanghui@xxxxxxxx wrote:
> Hi 
>
>     This is a  race issue,  I can't reproduce it stably yet. I have not tested the latest kernel.  but in fact,  I've synced some nvme-tcp patches from  lastest upstream,

Hello, could you try this patch?

queue_lock should protect against concurrent "error recovery",
while send_mutex should serialize try_recv() and try_send(), emulating
the way io_work works.
Concurrent calls to try_recv() should already be protected by
sock_lock.

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 841238f38fdd..f464de04ff4d 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2653,16 +2653,24 @@ static int nvme_tcp_poll(struct blk_mq_hw_ctx *hctx, struct io_comp_batch *iob)
 {
 	struct nvme_tcp_queue *queue = hctx->driver_data;
 	struct sock *sk = queue->sock->sk;
+	int r = 0;
 
+	mutex_lock(&queue->queue_lock);
 	if (!test_bit(NVME_TCP_Q_LIVE, &queue->flags))
-		return 0;
+		goto out;
 
 	set_bit(NVME_TCP_Q_POLLING, &queue->flags);
 	if (sk_can_busy_loop(sk) && skb_queue_empty_lockless(&sk->sk_receive_queue))
 		sk_busy_loop(sk, true);
+
+	mutex_lock(&queue->send_mutex);
 	nvme_tcp_try_recv(queue);
+	r = queue->nr_cqe;
+	mutex_unlock(&queue->send_mutex);
 	clear_bit(NVME_TCP_Q_POLLING, &queue->flags);
-	return queue->nr_cqe;
+out:
+	mutex_unlock(&queue->queue_lock);
+	return r;
 }
 
 static int nvme_tcp_get_address(struct nvme_ctrl *ctrl, char *buf, int size)


Thanks,
Maurizio