Re: Re: nvme-tcp: fix a possible UAF when failing to send request

"zhang.guanghui@xxxxxxxx" <zhang.guanghui@xxxxxxxx> · Wed, 12 Feb 2025 17:47:45 +0800

Hi, Thanks.
    I will test this patch, but I am worried whether it will affect the performance.
Should we also consider null pointer protection?


zhang.guanghui@xxxxxxxx



From: Maurizio Lombardi



Date: 2025-02-12 16:52



To: Maurizio Lombardi; zhang.guanghui@xxxxxxxx; chunguang.xu



CC: mgurtovoy; sagi; kbusch; sashal; linux-kernel; linux-nvme; linux-block



Subject: Re: nvme-tcp: fix a possible UAF when failing to send request



On Wed Feb 12, 2025 at 9:11 AM CET, Maurizio Lombardi wrote:



> On Tue Feb 11, 2025 at 9:04 AM CET, zhang.guanghui@xxxxxxxx wrote:



>> Hi 



>>



>>     This is a  race issue,  I can't reproduce it stably yet. I have not tested the latest kernel.  but in fact,  I've synced some nvme-tcp patches from  lastest upstream,



>



> Hello, could you try this patch?



>



> queue_lock should protect against concurrent "error recovery",



> + mutex_lock(&queue->queue_lock);



 



Unfortunately I've just realized that queue_lock won't save us



from the race against the controller reset, it's still possible



we lock a destroyed mutex. So just try this



simplified patch, I will try to figure out something else:

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c



index 841238f38fdd..b714e1691c30 100644



--- a/drivers/nvme/host/tcp.c



+++ b/drivers/nvme/host/tcp.c



@@ -2660,7 +2660,10 @@ static int nvme_tcp_poll(struct blk_mq_hw_ctx *hctx, struct io_comp_batch *iob)



set_bit(NVME_TCP_Q_POLLING, &queue->flags);



if (sk_can_busy_loop(sk) && skb_queue_empty_lockless(&sk->sk_receive_queue))



sk_busy_loop(sk, true);



+



+ mutex_lock(&queue->send_mutex);



nvme_tcp_try_recv(queue);



+ mutex_unlock(&queue->send_mutex);



clear_bit(NVME_TCP_Q_POLLING, &queue->flags);



return queue->nr_cqe;



}



 



Maurizio