Hello, I’m using the nvme multipath driver (NVMF/RDMA) and io-uring. When a path goes away, I sometimes get a CQE.res = -EAGAIN in user space. This is unexpected since the nvme multipath driver should handle this transparently. It’s somewhat workload related but easy to reproduce with fio. The multipath driver uses kblockd worker to re-queue the failed NVME bios (https://github.com/torvalds/linux/blob/13563da6ffcf49b8b45772e40b35f96926a7ee1e/drivers/nvme/host/multipath.c#L126). The original request is ended. When the nvme_requeue_work callback is executed, the blk layer tries to allocate a new request for the bios but that fails and the bio status is set to BLK_STS_AGAIN (https://elixir.bootlin.com/linux/v6.12.6/source/block/blk-mq.c#L2987). The failure to allocate a new req seems to be due to all tags for the queue being used up. Eventually, this makes it into io_uring’s io_rw_should_reissue and hits same_thread_group(req->tctx->task, current) = false (in https://github.com/torvalds/linux/blob/13563da6ffcf49b8b45772e40b35f96926a7ee1e/io_uring/rw.c#L437). As a result, CQE.res = -EAGAIN and thrown back to the user space program. Here’s a stack dump when we hit same_thread_group(req->tctx->task, current) = false kernel: [237700.098733] dump_stack_lvl+0x44/0x5c kernel: [237700.098737] io_rw_should_reissue.cold+0x5d/0x64 kernel: [237700.098742] io_complete_rw+0x9a/0xc0 kernel: [237700.098745] blkdev_bio_end_io_async+0x33/0x80 kernel: [237700.098749] blk_mq_submit_bio+0x5b5/0x620 kernel: [237700.098756] submit_bio_noacct_nocheck+0x163/0x370 kernel: [237700.098760] ? submit_bio_noacct+0x79/0x4b0 kernel: [237700.098764] nvme_requeue_work+0x4b/0x60 [nvme_core] kernel: [237700.098776] process_one_work+0x1c7/0x380 kernel: [237700.098782] worker_thread+0x4d/0x380 kernel: [237700.098786] ? _raw_spin_lock_irqsave+0x23/0x50 kernel: [237700.098791] ? rescuer_thread+0x3a0/0x3a0 kernel: [237700.098794] kthread+0xe9/0x110 kernel: [237700.098798] ? kthread_complete_and_exit+0x20/0x20 kernel: [237700.098802] ret_from_fork+0x22/0x30 kernel: [237700.098811] </TASK> Is the same_thread_group() check really needed in this case? The thread groups are certainly different… Any side effects if this check is being removed? Thanks. Michael