On 2/19/25 11:17 AM, alan.adamson@xxxxxxxxxx wrote:
On 2/18/25 11:26 PM, Shinichiro Kawasaki wrote:
CC+: Alan,
On Feb 13, 2025 / 08:18, Jens Axboe wrote:
The conditions for whether or not a request is allowed adding to a
completion batch are a bit hard to read, and they also have a few
issues. One is that ioerror may indeed be a random value on
passthrough,
and it's being checked unconditionally of whether or not the given
request is a passthrough request or not.
Rewrite the conditions to be separate for easier reading, and only
check
ioerror for non-passthrough requests. This fixes an issue with bio
unmapping on passthrough, where it fails getting added to a batch. This
both leads to suboptimal performance, and may trigger a potential
schedule-under-atomic condition for polled passthrough IO.
Fixes: f794f3351f26 ("block: add support for
blk_mq_end_request_batch()")
Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>
I observed the blktests test case nvme/039 failure with v6.14-rc3
kernel. I
bisected and found that this patch in the v6.14-rc3 is the trigger.
The test run output is as follows:
nvme/039 => nvme0n1 (test error logging) [failed]
runtime 5.378s ... 5.354s
--- tests/nvme/039.out 2024-09-20 11:20:26.405380875 +0900
+++
/home/shin/Blktests/blktests/results/nvme0n1/nvme/039.out.bad
2025-02-19 16:13:05.061387179 +0900
@@ -1,7 +1,3 @@
Running nvme/039
- Read(0x2) @ LBA 0, 1 blocks, Unrecovered Read Error (sct 0x2 /
sc 0x81) DNR
- Read(0x2) @ LBA 0, 1 blocks, Unknown (sct 0x3 / sc 0x75) DNR
- Write(0x1) @ LBA 0, 1 blocks, Write Fault (sct 0x2 / sc 0x80) DNR
Identify(0x6), Access Denied (sct 0x2 / sc 0x86) DNR cdw10=0x1
cdw11=0x0 cdw12=0x0 cdw13=0x0 cdw14=0x0 cdw15=0x0
- Read(0x2), Invalid Command Opcode (sct 0x0 / sc 0x1) DNR
cdw10=0x0 cdw11=0x0 cdw12=0x1 cdw13=0x0 cdw14=0x0 cdw15=0x0
Test complete
The test case does error injection. Test method requires
reconsideration to
adjust to this kernel change, probably. Help for fix will be
appreciated.
Not really an issue with the test, rather the error injector is
broken. I'll investigate.
The following change allows the test to pass.
- The check of (ioerror < 0) should be (ioerror != 0)
- passthroughs can also have ioerror set, so false needs to be returned.
This needs to be resolved.
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index fa2a76cc2f73..b2bd2ac40441 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -865,17 +865,17 @@ static inline bool blk_mq_add_to_batch(struct
request *req,
* 1) No batch container
* 2) Has scheduler data attached
* 3) Not a passthrough request and end_io set
- * 4) Not a passthrough request and an ioerror
+ * 4) An ioerror
*/
if (!iob)
return false;
if (req->rq_flags & RQF_SCHED_TAGS)
return false;
+ if (ioerror)
+ return false;
if (!blk_rq_is_passthrough(req)) {
if (req->end_io)
return false;
- if (ioerror < 0)
- return false;
}
Alan