From: Mike Snitzer <snitzer@xxxxxxxxxx> nvme: allow retry for requests with REQ_FAILFAST_TRANSPORT set BZ: 1948690 Upstream Status: RHEL-only Signed-off-by: Mike Snitzer <snitzer@xxxxxxxxxx> rhel-8.git commit 7dadadb072515f243868e6fe2f7e9c97fd3516c9 Author: Mike Snitzer <snitzer@xxxxxxxxxx> Date: Tue Aug 25 21:52:48 2020 -0400 [nvme] nvme: allow retry for requests with REQ_FAILFAST_TRANSPORT set Message-id: <20200825215248.2291-11-snitzer@xxxxxxxxxx> Patchwork-id: 325180 Patchwork-instance: patchwork O-Subject: [RHEL8.3 PATCH 10/10] nvme: allow retry for requests with REQ_FAILFAST_TRANSPORT set Bugzilla: 1843515 RH-Acked-by: David Milburn <dmilburn@xxxxxxxxxx> RH-Acked-by: Gopal Tiwari <gtiwari@xxxxxxxxxx> RH-Acked-by: Ewan Milne <emilne@xxxxxxxxxx> BZ: 1843515 Upstream Status: RHEL-only Based on patch that was proposed upstream but ultimately rejected, see: https://www.spinics.net/lists/linux-block/msg57490.html I'd have made this change even if this wasn't already posted obviously, but I figured I'd give proper attribution due to their public post with the same code change. Author: Chao Leng <lengchao@xxxxxxxxxx> Date: Wed Aug 12 16:18:55 2020 +0800 nvme: allow retry for requests with REQ_FAILFAST_TRANSPORT set REQ_FAILFAST_TRANSPORT may be designed for SCSI, because SCSI protocol does not define the local retry mechanism. SCSI implements a fuzzy local retry mechanism, so REQ_FAILFAST_TRANSPORT is needed to allow higher-level multipathing software to perform failover/retry. NVMe is different with SCSI about this. It defines a local retry mechanism and path error codes, so NVMe should retry local for non path error. If path related error, whether to retry and how to retry is still determined by higher-level multipathing's failover. Unlike SCSI, NVMe shouldn't prevent retry if REQ_FAILFAST_TRANSPORT because NVMe's local retry is needed -- as is NVMe specific logic to categorize whether an error is path related. In this way, the mechanism of NVMe multipath or other multipath are now equivalent. The mechanism is: non path related error will be retry local, path related error is handled by multipath. Signed-off-by: Chao Leng <lengchao@xxxxxxxxxx> [snitzer: edited header for grammar and to make clearer] Signed-off-by: Mike Snitzer <snitzer@xxxxxxxxxx> Signed-off-by: Mike Snitzer <snitzer@xxxxxxxxxx> Signed-off-by: Frantisek Hrbata <fhrbata@xxxxxxxxxx> diff a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -306,7 +306,14 @@ static inline enum nvme_disposition nvme_decide_disposition(struct request *req) if (likely(nvme_req(req)->status == 0)) return COMPLETE; - if (blk_noretry_request(req) || + /* + * REQ_FAILFAST_TRANSPORT is set by upper layer software that + * handles multipathing. Unlike SCSI, NVMe's error handling was + * specifically designed to handle local retry for non-path errors. + * As such, allow NVMe's local retry mechanism to be used for + * requests marked with REQ_FAILFAST_TRANSPORT. + */ + if ((req->cmd_flags & (REQ_FAILFAST_DEV | REQ_FAILFAST_DRIVER)) || (nvme_req(req)->status & NVME_SC_DNR) || nvme_req(req)->retries >= nvme_max_retries) return COMPLETE; -- https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1024 _______________________________________________ kernel mailing list -- kernel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to kernel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/kernel@xxxxxxxxxxxxxxxxxxxxxxx Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure