Re: [PATCHv2] nvme: avoid bogus CRTO values

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9/13/23 23:28, Keith Busch wrote:
From: Keith Busch <kbusch@xxxxxxxxxx>

Some devices are reporting Controller Ready Modes Supported, but return
0 for CRTO. These devices require a much higher time to ready than that,
so they are failing to initialize after the driver started preferring
that value over CAP.TO.

The spec requires CAP.TO match the appropritate CRTO value, or be set to
0xff if CRTO is larger than that. This means that CAP.TO can be used to
validate if CRTO is reliable, and provides an appropriate fallback for
setting the timeout value if not. Use whichever is larger.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=217863
Reported-by: Cláudio Sampaio <patola@xxxxxxxxx>
Reported-by: Felix Yan <felixonmars@xxxxxxxxxxxxx>
Based-on-a-patch-by: Felix Yan <felixonmars@xxxxxxxxxxxxx>
Reviewed-by: Sagi Grimberg <sagi@xxxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Keith Busch <kbusch@xxxxxxxxxx>
---
v1->v2:
   Warn once if driver isn't relying on CRTO values

  drivers/nvme/host/core.c | 54 ++++++++++++++++++++++++++--------------
  1 file changed, 35 insertions(+), 19 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 37b6fa7466620..0685ed4f2dc49 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2245,25 +2245,8 @@ int nvme_enable_ctrl(struct nvme_ctrl *ctrl)
  	else
  		ctrl->ctrl_config = NVME_CC_CSS_NVM;
- if (ctrl->cap & NVME_CAP_CRMS_CRWMS) {
-		u32 crto;
-
-		ret = ctrl->ops->reg_read32(ctrl, NVME_REG_CRTO, &crto);
-		if (ret) {
-			dev_err(ctrl->device, "Reading CRTO failed (%d)\n",
-				ret);
-			return ret;
-		}
-
-		if (ctrl->cap & NVME_CAP_CRMS_CRIMS) {
-			ctrl->ctrl_config |= NVME_CC_CRIME;
-			timeout = NVME_CRTO_CRIMT(crto);
-		} else {
-			timeout = NVME_CRTO_CRWMT(crto);
-		}
-	} else {
-		timeout = NVME_CAP_TIMEOUT(ctrl->cap);
-	}
+	if (ctrl->cap & NVME_CAP_CRMS_CRWMS && ctrl->cap & NVME_CAP_CRMS_CRIMS)
+		ctrl->ctrl_config |= NVME_CC_CRIME;
ctrl->ctrl_config |= (NVME_CTRL_PAGE_SHIFT - 12) << NVME_CC_MPS_SHIFT;
  	ctrl->ctrl_config |= NVME_CC_AMS_RR | NVME_CC_SHN_NONE;
@@ -2277,6 +2260,39 @@ int nvme_enable_ctrl(struct nvme_ctrl *ctrl)
  	if (ret)
  		return ret;
+ /* CAP value may change after initial CC write */
+	ret = ctrl->ops->reg_read64(ctrl, NVME_REG_CAP, &ctrl->cap);
+	if (ret)
+		return ret;
+
+	timeout = NVME_CAP_TIMEOUT(ctrl->cap);
+	if (ctrl->cap & NVME_CAP_CRMS_CRWMS) {
+		u32 crto, ready_timeout;
+
+		ret = ctrl->ops->reg_read32(ctrl, NVME_REG_CRTO, &crto);
+		if (ret) {
+			dev_err(ctrl->device, "Reading CRTO failed (%d)\n",
+				ret);
+			return ret;
+		}
+
+		/*
+		 * CRTO should always be greater or equal to CAP.TO, but some
+		 * devices are known to get this wrong. Use the larger of the
+		 * two values.
+		 */
+		if (ctrl->ctrl_config & NVME_CC_CRIME)
+			ready_timeout = NVME_CRTO_CRIMT(crto);
+		else
+			ready_timeout = NVME_CRTO_CRWMT(crto);
+
+		if (ready_timeout < timeout)
+			dev_warn_once(ctrl->device, "bad crto:%x cap:%llx\n",
+				      crto, ctrl->cap);
+		else
+			timeout = ready_timeout;
+	}
+
  	ctrl->ctrl_config |= NVME_CC_ENABLE;
  	ret = ctrl->ops->reg_write32(ctrl, NVME_REG_CC, ctrl->ctrl_config);
  	if (ret)

Thanks, verified that it works well here.

I noticed only one very small issue: dev_warn_once seems to only print once when multiple devices are affected. It may be more ideal if it prints once for each device, but I don't know how to really achieve that...

--
Regards,
Felix Yan

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux