On Tue, Sep 12, 2023 at 02:47:33PM -0700, Keith Busch wrote: > From: Keith Busch <kbusch@xxxxxxxxxx> > > Some devices are reporting controller ready mode support, but return 0 > for CRTO. These devices require a much higher time to ready than that, > so they are failing to initialize after the driver starter preferring > that value over CAP.TO. > > The spec requires that CAP.TO match the appropritate CRTO value, or be > set to 0xff if CRTO is larger than that. This means that CAP.TO can be > used to validate if CRTO is reliable, and provides an appropriate > fallback for setting the timeout value if not. Use whichever is larger. > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=217863 > Reported-by: Cláudio Sampaio <patola@xxxxxxxxx> > Reported-by: Felix Yan <felixonmars@xxxxxxxxxxxxx> > Based-on-a-patch-by: Felix Yan <felixonmars@xxxxxxxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Keith Busch <kbusch@xxxxxxxxxx> > --- > drivers/nvme/host/core.c | 48 ++++++++++++++++++++++++---------------- > 1 file changed, 29 insertions(+), 19 deletions(-) > > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c > + if (ctrl->cap & NVME_CAP_CRMS_CRWMS && ctrl->cap & NVME_CAP_CRMS_CRIMS) I don't think the NVME_CAP_CRMS_CRWMS check here makes sense, this should only need the NVME_CAP_CRMS_CRIMS one. > + timeout = NVME_CAP_TIMEOUT(ctrl->cap); > + if (ctrl->cap & NVME_CAP_CRMS_CRWMS) { > + u32 crto; > + > + ret = ctrl->ops->reg_read32(ctrl, NVME_REG_CRTO, &crto); > + if (ret) { > + dev_err(ctrl->device, "Reading CRTO failed (%d)\n", > + ret); > + return ret; > + } > + > + /* > + * CRTO should always be greater or equal to CAP.TO, but some > + * devices are known to get this wrong. Use the larger of the > + * two values. > + */ > + if (ctrl->ctrl_config & NVME_CC_CRIME) > + timeout = max(timeout, NVME_CRTO_CRIMT(crto)); > + else > + timeout = max(timeout, NVME_CRTO_CRWMT(crto)); Should we at least log a harmless one-liner warning if the new timeouts are too small?