On Mon, 22 Nov 2021 09:17:31 -0800 Ben Widawsky <ben.widawsky@xxxxxxxxx> wrote: > On 21-11-22 15:02:27, Jonathan Cameron wrote: > > On Fri, 19 Nov 2021 16:02:31 -0800 > > Ben Widawsky <ben.widawsky@xxxxxxxxx> wrote: > > > > > The original driver implementation used the doorbell timeout for the > > > Mailbox Interface Ready bit to piggy back off of, since the latter > > > doesn't have a defined timeout. This functionality, introduced in > > > 8adaf747c9f0 ("cxl/mem: Find device capabilities"), can now be improved > > > since a timeout has been defined with an ECN to the 2.0 spec. > > > > > > While devices implemented prior to the ECN could have an arbitrarily > > > long wait and still be within spec, the max ECN value (256s) is chosen > > > as the default for all devices. All vendors in the consortium agreed to > > > this amount and so it is reasonable to assume no devices made will > > > exceed this amount. > > > > Optimistic :) > > > > Reasonable to assume is certainly not the same as "in reality". I can soften > this wording. > > > > > > > Signed-off-by: Ben Widawsky <ben.widawsky@xxxxxxxxx> > > > --- > > > This patch did not exist in RFCv2 > > > --- > > > drivers/cxl/pci.c | 29 +++++++++++++++++++++++++++++ > > > 1 file changed, 29 insertions(+) > > > > > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c > > > index 6c8d09fb3a17..2cef9fec8599 100644 > > > --- a/drivers/cxl/pci.c > > > +++ b/drivers/cxl/pci.c > > > @@ -2,6 +2,7 @@ > > > /* Copyright(c) 2020 Intel Corporation. All rights reserved. */ > > > #include <linux/io-64-nonatomic-lo-hi.h> > > > #include <linux/module.h> > > > +#include <linux/delay.h> > > > #include <linux/sizes.h> > > > #include <linux/mutex.h> > > > #include <linux/list.h> > > > @@ -298,6 +299,34 @@ static int cxl_pci_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *c > > > static int cxl_pci_setup_mailbox(struct cxl_dev_state *cxlds) > > > { > > > const int cap = readl(cxlds->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET); > > > + unsigned long timeout; > > > + u64 md_status; > > > + int rc; > > > + > > > + /* > > > + * CXL 2.0 ECN "Add Mailbox Ready Time" defines a capability field to > > > + * dictate how long to wait for the mailbox to become ready. For > > > + * simplicity, and to handle devices that might have been implemented > > > > I'm not keen on the 'for simplicity' argument here. If the device is advertising > > a lower value, then that is what we should use. It's fine to wait the max time > > if nothing is specified. It'll cost us a few lines of code at most unless > > I am missing something... > > > > Jonathan > > > > Let me pose it a different way, if a device advertises 1s, but for whatever > takes 4s to come up, should we penalize it over the device advertising 256s? Yes, because it is buggy. A compliance test should have failed on this anyway. > The > way this field is defined in the spec would [IMHO] lead vendors to simply put > the max field in there to game the driver, so why not start off with just > insisting they don't? Given reading this value and getting a big number gives the implication that the device is meant to be really slow to initialize, I'd expect that to push vendors a little in the directly of putting realistic values in). Maybe we should print the value in the log to make them look silly ;) Jonathan > > > > + * prior to the ECN, wait the max amount of time no matter what the > > > + * device says. > > > + */ > > > + timeout = jiffies + 256 * HZ; > > > + > > > + rc = check_device_status(cxlds); > > > + if (rc) > > > + return rc; > > > + > > > + do { > > > + md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET); > > > + if (md_status & CXLMDEV_MBOX_IF_READY) > > > + break; > > > + if (msleep_interruptible(100)) > > > + break; > > > + } while (!time_after(jiffies, timeout)); > > > + > > > + /* It's assumed that once the interface is ready, it will remain ready. */ > > > + if (!(md_status & CXLMDEV_MBOX_IF_READY)) > > > + return -EIO; > > > > > > cxlds->mbox_send = cxl_pci_mbox_send; > > > cxlds->payload_size = > >