On Mon, Jan 31, 2022 at 3:25 PM Ben Widawsky <ben.widawsky@xxxxxxxxx> wrote: > > On 22-01-31 15:11:05, Dan Williams wrote: > > On Mon, Jan 31, 2022 at 2:21 PM Ben Widawsky <ben.widawsky@xxxxxxxxx> wrote: > > > > > > On 22-01-23 16:28:49, Dan Williams wrote: > > > > From: Ben Widawsky <ben.widawsky@xxxxxxxxx> > > > > > > > > The original driver implementation used the doorbell timeout for the > > > > Mailbox Interface Ready bit to piggy back off of, since the latter does > > > > not have a defined timeout. This functionality, introduced in commit > > > > 8adaf747c9f0 ("cxl/mem: Find device capabilities"), needs improvement as > > > > the recent "Add Mailbox Ready Time" ECN timeout indicates that the > > > > mailbox ready time can be significantly longer that 2 seconds. > > > > > > > > While the specification limits the maximum timeout to 256s, the cxl_pci > > > > driver gives up on the mailbox after 60s. This value corresponds with > > > > important timeout values already present in the kernel. A module > > > > parameter is provided as an emergency override and represents the > > > > default Linux policy for all devices. > > > > > > > > Signed-off-by: Ben Widawsky <ben.widawsky@xxxxxxxxx> > > > > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> > > > > [djbw: add modparam, drop check_device_status()] > > > > Co-developed-by: Dan Williams <dan.j.williams@xxxxxxxxx> > > > > Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx> > > > > --- > > > > drivers/cxl/pci.c | 35 +++++++++++++++++++++++++++++++++++ > > > > 1 file changed, 35 insertions(+) > > > > > > > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c > > > > index 8dc91fd3396a..ed8de9eac970 100644 > > > > --- a/drivers/cxl/pci.c > > > > +++ b/drivers/cxl/pci.c > > > > @@ -1,7 +1,9 @@ > > > > // SPDX-License-Identifier: GPL-2.0-only > > > > /* Copyright(c) 2020 Intel Corporation. All rights reserved. */ > > > > #include <linux/io-64-nonatomic-lo-hi.h> > > > > +#include <linux/moduleparam.h> > > > > #include <linux/module.h> > > > > +#include <linux/delay.h> > > > > #include <linux/sizes.h> > > > > #include <linux/mutex.h> > > > > #include <linux/list.h> > > > > @@ -35,6 +37,20 @@ > > > > /* CXL 2.0 - 8.2.8.4 */ > > > > #define CXL_MAILBOX_TIMEOUT_MS (2 * HZ) > > > > > > > > +/* > > > > + * CXL 2.0 ECN "Add Mailbox Ready Time" defines a capability field to > > > > + * dictate how long to wait for the mailbox to become ready. The new > > > > + * field allows the device to tell software the amount of time to wait > > > > + * before mailbox ready. This field per the spec theoretically allows > > > > + * for up to 255 seconds. 255 seconds is unreasonably long, its longer > > > > + * than the maximum SATA port link recovery wait. Default to 60 seconds > > > > + * until someone builds a CXL device that needs more time in practice. > > > > + */ > > > > +static unsigned short mbox_ready_timeout = 60; > > > > +module_param(mbox_ready_timeout, ushort, 0600); > > > > > > Any reason not to make it 0644? > > > > > > > Are there any tooling scenarios where this information is usable by non-root? > > Just for ease of debug. If I get a bug report with this, first thing I'm going > to do is ask for the timeout value. Perhaps it's expected the person who filed > the bug will have root access. They would have already needed to be root to change it from the default in the first instance, or the kernel command line from the dmesg would show it being overridden. That said, there's nothing security sensitive about emitting it, so 0644 it is.