Re: [PATCH v3 02/40] cxl/pci: Implement Interface Ready Timeout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 22-01-31 15:11:05, Dan Williams wrote:
> On Mon, Jan 31, 2022 at 2:21 PM Ben Widawsky <ben.widawsky@xxxxxxxxx> wrote:
> >
> > On 22-01-23 16:28:49, Dan Williams wrote:
> > > From: Ben Widawsky <ben.widawsky@xxxxxxxxx>
> > >
> > > The original driver implementation used the doorbell timeout for the
> > > Mailbox Interface Ready bit to piggy back off of, since the latter does
> > > not have a defined timeout. This functionality, introduced in commit
> > > 8adaf747c9f0 ("cxl/mem: Find device capabilities"), needs improvement as
> > > the recent "Add Mailbox Ready Time" ECN timeout indicates that the
> > > mailbox ready time can be significantly longer that 2 seconds.
> > >
> > > While the specification limits the maximum timeout to 256s, the cxl_pci
> > > driver gives up on the mailbox after 60s. This value corresponds with
> > > important timeout values already present in the kernel. A module
> > > parameter is provided as an emergency override and represents the
> > > default Linux policy for all devices.
> > >
> > > Signed-off-by: Ben Widawsky <ben.widawsky@xxxxxxxxx>
> > > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> > > [djbw: add modparam, drop check_device_status()]
> > > Co-developed-by: Dan Williams <dan.j.williams@xxxxxxxxx>
> > > Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>
> > > ---
> > >  drivers/cxl/pci.c |   35 +++++++++++++++++++++++++++++++++++
> > >  1 file changed, 35 insertions(+)
> > >
> > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > index 8dc91fd3396a..ed8de9eac970 100644
> > > --- a/drivers/cxl/pci.c
> > > +++ b/drivers/cxl/pci.c
> > > @@ -1,7 +1,9 @@
> > >  // SPDX-License-Identifier: GPL-2.0-only
> > >  /* Copyright(c) 2020 Intel Corporation. All rights reserved. */
> > >  #include <linux/io-64-nonatomic-lo-hi.h>
> > > +#include <linux/moduleparam.h>
> > >  #include <linux/module.h>
> > > +#include <linux/delay.h>
> > >  #include <linux/sizes.h>
> > >  #include <linux/mutex.h>
> > >  #include <linux/list.h>
> > > @@ -35,6 +37,20 @@
> > >  /* CXL 2.0 - 8.2.8.4 */
> > >  #define CXL_MAILBOX_TIMEOUT_MS (2 * HZ)
> > >
> > > +/*
> > > + * CXL 2.0 ECN "Add Mailbox Ready Time" defines a capability field to
> > > + * dictate how long to wait for the mailbox to become ready. The new
> > > + * field allows the device to tell software the amount of time to wait
> > > + * before mailbox ready. This field per the spec theoretically allows
> > > + * for up to 255 seconds. 255 seconds is unreasonably long, its longer
> > > + * than the maximum SATA port link recovery wait. Default to 60 seconds
> > > + * until someone builds a CXL device that needs more time in practice.
> > > + */
> > > +static unsigned short mbox_ready_timeout = 60;
> > > +module_param(mbox_ready_timeout, ushort, 0600);
> >
> > Any reason not to make it 0644?
> >
> 
> Are there any tooling scenarios where this information is usable by non-root?

Just for ease of debug. If I get a bug report with this, first thing I'm going
to do is ask for the timeout value. Perhaps it's expected the person who filed
the bug will have root access.

> 
> > > +MODULE_PARM_DESC(mbox_ready_timeout,
> > > +              "seconds to wait for mailbox ready status");
> > > +
> > >  static int cxl_pci_mbox_wait_for_doorbell(struct cxl_dev_state *cxlds)
> > >  {
> > >       const unsigned long start = jiffies;
> > > @@ -281,6 +297,25 @@ static int cxl_pci_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *c
> > >  static int cxl_pci_setup_mailbox(struct cxl_dev_state *cxlds)
> > >  {
> > >       const int cap = readl(cxlds->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
> > > +     unsigned long timeout;
> > > +     u64 md_status;
> > > +
> > > +     timeout = jiffies + mbox_ready_timeout * HZ;
> > > +     do {
> > > +             md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
> > > +             if (md_status & CXLMDEV_MBOX_IF_READY)
> > > +                     break;
> > > +             if (msleep_interruptible(100))
> > > +                     break;
> > > +     } while (!time_after(jiffies, timeout));
> >
> > Just pointing out the [probably] obvious. If the user specifies a zero second
> > timeout, the code will still wait 100ms.
> 
> Sure, is that going to be a problem in practice? I expect the
> overwhelming common case is that the mailbox is already ready by this
> point, so it's a zero-wait.
> 

No problem I can see in practice.

> >
> > > +
> > > +     if (!(md_status & CXLMDEV_MBOX_IF_READY)) {
> > > +             dev_err(cxlds->dev,
> > > +                     "timeout awaiting mailbox ready, device state:%s%s\n",
> > > +                     md_status & CXLMDEV_DEV_FATAL ? " fatal" : "",
> > > +                     md_status & CXLMDEV_FW_HALT ? " firmware-halt" : "");
> > > +             return -EIO;
> > > +     }
> > >
> > >       cxlds->mbox_send = cxl_pci_mbox_send;
> > >       cxlds->payload_size =
> > >



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux