Re: [PATCH v9 4/4] PCI: brcmstb: Configure HW CLKREQ# mode appropriate for downstream device

Jim Quinlan <james.quinlan@xxxxxxxxxxxx> · Wed, 8 May 2024 13:55:24 -0400

On Mon, May 6, 2024 at 7:20 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
>
> On Wed, Apr 03, 2024 at 05:39:01PM -0400, Jim Quinlan wrote:
> > The Broadcom STB/CM PCIe HW core, which is also used in RPi SOCs, must be
> > deliberately set by the PCIe RC HW into one of three mutually exclusive
> > modes:
> >
> > "safe" -- No CLKREQ# expected or required, refclk is always provided.  This
> >     mode should work for all devices but is not be capable of any refclk
> >     power savings.
>
> s/refclk is always provided/the Root Port always supplies Refclk/
>
> At least, I assume that's what this means?  The Root Port always
> supplies Refclk regardless of whether a downstream device deasserts
> CLKREQ#?
>
> The patch doesn't do anything to prevent aspm.c from setting
> PCI_EXP_LNKCTL_CLKREQ_EN, so it looks like Linux may still set the
> "Enable Clock Power Management" bit in downstream devices, but the
> Root Port just ignores the CLKREQ# signal, right?
>
> s/is not be/is not/
>
> > "no-l1ss" -- CLKREQ# is expected to be driven by the downstream device for
> >     CPM and ASPM L0s and L1.  Provides Clock Power Management, L0s, and L1,
> >     but cannot provide L1 substate (L1SS) power savings. If the downstream
> >     device connected to the RC is L1SS capable AND the OS enables L1SS, all
> >     PCIe traffic may abruptly halt, potentially hanging the system.
>
> s/CPM/Clock Power Management (CPM)/ and then you can use "CPM" for the
> *second* reference here.
>
> It *looks* like we should never see this PCIe hang because with this
> setting you don't advertise L1SS in the Root Port, so the OS should
> never enable L1SS, at least for that link.  Right?
>
> If we never enable L1SS in the case where it could cause a hang, why
> mention the possibility here?

Hello Bjorn,

I will remove this.

>
> I assume that if the downstream device is a Switch, L1SS is unsafe for
> the Root Port to Switch link, but it could still be used for the link
> between the Switch and whatever is below it?
Yes.  The "brcm,clkreq-mode" only applies to the root complex and the
device to which it is connected.

>
> > "default" -- Bidirectional CLKREQ# between the RC and downstream device.
> >     Provides ASPM L0s, L1, and L1SS, but not compliant to provide Clock
> >     Power Management; specifically, may not be able to meet the T_CLRon max
> >     timing of 400ns as specified in "Dynamic Clock Control", section
> >     3.2.5.2.2 of the PCIe Express Mini CEM 2.1 specification.  This
> >     situation is atypical and should happen only with older devices.
>
> IIUC this T_CLRon timing issue is with the STB/CM *Root Port*, but the
> last sentence refers to "older devices," which sounds like it means
> "older devices that might be plugged into the Root Port."  That would
> suggest the issue is with those devices, not iwth the STB/CM Root
> Port.
According to the PCIe HW designer, more modern chips have extra circuitry
to overcome this issue.  I really do not know if this is the case, nor am I
sure that he knows for sure.  But the spec says that T_CLRon should meet
a certain value, and this RC cannot do that in some situations.

>
> Or maybe this is meant to refer to older STB/CM Root Ports?
>
> > Previously, this driver always set the mode to "no-l1ss", as almost all
> > STB/CM boards operate in this mode.  But now there is interest in
> > activating L1SS power savings from STB/CM customers, which requires
> > "default" mode.  In addition, a bug was filed for RPi4 CM platform because
> > most devices did not work in "no-l1ss" mode (see link below).
>
> I'm having a hard time reconciling "almost all STB/CM boards operate
> in 'no-l1ss' mode" with "most devices did not work in 'no-l1ss' mode."
> They sound contradictory.

I concur, it is no longer clear to me why some device+board+connector
combos work
in "no-l1ss" mode and not in "default mode", and vice versa.  Our
existing boards
work in "no-l1ss" mode and the RPi CM HW works fine with "default"
mode (l1ss possible).

This is not just due to older devices, although I've noticed that a
lot of older devices
have no trace connected to their CLKREQ# pin, and the signal is left floating.
Another thing that has recently surfaced is that some of our board
designs are using a unidirectional level-shifter for CLKREQ#, which is
a bidirectional signal.  This may be causing mayhem.  Another issue is
that some if not a majority of the
adapters I use to test PCIe devices on a board with a socket interfere
with the CLKREQ# signal;
e.g. some adapters ground it, leading me to believe that systems are
working when
they would not if CLKREQ# was not grounded.

I have not enumerated all of the reasons for which
brcm,clkreq-mode setting will make a device+board+connector combo work or not.
But I do know that being able to configure these modes is a must-have
requirement.  I also
know that the "default" setting I am proposing is the same configuration
that is used by the RaspberryPi folks with RaspianOS.  The STB consumers have no
problem changing the DT property if required.  Similarly, a Linux
enthusiast should be
able to set  the brcm,clkreq-mode property to "safe" if they are
having PCIe issues,
just like they may configure  CONFIG_PCIE_BUS_SAFE=y.

Please keep in mind that currently the upstream Linux will not run on
an Rpi CM board until
this submission or something like it is accepted.

TL;DR  Let me rewrite this text and resubmit.

>
> > Note that the mode is specified by the DT property "brcm,clkreq-mode".  If
> > this property is omitted, then "default" mode is chosen.
>
> As a user, how do I determine which setting to use?
Using the "safe" mode will always work.  In fact I considered making
this the default mode.
As I said, I cannot enumerate all of the reasons why one mode works and one does
not for a particular device+board+connector combo.  The HW folks have not really
been forthcoming on the reasons as well.

>
> Trial and error?  If so, how do I identify the errors?
Either PCIe link-up is not happening, or it is happening but the
device driver is non-functional
and boot typically  hangs.

>
> Obviously "default" is the best, so I assume I would try that first.
> If something is flaky (whatever that means), I would fall back to
> "no-l1ss", which gets me Clock PM, L0s, and L1, right?  In what
> situation does "no-l1ss" fail, and how do I tell that it fails?

For example,"no-l1ss" fails on the Rpi CM.  Perhaps the reason for that
is that the CLKREQ# signal is left floating  and some devices do not connect
their CLKREQ# pin.  But I am not sure of that -- I do not have access to
the signals and I do not have the requisite RPi CM design info.

Regards,
Jim Quinlan
Broadcom STB/CM

>
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=217276
> >
> > Signed-off-by: Jim Quinlan <james.quinlan@xxxxxxxxxxxx>
> > ---
> >  drivers/pci/controller/pcie-brcmstb.c | 79 ++++++++++++++++++++++++---
> >  1 file changed, 70 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c
> > index 3d08b92d5bb8..3dc8511e6f58 100644
> > --- a/drivers/pci/controller/pcie-brcmstb.c
> > +++ b/drivers/pci/controller/pcie-brcmstb.c
> > @@ -48,6 +48,9 @@
> >  #define PCIE_RC_CFG_PRIV1_LINK_CAPABILITY                    0x04dc
> >  #define  PCIE_RC_CFG_PRIV1_LINK_CAPABILITY_ASPM_SUPPORT_MASK 0xc00
> >
> > +#define PCIE_RC_CFG_PRIV1_ROOT_CAP                   0x4f8
> > +#define  PCIE_RC_CFG_PRIV1_ROOT_CAP_L1SS_MODE_MASK   0xf8
> > +
> >  #define PCIE_RC_DL_MDIO_ADDR                         0x1100
> >  #define PCIE_RC_DL_MDIO_WR_DATA                              0x1104
> >  #define PCIE_RC_DL_MDIO_RD_DATA                              0x1108
> > @@ -121,9 +124,12 @@
> >
> >  #define PCIE_MISC_HARD_PCIE_HARD_DEBUG                                       0x4204
> >  #define  PCIE_MISC_HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_MASK     0x2
> > +#define  PCIE_MISC_HARD_PCIE_HARD_DEBUG_L1SS_ENABLE_MASK             0x200000
> >  #define  PCIE_MISC_HARD_PCIE_HARD_DEBUG_SERDES_IDDQ_MASK             0x08000000
> >  #define  PCIE_BMIPS_MISC_HARD_PCIE_HARD_DEBUG_SERDES_IDDQ_MASK               0x00800000
> > -
> > +#define  PCIE_CLKREQ_MASK \
> > +       (PCIE_MISC_HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_MASK | \
> > +        PCIE_MISC_HARD_PCIE_HARD_DEBUG_L1SS_ENABLE_MASK)
> >
> >  #define PCIE_INTR2_CPU_BASE          0x4300
> >  #define PCIE_MSI_INTR2_BASE          0x4500
> > @@ -1100,13 +1106,73 @@ static int brcm_pcie_setup(struct brcm_pcie *pcie)
> >       return 0;
> >  }
> >
> > +static void brcm_config_clkreq(struct brcm_pcie *pcie)
> > +{
> > +     static const char err_msg[] = "invalid 'brcm,clkreq-mode' DT string\n";
> > +     const char *mode = "default";
> > +     u32 clkreq_cntl;
> > +     int ret, tmp;
> > +
> > +     ret = of_property_read_string(pcie->np, "brcm,clkreq-mode", &mode);
> > +     if (ret && ret != -EINVAL) {
> > +             dev_err(pcie->dev, err_msg);
> > +             mode = "safe";
> > +     }
> > +
> > +     /* Start out assuming safe mode (both mode bits cleared) */
> > +     clkreq_cntl = readl(pcie->base + PCIE_MISC_HARD_PCIE_HARD_DEBUG);
> > +     clkreq_cntl &= ~PCIE_CLKREQ_MASK;
> > +
> > +     if (strcmp(mode, "no-l1ss") == 0) {
> > +             /*
> > +              * "no-l1ss" -- Provides Clock Power Management, L0s, and
> > +              * L1, but cannot provide L1 substate (L1SS) power
> > +              * savings. If the downstream device connected to the RC is
> > +              * L1SS capable AND the OS enables L1SS, all PCIe traffic
> > +              * may abruptly halt, potentially hanging the system.
> > +              */
> > +             clkreq_cntl |= PCIE_MISC_HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_MASK;
> > +             /*
> > +              * We want to un-advertise L1 substates because if the OS
> > +              * tries to configure the controller into using L1 substate
> > +              * power savings it may fail or hang when the RC HW is in
> > +              * "no-l1ss" mode.
> > +              */
> > +             tmp = readl(pcie->base + PCIE_RC_CFG_PRIV1_ROOT_CAP);
> > +             u32p_replace_bits(&tmp, 2, PCIE_RC_CFG_PRIV1_ROOT_CAP_L1SS_MODE_MASK);
> > +             writel(tmp, pcie->base + PCIE_RC_CFG_PRIV1_ROOT_CAP);
> > +
> > +     } else if (strcmp(mode, "default") == 0) {
> > +             /*
> > +              * "default" -- Provides L0s, L1, and L1SS, but not
> > +              * compliant to provide Clock Power Management;
> > +              * specifically, may not be able to meet the Tclron max
> > +              * timing of 400ns as specified in "Dynamic Clock Control",
> > +              * section 3.2.5.2.2 of the PCIe spec.  This situation is
> > +              * atypical and should happen only with older devices.
> > +              */
> > +             clkreq_cntl |= PCIE_MISC_HARD_PCIE_HARD_DEBUG_L1SS_ENABLE_MASK;
> > +
> > +     } else {
> > +             /*
> > +              * "safe" -- No power savings; refclk is driven by RC
> > +              * unconditionally.
> > +              */
> > +             if (strcmp(mode, "safe") != 0)
> > +                     dev_err(pcie->dev, err_msg);
> > +             mode = "safe";
> > +     }
> > +     writel(clkreq_cntl, pcie->base + PCIE_MISC_HARD_PCIE_HARD_DEBUG);
> > +
> > +     dev_info(pcie->dev, "clkreq-mode set to %s\n", mode);
> > +}
> > +
> >  static int brcm_pcie_start_link(struct brcm_pcie *pcie)
> >  {
> >       struct device *dev = pcie->dev;
> >       void __iomem *base = pcie->base;
> >       u16 nlw, cls, lnksta;
> >       bool ssc_good = false;
> > -     u32 tmp;
> >       int ret, i;
> >
> >       /* Unassert the fundamental reset */
> > @@ -1138,6 +1204,8 @@ static int brcm_pcie_start_link(struct brcm_pcie *pcie)
> >        */
> >       brcm_extend_internal_bus_timeout(pcie, BRCM_LTR_MAX_NS + 1000);
> >
> > +     brcm_config_clkreq(pcie);
> > +
> >       if (pcie->gen)
> >               brcm_pcie_set_gen(pcie, pcie->gen);
> >
> > @@ -1156,13 +1224,6 @@ static int brcm_pcie_start_link(struct brcm_pcie *pcie)
> >                pci_speed_string(pcie_link_speed[cls]), nlw,
> >                ssc_good ? "(SSC)" : "(!SSC)");
> >
> > -     /*
> > -      * Refclk from RC should be gated with CLKREQ# input when ASPM L0s,L1
> > -      * is enabled => setting the CLKREQ_DEBUG_ENABLE field to 1.
> > -      */
> > -     tmp = readl(base + PCIE_MISC_HARD_PCIE_HARD_DEBUG);
> > -     tmp |= PCIE_MISC_HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_MASK;
> > -     writel(tmp, base + PCIE_MISC_HARD_PCIE_HARD_DEBUG);
> >
> >       return 0;
> >  }
> > --
> > 2.17.1
> >
>
>
Attachment:
smime.p7s

Description: S/MIME Cryptographic Signature