On Mon, May 6, 2024 at 7:20 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > On Wed, Apr 03, 2024 at 05:39:01PM -0400, Jim Quinlan wrote: > > The Broadcom STB/CM PCIe HW core, which is also used in RPi SOCs, must be > > deliberately set by the PCIe RC HW into one of three mutually exclusive > > modes: > > > > "safe" -- No CLKREQ# expected or required, refclk is always provided. This > > mode should work for all devices but is not be capable of any refclk > > power savings. > > s/refclk is always provided/the Root Port always supplies Refclk/ > > At least, I assume that's what this means? The Root Port always > supplies Refclk regardless of whether a downstream device deasserts > CLKREQ#? > > The patch doesn't do anything to prevent aspm.c from setting > PCI_EXP_LNKCTL_CLKREQ_EN, so it looks like Linux may still set the > "Enable Clock Power Management" bit in downstream devices, but the > Root Port just ignores the CLKREQ# signal, right? > > s/is not be/is not/ > > > "no-l1ss" -- CLKREQ# is expected to be driven by the downstream device for > > CPM and ASPM L0s and L1. Provides Clock Power Management, L0s, and L1, > > but cannot provide L1 substate (L1SS) power savings. If the downstream > > device connected to the RC is L1SS capable AND the OS enables L1SS, all > > PCIe traffic may abruptly halt, potentially hanging the system. > > s/CPM/Clock Power Management (CPM)/ and then you can use "CPM" for the > *second* reference here. > > It *looks* like we should never see this PCIe hang because with this > setting you don't advertise L1SS in the Root Port, so the OS should > never enable L1SS, at least for that link. Right? > > If we never enable L1SS in the case where it could cause a hang, why > mention the possibility here? Hello Bjorn, I will remove this. > > I assume that if the downstream device is a Switch, L1SS is unsafe for > the Root Port to Switch link, but it could still be used for the link > between the Switch and whatever is below it? Yes. The "brcm,clkreq-mode" only applies to the root complex and the device to which it is connected. > > > "default" -- Bidirectional CLKREQ# between the RC and downstream device. > > Provides ASPM L0s, L1, and L1SS, but not compliant to provide Clock > > Power Management; specifically, may not be able to meet the T_CLRon max > > timing of 400ns as specified in "Dynamic Clock Control", section > > 3.2.5.2.2 of the PCIe Express Mini CEM 2.1 specification. This > > situation is atypical and should happen only with older devices. > > IIUC this T_CLRon timing issue is with the STB/CM *Root Port*, but the > last sentence refers to "older devices," which sounds like it means > "older devices that might be plugged into the Root Port." That would > suggest the issue is with those devices, not iwth the STB/CM Root > Port. According to the PCIe HW designer, more modern chips have extra circuitry to overcome this issue. I really do not know if this is the case, nor am I sure that he knows for sure. But the spec says that T_CLRon should meet a certain value, and this RC cannot do that in some situations. > > Or maybe this is meant to refer to older STB/CM Root Ports? > > > Previously, this driver always set the mode to "no-l1ss", as almost all > > STB/CM boards operate in this mode. But now there is interest in > > activating L1SS power savings from STB/CM customers, which requires > > "default" mode. In addition, a bug was filed for RPi4 CM platform because > > most devices did not work in "no-l1ss" mode (see link below). > > I'm having a hard time reconciling "almost all STB/CM boards operate > in 'no-l1ss' mode" with "most devices did not work in 'no-l1ss' mode." > They sound contradictory. I concur, it is no longer clear to me why some device+board+connector combos work in "no-l1ss" mode and not in "default mode", and vice versa. Our existing boards work in "no-l1ss" mode and the RPi CM HW works fine with "default" mode (l1ss possible). This is not just due to older devices, although I've noticed that a lot of older devices have no trace connected to their CLKREQ# pin, and the signal is left floating. Another thing that has recently surfaced is that some of our board designs are using a unidirectional level-shifter for CLKREQ#, which is a bidirectional signal. This may be causing mayhem. Another issue is that some if not a majority of the adapters I use to test PCIe devices on a board with a socket interfere with the CLKREQ# signal; e.g. some adapters ground it, leading me to believe that systems are working when they would not if CLKREQ# was not grounded. I have not enumerated all of the reasons for which brcm,clkreq-mode setting will make a device+board+connector combo work or not. But I do know that being able to configure these modes is a must-have requirement. I also know that the "default" setting I am proposing is the same configuration that is used by the RaspberryPi folks with RaspianOS. The STB consumers have no problem changing the DT property if required. Similarly, a Linux enthusiast should be able to set the brcm,clkreq-mode property to "safe" if they are having PCIe issues, just like they may configure CONFIG_PCIE_BUS_SAFE=y. Please keep in mind that currently the upstream Linux will not run on an Rpi CM board until this submission or something like it is accepted. TL;DR Let me rewrite this text and resubmit. > > > Note that the mode is specified by the DT property "brcm,clkreq-mode". If > > this property is omitted, then "default" mode is chosen. > > As a user, how do I determine which setting to use? Using the "safe" mode will always work. In fact I considered making this the default mode. As I said, I cannot enumerate all of the reasons why one mode works and one does not for a particular device+board+connector combo. The HW folks have not really been forthcoming on the reasons as well. > > Trial and error? If so, how do I identify the errors? Either PCIe link-up is not happening, or it is happening but the device driver is non-functional and boot typically hangs. > > Obviously "default" is the best, so I assume I would try that first. > If something is flaky (whatever that means), I would fall back to > "no-l1ss", which gets me Clock PM, L0s, and L1, right? In what > situation does "no-l1ss" fail, and how do I tell that it fails? For example,"no-l1ss" fails on the Rpi CM. Perhaps the reason for that is that the CLKREQ# signal is left floating and some devices do not connect their CLKREQ# pin. But I am not sure of that -- I do not have access to the signals and I do not have the requisite RPi CM design info. Regards, Jim Quinlan Broadcom STB/CM > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=217276 > > > > Signed-off-by: Jim Quinlan <james.quinlan@xxxxxxxxxxxx> > > --- > > drivers/pci/controller/pcie-brcmstb.c | 79 ++++++++++++++++++++++++--- > > 1 file changed, 70 insertions(+), 9 deletions(-) > > > > diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c > > index 3d08b92d5bb8..3dc8511e6f58 100644 > > --- a/drivers/pci/controller/pcie-brcmstb.c > > +++ b/drivers/pci/controller/pcie-brcmstb.c > > @@ -48,6 +48,9 @@ > > #define PCIE_RC_CFG_PRIV1_LINK_CAPABILITY 0x04dc > > #define PCIE_RC_CFG_PRIV1_LINK_CAPABILITY_ASPM_SUPPORT_MASK 0xc00 > > > > +#define PCIE_RC_CFG_PRIV1_ROOT_CAP 0x4f8 > > +#define PCIE_RC_CFG_PRIV1_ROOT_CAP_L1SS_MODE_MASK 0xf8 > > + > > #define PCIE_RC_DL_MDIO_ADDR 0x1100 > > #define PCIE_RC_DL_MDIO_WR_DATA 0x1104 > > #define PCIE_RC_DL_MDIO_RD_DATA 0x1108 > > @@ -121,9 +124,12 @@ > > > > #define PCIE_MISC_HARD_PCIE_HARD_DEBUG 0x4204 > > #define PCIE_MISC_HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_MASK 0x2 > > +#define PCIE_MISC_HARD_PCIE_HARD_DEBUG_L1SS_ENABLE_MASK 0x200000 > > #define PCIE_MISC_HARD_PCIE_HARD_DEBUG_SERDES_IDDQ_MASK 0x08000000 > > #define PCIE_BMIPS_MISC_HARD_PCIE_HARD_DEBUG_SERDES_IDDQ_MASK 0x00800000 > > - > > +#define PCIE_CLKREQ_MASK \ > > + (PCIE_MISC_HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_MASK | \ > > + PCIE_MISC_HARD_PCIE_HARD_DEBUG_L1SS_ENABLE_MASK) > > > > #define PCIE_INTR2_CPU_BASE 0x4300 > > #define PCIE_MSI_INTR2_BASE 0x4500 > > @@ -1100,13 +1106,73 @@ static int brcm_pcie_setup(struct brcm_pcie *pcie) > > return 0; > > } > > > > +static void brcm_config_clkreq(struct brcm_pcie *pcie) > > +{ > > + static const char err_msg[] = "invalid 'brcm,clkreq-mode' DT string\n"; > > + const char *mode = "default"; > > + u32 clkreq_cntl; > > + int ret, tmp; > > + > > + ret = of_property_read_string(pcie->np, "brcm,clkreq-mode", &mode); > > + if (ret && ret != -EINVAL) { > > + dev_err(pcie->dev, err_msg); > > + mode = "safe"; > > + } > > + > > + /* Start out assuming safe mode (both mode bits cleared) */ > > + clkreq_cntl = readl(pcie->base + PCIE_MISC_HARD_PCIE_HARD_DEBUG); > > + clkreq_cntl &= ~PCIE_CLKREQ_MASK; > > + > > + if (strcmp(mode, "no-l1ss") == 0) { > > + /* > > + * "no-l1ss" -- Provides Clock Power Management, L0s, and > > + * L1, but cannot provide L1 substate (L1SS) power > > + * savings. If the downstream device connected to the RC is > > + * L1SS capable AND the OS enables L1SS, all PCIe traffic > > + * may abruptly halt, potentially hanging the system. > > + */ > > + clkreq_cntl |= PCIE_MISC_HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_MASK; > > + /* > > + * We want to un-advertise L1 substates because if the OS > > + * tries to configure the controller into using L1 substate > > + * power savings it may fail or hang when the RC HW is in > > + * "no-l1ss" mode. > > + */ > > + tmp = readl(pcie->base + PCIE_RC_CFG_PRIV1_ROOT_CAP); > > + u32p_replace_bits(&tmp, 2, PCIE_RC_CFG_PRIV1_ROOT_CAP_L1SS_MODE_MASK); > > + writel(tmp, pcie->base + PCIE_RC_CFG_PRIV1_ROOT_CAP); > > + > > + } else if (strcmp(mode, "default") == 0) { > > + /* > > + * "default" -- Provides L0s, L1, and L1SS, but not > > + * compliant to provide Clock Power Management; > > + * specifically, may not be able to meet the Tclron max > > + * timing of 400ns as specified in "Dynamic Clock Control", > > + * section 3.2.5.2.2 of the PCIe spec. This situation is > > + * atypical and should happen only with older devices. > > + */ > > + clkreq_cntl |= PCIE_MISC_HARD_PCIE_HARD_DEBUG_L1SS_ENABLE_MASK; > > + > > + } else { > > + /* > > + * "safe" -- No power savings; refclk is driven by RC > > + * unconditionally. > > + */ > > + if (strcmp(mode, "safe") != 0) > > + dev_err(pcie->dev, err_msg); > > + mode = "safe"; > > + } > > + writel(clkreq_cntl, pcie->base + PCIE_MISC_HARD_PCIE_HARD_DEBUG); > > + > > + dev_info(pcie->dev, "clkreq-mode set to %s\n", mode); > > +} > > + > > static int brcm_pcie_start_link(struct brcm_pcie *pcie) > > { > > struct device *dev = pcie->dev; > > void __iomem *base = pcie->base; > > u16 nlw, cls, lnksta; > > bool ssc_good = false; > > - u32 tmp; > > int ret, i; > > > > /* Unassert the fundamental reset */ > > @@ -1138,6 +1204,8 @@ static int brcm_pcie_start_link(struct brcm_pcie *pcie) > > */ > > brcm_extend_internal_bus_timeout(pcie, BRCM_LTR_MAX_NS + 1000); > > > > + brcm_config_clkreq(pcie); > > + > > if (pcie->gen) > > brcm_pcie_set_gen(pcie, pcie->gen); > > > > @@ -1156,13 +1224,6 @@ static int brcm_pcie_start_link(struct brcm_pcie *pcie) > > pci_speed_string(pcie_link_speed[cls]), nlw, > > ssc_good ? "(SSC)" : "(!SSC)"); > > > > - /* > > - * Refclk from RC should be gated with CLKREQ# input when ASPM L0s,L1 > > - * is enabled => setting the CLKREQ_DEBUG_ENABLE field to 1. > > - */ > > - tmp = readl(base + PCIE_MISC_HARD_PCIE_HARD_DEBUG); > > - tmp |= PCIE_MISC_HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_MASK; > > - writel(tmp, base + PCIE_MISC_HARD_PCIE_HARD_DEBUG); > > > > return 0; > > } > > -- > > 2.17.1 > > > >
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature