On Tue, Jan 17, 2017 at 12:14:58PM -0600, Bjorn Helgaas wrote: > The instrumentation has evolved a bit since then. Latest is below (could > still use improvement, but it does address your suggestions above): > > https://bugzilla.kernel.org/attachment.cgi?id=251691 (CONFIG_PCIEASPM=y) > https://bugzilla.kernel.org/attachment.cgi?id=251701 (CONFIG_PCIEASPM not set) Thanks. The point at which things die is when we request a link retrain - I've augmented the trace with the register names: pci 0000:02:00.0: rd where=0x074 size=4 val=0x8dc1 (hw) EXP_DEVCAP pcie_aspm_configure_common_clock(): pci 0000:02:00.0: rd where=0x082 size=2 val=0x1011 (hw) EXP_LNKSTA pci 0000:??:??.?: rd where=0x052 size=2 val=0x1011 (sw) EXP_LNKSTA pci 0000:02:00.0: rd where=0x080 size=2 val=0x0 (hw) EXP_LNKCTL pci 0000:02:00.0: wr where=0x080 size=2 val=0x40 (hw) EXP_LNKCTL Enables common clock configuration on the device. pci 0000:??:??.?: rd where=0x050 size=2 val=0x40 (sw) EXP_LNKCTL pci 0000:??:??.?: wr where=0x050 size=2 val=0x40 (sw) EXP_LNKCTL Common clock configuration is already enabled on the root. pci 0000:??:??.?: rd where=0x050 size=4 val=0x10110040 (sw) EXP_LNKCTL pci 0000:??:??.?: wr where=0x050 size=2 val=0x60 (sw) EXP_LNKCTL Here we request the train, setting bit 5 in the link control register. pci 0000:??:??.?: rd where=0x050 size=4 val=0x110040 (sw) EXP_LNKCTL pci 0000:??:??.?: rd where=0x052 size=2 val=0x811 (sw) EXP_LNKSTA pci 0000:??:??.?: rd where=0x052 size=2 val=0x811 (sw) EXP_LNKSTA Waiting for the link training bit to clear... pci 0000:??:??.?: rd where=0x052 size=2 val=0x11 (sw) EXP_LNKSTA and it's cleared here - but note that the link is still down. pci 0000:??:??.?: rd where=0x04c size=4 val=0x3ac12 (sw) EXP_LNKCAP pci 0000:??:??.?: rd where=0x050 size=2 val=0x40 (sw) EXP_LNKCTL pcie_get_aspm_reg() for the root. pci 0000:02:00.0: rd where=0x07c size=4 val=0xffffffff (no link) pcie_get_aspm_reg() for the device (fails). So, I think the question is... why does asking for a retrain cause the link to fail and never recover? Uwe, can you try: setpci -s <whatever-the-id-of-the-root-is-it's-blanked-out-in-the-above> \ 0x50.w=0x60 and see whether it remains alive (you can check by reading the root register 0x52.w - bit 12 should be set once bit 11 clears again. If that's successful, maybe setting the common clock bit on the PCIe device is what's causing the problem, in which case: setpci -s 02:00.0 0x80.w=0x40 setpci -s <whatever-the-id-of-the-root-is-it's-blanked-out-in-the-above> \ 0x50.w=0x60 I would imagine would cause the link to go down. So, the question this gives us is why the common clock setup is not working on your platform. Maybe we need to source the SLC bit in the link status from DT, though I'd like to understand what's going on here more first. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html