Re: [PATCH v2] PCI/ASPM: Disable L1 before disabling L1ss

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 03, 2024 at 10:53:58PM +0530, Ajay Agarwal wrote:
> On Thu, Oct 03, 2024 at 12:01:22PM -0500, Bjorn Helgaas wrote:
> > On Thu, Oct 03, 2024 at 06:55:03PM +0530, Ajay Agarwal wrote:
> > > The current sequence in the driver for L1ss update is as follows.
> > > 
> > > Disable L1ss
> > > Disable L1
> > > Enable L1ss as required
> > > Enable L1 if required
> > > 
> > > With this sequence, a bus hang is observed during the L1ss
> > > disable sequence when the RC CPU attempts to clear the RC L1ss
> > > register after clearing the EP L1ss register.
> > 
> > Thanks for this.  What exactly does the bus hang look like to a user?
> >
> The CPU is just hung on reading the RC PCI_L1SS_CTL1 register. After
> some time, the CPU watchdog expires and the system reboots.

Wow.  Good to know that this is outside the PCIe domain.  I think this
is a good change, and since it is partly motivated by hardware
behavior that might be legal but seems somewhat unusual, can we
identify the hardware (CPU and PCIe Root Complex) involved here?

> > I guess the problem happens in pcie_config_aspm_l1ss(), where we do:
> > 
> >   pci_clear_and_set_config_dword(child->l1ss + PCI_L1SS_CTL1, ... 0)
> >   pci_clear_and_set_config_dword(parent->l1ss + PCI_L1SS_CTL1, ... 0)
> > 
> > where clearing the child (endpoint) PCI_L1SS_CTL1_L1_2_MASK works, but
> > something goes wrong when clearing the parent (RP) mask?  The
> > clear_and_set will do a read followed by a write, and one of those
> > causes some kind of error?
> >
> During ASPM disable, in pcie_config_aspm_l1ss(), we do:
>    1. pci_clear_and_set_config_dword(child->l1ss + PCI_L1SS_CTL1, ... 0)
>    2. pci_clear_and_set_config_dword(parent->l1ss + PCI_L1SS_CTL1, ... 0)
>    3. pci_clear_and_set_config_dword(parent->l1ss + PCI_L1SS_CTL1, ... 0)
>    4. pci_clear_and_set_config_dword(child->l1ss + PCI_L1SS_CTL1, ... 0)
> 
> We observe that the steps 1 and 2 go through just fine. But the read of
> PCI_L1SS_CTL1 register in the step 3 hangs. I am not sure why.
> The issue is pretty difficult to reproduce, and adding prints around
> these steps masks the issue.

I guess the L1 disable is between 2 and 3, right?  And 3 and 4 may
enable L1 SS (using val, not 0)?

  1. same
  2. same
  2.5 pcie_capability_clear_word(child, PCI_EXP_LNKCTL_ASPM_L1)
  2.6 pcie_capability_clear_word(parent, PCI_EXP_LNKCTL_ASPM_L1)
  3. pci_clear_and_set_config_dword(parent->l1ss + PCI_L1SS_CTL1, ...  val)
  4. pci_clear_and_set_config_dword(child->l1ss + PCI_L1SS_CTL1, ...  val)

Bjorn




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux