Re: [PATCH] PCI: pciehp: Include the Data Link Layer State Changed bit when clearing the Slot Status register's event bits

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jul 7, 2014 at 2:26 PM, Myron Stowe <myron.stowe@xxxxxxxxx> wrote:
> On Tue, Jul 1, 2014 at 1:29 PM, Bjorn Helgaas <bhelgaas@xxxxxxxxxx> wrote:
>> On Mon, Jun 30, 2014 at 10:49 AM, Myron Stowe <myron.stowe@xxxxxxxxx> wrote:
>>> On Tue, Jun 17, 2014 at 3:07 PM, Bjorn Helgaas <bhelgaas@xxxxxxxxxx> wrote:
>>>> On Tue, Jun 17, 2014 at 1:27 PM, Myron Stowe <myron.stowe@xxxxxxxxxx> wrote:
>>>>> During PCIe hot-plug initialization - pciehp_probe - data structures
>>>>> related to slot capabilities are set up.  As part of this set up, ISRs are
>>>>> put in place to handle slot events and all event bits are cleared out.
>>>>>
>>>>> This patch adds the Data Link Layer State Changed (PCI_EXP_SLTSTA_DLLSC)
>>>>> Slot Status bit to the event bits that are cleared out during
>>>>> initialization.
>>>>
>>>> I think this is related to e48f1b67f668 ("PCI: pciehp: Use link change
>>>> notifications for hot-plug and removal").  Prior to that, pcie_isr()
>>>> didn't look at the PCI_EXP_SLTSTA_DLLSC bit.
>>>>
>>>> Apparently there's a non-public report of spurious messages like this
>>>> at boot-time, with no actual hotplug events:
>>>>
>>>>   pciehp 0000:82:04.0:pcie24: slot(4): Link Up event
>>>>   pciehp 0000:82:04.0:pcie24: Device 0000:83:00.0 already exists at
>>>> 0000:83:00, cannot hot-add
>>>>   pciehp 0000:82:04.0:pcie24: Cannot add device at 0000:83:00
>>>>
>>>> Device 0000:83:00.0 was enumerated previously, during the normal PCI
>>>> device enumeration.  I suspect DLLSC was set by the hardware when the
>>>> link came up after power-on, and it remained set until Linux booted.
>>>> Then when we take the first pciehp interrupt, we notice DLLSC is set
>>>> and think it's a new link state change.
>>>>
>>>> This might be system-specific, because some BIOSes might clear DLLSC
>>>> before handing off to the OS.  Or my theory might be all wet.
>>>>
>>>> But I think we should clear DLLSC in any case.  We clear all the other
>>>> RW1C bits in Slot Status, and I think we should clear DLLSC as well.
>>>>
>>>> I'd like to include a bugzilla or mailing list reference for the
>>>> spurious message, and I'd also like confirmation that this changes
>>>> actually fixes it.
>>>
>>> I just got confirmation from the reporter that the patch fixes the
>>> issue they were encountering.
>>>
>>> I've also asked them numerous time if I could make the bug public as I
>>> expect others may be hitting it and could benefit from a BZ and
>>> analysis/explanation.  I just repeated my request - if they reply
>>> positively I'll follow through with some documentation.
>>
>> I'd be glad to merge this, as soon as we get more details about the
>> problem it fixes.  I think it's good to include some specifics, e.g.,
>> "Unexpected Link Up event," etc., because it helps other people with
>> the same problem to find the solution.  It's OK if details about
>> pre-production machines and so on are removed, but it's better if
>> changes to generic code are supported by public evidence that
>> everybody can evaluate.
>>
>> If you want to pull out the public stuff into a kernel.org bugzilla,
>> that would be fine, too.  That would actually be better because then
>> we don't have to depend on Red Hat maintaining it.
>
> Agreed (but I couldn't get the vendor to agree to open up the Red Hat
> BZ).  I've opened https://bugzilla.kernel.org/show_bug.cgi?id=79611
> and included a scrubbed 'dmesg' log there along with an analysis of
> how we are hitting the issue.

Beautiful, thanks!  I marked this for stable and put this in
pci/hotplug for v3.17.

>>>>> Reference:
>>>>>   PCI-SIG.  PCI Express Base Specification Revision 4.0 Version 0.3
>>>>>   (PCI-SIG, 2014): 7.8.11. Slot Status Register (Offset 1Ah).
>>>>>
>>>>> Signed-off-by: Myron Stowe <myron.stowe@xxxxxxxxxx>
>>>>> ---
>>>>>
>>>>>  drivers/pci/hotplug/pciehp_hpc.c |    2 +-
>>>>>  1 files changed, 1 insertions(+), 1 deletions(-)
>>>>>
>>>>> diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
>>>>> index 42914e0..0568416 100644
>>>>> --- a/drivers/pci/hotplug/pciehp_hpc.c
>>>>> +++ b/drivers/pci/hotplug/pciehp_hpc.c
>>>>> @@ -794,7 +794,7 @@ struct controller *pcie_init(struct pcie_device *dev)
>>>>>         pcie_capability_write_word(pdev, PCI_EXP_SLTSTA,
>>>>>                 PCI_EXP_SLTSTA_ABP | PCI_EXP_SLTSTA_PFD |
>>>>>                 PCI_EXP_SLTSTA_MRLSC | PCI_EXP_SLTSTA_PDC |
>>>>> -               PCI_EXP_SLTSTA_CC);
>>>>> +               PCI_EXP_SLTSTA_CC | PCI_EXP_SLTSTA_DLLSC);
>>>>>
>>>>>         /* Disable software notification */
>>>>>         pcie_disable_notification(ctrl);
>>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux