Re: [PATCH v3 4/8] pciehp: Don't disable the link permanently, during removal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Yinghai / Bjorn,


On Thu, Jan 9, 2014 at 12:58 PM, Bjorn Helgaas <bhelgaas@xxxxxxxxxx> wrote:
>>> >
>>> > On Sun, Jan 5, 2014 at 10:53 AM, Rajat Jain <rajatxjain@xxxxxxxxx>
>>> > wrote:
>>> > > Hello Bjorn,
>>> > >
>>> > > Just checking on the fate of this patch set...
>>> > >
>>> > > On Tue, Dec 17, 2013 at 5:02 PM, Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
>>> > wrote:
>>> > >> [+cc yinghai@xxxxxxxxxx (seems to be Yinghai's preferred email]
>>> > >>
>>> > >> On Tue, Dec 17, 2013 at 12:06:05PM -0800, Rajat Jain wrote:
>>> > >>> We need future link up events for hot-add, thus don't disable the
>>> > >>> link permanently during device removal. Also, remove the static
>>> > >>> functions that are now left unused.
>>> > >>
>>> > >> The changelog should mention that this reverts part of 2debd9289997
>>> > ("PCI:
>>> > >> pciehp: Disable/enable link during slot power off/on").
>>> > >
>>> > > Sure. Do you want me to submit another patch set (bumping up the
>>> > > version) with this change log, or you'd want to add this change log
>>> > > while merging?
>>> > >
>>> > >>
>>> > >> Yinghai, can you tell us whether this is an issue on your systems?
>>> > >
>>> > > As Yinghai confirms further down this thread, his issue was
>>> > > confirmed by Intel to be a bug in the repeater chip.
>>> > > ----------------------------------
>>> > > Yinghai writes:
>>> > >> According to HW guys and Intel, that should be bug of repeater.
>>> > >>
>>> > > ---------------------------------
>>> > >  I don't know about the details of his scenario, except that when
>>> > > the adapter was disabled the repeater kept on flapping the link up &
>>> > > down (and hence disabling the link solved the problem then). Yinghai
>>> > > couldn't test, but I believe with this patch even if we disable
>>> > > presence detect interrupt, the "adapter present / no present"
>>> > > messages would (rightly) convert to "Link Up / Link Down" messages
>>> > > (since the repeater keeps on flapping the link).
>>> > >
>>> > > Since it is a platform specific bug, I'm not sure what can be done
>>> > > to remove those messages except may be reduce the verbosity? If
>>> > > you'd like I could change all the INFO messages to DBG messages.
>>> >
>>> > Even if it's a defect in a particular piece of hardware, I don't want
>>> > to regress on that hardware, even if the regression is just extra
>>> > messages that we didn't see before.
>>> >
>>> > I think ideally we would add some sort of quirk for that hardware so
>>> > it works just as well as it does today.  I think extra messages will
>>> > lead to a bug reports from users.
>>>
>>> Sure, I can do that. I think what the quirk would have to do is that for
>>> that particular platform, don't enable the link-state based hotplug.
>>> (Since link-state hotplug will not work if we disable the link
>>> permanently as we do today on card removal).
>>>
>>> But the question is how to determine that the quirk has to be applied? I
>>> think the objective is to apply the quirk to the platforms that have a
>>> "PCIe repeater". Since this does not depend on a PCI device / vendor ID,
>>> and I think the PCIe repeater is probably not even visible to the pciehp
>>> or the PCI subsystem, how do I determine that the quirk has to be
>>> applied?
>>
>> Any ideas on how do I identify the platforms that may have this problem?
>
> I sure don't know.  I suspect you're right that the PCIe repeater is
> invisible to software, at least in terms of PCI config space.  Maybe
> we could use DMI to identify platforms.  That's not a very good
> solution because we have to come up with a list, but I can't think of
> a better way.  Yinghai knows more about the platform and might have
> better ideas.

Yinghai: I am trying to understand what exactly is this platform bug
and how to add a quirk such that this platform remains unaffected. Can
you please help me by suggesting how to decide if this is _the_
platform that has the bug (the pcie repeater).

Bjorn: It seems to be that identification of this platform will be out
PCI code (since the bug seems to be in a pcie repeater chip which is
not a PCI device visible to SW). So even if we find a way to identify
this platform (e.g DMI) , I doubt if you'd want me to add that in the
pciehp code (which is platform independent so far). At best, the only
way out I can see is to provide a knob from the pciehp, that can be
use by the platform code to either enable or disable the link state
hotplug. It could go back towards using a module parameter like
pciehp_use_link_events. Please suggest.

The only other way I can think of, is that I can remove the debug
message altogether (Link up / Link down). (Or the user can change the
verbosity).

Humm, when I think of it, we're trying to address a bug of a chip
which is not a PCI device, into pciehp. I'm praying it doesn't bring
this patch set to a dead end :-)

Thanks,

Rajat

>
> Bjorn
>
>>> If (hw_has_pcie_repeater)
>>>       Don't use link-state hotplug (and disable link permanently during
>>> card removal) Else
>>>       Use link-state hotplug (and don't disable the link permanently)
>>>
>>>
>>> Yinghai: Since I do not have that hardware, I will need some help in
>>> testing the patch with the quirk. I was wondering if you'd still have
>>> that hardware around and would be able to help me with testing?
>>>
>>> Thanks,
>>>
>>> Rajat
>>>   {.n +       +%  lzwm  b 맲  r  zX   \ )   w*jg         ݢj/   z ޖ  2 ޙ
>>> & )ߡ a       G   h   j:+v   w ٥
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux