Re: [PATCH 1/4] PCI: pciehp: Ignore Link Down/Up caused by error-induced Hot Reset

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Oct 10, 2021 at 11:14:07AM +0200, Lukas Wunner wrote:
> On Thu, Oct 07, 2021 at 06:00:20PM -0500, Bjorn Helgaas wrote:
> > On Sat, Jul 31, 2021 at 02:39:01PM +0200, Lukas Wunner wrote:
> > > The issue isn't DPC-specific, it also occurs when an error is handled by
> > > AER through aer_root_reset().  So while the issue was noticed only now,
> > > it's been around since 2006 when AER support was first introduced.
> > > 
> > > Fixes: 6c2b374d7485 ("PCI-Express AER implemetation: AER core and aerdriver")
> [...]
> > > Cc: stable@xxxxxxxxxxxxxxx # v2.6.19+: ba952824e6c1: PCI/portdrv: Report reset for frozen channel
> > 
> > Have you tried backporting this to v2.6.19?  I bet it's tough.  Not
> > sure we should suggest that stable pick this up unless there's a
> > reasonable path to do that.
> 
> The oldest kernel.org stable release that still receives updates is v4.4.
> There may be older distribution kernels out there which continue to be
> supported.  We don't know for sure, so I think it's customary to tag
> the release that introduced an issue and leave it to the stable and
> distribution maintainers to choose what they backport.
> 
> It's true that patches often do not apply cleanly to older releases.
> There are some good folks who regularly take up the thankless task of
> backporting (Sudip Mukherjee to name but one), but I've also frequently
> backported patches myself where necessary.
> 
> 
> > > +config PCI_ERROR_RECOVERY
> > > +	def_bool PCIEAER || EEH
> > 
> > I'm having a hard time connecting this to the code.
> [...]
> > But this still seems like it's kind of dangling.  It's not obvious to
> > me why pciehp_slot_reset() should be inside that #ifdef.  Do we need
> > the #ifdef for a functional reason, or is it there because we know it
> > will never be called?  If the latter, I don't think the savings are
> > worth it.
> 
> The motivation for the #ifdef was merely to reduce code size if neither
> of PCIEAER or EEH is enabled.  Happy to remove it.

That'd be great.

> I have a different question though.  We've often discussed deprecating
> portdrv and moving port services to the core instead.  I've stumbled
> across commit a39bd851dccf ("PCI/PM: Clear PCIe PME Status bit in core,
> not PCIe port driver"), wherein you moved pcie_pme_root_status_cleanup()
> from portdrv to the core, which allowed you to drop the ->resume_noirq
> callback from portdrv.
> 
> I've been thinking what moving port services to the core would look like
> and what your vision for that might be.  If that commit is any indication,
> it seems you'd probably prefer that pciehp_slot_reset() (as introduced in
> the present patch) is called directly from report_slot_reset() in
> drivers/pci/pcie/err.c.
> 
> That would eliminate much of the plumbing contained in the present patch
> to allow each port service driver to define a ->slot_reset callback and
> iterate over those callbacks.  pciehp is probably the only port service
> which requires special handling upon ->slot_reset and the same goes for
> a lot of the error handling and power management callbacks defined by
> port services.  So all that plumbing is probably just a very roundabout
> way of doing things.
> 
> When calling into port service code from core code, one needs to find
> the port service's driver data.  For now we can resort to
> pcie_port_find_device(), but I imagine once we move all port services
> to the core, we'll be able to access their data directly from
> struct pci_dev, e.g. via pdev->hotplug or pdev->pcie->hotplug pointers.

We managed to get rid of pcie_port_find_service() recently, and I'd
love to get rid of pcie_port_find_device() as well.  I'd like to get
rid of struct pcie_device, too, but that's visible via sysfs and would
be harder.  But maybe we could at least stop using it internally.

I don't know exactly what the ->slot_reset() path would look like, and
I'm not suggesting you need to change that for this patch.

Bjorn



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux