Re: pci_bus_distribute_available_resources() is wrong?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Mon, Dec 12, 2022 at 04:10:16PM -0500, Alexander Motin wrote:
> On 12.12.2022 15:32, Bjorn Helgaas wrote:
> > On Mon, Dec 12, 2022 at 1:36 PM Alexander Motin <mav@xxxxxxxxxxxxx> wrote:
> > > Hi,
> > > 
> > > I am writing to you three as the authors of Linux
> > > drivers/pci/setup-bus.c pci_bus_distribute_available_resources()
> > > function.  Trying to debug PCI hot-plug issue on passive side of AMD NTB
> > > I hit this function, behavior of which I looks very suspicious to me,
> > > which I believe cause resource allocation problems we observe.
> > > 
> > > As I see, this function distributes extra size of parent memory window
> > > of hot-plug PCI bridge between memory windows of child bridges.  It
> > > probably makes some sense, but I see a problem in the fact that the
> > > function only looks on children bridge memory windows, but not any other
> > > resources (of bridges or other devices that may be there).

Right the idea was that we allocate the spare resources for the possible
hotplug downstream ports so that it is possible to extend that topology
without running out of resources. This is mostly used with
Thunderbolt/USB4 PCIe tunneling.

However, like many have noticed, it does not handle the more generic PCI
case well. Sorry about that.

> > > In my AMD NTB case PCI topology looks this way:
> > > 
> > > +-[0000:80]-+-00.0
> > > |           +-01.1-[81-83]----00.0-[82-83]----00.0-[83]--+-00.0 Dummy
> > > |           |                                            \-00.1 NTB
> > > 
> > > 80:01.1 is the root bridge where the hot-plug happens.  The 81:00.0
> > > bridge in addition to memory windows has small 16KB BAR.  But since it
> > > is the only bridge on the bus, the function passes all available
> > > resources down to its children.  As result, that BAR fails to allocate.
> > >    And while that BAR seems not really needed, in some cases the
> > > allocation error makes whole memory window to be disabled, that ends up
> > > in NTB device driver attach failure.

Just out of the curiosity, is this PCIe or PCI topology?

> > Mika is working on what sounds like the same problem.  His current
> > patch series is at
> > https://lore.kernel.org/linux-pci/20221130112221.66612-1-mika.westerberg@xxxxxxxxxxxxxxx/
> > 
> > We would appreciate your comments and testing as that series is developed.
> 
> Thank you, Bjorn.  This definitely looks related, but as you've already
> noted in your review there, present patch does not handle BARs of the bridge
> itself, that I have in my case.  I'd be happy to test the updated patch.
> Please keep me in a loop.
> 
> I also agree with your comment that the same should be done in case of
> multiple bridges.  I am generally not sure the cases of single bridge or not
> having hot-plug on this level should be any specific.

Yeah, I'm working on a new version of the patch series that should take
these into consideration. The challenge is that the code has been used
with the Thunderbolt/USB4 PCIe tunneling for some time already and we
don't want to break that either.

I'm also more than happy to test any patches regarding this if someone
else wants to work on it ;-)



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux