Re: [RFC PATCH 1/2] PCI: Add PCI device flag PCI_DEV_FLAGS_BRIDGE_SKIP_ALIAS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 18 Feb 2016 18:57:12 +0530
Jayachandran Chandrashekaran Nair
<jayachandran.chandrashekaran@xxxxxxxxxxxx> wrote:

> On Wed, Feb 17, 2016 at 8:58 PM, Alex Williamson
> <alex.williamson@xxxxxxxxxx> wrote:
> > On Wed, 17 Feb 2016 17:15:09 +0530
> > Jayachandran Chandrashekaran Nair
> > <jayachandran.chandrashekaran@xxxxxxxxxxxx> wrote:
> >  
> >> On Wed, Feb 17, 2016 at 3:55 AM, Alex Williamson
> >> <alex.williamson@xxxxxxxxxx> wrote:  
> >> > On Wed, 17 Feb 2016 02:38:24 +0530
> >> > Jayachandran Chandrashekaran Nair
> >> > <jayachandran.chandrashekaran@xxxxxxxxxxxx> wrote:
> >> >  
> >> >> On Tue, Feb 16, 2016 at 1:09 AM, Alex Williamson
> >> >> <alex.williamson@xxxxxxxxxx> wrote:  
> >> >> > On Mon, 15 Feb 2016 12:20:23 -0600
> >> >> > Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
> >> >> >  
> >> >> >> [+cc Alex, iommu list]
> >> >> >>
> >> >> >> On Mon, Feb 15, 2016 at 03:35:00AM +0530, Jayachandran C wrote:  
> >> >> >> > Add a new flag PCI_DEV_FLAGS_BRIDGE_SKIP_ALIAS to indicate bridges
> >> >> >> > that should not be considered during DMA alias search. This is
> >> >> >> > to support hardware (in this case Broadcom Vulcan PCIe subsystem)
> >> >> >> > that has internal bridges which have either missing or wrong PCIe
> >> >> >> > capabilities.  
> >> >> >
> >> >> > I figured this would come at some point, the right answer is of course
> >> >> > to follow the PCIe spec and implement the required PCIe capability in
> >> >> > the hardware.  
> >> >>
> >> >> There are  PCIe controllers on the chip that follows the spec, the issue is
> >> >> how it is integrated to the northbridge (equivalent) on the chip, I have
> >> >> tried to explain it below.
> >> >>  
> >> >> >> This needs more explanation, like what exactly is wrong with this
> >> >> >> device?  A missing PCIe capability might cause other problems.
> >> >> >>
> >> >> >> What problem does this fix?  Without these patches, do we merely add
> >> >> >> aliases that are unnecessary?  Do we crash because something goes
> >> >> >> wrong in the pci_pcie_type() switch because of the incorrect
> >> >> >> capability?  
> >> >>
> >> >> Here's how (for example) how the PCI enumeration of a 2 node Vulcan
> >> >> processor will look like:
> >> >>
> >> >>
> >> >> [0] +-0.0.0--[1]---+--1.a.0----[2]-----2.0.0---[3]----3.0.0
> >> >>     |              +--1.a.1----[4]-----4.0.0---[5]----5.0.0
> >> >>     |              .
> >> >>     |              ... etc...
> >> >>     |
> >> >>     +-0.0.1--[10]--+-10.a.0----[11]---11.0.0---[12]---12.0.0
> >> >>                    +-10.a.1----[13]---13.0.0---[14]---14.0.0
> >> >>                    .
> >> >>                    ... etc...  
> >> >
> >> > So we have:
> >> >
> >> >     "Glue Bridge"---"Glue Bridges"---Root Ports---Endpoints
> >> >       (no pcie)      (pci/x-pcie)  
> >>
> >> The top level is one glue bridge per chip in a multi-chip board,
> >> but otherwise this is accurate.
> >>  
> >> >>
> >> >> The devices 0.0.x and x.a.x are glue bridges that are there to
> >> >> bring the real PCIe controllers (pcie cap type 4) 2.0.0, 4.0.0,
> >> >> 11.0.0, 13.0.0 under a single PCI hierarchy. The x.a.x bridges
> >> >> have a PCIe capability (type 8) and 0.0.x does not have any pcie
> >> >> capability.
> >> >>
> >> >> In case of Vulcan, both the GICv3 ITS driver code and the SMMUv3
> >> >> driver code does dma alias walk to find the device id to use
> >> >> in ITS and SMMU. In both cases it will ignore the x.0.0 bridges
> >> >> since they are type root port, but will continue on up and end
> >> >> up with incorrect device id.
> >> >>
> >> >> The flag I have added is to make the pci_for_each_dma_alias()
> >> >> ignore the last 2 levels of glue/internal bridges.  
> >> >
> >> > Per the PCIe spec, I'm not even sure what you have is a valid
> >> > hierarchy, root ports sit on root complexes, not behind a PCI-to-PCIe
> >> > bridge.  So really you're pretending that the downstream "glue bridge"
> >> > is your host bridge and therefore root complex, but the PCI topology
> >> > would clearly dictate that the top-most bus is conventional PCI and
> >> > therefore everything is an alias of everything else and the DMA alias
> >> > code is doing exactly what it should.  
> >>
> >> Yes, I am not arguing that there is any issue in the current code. I
> >> am trying to figure out the correct way to implement the quirk. We
> >> have to support the hardware we have, not the hardware we want to
> >> have :)
> >>  
> >> > Also note that the caller of pci_for_each_dma_alias() owns the callback
> >> > function triggered for each alias, that function could choose to prune
> >> > out these "glue bridges" itself if that's the appropriate thing to do.  
> >>
> >> I had implemented this first, but moved to the current approach because
> >> it is needed in multiple places. The problem is: "On vulcan, while
> >> going up the pcie bus heirarchy for finding aliases, skip the glue bridges",
> >> So the logical place to put the fix is in pci_for_each_dma_alias()
> >>  
> >> > Where do the SMMU and ITS actually reside in the above diagram?  You
> >> > can use the callout function to stop the traversal anywhere you wish,
> >> > it's free to return a -errno to stop or positive number, which the
> >> > caller can interpret as error or non-failure stop condition.  
> >>
> >> The SMMU (for translation requests) and ITS (for MSI writes for
> >> interrupts) are connected directly to the proper PCIe controller
> >> (2/4/11/13.0.0 above)  
> >
> >
> > If the translation unit is on the root port then DMA aliases upstream of
> > the translation unit are irrelevant and the callback function should
> > stop the traversal at that point.
> >  
> >> > You could even think about changing what Linux considers to be the host
> >> > bridge.  Maybe these glue bridges shouldn't even be visible to Linux,
> >> > would that also solve the issue with the broken BAR?  
> >>
> >> The glue bridges have to seen by Linux for assigning resources like
> >> bus ranges and memory windows. So these are required bridges in
> >> the topology and will work with the standard linux code
> >> (provided we skip them for aliases, and and ignore the BAR).
> >>  
> >> >> > The change takes the same code path as it would for a real PCIe bridge
> >> >> > port (downstream/upstream/root), which means they want to skip adding
> >> >> > this bridge as an alias of the device.  So we're adding in aliases that
> >> >> > don't exist, the bridge itself.
> >> >> >
> >> >> > If anything I'd suggest a flag that actually tries to address the
> >> >> > problem rather than a symptom of the problem.  For example, maybe the
> >> >> > flag should be PCI_DEV_FLAGS_IS_PCIE.  Maybe pci_is_pcie() should even
> >> >> > take that into account.  That has some trickle through for
> >> >> > pci_pcie_type() and all the accessor functions, but maybe it's a
> >> >> > cleaner solution overall (or maybe it explodes further).  Thanks,  
> >> >>
> >> >> I didn't really want to mark the glue bridges as PCIe or have fake
> >> >> PCIe capability there, the obvious simple solution was to add
> >> >> the flag PCI_DEV_FLAGS_BRIDGE_SKIP_ALIAS
> >> >>
> >> >> Any suggestions/comments on how to do this better would be welcome.  
> >> >
> >> > I definitely don't think either flag idea is the right solution, I
> >> > think you've actually already got the tools you need to solve it by
> >> > putting the intelligence in the callback function or by going further
> >> > down the path of pretending the glue bridge is really a host bridge.  
> >>
> >> If I understand your suggestion, I need to fake the PCIe capability
> >> for the glue bridges, which may be reasonable for the second level.
> >> But for the first level, there is no pcie capability and faking that
> >> becomes more difficult.  
> >
> > That's not what I'm suggesting.
> >  
> >> If the concern is adding a flag for just one platform, I can check
> >> for the vendor ID/device ID in the pci_for_each_dma_alias()
> >> function, which would be slightly uglier, but still a simple and
> >> maintainable solution.
> >>
> >> Another option is to break the pci_for_each_dma_alias when it
> >> reaches to root port (for vulcan), this too can be done without
> >> a flag.  
> >
> > Exactly, this can be done by the callback function provided on vulcan
> > systems and requires no core changes.  The IOMMU needs to provide a
> > callback function that makes sense for collecting aliases on the
> > specific platform.  If the translation unit is not on the root bus, the
> > callback function should stop the topology walk earlier.  Thanks,  
> 
> Fixing every callback function is going to be painful and ugly.
> I will have to change drivers/irqchip/irq-gic-v3-its-pci-msi.c,
> drivers/pci/msi.c, and drivers/iommu/arm-smmu-v3.c with the
> same check for vendor/device ids and return early.
> 
> I can update the patch to check the vendor/device id in the
> case of ROOT_PORT in pci_for_each_dma_alias(). If you
> think that will not be acceptable, I don't have any reasonable
> options left.

I've never known there to necessarily be a correlation between the
easiest solution and the correct one, but I think there's probably
still an easy and clean solution not far from your original approach.
pci_for_each_dma_alias() is meant to collect all the aliases for a
given device.  It stops at the root bus both because it can't go
further and because on many platforms, that's where the translation
unit lives.  If we want to have a common and easy way to stop it above
the root bus then we could flag a device as being the DMA alias root
and exit at that point.  I think that makes more sense architecturally
than some mystery devices flagged to skip reporting an alias.  Thanks,

Alex

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux