Re: [PATCH v7] PCI: Try best to allocate pref mmio 64bit above 4g

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2014-04-10 at 11:26 -0600, Bjorn Helgaas wrote:
> > The problem with the old code is: when both 32-bit and 64-bit prefetchable
> > BARs present, it's in favor of 32-bit one to use prefetchable window.
> > This window is then not supposed to get 4G-above address, and this
> > 32-bit-address-only property would propagates upwards until reaching root
> > bridge. In later assignment phase, 4G-above space is never touched. This
> > is just caused by a single 32-bit prefetchable BAR (say a ROM BAR).
> >
> > This patch helps by making better decision:
> >
> >         * Keep the old behaviour if only 32-bit or 64-bit prefetchable
> >           BARs present
> >
> >         * If both of them present, put 64-bit ones to prefetchable window,
> >           32-bit ones to non-prefetchable window
> 
> I thought the patch was supposed to change the way we allocate bridge
> windows.  But you are talking about the way we assign 32- and 64-bit
> device resources, i.e., changing the way we decide whether to put them
> in prefetchable or non-prefetchable windows.

Wait...

Are you putting "non-prefetchable" 64-bit BARs in the prefetchable
range ?

That's bad. Don't do that.

Some switches or bridges *will* actively prefetch, and putting a
register BAR marked non-prefetchable in the prefetchable range will thus
cause nasty bugs if a register with side effects becomes the target of a
prefetch operations.

We can do that on powerpc *selectively* when and only when we know for
sure that no bridge will prefetch on the path to the device. We know our
host bridges won't and we *might* (to be verified) know that the PLX
switches we have soldered on the mobo won't, but that's the only cases
where it is legit.

> > So 4G-above space can be properly used.
> >
> > Why does enabling 64-bit space matter?
> >
> > * 32-bit space is just too small. If larger-than-4G BAR sounds
> >   unrealistic (we do have such devices), there is still a chance that
> >   total MMIO size under a domain is too large, especially when SR-IOV
> >   enabled (we met this situation).
> 
> Please give more details about the problem you saw.  A complete dmesg
> log showing a boot failure or a device that doesn't work, and another
> log with this patch applied, showing things working as desired, would
> go a long ways toward clarifying this.
> 
> This sounds like a specific case that will be fixed by the patch, and
> if you give more details, maybe I can figure out what's going on.

Well, in our case it's a very complicated story. We have this segmented
space where we have 256 segments covering our 32-bit space. We need each
"partitionable endpoint" to be in separate sets of segments. PEs are
basically iommu domain but also affect MMIO and are our unit of
assignment to guests when virtualizing and of error handling when doing
EEH (so we freeze access on error by PE).

So with something like SR-IOV where each VF has to be a separate PE,
the 32-bit space is problematic because we want the VF BARs to stride
over segments so each VF lives in a separate PE, but we run out of
segments and the 32-bit space has a fixed segment size that may not
match the VF BAR stride.

So we want our BARs to shoot into 64-bit space where we have additional
windows with each 256 segments that we can use, and additionally, we can
resize these to make the segment size match the VF BAR stride.

Cheers,
Ben.

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux