On Tue, May 20, 2014 at 11:22:22AM -0600, Bjorn Helgaas wrote: > On Tue, May 20, 2014 at 11:02 AM, Jason Gunthorpe > <jgunthorpe@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Fri, May 16, 2014 at 08:29:56PM +0000, Karicheri, Muralidharan wrote: > > > >> But pcie_bus_configure_settings just make sure the mrrs for a device > >> is not greater than the max payload size. > > > > Not quite, it first scans the network checking the Maximum Payload Size > > Supported (MPSS) for each device, and chooses the highest supported by > > all as the MPS for all. > > The "performance" setting, e.g., "pci=pcie_bus_perf", adds a few > wrinkles by setting MRRS in ways that allow some devices to have > larger MPS than others. I don't think this is exactly what was > envisioned in the spec, and it is not guaranteed to work if there is > peer-to-peer DMA. This isn't documented very well; the best I know of > is the changelogs for: > > a1c473aa11e6 pci: Clamp pcie_set_readrq() when using "performance" settings > b03e7495a862 PCI: Set PCI-E Max Payload Size on fabric Neat, and does pretty much confirm that setting the host bridge MPSS properly is necessary to support all drivers, particularly the ones that call pcie_set_readrq with any old value. The 'performance' setting is a bit scary, it isn't just peer-to-peer DMA that would be impacted but also CPU initiated burst writes. Eg InfiniBand drivers burst a WQE to the NIC via the CPU. The MPS on the root port bridge is used to segment the write. Probably not a problem in practice because I think this is rare, and even rarer that a burst would be > 128 bytes - but as you say, not really what the spec intended.. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html