> On Wed, Jun 29, 2011 at 1:07 AM, Benjamin Herrenschmidt > <benh@xxxxxxxxxxxxxxxxxxx> wrote: >> On Wed, 2011-06-29 at 00:28 -0500, Jon Mason wrote: >>> There is a sizable performance boost for having the largest possible >>> maximum payload size on each PCI-E device. However, the maximum payload >>> size must be uniform on a given PCI-E fabric, and each device, bridge, >>> and root port can have a different max size. To find and configure the >>> optimal MPS settings, one must walk the fabric and determine the largest >>> MPS available on all the devices on the given fabric. >>> >>> -/* Program PCIE MaxPayload setting on device: ensure parent maxpayload <= device */ >> >> It is interesting to see how busted that comment was in retrospect ;-) >> >>> -static int pci_set_payload(struct pci_dev *dev) >>> -{ >>> - int pos, ppos; >>> - u16 pctl, psz; >>> - u16 dctl, dsz, dcap, dmax; >>> - struct pci_dev *parent; >>> - >>> - parent = dev->bus->self; >>> - pos = pci_find_capability(dev, PCI_CAP_ID_EXP); >>> - if (!pos) >>> - return 0; >>> - >>> - /* Read Device MaxPayload capability and setting */ >>> - pci_read_config_word(dev, pos + PCI_EXP_DEVCTL, &dctl); >>> - pci_read_config_word(dev, pos + PCI_EXP_DEVCAP, &dcap); >>> - dsz = (dctl & PCI_EXP_DEVCTL_PAYLOAD) >> 5; >>> - dmax = (dcap & PCI_EXP_DEVCAP_PAYLOAD); >>> - >>> - /* Read Parent MaxPayload setting */ >>> - ppos = pci_find_capability(parent, PCI_CAP_ID_EXP); >>> - if (!ppos) >>> - return 0; >>> - pci_read_config_word(parent, ppos + PCI_EXP_DEVCTL, &pctl); >>> - psz = (pctl & PCI_EXP_DEVCTL_PAYLOAD) >> 5; >>> - >>> - /* If parent payload > device max payload -> error >>> - * If parent payload > device payload -> set speed >>> - * If parent payload <= device payload -> do nothing >>> - */ >> >> And this one too, it would be useful to know the rationale of whoever >> came up with that code in the first place ... oh well, let's not dwelve >> on it now... > > I believe Jordan (who has been Cc'ed on these e-mails) was the > originator of this code. Perhaps he can shed some light. > This was required for hotplugging a device whose parent MPS was set to something other than the minimum 128. When a PCIE device was inserted into the hotplug bridge, the Linux pci subsystem would probe the device, but it would only have a DEVCTL MPS=128 by default (DEVCAP MPS = 512). On this particular system, the parent bridge MPS was set to DEVCTL MPS=256 (DEVCAP MPS=2048). So whenever a transaction to the new device was attempted, it would cause PCI errors as the transaction was too large. So what the old code did was this: pctl=Read Parent DevCtl MPS dcap=Read Device DevCap MPS dctl=Read Device DevCtl MPS If the parent MPS setting (pctl) was greater than the device capability (dcap) -> error else If the parent MPS setting (pctl) was greater than the device setting (dctl) -> Set device MPS to that of the parent else do nothing >>> +/** >>> + * pcie_set_mps - set PCI Express maximum payload size >>> + * @dev: PCI device to query >>> + * @rq: maximum payload size in bytes >>> + * valid values are 128, 256, 512, 1024, 2048, 4096 >>> + * >>> + * If possible sets maximum payload size >>> + */ >>> +int pcie_set_mps(struct pci_dev *dev, int mps) >>> +{ >>> + int cap, err = -EINVAL; >>> + u16 ctl, v; >>> + >>> + if (mps < 128 || mps > 4096 || !is_power_of_2(mps)) >>> + goto out; >>> + >>> + v = (ffs(mps) - 8) << 5; >>> + if (v > dev->pcie_mpss) >>> + goto out; >>> + >>> + cap = pci_pcie_cap(dev); >>> + if (!cap) >>> + goto out; >>> + >>> + err = pci_read_config_word(dev, cap + PCI_EXP_DEVCTL, &ctl); >>> + if (err) >>> + goto out; >>> + >>> + if ((ctl & PCI_EXP_DEVCTL_PAYLOAD) != v) { >>> + ctl &= ~PCI_EXP_DEVCTL_PAYLOAD; >>> + ctl |= v; >>> + err = pci_write_config_word(dev, cap + PCI_EXP_DEVCTL, ctl); >>> + } >>> +out: >>> + return err; >>> +} >>> + This code wasn't working on my system.. it would generate an error when trying to set the device MPS. the issue was that v = 128,256,... while pcie->mpss is 0,1,2,... The conditional would always be true and the new MPS would not be set during hotplug. --- one Wed Jul 13 20:25:53 2011 +++ two Wed Jul 13 20:25:34 2011 @@ -7,7 +7,7 @@ int pcie_set_mps(struct pci_dev *dev, int mps) goto out; v = (ffs(mps) - 8) << 5; - if (v > dev->pcie_mpss) + if (v > (128 << dev->pcie_mpss)) goto out; cap = pci_pcie_cap(dev); -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html