Re: [RFC v6] PCI: Set PCI-E Max Payload Size on fabric

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On Wed, Jun 29, 2011 at 1:07 AM, Benjamin Herrenschmidt
> <benh@xxxxxxxxxxxxxxxxxxx> wrote:
>> On Wed, 2011-06-29 at 00:28 -0500, Jon Mason wrote:
>>> There is a sizable performance boost for having the largest possible
>>> maximum payload size on each PCI-E device.  However, the maximum payload
>>> size must be uniform on a given PCI-E fabric, and each device, bridge,
>>> and root port can have a different max size.  To find and configure the
>>> optimal MPS settings, one must walk the fabric and determine the largest
>>> MPS available on all the devices on the given fabric.
>>>

>>> -/* Program PCIE MaxPayload setting on device: ensure parent maxpayload <= device */
>>
>> It is interesting to see how busted that comment was in retrospect ;-)
>>
>>> -static int pci_set_payload(struct pci_dev *dev)
>>> -{
>>> -       int pos, ppos;
>>> -       u16 pctl, psz;
>>> -       u16 dctl, dsz, dcap, dmax;
>>> -       struct pci_dev *parent;
>>> -
>>> -       parent = dev->bus->self;
>>> -       pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
>>> -       if (!pos)
>>> -               return 0;
>>> -
>>> -       /* Read Device MaxPayload capability and setting */
>>> -       pci_read_config_word(dev, pos + PCI_EXP_DEVCTL, &dctl);
>>> -       pci_read_config_word(dev, pos + PCI_EXP_DEVCAP, &dcap);
>>> -       dsz = (dctl & PCI_EXP_DEVCTL_PAYLOAD) >> 5;
>>> -       dmax = (dcap & PCI_EXP_DEVCAP_PAYLOAD);
>>> -
>>> -       /* Read Parent MaxPayload setting */
>>> -       ppos = pci_find_capability(parent, PCI_CAP_ID_EXP);
>>> -       if (!ppos)
>>> -               return 0;
>>> -       pci_read_config_word(parent, ppos + PCI_EXP_DEVCTL, &pctl);
>>> -       psz = (pctl &  PCI_EXP_DEVCTL_PAYLOAD) >> 5;
>>> -
>>> -       /* If parent payload > device max payload -> error
>>> -        * If parent payload > device payload -> set speed
>>> -        * If parent payload <= device payload -> do nothing
>>> -        */
>>
>> And this one too, it would be useful to know the rationale of whoever
>> came up with that code in the first place ... oh well, let's not dwelve
>> on it now...
>
> I believe Jordan (who has been Cc'ed on these e-mails) was the
> originator of this code.  Perhaps he can shed some light.
>

This was required for hotplugging a device whose parent MPS was set to
something other than the minimum 128.  When a PCIE device was inserted
into the hotplug bridge, the Linux pci subsystem would probe the
device, but it would only have a DEVCTL MPS=128 by default (DEVCAP MPS
= 512).  On this particular system, the parent bridge MPS was set to
DEVCTL MPS=256 (DEVCAP MPS=2048).  So whenever a transaction to the
new device was attempted, it would cause PCI errors as the transaction
was too large.

So what the old code did was this:
pctl=Read Parent DevCtl MPS
dcap=Read Device DevCap MPS
dctl=Read Device DevCtl MPS

If the parent MPS setting (pctl) was greater than the device
capability (dcap) -> error
else If the parent MPS setting (pctl) was greater than the device
setting (dctl) -> Set device MPS to that of the parent
else do nothing

>>> +/**
>>> + * pcie_set_mps - set PCI Express maximum payload size
>>> + * @dev: PCI device to query
>>> + * @rq: maximum payload size in bytes
>>> + *    valid values are 128, 256, 512, 1024, 2048, 4096
>>> + *
>>> + * If possible sets maximum payload size
>>> + */
>>> +int pcie_set_mps(struct pci_dev *dev, int mps)
>>> +{
>>> +     int cap, err = -EINVAL;
>>> +     u16 ctl, v;
>>> +
>>> +     if (mps < 128 || mps > 4096 || !is_power_of_2(mps))
>>> +             goto out;
>>> +
>>> +     v = (ffs(mps) - 8) << 5;
>>> +     if (v > dev->pcie_mpss)
>>> +             goto out;
>>> +
>>> +     cap = pci_pcie_cap(dev);
>>> +     if (!cap)
>>> +             goto out;
>>> +
>>> +     err = pci_read_config_word(dev, cap + PCI_EXP_DEVCTL, &ctl);
>>> +     if (err)
>>> +             goto out;
>>> +
>>> +     if ((ctl & PCI_EXP_DEVCTL_PAYLOAD) != v) {
>>> +             ctl &= ~PCI_EXP_DEVCTL_PAYLOAD;
>>> +             ctl |= v;
>>> +             err = pci_write_config_word(dev, cap + PCI_EXP_DEVCTL, ctl);
>>> +     }
>>> +out:
>>> +     return err;
>>> +}
>>> +
This code wasn't working on my system.. it would generate an error
when trying to set the device MPS.  the issue was that v = 128,256,...
while pcie->mpss is 0,1,2,...  The conditional would always be true
and the new MPS would not be set during hotplug.

--- one Wed Jul 13 20:25:53 2011
+++ two Wed Jul 13 20:25:34 2011
@@ -7,7 +7,7 @@ int pcie_set_mps(struct pci_dev *dev, int mps)
              goto out;

      v = (ffs(mps) - 8) << 5;
-     if (v > dev->pcie_mpss)
+     if (v > (128 << dev->pcie_mpss))
              goto out;

      cap = pci_pcie_cap(dev);
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux