Re: [RFC v6] PCI: Set PCI-E Max Payload Size on fabric

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 13, 2011 at 8:30 PM, jordan hargrave <jharg93@xxxxxxxxx> wrote:
>> On Wed, Jun 29, 2011 at 1:07 AM, Benjamin Herrenschmidt
>> <benh@xxxxxxxxxxxxxxxxxxx> wrote:
>>> On Wed, 2011-06-29 at 00:28 -0500, Jon Mason wrote:
>>>> There is a sizable performance boost for having the largest possible
>>>> maximum payload size on each PCI-E device.  However, the maximum payload
>>>> size must be uniform on a given PCI-E fabric, and each device, bridge,
>>>> and root port can have a different max size.  To find and configure the
>>>> optimal MPS settings, one must walk the fabric and determine the largest
>>>> MPS available on all the devices on the given fabric.
>>>>
>
>>>> -/* Program PCIE MaxPayload setting on device: ensure parent maxpayload <= device */
>>>
>>> It is interesting to see how busted that comment was in retrospect ;-)
>>>
>>>> -static int pci_set_payload(struct pci_dev *dev)
>>>> -{
>>>> -       int pos, ppos;
>>>> -       u16 pctl, psz;
>>>> -       u16 dctl, dsz, dcap, dmax;
>>>> -       struct pci_dev *parent;
>>>> -
>>>> -       parent = dev->bus->self;
>>>> -       pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
>>>> -       if (!pos)
>>>> -               return 0;
>>>> -
>>>> -       /* Read Device MaxPayload capability and setting */
>>>> -       pci_read_config_word(dev, pos + PCI_EXP_DEVCTL, &dctl);
>>>> -       pci_read_config_word(dev, pos + PCI_EXP_DEVCAP, &dcap);
>>>> -       dsz = (dctl & PCI_EXP_DEVCTL_PAYLOAD) >> 5;
>>>> -       dmax = (dcap & PCI_EXP_DEVCAP_PAYLOAD);
>>>> -
>>>> -       /* Read Parent MaxPayload setting */
>>>> -       ppos = pci_find_capability(parent, PCI_CAP_ID_EXP);
>>>> -       if (!ppos)
>>>> -               return 0;
>>>> -       pci_read_config_word(parent, ppos + PCI_EXP_DEVCTL, &pctl);
>>>> -       psz = (pctl &  PCI_EXP_DEVCTL_PAYLOAD) >> 5;
>>>> -
>>>> -       /* If parent payload > device max payload -> error
>>>> -        * If parent payload > device payload -> set speed
>>>> -        * If parent payload <= device payload -> do nothing
>>>> -        */
>>>
>>> And this one too, it would be useful to know the rationale of whoever
>>> came up with that code in the first place ... oh well, let's not dwelve
>>> on it now...
>>
>> I believe Jordan (who has been Cc'ed on these e-mails) was the
>> originator of this code.  Perhaps he can shed some light.
>>
>
> This was required for hotplugging a device whose parent MPS was set to
> something other than the minimum 128.  When a PCIE device was inserted
> into the hotplug bridge, the Linux pci subsystem would probe the
> device, but it would only have a DEVCTL MPS=128 by default (DEVCAP MPS
> = 512).  On this particular system, the parent bridge MPS was set to
> DEVCTL MPS=256 (DEVCAP MPS=2048).  So whenever a transaction to the
> new device was attempted, it would cause PCI errors as the transaction
> was too large.
>
> So what the old code did was this:
> pctl=Read Parent DevCtl MPS
> dcap=Read Device DevCap MPS
> dctl=Read Device DevCtl MPS
>
> If the parent MPS setting (pctl) was greater than the device
> capability (dcap) -> error
> else If the parent MPS setting (pctl) was greater than the device
> setting (dctl) -> Set device MPS to that of the parent
> else do nothing
>
>>>> +/**
>>>> + * pcie_set_mps - set PCI Express maximum payload size
>>>> + * @dev: PCI device to query
>>>> + * @rq: maximum payload size in bytes
>>>> + *    valid values are 128, 256, 512, 1024, 2048, 4096
>>>> + *
>>>> + * If possible sets maximum payload size
>>>> + */
>>>> +int pcie_set_mps(struct pci_dev *dev, int mps)
>>>> +{
>>>> +     int cap, err = -EINVAL;
>>>> +     u16 ctl, v;
>>>> +
>>>> +     if (mps < 128 || mps > 4096 || !is_power_of_2(mps))
>>>> +             goto out;
>>>> +
>>>> +     v = (ffs(mps) - 8) << 5;
>>>> +     if (v > dev->pcie_mpss)
>>>> +             goto out;
>>>> +
>>>> +     cap = pci_pcie_cap(dev);
>>>> +     if (!cap)
>>>> +             goto out;
>>>> +
>>>> +     err = pci_read_config_word(dev, cap + PCI_EXP_DEVCTL, &ctl);
>>>> +     if (err)
>>>> +             goto out;
>>>> +
>>>> +     if ((ctl & PCI_EXP_DEVCTL_PAYLOAD) != v) {
>>>> +             ctl &= ~PCI_EXP_DEVCTL_PAYLOAD;
>>>> +             ctl |= v;
>>>> +             err = pci_write_config_word(dev, cap + PCI_EXP_DEVCTL, ctl);
>>>> +     }
>>>> +out:
>>>> +     return err;
>>>> +}
>>>> +
> This code wasn't working on my system.. it would generate an error
> when trying to set the device MPS.  the issue was that v = 128,256,...
> while pcie->mpss is 0,1,2,...  The conditional would always be true
> and the new MPS would not be set during hotplug.

I see the bug.  I'll have a new version coming out shortly, as I want
to verify it first on the faulty HW we have internally.

>
> --- one Wed Jul 13 20:25:53 2011
> +++ two Wed Jul 13 20:25:34 2011
> @@ -7,7 +7,7 @@ int pcie_set_mps(struct pci_dev *dev, int mps)
>              goto out;
>
>      v = (ffs(mps) - 8) << 5;
> -     if (v > dev->pcie_mpss)
> +     if (v > (128 << dev->pcie_mpss))
>              goto out;
>
>      cap = pci_pcie_cap(dev);
>
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux