Re: [PATCH V3 3/4] PCI: Improve the MRRS quirk for LS7A

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, Bjorn,

On Sat, Jun 26, 2021 at 11:48 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
>
> On Fri, Jun 25, 2021 at 05:22:04PM -0500, Bjorn Helgaas wrote:
> > On Fri, Jun 25, 2021 at 05:30:29PM +0800, Huacai Chen wrote:
> > > In new revision of LS7A, some PCIe ports support larger value than 256,
> > > but their maximum supported MRRS values are not detectable. Moreover,
> > > the current loongson_mrrs_quirk() cannot avoid devices increasing its
> > > MRRS after pci_enable_device(), and some devices (e.g. Realtek 8169)
> > > will actually set a big value in its driver. So the only possible way is
> > > configure MRRS of all devices in BIOS, and add a PCI device flag (i.e.,
> > > PCI_DEV_FLAGS_NO_INCREASE_MRRS) to stop the increasing MRRS operations.
> > >
> > > However, according to PCIe Spec, it is legal for an OS to program any
> > > value for MRRS, and it is also legal for an endpoint to generate a Read
> > > Request with any size up to its MRRS. As the hardware engineers says,
> > > the root cause here is LS7A doesn't break up large read requests (Yes,
> > > that is a problem in the LS7A design).
> >
> > "LS7A doesn't break up large read requests" claims to be a root cause,
> > but you haven't yet said what the actual *problem* is.
> >
> > Is the problem that an endpoint reports a malformed TLP because it
> > received a completion bigger than it can handle?  Is it that the LS7A
> > root port reports some kind of error if it receives a Memory Read
> > request with a size that's "too big"?  Maybe the LS7A doesn't know
> > what to do when it receives a Memory Read request with MRRS > MPS?
> > What exactly happens when the problem occurs?
> >
> > MRRS applies only to the read request.  It is not directly related to
> > the size of the completions that carry the data back to the device
> > (except that obviously you shouldn't get a completion larger than the
> > read you requested).
> >
> > The setting that directly controls the size of completions is MPS
> > (Max_Payload_Size).  One reason to break up read requests is because
> > the endpoint's buffers can't accommodate big TLPs.  One way to deal
> > with that is to set MPS in the hierarchy to a smaller value.  Then the
> > root port must ensure that no TLP exceeds the MPS size, regardless of
> > what the MRRS in the read request was.
> >
> > For example, if the endpoint's MRRS=4096 and the hierarchy's MPS=128,
> > it's up to the root port to break up completions into 128-byte chunks.
> >
> > It's also possible to set the endpoint's MRRS=128, which means reads
> > to main memory will never receive completions larger than 128 bytes.
> > But it does NOT guarantee that a peer-to-peer DMA from another device
> > will be limited to 128 bytes.  The other device is allowed to generate
> > Memory Write TLPs with payloads up to its MPS size, and MRRS is not
> > involved at all.
> >
> > It's not clear yet whether the LS7A problem is with MRRS, with MPS, or
> > with some combination.  It's important to understand exactly what is
> > broken here so the quirk doesn't get in the way of future changes to
> > the generic MRRS and MPS configuration.
> >
> > Here's a good overview:
> >
> >   https://www.xilinx.com/support/documentation/white_papers/wp350.pdf
> >
> > > Signed-off-by: Huacai Chen <chenhuacai@xxxxxxxxxxx>
> > > ---
> > >  drivers/pci/pci.c    | 5 +++++
> > >  drivers/pci/quirks.c | 8 +++++++-
> > >  include/linux/pci.h  | 2 ++
> > >  3 files changed, 14 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > > index b717680377a9..6f0d2f5b6f30 100644
> > > --- a/drivers/pci/pci.c
> > > +++ b/drivers/pci/pci.c
> > > @@ -5802,6 +5802,11 @@ int pcie_set_readrq(struct pci_dev *dev, int rq)
> > >
> > >     v = (ffs(rq) - 8) << 12;
> > >
> > > +   if (dev->dev_flags & PCI_DEV_FLAGS_NO_INCREASE_MRRS) {
> > > +           if (rq > pcie_get_readrq(dev))
> > > +                   return -EINVAL;
> > > +   }
> > > +
> > >     ret = pcie_capability_clear_and_set_word(dev, PCI_EXP_DEVCTL,
> > >                                               PCI_EXP_DEVCTL_READRQ, v);
> > >
> > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > > index dee4798a49fc..8284480dc7e4 100644
> > > --- a/drivers/pci/quirks.c
> > > +++ b/drivers/pci/quirks.c
> > > @@ -263,7 +263,13 @@ static void loongson_mrrs_quirk(struct pci_dev *dev)
> > >              * anything larger than this. So force this limit on
> > >              * any devices attached under these ports.
> > >              */
> > > -           if (pci_match_id(bridge_devids, bridge)) {
> > > +           if (bridge && pci_match_id(bridge_devids, bridge)) {
> > > +                   dev->dev_flags |= PCI_DEV_FLAGS_NO_INCREASE_MRRS;
> > > +
> > > +                   if (pcie_bus_config == PCIE_BUS_DEFAULT ||
> > > +                       pcie_bus_config == PCIE_BUS_TUNE_OFF)
> > > +                           break;
>
> Another approach might be to make a quirk that prevents Linux from
> touching MPS and MRRS at all under any circumstances.
>
> You'd have to do this without reference to pcie_bus_config so future
> MPS/MRRS algorithm changes wouldn't be affected.  And the quirk bit
> would have to be in struct pci_host_bridge, similar to no_ext_tags.
I will change the quirk bit to be a bus flag, but still allow to decrease MRRS.

Huacai
>
> > >                     if (pcie_get_readrq(dev) > 256) {
> > >                             pci_info(dev, "limiting MRRS to 256\n");
> > >                             pcie_set_readrq(dev, 256);
> > > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > > index 24306504226a..5e0ec3e4318b 100644
> > > --- a/include/linux/pci.h
> > > +++ b/include/linux/pci.h
> > > @@ -227,6 +227,8 @@ enum pci_dev_flags {
> > >     PCI_DEV_FLAGS_NO_FLR_RESET = (__force pci_dev_flags_t) (1 << 10),
> > >     /* Don't use Relaxed Ordering for TLPs directed at this device */
> > >     PCI_DEV_FLAGS_NO_RELAXED_ORDERING = (__force pci_dev_flags_t) (1 << 11),
> > > +   /* Don't increase BIOS's MRRS configuration */
> > > +   PCI_DEV_FLAGS_NO_INCREASE_MRRS = (__force pci_dev_flags_t) (1 << 12),
> > >  };
> > >
> > >  enum pci_irq_reroute_variant {
> > > --
> > > 2.27.0
> > >



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux