On Tue, Jun 29, 2021 at 10:00:20AM +0800, Huacai Chen wrote: > On Tue, Jun 29, 2021 at 4:51 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > On Sun, Jun 27, 2021 at 06:25:04PM +0800, Huacai Chen wrote: > > > On Sat, Jun 26, 2021 at 6:22 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > > > On Fri, Jun 25, 2021 at 05:30:29PM +0800, Huacai Chen wrote: > > > > > In new revision of LS7A, some PCIe ports support larger value than 256, > > > > > but their maximum supported MRRS values are not detectable. Moreover, > > > > > the current loongson_mrrs_quirk() cannot avoid devices increasing its > > > > > MRRS after pci_enable_device(), and some devices (e.g. Realtek 8169) > > > > > will actually set a big value in its driver. So the only possible way is > > > > > configure MRRS of all devices in BIOS, and add a PCI device flag (i.e., > > > > > PCI_DEV_FLAGS_NO_INCREASE_MRRS) to stop the increasing MRRS operations. > > > > > > > > > > However, according to PCIe Spec, it is legal for an OS to program any > > > > > value for MRRS, and it is also legal for an endpoint to generate a Read > > > > > Request with any size up to its MRRS. As the hardware engineers says, > > > > > the root cause here is LS7A doesn't break up large read requests (Yes, > > > > > that is a problem in the LS7A design). > > > > > > > > "LS7A doesn't break up large read requests" claims to be a root cause, > > > > but you haven't yet said what the actual *problem* is. > > > > > > > > Is the problem that an endpoint reports a malformed TLP because it > > > > received a completion bigger than it can handle? Is it that the LS7A > > > > root port reports some kind of error if it receives a Memory Read > > > > request with a size that's "too big"? Maybe the LS7A doesn't know > > > > what to do when it receives a Memory Read request with MRRS > MPS? > > > > What exactly happens when the problem occurs? > > > > > > The hardware engineer said that the problem is: LS7A PCIe port reports > > > CA (Completer Abort) if it receives a Memory Read > > > request with a size that's "too big". > > > > What is "too big"? > > > "Too big" means bigger than the port can handle, PCIe SPEC allows any > MRRS value, but, but, LS7A surely violates the protocol here. Right, I just wanted to know what the number is. That is, what values we can write to MRRS safely. But ISTR you saying that it's not actually fixed, and that's why you wanted to rely on what firmware put there. This is important to know for the question about hot-added devices below, because a hot-added device should power up with MRRS=512 bytes, and if that's too big for LS7A, then we have a problem and the quirk needs to be more extensive. > > I'm trying to figure out how to make this work with hot-added devices. > > Per spec (PCIe r5.0, sec 7.5.3.4), devices should power up with > > MRRS=010b (512 bytes). > > > > If Linux does not touch MRRS at all in hierarchices under LS7A, will a > > hot-added device with MRRS=010b work? Or does Linux need to actively > > write MRRS to 000b (128 bytes) or 001b (256 bytes)? > > > > Bjorn