Hi, Bjorn, On Tue, Jun 29, 2021 at 10:12 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > On Tue, Jun 29, 2021 at 10:00:20AM +0800, Huacai Chen wrote: > > On Tue, Jun 29, 2021 at 4:51 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > > On Sun, Jun 27, 2021 at 06:25:04PM +0800, Huacai Chen wrote: > > > > On Sat, Jun 26, 2021 at 6:22 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > > > > On Fri, Jun 25, 2021 at 05:30:29PM +0800, Huacai Chen wrote: > > > > > > In new revision of LS7A, some PCIe ports support larger value than 256, > > > > > > but their maximum supported MRRS values are not detectable. Moreover, > > > > > > the current loongson_mrrs_quirk() cannot avoid devices increasing its > > > > > > MRRS after pci_enable_device(), and some devices (e.g. Realtek 8169) > > > > > > will actually set a big value in its driver. So the only possible way is > > > > > > configure MRRS of all devices in BIOS, and add a PCI device flag (i.e., > > > > > > PCI_DEV_FLAGS_NO_INCREASE_MRRS) to stop the increasing MRRS operations. > > > > > > > > > > > > However, according to PCIe Spec, it is legal for an OS to program any > > > > > > value for MRRS, and it is also legal for an endpoint to generate a Read > > > > > > Request with any size up to its MRRS. As the hardware engineers says, > > > > > > the root cause here is LS7A doesn't break up large read requests (Yes, > > > > > > that is a problem in the LS7A design). > > > > > > > > > > "LS7A doesn't break up large read requests" claims to be a root cause, > > > > > but you haven't yet said what the actual *problem* is. > > > > > > > > > > Is the problem that an endpoint reports a malformed TLP because it > > > > > received a completion bigger than it can handle? Is it that the LS7A > > > > > root port reports some kind of error if it receives a Memory Read > > > > > request with a size that's "too big"? Maybe the LS7A doesn't know > > > > > what to do when it receives a Memory Read request with MRRS > MPS? > > > > > What exactly happens when the problem occurs? > > > > > > > > The hardware engineer said that the problem is: LS7A PCIe port reports > > > > CA (Completer Abort) if it receives a Memory Read > > > > request with a size that's "too big". > > > > > > What is "too big"? > > > > > "Too big" means bigger than the port can handle, PCIe SPEC allows any > > MRRS value, but, but, LS7A surely violates the protocol here. > > Right, I just wanted to know what the number is. That is, what values > we can write to MRRS safely. > > But ISTR you saying that it's not actually fixed, and that's why you > wanted to rely on what firmware put there. Yes, it's not fixed (256 on some ports and 4096 on other ports), so we should heavily depend on firmware. Huacai > > This is important to know for the question about hot-added devices > below, because a hot-added device should power up with MRRS=512 bytes, > and if that's too big for LS7A, then we have a problem and the quirk > needs to be more extensive. > > > > I'm trying to figure out how to make this work with hot-added devices. > > > Per spec (PCIe r5.0, sec 7.5.3.4), devices should power up with > > > MRRS=010b (512 bytes). > > > > > > If Linux does not touch MRRS at all in hierarchices under LS7A, will a > > > hot-added device with MRRS=010b work? Or does Linux need to actively > > > write MRRS to 000b (128 bytes) or 001b (256 bytes)? Emm, hot-plug is a problem, maybe we can only disable hot-plug in board design... Huacai > > > > > > Bjorn