> On Wed, 2011-09-07 at 06:30 -0300, Benjamin Herrenschmidt wrote: > >> Unfortunately, I didn't manage to get a good TLP capture of the problem >> packets in AER. But basically what happens is: >> >> - Host bridge has a large MPS (For example 4096) Out of curiosity: what is this for a board? The only thing we ever found for a reasonable price that has more than 128 byte here is the Intel X58/Tylersburg (and an older NVidia one). This is not really my business anymore but I guess I could make some people happy if I tell them what to look for. >> - Device has a smaller MPS (for example 128) >> - Device has a large MRRS (for example 512) > > Just double checked on the actual machine. The device has a MPS of 256 > and the bridge can go up to 4096. We were letting it up and observed the > problem with an MRRS of 512 (apparently the power-on default of that > adapter). > > So either we clamp the bridge to 256 and penalize everybody, or we clamp > the e1000's MRRS to 256 and things work. We need to change the MPS of the bridge anyway as it could send e.g. a write request of 4k otherwise. Which is completely orthogonal to the MRRS and would cause the same breakage. And as far as I understand what the patches do is exactly this change for exactly this reason: avoid too large packets hitting the device. The MRRS is only for things that were originally requested by the target device, but it is by far not the only way such packets may happen. Maybe it is the most likely way, but nothing more. But it would still be an interesting question to get a list of devices broken when the MRRS is changed. And to kick the vendors hard to fix that mess. Eike -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html