On Wed, 2011-09-07 at 06:30 -0300, Benjamin Herrenschmidt wrote: > Unfortunately, I didn't manage to get a good TLP capture of the problem > packets in AER. But basically what happens is: > > - Host bridge has a large MPS (For example 4096) > - Device has a smaller MPS (for example 128) > - Device has a large MRRS (for example 512) Just double checked on the actual machine. The device has a MPS of 256 and the bridge can go up to 4096. We were letting it up and observed the problem with an MRRS of 512 (apparently the power-on default of that adapter). So either we clamp the bridge to 256 and penalize everybody, or we clamp the e1000's MRRS to 256 and things work. Cheers, Ben. > What I observed is when receiving network packets larger than roughly > 128 bytes (I didn't get precise packet size threshold, I wasn't doing > the tests myself), the device appears to get read -responses- larger > than it's MPS (up to it's MRRS, ie, the size it specified in the read > request), and shoots an UE upstream. > > This happens with e1000e's so I doubt it's a broken PCIe implementation > in the device, and it makes sense all things considered. > > The Host bridge having an MPS larger than 128, it is allowed to send a > read response using a large TLP, which will be rejected by the device. > > The "safe" approach of course is to clamp all MPS to the minimum, but > that leads to way too many situations where everybody gets down to 128 > bytes because -one- device in the system has 128 bytes, and that means > that anything that has a hotplug slot must clamp everybody as well. > > Cheers, > Ben. > -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html