Re: One Question About PCIe BUS Config Type with pcie_bus_safe or pcie_bus_perf On NVMe Device

Bjorn Helgaas <helgaas@xxxxxxxxxx> · Tue, 23 Jan 2018 08:38:39 -0600

On Tue, Jan 23, 2018 at 01:25:56PM +0000, Ron Yuan wrote:

I'm reproducing Sinan's picture here so we can see what you're talking
about:

> >>>>                root (MPS=256)
> >>>>                  |
> >>>>          ------------------
> >>>>         /                  \
> >>>>    bridge0 (MPS=256)      bridge1 (MPS=128)
> >>>>       /                       \
> >>>>     EP0 (MPS=256)            EP1 (MPS=128)
> >>>>

> > PERFORMANCE mode reduces MRRS not because of a starvation issue,
> > but because reducing EP1's MRRS allows EP0 to use a larger MPS.

> Looks like this case is talking about EP1 requests data directly
> from EP0, using MRRS to control the return data payload, while still
> keeping the traffic from EP0 to RC in 256B. 

No, this is not talking about EP1 requesting data from EP0.  That
would be peer-to-peer DMA, and PERFORMANCE mode explicitly assumes
there is no peer-to-peer DMA.  It reduces MRRS to allow EP0 to use a
larger MPS.

We must guarantee that no device receives a TLP larger than its MPS
setting.  The simple and obvious configuration is to set MPS=128 for
everything in Sinan's picture.  That works correctly but limits EP0's
performance.

What PERFORMANCE mode does is set MPS as shown in the picture and set
EP1's MRRS=128.  We're assuming no peer-to-peer DMA, but of course EP1
may still need to do DMA reads from system memory, and setting its
MRRS=128 means those reads will be of 128 bytes or less.

If we set EP1's MRRS=256, it could do a 256-byte DMA read from system
memory, the root port could send a 256-byte completion, and bridge1
would treat that as a malformed TLP because its MPS=128.