RE: One Question About PCIe BUS Config Type with pcie_bus_safe or pcie_bus_perf On NVMe Device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Bjorne,

Ceiling the MRRS to the MPS value in order to guarantee the interoperability in pcie_bus_perf mode does not make sense. A device can make a memrd request according to the MRRS setting (which can be higher than its MPS), but the completer has to respect the MPS and send completions accordingly. As an example, system can configure MPS=128B and MRRS=4K, where an endpoint can a make 4K MemRd request, but the completer has to send completions as 128B TLPs, by respecting the MPS setting. MRRS does not force a device to use higher MPS value than it is configured to.

Another factor that need to be considered for storage devices is that support of T10 Protection Information (DIF). For every 512B or 4KB, a 8B PI is computed and inserted or verified, which require the 512B of data to arrive in sequence. If the MRRS is < 512B, this might pose out of order completions to the storage device, if the EP has to submit multiple outstanding read requests in order to achieve higher performance. This would be a challenge for the storage endpoints that process the T10 PI inline with the transfer, now they have to store and process the 512B sectors once they receive all the TLPs for that sector.

So, it is better to decouple the MRRS and MPS in pcie_bus_perf mode. Like stated earlier in the thread, provide an option to configure MRRS separately in pcie_bus_perf mode.

Regards,
Radj.

-----Original Message-----
From: Bjorn Helgaas [mailto:helgaas@xxxxxxxxxx] 
Sent: Tuesday, January 23, 2018 6:39 AM
To: Ron Yuan <ron.yuan@xxxxxxxxxxxx>
Cc: Sinan Kaya <okaya@xxxxxxxxxxxxxx>; Bjorn Helgaas <bhelgaas@xxxxxxxxxx>; Bo Chen <bo.chen@xxxxxxxxxxxx>; William Huang <william.huang@xxxxxxxxxxxx>; Fengming Wu <fengming.wu@xxxxxxxxxxxx>; Jason Jiang <jason.jiang@xxxxxxxxxxxxx>; Radjendirane Codandaramane <radjendirane.codanda@xxxxxxxxxxxxx>; Ramyakanth Edupuganti <Ramyakanth.Edupuganti@xxxxxxxxxxxxx>; William Cheng <william.cheng@xxxxxxxxxxxxx>; Kim Helper (khelper) <khelper@xxxxxxxxxx>; Linux PCI <linux-pci@xxxxxxxxxxxxxxx>
Subject: Re: One Question About PCIe BUS Config Type with pcie_bus_safe or pcie_bus_perf On NVMe Device

EXTERNAL EMAIL


On Tue, Jan 23, 2018 at 01:25:56PM +0000, Ron Yuan wrote:

I'm reproducing Sinan's picture here so we can see what you're talking
about:

> >>>>                root (MPS=256)
> >>>>                  |
> >>>>          ------------------
> >>>>         /                  \
> >>>>    bridge0 (MPS=256)      bridge1 (MPS=128)
> >>>>       /                       \
> >>>>     EP0 (MPS=256)            EP1 (MPS=128)
> >>>>

> > PERFORMANCE mode reduces MRRS not because of a starvation issue, but 
> > because reducing EP1's MRRS allows EP0 to use a larger MPS.

> Looks like this case is talking about EP1 requests data directly from 
> EP0, using MRRS to control the return data payload, while still 
> keeping the traffic from EP0 to RC in 256B.

No, this is not talking about EP1 requesting data from EP0.  That would be peer-to-peer DMA, and PERFORMANCE mode explicitly assumes there is no peer-to-peer DMA.  It reduces MRRS to allow EP0 to use a larger MPS.

We must guarantee that no device receives a TLP larger than its MPS setting.  The simple and obvious configuration is to set MPS=128 for everything in Sinan's picture.  That works correctly but limits EP0's performance.

What PERFORMANCE mode does is set MPS as shown in the picture and set EP1's MRRS=128.  We're assuming no peer-to-peer DMA, but of course EP1 may still need to do DMA reads from system memory, and setting its
MRRS=128 means those reads will be of 128 bytes or less.

If we set EP1's MRRS=256, it could do a 256-byte DMA read from system memory, the root port could send a 256-byte completion, and bridge1 would treat that as a malformed TLP because its MPS=128.




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux