Re: [PATCH v2 10/10] nvmet: Optionally use PCI P2P memory

Logan Gunthorpe <logang@xxxxxxxxxxxx> · Thu, 1 Mar 2018 16:29:19 -0700

On 01/03/18 04:20 PM, Jason Gunthorpe wrote:
On Thu, Mar 01, 2018 at 11:00:51PM +0000, Stephen  Bates wrote:

No, locality matters. If you have a bunch of NICs and bunch of drives
and the allocator chooses to put all P2P memory on a single drive your
performance will suck horribly even if all the traffic is offloaded.

Performance will suck if you have speed differences between the PCI-E
devices. Eg a bunch of Gen 3 x8 NVMe cards paired with a Gen 4 x16 NIC
will not reach full performance unless the p2p buffers are properly
balanced between all cards.

This would be solved better by choosing the closest devices in the 
hierarchy in the p2pmem_find function (ie. preferring devices involved 
in the transaction). Also, based on the current code, it's a bit of a 
moot point seeing it requires all devices to be on the same switch. So 
unless you have a giant switch hidden somewhere you're not going to get 
too many NIC/NVME pairs.

In any case, we are looking at changing both those things so distance in 
the topology is preferred and the ability to span multiple switches is 
allowed. We're also discussing changing it to "user picks" which 
simplifies all of this.

Logan