Re: Interacting with coherent memory on external devices

Austin S Hemmelgarn <ahferroin7@xxxxxxxxx> · Thu, 23 Apr 2015 11:25:21 -0400

On 2015-04-23 10:25, Christoph Lameter wrote:
On Thu, 23 Apr 2015, Benjamin Herrenschmidt wrote:

They are via MMIO space. The big differences here are that via CAPI the
memory can be fully cachable and thus have the same characteristics as
normal memory from the processor point of view, and the device shares
the MMU with the host.

Practically what that means is that the device memory *is* just some
normal system memory with a larger distance. The NUMA model is an
excellent representation of it.

I sure wish you would be working on using these features to increase
performance and the speed of communication to devices.

Device memory is inherently different from main memory (otherwise the
device would be using main memory) and thus not really NUMA. NUMA at least
assumes that the basic characteristics of memory are the same while just
the access speeds vary. GPU memory has very different performance
characteristics and the various assumptions on memory that the kernel
makes for the regular processors may not hold anymore.

You are restricting your definition of NUMA to what the industry 
constrains it to mean.  Based solely on the academic definition of a 
NUMA system, this _is_ NUMA.  In fact, based on the academic definition, 
all modern systems could be considered to be NUMA systems, with each 
level of cache representing a memory only node.

Looking at this whole conversation, all I see is two different views on 
how to present the asymmetric multiprocessing arrangements that have 
become commonplace in today's systems to userspace.  Your model favors 
performance, while CAPI favors simplicity for userspace.

Attachment:
smime.p7s

Description: S/MIME Cryptographic Signature