Re: [RFC 0/4] RFC - Coherent Device Memory (Not for inclusion)

John Hubbard <jhubbard@xxxxxxxxxx> · Tue, 2 May 2017 10:50:42 -0700

On 05/02/2017 12:23 AM, Balbir Singh wrote:
On Mon, 2017-05-01 at 22:47 -0700, John Hubbard wrote:

On 05/01/2017 06:29 PM, Balbir Singh wrote:
On Mon, 2017-05-01 at 13:41 -0700, John Hubbard wrote:
On 04/19/2017 12:52 AM, Balbir Singh wrote:
[...]
1. Enable hotplug of CDM nodes
2. Isolation of CDM memory
3. Migration to/from CDM memory
4. Performance enhancements for migration

So, there is a little more than the above required, which is why I made that short
list. I'm in particular concerned about the various system calls that userspace can
make to control NUMA memory, and the device drivers will need notification (probably
mmu_notifiers, I guess), and once they get notification, in many cases they'll need
some way to deal with reverse mapping.

Are you suggesting that the system calls user space should be audited to
check if they should be used with a CDM device? I would
think a whole lot of this should be transparent to user space, unless it opts
in to using CDM and explictly wants to allocate and free memory -- the whole
isolation premise. w.r.t device drivers are you suggesting that the device
driver needs to know the state of each page -- free/in-use? Reverse mapping
for migration?

Interesting question. No, I was not going that direction (auditing the various system calls...) at 
all, actually. Rather, I was expecting that this system to interact as normally as possible with all 
of the system calls, and that is what led me to expect that some combination of "device driver + 
enhanced NUMA subsystem" would need to do rmap lookups.

Going through and special-casing CDM for various system calls would probably not be well-received, 
because it would be an indication of force-fitting this into the NUMA model before it's ready, right?

HMM provides all of that support, so it needs to happen here, too.

The RFC here is for (2) above. (3) is handled by HMM and (4) is being discussed
in the community. I think the larger goals are same as HMM, except that we
don't need unaddressable memory, since the memory is cache coherent.

So, I'd suggest putting together something more complete, so that it can be fairly
compared against the HMM-for-hardware-coherent-nodes approach.

Since I intend to reuse bits of HMM, I am not sure if I want to repost those
patches as a part of my RFC. I hope my answers make sense, the goal is to
reuse as much of what is available. From a user perspective

It's hard to keep track of what the plan is, so explaining exactly what you're doing
helps.

Fair enough, I hope I answered the questions?

Yes, thanks.

1. We see no new interface being added in either case, the programming model
would differ though
2. We expect the programming model to be abstracted behind a user space
framework, potentially like CUDA or CXL

Jerome posted HMM-CDM at https://lwn.net/Articles/713035/.
The patches do a great deal to enable CDM with HMM, but we
still believe that HMM with CDM is not a natural way to
represent coherent device memory and the mm will need
to be audited and enhanced for it to even work.

That is also true for the CDM approach. Specifically, in order for this to be of any
use to device drivers, we'll need the following:

Since Reza answered these questions, I'll skip them in this email

Yes, but he skipped over the rmap question, which I think is an important one.

If it is for migration, then we are going to rely on changes from HMM-CDM.
How does HMM deal with the rmap case? I presume it is not required for
unaddressable memory?

Balbir Singh.

That's correct, we don't need rmap access for device drivers in the "pure HMM" case, because the HMM 
core handles it.

thanks
john h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>