Re: [RFC 0/4] RFC - Coherent Device Memory (Not for inclusion)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 05/01/2017 06:29 PM, Balbir Singh wrote:
On Mon, 2017-05-01 at 13:41 -0700, John Hubbard wrote:
On 04/19/2017 12:52 AM, Balbir Singh wrote:
This is a request for comments on the discussed approaches
for coherent memory at mm-summit (some of the details are at
https://lwn.net/Articles/717601/). The latest posted patch
series is at https://lwn.net/Articles/713035/. I am reposting
this as RFC, Michal Hocko suggested using HMM for CDM, but
we believe there are stronger reasons to use the NUMA approach.
The earlier patches for Coherent Device memory were implemented
and designed by Anshuman Khandual.


Hi Balbir,

Although I think everyone agrees that in the [very] long term, these
hardware-coherent nodes probably want to be NUMA nodes, in order to decide what to
code up over the next few years, we need to get a clear idea of what has to be done
for each possible approach.

Here, the CDM discussion is falling just a bit short, because it does not yet
include the whole story of what we would need to do. Earlier threads pointed this
out: the idea started as a large patchset RFC, but then, "for ease of review", it
got turned into a smaller RFC, which loses too much context.

Hi, John

I thought I explained the context, but I'll try again. I see the whole solution
as a composite of the following primitives:

1. Enable hotplug of CDM nodes
2. Isolation of CDM memory
3. Migration to/from CDM memory
4. Performance enhancements for migration


So, there is a little more than the above required, which is why I made that short list. I'm in particular concerned about the various system calls that userspace can make to control NUMA memory, and the device drivers will need notification (probably mmu_notifiers, I guess), and once they get notification, in many cases they'll need some way to deal with reverse mapping.

HMM provides all of that support, so it needs to happen here, too.



The RFC here is for (2) above. (3) is handled by HMM and (4) is being discussed
in the community. I think the larger goals are same as HMM, except that we
don't need unaddressable memory, since the memory is cache coherent.


So, I'd suggest putting together something more complete, so that it can be fairly
compared against the HMM-for-hardware-coherent-nodes approach.


Since I intend to reuse bits of HMM, I am not sure if I want to repost those
patches as a part of my RFC. I hope my answers make sense, the goal is to
reuse as much of what is available. From a user perspective

It's hard to keep track of what the plan is, so explaining exactly what you're doing helps.


1. We see no new interface being added in either case, the programming model
would differ though
2. We expect the programming model to be abstracted behind a user space
framework, potentially like CUDA or CXL


Jerome posted HMM-CDM at https://lwn.net/Articles/713035/.
The patches do a great deal to enable CDM with HMM, but we
still believe that HMM with CDM is not a natural way to
represent coherent device memory and the mm will need
to be audited and enhanced for it to even work.

That is also true for the CDM approach. Specifically, in order for this to be of any
use to device drivers, we'll need the following:


Since Reza answered these questions, I'll skip them in this email

Yes, but he skipped over the rmap question, which I think is an important one.

thanks
john h


Thanks for the review!
Balbir Singh


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]
  Powered by Linux