Re: [RFC 0/4] RFC - Coherent Device Memory (Not for inclusion)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2017-05-01 at 22:47 -0700, John Hubbard wrote:
> 
> On 05/01/2017 06:29 PM, Balbir Singh wrote:
> > On Mon, 2017-05-01 at 13:41 -0700, John Hubbard wrote:
> > > On 04/19/2017 12:52 AM, Balbir Singh wrote:
> > > > This is a request for comments on the discussed approaches
> > > > for coherent memory at mm-summit (some of the details are at
> > > > https://lwn.net/Articles/717601/). The latest posted patch
> > > > series is at https://lwn.net/Articles/713035/. I am reposting
> > > > this as RFC, Michal Hocko suggested using HMM for CDM, but
> > > > we believe there are stronger reasons to use the NUMA approach.
> > > > The earlier patches for Coherent Device memory were implemented
> > > > and designed by Anshuman Khandual.
> > > > 
> > > 
> > > Hi Balbir,
> > > 
> > > Although I think everyone agrees that in the [very] long term, these
> > > hardware-coherent nodes probably want to be NUMA nodes, in order to decide what to
> > > code up over the next few years, we need to get a clear idea of what has to be done
> > > for each possible approach.
> > > 
> > > Here, the CDM discussion is falling just a bit short, because it does not yet
> > > include the whole story of what we would need to do. Earlier threads pointed this
> > > out: the idea started as a large patchset RFC, but then, "for ease of review", it
> > > got turned into a smaller RFC, which loses too much context.
> > 
> > Hi, John
> > 
> > I thought I explained the context, but I'll try again. I see the whole solution
> > as a composite of the following primitives:
> > 
> > 1. Enable hotplug of CDM nodes
> > 2. Isolation of CDM memory
> > 3. Migration to/from CDM memory
> > 4. Performance enhancements for migration
> > 
> 
> So, there is a little more than the above required, which is why I made that short 
> list. I'm in particular concerned about the various system calls that userspace can 
> make to control NUMA memory, and the device drivers will need notification (probably 
> mmu_notifiers, I guess), and once they get notification, in many cases they'll need 
> some way to deal with reverse mapping.

Are you suggesting that the system calls user space should be audited to
check if they should be used with a CDM device? I would
think a whole lot of this should be transparent to user space, unless it opts
in to using CDM and explictly wants to allocate and free memory -- the whole
isolation premise. w.r.t device drivers are you suggesting that the device
driver needs to know the state of each page -- free/in-use? Reverse mapping
for migration?

> 
> HMM provides all of that support, so it needs to happen here, too.
> 
> 
> 
> > The RFC here is for (2) above. (3) is handled by HMM and (4) is being discussed
> > in the community. I think the larger goals are same as HMM, except that we
> > don't need unaddressable memory, since the memory is cache coherent.
> > 
> > > 
> > > So, I'd suggest putting together something more complete, so that it can be fairly
> > > compared against the HMM-for-hardware-coherent-nodes approach.
> > > 
> > 
> > Since I intend to reuse bits of HMM, I am not sure if I want to repost those
> > patches as a part of my RFC. I hope my answers make sense, the goal is to
> > reuse as much of what is available. From a user perspective
> 
> It's hard to keep track of what the plan is, so explaining exactly what you're doing 
> helps.
> 

Fair enough, I hope I answered the questions?

> > 
> > 1. We see no new interface being added in either case, the programming model
> > would differ though
> > 2. We expect the programming model to be abstracted behind a user space
> > framework, potentially like CUDA or CXL
> > 
> >   
> > > 
> > > > Jerome posted HMM-CDM at https://lwn.net/Articles/713035/.
> > > > The patches do a great deal to enable CDM with HMM, but we
> > > > still believe that HMM with CDM is not a natural way to
> > > > represent coherent device memory and the mm will need
> > > > to be audited and enhanced for it to even work.
> > > 
> > > That is also true for the CDM approach. Specifically, in order for this to be of any
> > > use to device drivers, we'll need the following:
> > > 
> > 
> > Since Reza answered these questions, I'll skip them in this email
> 
> Yes, but he skipped over the rmap question, which I think is an important one.
>

If it is for migration, then we are going to rely on changes from HMM-CDM.
How does HMM deal with the rmap case? I presume it is not required for
unaddressable memory?

Balbir Singh. 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]
  Powered by Linux