Re: [PATCH 0/6] Cache coherent device memory (CDM) with HMM v5

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jul 18, 2017 at 11:26:51AM +0800, Bob Liu wrote:
> On 2017/7/14 5:15, Jérôme Glisse wrote:
> > Sorry i made horrible mistake on names in v4, i completly miss-
> > understood the suggestion. So here i repost with proper naming.
> > This is the only change since v3. Again sorry about the noise
> > with v4.
> > 
> > Changes since v4:
> >   - s/DEVICE_HOST/DEVICE_PUBLIC
> > 
> > Git tree:
> > https://cgit.freedesktop.org/~glisse/linux/log/?h=hmm-cdm-v5
> > 
> > 
> > Cache coherent device memory apply to architecture with system bus
> > like CAPI or CCIX. Device connected to such system bus can expose
> > their memory to the system and allow cache coherent access to it
> > from the CPU.
> > 
> > Even if for all intent and purposes device memory behave like regular
> > memory, we still want to manage it in isolation from regular memory.
> > Several reasons for that, first and foremost this memory is less
> > reliable than regular memory if the device hangs because of invalid
> > commands we can loose access to device memory. Second CPU access to
> > this memory is expected to be slower than to regular memory. Third
> > having random memory into device means that some of the bus bandwith
> > wouldn't be available to the device but would be use by CPU access.
> > 
> > This is why we want to manage such memory in isolation from regular
> > memory. Kernel should not try to use this memory even as last resort
> > when running out of memory, at least for now.
> >
> 
> I think set a very large node distance for "Cache Coherent Device Memory"
> may be a easier way to address these concerns.

Such approach was discuss at length in the past see links below. Outcome
of discussion:
  - CPU less node are bad
  - device memory can be unreliable (device hang) no way for application
    to understand that
  - application and driver NUMA madvise/mbind/mempolicy ... can conflict
    with each other and no way the kernel can figure out which should
    apply
  - NUMA as it is now would not work as we need further isolation that
    what a large node distance would provide

Probably few others argument i forget.

https://lists.gt.net/linux/kernel/2551369
https://groups.google.com/forum/#!topic/linux.kernel/Za_e8C3XnRs%5B1-25%5D
https://lwn.net/Articles/720380/

Cheers,
Jérôme

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]
  Powered by Linux