CC'ing : linux-accelerators@xxxxxxxxxxxxxxx On Wed, Nov 15, 2017 at 6:44 PM, Jerome Glisse <jglisse@xxxxxxxxxx> wrote: > On Wed, Nov 15, 2017 at 06:10:08PM -0800, chet l wrote: >> >> You may think it as a CCIX device or CAPI device. >> >> The requirement is eliminate any extra copy. >> >> A typical usecase/requirement is malloc() and madvise() allocate from >> >> device memory, then CPU write data to device memory directly and >> >> trigger device to read the data/do calculation. >> > >> > I suggest you rely on the device driver userspace API to do a migration after malloc >> > then. Something like: >> > ptr = malloc(size); >> > my_device_migrate(ptr, size); >> > >> > Which would call an ioctl of the device driver which itself would migrate memory or >> > allocate device memory for the range if pointer return by malloc is not yet back by >> > any pages. >> > >> >> So for CCIX, I don't think there is going to be an inline device >> driver that would allocate any memory for you. The expansion memory >> will become part of the system memory as part of the boot process. So, >> if the host DDR is 256GB and the CCIX expansion memory is 4GB, the >> total system mem will be 260GB. >> >> Assume that the 'mm' is taught to mark/anoint the ZONE_DEVICE(or >> ZONE_XXX) range from 256 to 260 GB. Then, for kmalloc it(mm) won't use >> the ZONE_DEV range. But for a malloc, it will/can use that range. > > HMM zone device memory would work with that, you just need to teach the > platform to identify this memory zone and not hotplug it. Again you > should rely on specific device driver API to allocate this memory. > @Jerome - a new linux-accelerator's list has just been created. I have CC'd that list since we have overlapping interests w.r.t CCIX. I cannot comment on surprise add/remove as of now ... will cross the bridge later. >> > There has been several discussions already about madvise/mbind/set_mempolicy/ >> > move_pages and at this time i don't think we want to add or change any of them to >> > understand device memory. My personal opinion is that we first need to have enough >> >> We will visit these APIs when we are more closer to building exotic >> CCIX devices. And the plan is to present/express the CCIX proximity >> attributes just like a NUMA node-proximity attribute today. That way >> there would be minimal disruptions to the existing OS ecosystem. > > NUMA have been rejected previously see CDM/CAPI threads. So i don't see > it being accepted for CCIX either. My belief is that we want to hide this > inside device driver and only once we see multiple devices all doing the > same kind of thing we should move toward building something generic that > catter to CCIX devices. Thanks for pointing out the NUMA thingy. I will visit the CDM/CAPI threads to understand what was discussed before commenting further. Chetan -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>