> Well, per address range operation is a completely different beast I > would say. External tool would need to a) understand what that range is > used for (e.g. stack/heap ranges, mmaped shared files like libraries or > private mappings) and b) by in sync with memory layout modifications > done by applications (e.g. that an mmap has been issued to back malloc > request). Quite a lot of understanding about the specific process. I > would say that with that intimate knowledge it is quite better to be > part of the process and do those changes from within of the process > itself. Sorry, this may be a digression, but just wanted to mention a particular use case from a project I recently collaborated on (to appear next month at IIWSC 2022: http://www.iiswc.org/iiswc2022/index.html). We carried out a performance analysis of the latest Linux AutoNUMA memory tiering on graph processing applications. We noticed that hot pages cannot be properly identified by the reactive approach used by AutoNUMA due to irregular/random memory access patterns. Thus, as a POC, we implemented and evaluated a simple idea of having an external user-level process/agent that, based on prior profiling results of memory regions, could make more effectively memory chunk/object-based mappings (instead of page-level allocation/migration) in advance on either DRAM or CXL/PMEM (via mbind calls). This kind of tiering solution could deliver up to 2x more performance for graph analytics workloads. We plan to evaluate other workloads as well. Having a feature like "pidfd/process_mbind" would really simplify our user-level agent implementation moving forward, as right now we are adding a LD_PRELOAD wrapper (for signal handler) to listen and execute "mbind" requests from another process. If there's any other alternative solution to this already (via ptrace?), please let me know. Thank you! Vinicius Petrucci Principal Performance Engineer Micron Technology