On Thu, 29 Feb 2024 09:31:02 +0530 Bharata B Rao <bharata@xxxxxxx> wrote: > On 29-Feb-24 7:34 AM, Davidlohr Bueso wrote: > > On Thu, 25 Jan 2024, David Rientjes wrote: > > > >> Some recent discussions have proven that there is widespread interest in > >> some very foundational topics for this technology such as: > >> > >> - Decoupling CPU balancing from memory balancing (or obsoleting CPU > >> balancing entirely) > >> > >> + John Hubbard notes this would be useful for GPUs: > >> > >> a) GPUs have their own processors that are invisible to the kernel's > >> NUMA "which tasks are active on which NUMA nodes" calculations, > >> and > >> > >> b) Similar to where CXL is generally going, we have already built > >> fully memory-coherent hardware, which include memory-only NUMA > >> nodes. > >> > >> - In-kernel hot memory abstraction, informed by hardware hinting drivers > >> (incl some architectures like Power10), usable as a NUMA Balancing > >> backend for promotion and other areas of the kernel like transparent > >> hugepage utilization > > > > Regarding the hardware counters, can/will CPU vendors provide something > > better for what is currently there for PEBS/IBS - which needs a lot of > > stat crunching to make it useful for hot page detection. > > IBS works independent of PMCs and reports useful information like the > virtual and physical address of the access, precise IP, Data Source > info (like cache, DRAM, External memory/CXL etc), remote node indication > etc. Hence it doesn't really need stat crunching. > > However it captures and reports access information based on sampling > and I have seen that the best sampling interval isn't always good enough > to match the number of accesses captured by software based mechanism > like NUMA balancing. This is same for sampling methods using time interval such as DAMON. I'm trying to make a sort of automatic tuning of such intervals for DAMON based on realtime monitoring results and real request from users. For example, if the user want to find hottest memory region of X % of the total memory, we could draw hotness histogram of the memory and get some clue about if the current sampling interval is too large or small. Just a vague idea. I haven't spent time for this topic so far. Since DAMON is designed to be easy to be extended with multiple access check methods including hardware features like IBS, I think the DAMON level auto-tuning might help in this case. Thanks, SJ > > Regards, > Bharata. > >