SeongJae Park <sj@xxxxxxxxxx> writes: > Hi Aneesh, > > On Fri, 17 Feb 2023 17:28:09 +0530 Aneesh Kumar K V <aneesh.kumar@xxxxxxxxxxxxx> wrote: > >> PowerPC architecture (POWER10) supports a Hot/Cold page tracking >> facility that provides access counter and access affinity details at >> configurable page size granularity [1]. I have been looking at using >> this counter in different areas of the kernel such as >> >> 1) Page reclaim/demotion >> 2) THP utilization >> 3) Page promotion. >> >> I have done some MGLRU integration and would like to discuss the >> observation with the rest of the community. It is still not clear what >> are the best ways to integrate these hardware counters in the Linux >> kernel. > > Sounds very interesting. I think DAMON might be one another option, because it > is designed to be easy to extended with various source of access > information[1], and provides an abstraction layer for access temparature based > memory management[2], namely Data Access Monitoring-based Operation Schemes > (DAMOS). > >> Attached is the performance graph showing how the mongodb/ycsb >> benchmark performs when using hardware counters with MGLRU aging. An >> early RFC version of the code can be found at >> https://github.com/kvaneesh/linux/commit/b472e2c8080823bb4114c286270aea3e18ffe221 >> . I also expect we can get some numbers w.r.t THP usage before the >> conference. > > I also have experimented a DAMON-based THP optimization[3], which shown > interesting results. > > Hope to discuss about this with you at LSF/MM. FYI, I also proposed an LSF/MM > topic for DAMON[4]. > > [1] https://docs.kernel.org/mm/damon/design.html#configurable-layers > [2] https://docs.kernel.org/mm/damon/api.html#c.damos > [3] https://www.amazon.science/publications/daos-data-access-aware-operating-system > [4] https://lore.kernel.org/damon/20230214003328.55285-1-sj@xxxxxxxxxx/ > > The hardware counters that are supported in the case of POWER10 are based on physical addresses. The hardware facility will count the access across a physical address range and there is a counter for each page that gives the access count and also information about which node did access the page. I haven't spent much time studying DAMON so I might be wrong here. The reason I avoided using DAMON for the POC was because my goal was to evaluate how the hardware counters measured against the pte reference bit and I was not sure I could evaluate that using the DAMON action facility. I do agree that we could add a layer similar to DAMON_PADDR and expose the details to userspace. But I was not sure we can take action based on that. In most cases what I wanted was to move the coldest page in the Numa node to swap. So that is relative hotness rather than moving a page that got a hotness value less than 10 to swap even though we can figure out a way to make the latter similar to the former. I will look at DAMON and see if that is the best framework for things like this. -aneesh