Hi SeongJae, Thanks very much for your comments in details. On Tue, 16 Jan 2024 12:31:59 -0800 SeongJae Park <sj@xxxxxxxxxx> wrote: > Thank you so much for this great patches and the above nice test results. I > believe the test setup and results make sense, and merging a revised version of > this patchset would provide real benefits to the users. Glad to hear that! > In a high level, I think it might better to separate DAMON internal changes > from DAMON external changes. I agree. I can't guarantee but I can move all the external changes inside mm/damon, but will try that as much as possible. > For DAMON part changes, I have no big concern other than trivial coding style > level comments. Sure. I will fix those. > For DAMON-external changes that implementing demote_pages() and > promote_pages(), I'm unsure if the implementation is reusing appropriate > functions, and if those are placee in right source file. Especially, I'm > unsure if vmscan.c is the right place for promotion code. Also I don't know if > there is a good agreement on the promotion/demotion target node decision. That > should be because I'm not that familiar with the areas and the files, but I > feel this might because our discussions on the promotion and the demotion > operations are having rooms for being more matured. Because I'm not very > faimiliar with the part, I'd like to hear others' comments, too. I would also like to hear others' comments, but this might not be needed if most of external code can be moved to mm/damon. > To this end, I feel the problem might be able te be simpler, because this > patchset is trying to provide two sophisticated operations, while I think a > simpler approach might be possible. My humble simpler idea is adding a DAMOS > operation for moving pages to a given node (like sys_move_phy_pages RFC[1]), > instead of the promote/demote. Because the general pages migration can handle > multiple cases including the promote/demote in my humble assumption. My initial implementation was similar but I found that it's not accurate enough due to the nature of inaccuracy of DAMON regions. I saw that many pages were demoted and promoted back and forth because migration target regions include both hot and cold pages together. So I have implemented the demotion and promotion logics based on the shrink_folio_list, which contains many corner case handling logics for reclaim. Having the current demotion and promotion logics makes the hot/cold migration pretty accurate as expected. We made a simple program called "hot_cold" and it receives 2 arguments for hot size and cold size in MB. For example, "hot_cold 200 500" allocates 200MB of hot memory and 500MB of cold memory. It basically allocates 2 large blocks of memory with mmap, then repeat memset for the initial 200MB to make it accessed in an infinite loop. Let's say there are 3 nodes in the system and the first node0 and node1 are the first tier, and node2 is the second tier. $ cat /sys/devices/virtual/memory_tiering/memory_tier4/nodelist 0-1 $ cat /sys/devices/virtual/memory_tiering/memory_tier22/nodelist 2 Here is the result of partitioning hot/cold memory and I put execution command at the right side of numastat result. I initially ran each hot_cold program with preferred setting so that they initially allocate memory on one of node0 or node2, but they gradually migrated based on their access frequencies. $ numastat -c -p hot_cold Per-node process memory usage (in MBs) PID Node 0 Node 1 Node 2 Total --------------- ------ ------ ------ ----- 754 (hot_cold) 1800 0 2000 3800 <- hot_cold 1800 2000 1184 (hot_cold) 300 0 500 800 <- hot_cold 300 500 1818 (hot_cold) 801 0 3199 4000 <- hot_cold 800 3200 30289 (hot_cold) 4 0 5 10 <- hot_cold 3 5 30325 (hot_cold) 31 0 51 81 <- hot_cold 30 50 --------------- ------ ------ ------ ----- Total 2938 0 5756 8695 The final node placement result shows that DAMON accurately migrated pages by their hotness for multiple processes. > In more detail, users could decide which is the appropriate node for promotion > or demotion and use the new DAMOS action to do promotion and demotion. Users > would requested to decide which node is the proper promotion/demotion target > nodes, but that decision wouldn't be that hard in my opinion. > > For this, 'struct damos' would need to be updated for such argument-dependent > actions, like 'struct damos_filter' is haing a union. That might be a better solution. I will think about it. > In future, we could extend the operation to the promotion and the demotion > after the dicussion around the promotion and demotion is matured, if required. > And assuming DAMON be extended for originating CPU-aware access monitoring, the > new DAMOS action would also cover more use cases such as general NUMA nodes > balancing (extending DAMON for CPU-aware monitoring would required), and some > complex configurations where having both CPU affinity and tiered memory. I > also think that may well fit with my RFC idea[2] for tiered memory management. > > Looking forward to opinions from you and others. I admig I miss many things, > and more than happy to be enlightened. > > [1] https://lwn.net/Articles/944007/ > [2] https://lore.kernel.org/damon/20231112195602.61525-1-sj@xxxxxxxxxx/ Thanks very much for your comments. I will need a few more days for the update but will try to address your concerns as much as possible. Thanks, Honggyu