Hi Honggyu, On Thu, 13 Jun 2024 22:20:47 +0900 Honggyu Kim <honggyu.kim@xxxxxx> wrote: > There was an RFC IDEA "DAMOS-based Tiered-Memory Management" previously > posted at [1]. > > It says there is no implementation of the demote/promote DAMOS action > are made. This patch series is about its implementation for physical > address space so that this scheme can be applied in system wide level. > > Changes from RFC v4: > https://lore.kernel.org/20240512175447.75943-1-sj@xxxxxxxxxx > 1. Add usage and design documents > 2. Rename alloc_demote_folio to alloc_migrate_folio > 3. Add evaluation results with "demotion_enabled" true > 4. Rebase based on v6.10-rc3 I left comments on the new patches for the documentation. [...] > > Evaluation Results > ================== > > All the result values are normalized to DRAM-only execution time because > the workload cannot be faster than DRAM-only unless the workload hits > the peak bandwidth but our redis test doesn't go beyond the bandwidth > limit. > > So the DRAM-only execution time is the ideal result without affected by > the gap between DRAM and CXL performance difference. The NUMA node > environment is as follows. > > node0 - local DRAM, 512GB with a CPU socket (fast tier) > node1 - disabled > node2 - CXL DRAM, 96GB, no CPU attached (slow tier) > > The following is the result of generating zipfian distribution to > redis-server and the numbers are averaged by 50 times of execution. > > 1. YCSB zipfian distribution read only workload > memory pressure with cold memory on node0 with 512GB of local DRAM. > ====================+================================================+========= > | cold memory occupied by mmap and memset | > | 0G 440G 450G 460G 470G 480G 490G 500G | > ====================+================================================+========= > Execution time normalized to DRAM-only values | GEOMEAN > --------------------+------------------------------------------------+--------- > DRAM-only | 1.00 - - - - - - - | 1.00 > CXL-only | 1.19 - - - - - - - | 1.19 > default | - 1.00 1.05 1.08 1.12 1.14 1.18 1.18 | 1.11 > DAMON tiered | - 1.03 1.03 1.03 1.03 1.03 1.07 *1.05 | 1.04 > DAMON lazy | - 1.04 1.03 1.04 1.05 1.06 1.06 *1.06 | 1.05 > ====================+================================================+========= > CXL usage of redis-server in GB | AVERAGE > --------------------+------------------------------------------------+--------- > DRAM-only | 0.0 - - - - - - - | 0.0 > CXL-only | 51.4 - - - - - - - | 51.4 > default | - 0.6 10.6 20.5 30.5 40.5 47.6 50.4 | 28.7 > DAMON tiered | - 0.6 0.5 0.4 0.7 0.8 7.1 5.6 | 2.2 > DAMON lazy | - 0.5 3.0 4.5 5.4 6.4 9.4 9.1 | 5.5 > ====================+================================================+========= > > Each test result is based on the exeuction environment as follows. Nit. s/exeuction/execution/ [...] > In summary, the evaluation results show that DAMON memory management > with DAMOS_MIGRATE_{HOT,COLD} actions reduces the performance slowdown > compared to the "default" memory policy from 11% to 3~5% when the system > runs with high memory pressure on its fast tier DRAM nodes. > > Having these DAMOS_MIGRATE_HOT and DAMOS_MIGRATE_COLD actions can make > tiered memory systems run more efficiently under high memory pressures. Thank you very much for continuing this great work. Other than trivial comments on documentation patches and the above typo, I have no particular concern on this patchset. I'm looking forward to the next version. Thanks, SJ [...]