Donet Tom <donettom@xxxxxxxxxxxxx> writes: > This patchset is to optimize the cross-socket memory access with > MPOL_PREFERRED_MANY policy. > > To test this patch we ran the following test on a 3 node system. > Node 0 - 2GB - Tier 1 > Node 1 - 11GB - Tier 1 > Node 6 - 10GB - Tier 2 > > Below changes are made to memcached to set the memory policy, > It select Node0 and Node1 as preferred nodes. > > #include <numaif.h> > #include <numa.h> > > unsigned long nodemask; > int ret; > > nodemask = 0x03; > ret = set_mempolicy(MPOL_PREFERRED_MANY | MPOL_F_NUMA_BALANCING, > &nodemask, 10); > /* If MPOL_F_NUMA_BALANCING isn't supported, > * fall back to MPOL_PREFERRED_MANY */ > if (ret < 0 && errno == EINVAL){ > printf("set mem policy normal\n"); > ret = set_mempolicy(MPOL_PREFERRED_MANY, &nodemask, 10); > } > if (ret < 0) { > perror("Failed to call set_mempolicy"); > exit(-1); > } > > Test Procedure: > =============== > 1. Make sure memory tiering and demotion are enabled. > 2. Start memcached. > > # ./memcached -b 100000 -m 204800 -u root -c 1000000 -t 7 > -d -s "/tmp/memcached.sock" > > 3. Run memtier_benchmark to store 3200000 keys. > > #./memtier_benchmark -S "/tmp/memcached.sock" --protocol=memcache_binary > --threads=1 --pipeline=1 --ratio=1:0 --key-pattern=S:S --key-minimum=1 > --key-maximum=3200000 -n allkeys -c 1 -R -x 1 -d 1024 > > 4. Start a memory eater on node 0 and 1. This will demote all memcached > pages to node 6. > 5. Make sure all the memcached pages got demoted to lower tier by reading > /proc/<memcaced PID>/numa_maps. > > # cat /proc/2771/numa_maps > --- > default anon=1009 dirty=1009 active=0 N6=1009 kernelpagesize_kB=64 > default anon=1009 dirty=1009 active=0 N6=1009 kernelpagesize_kB=64 > --- > > 6. Kill memory eater. > 7. Read the pgpromote_success counter. > 8. Start reading the keys by running memtier_benchmark. > > #./memtier_benchmark -S "/tmp/memcached.sock" --protocol=memcache_binary > --pipeline=1 --distinct-client-seed --ratio=0:3 --key-pattern=R:R > --key-minimum=1 --key-maximum=3200000 -n allkeys > --threads=64 -c 1 -R -x 6 > > 9. Read the pgpromote_success counter. > > Test Results: > ============= > Without Patch > ------------------ > 1. pgpromote_success before test > Node 0: pgpromote_success 11 > Node 1: pgpromote_success 140974 > > pgpromote_success after test > Node 0: pgpromote_success 11 > Node 1: pgpromote_success 140974 > > 2. Memtier-benchmark result. > AGGREGATED AVERAGE RESULTS (6 runs) > ================================================================== > Type Ops/sec Hits/sec Misses/sec Avg. Latency p50 Latency > ------------------------------------------------------------------ > Sets 0.00 --- --- --- --- > Gets 305792.03 305791.93 0.10 0.18949 0.16700 > Waits 0.00 --- --- --- --- > Totals 305792.03 305791.93 0.10 0.18949 0.16700 > > ====================================== > p99 Latency p99.9 Latency KB/sec > ------------------------------------- > --- --- 0.00 > 0.44700 1.71100 11542.69 > --- --- --- > 0.44700 1.71100 11542.69 > > With Patch > --------------- > 1. pgpromote_success before test > Node 0: pgpromote_success 5 > Node 1: pgpromote_success 89386 > > pgpromote_success after test > Node 0: pgpromote_success 57895 > Node 1: pgpromote_success 141463 > > 2. Memtier-benchmark result. > AGGREGATED AVERAGE RESULTS (6 runs) > ==================================================================== > Type Ops/sec Hits/sec Misses/sec Avg. Latency p50 Latency > -------------------------------------------------------------------- > Sets 0.00 --- --- --- --- > Gets 521942.24 521942.07 0.17 0.11459 0.10300 > Waits 0.00 --- --- --- --- > Totals 521942.24 521942.07 0.17 0.11459 0.10300 > > ======================================= > p99 Latency p99.9 Latency KB/sec > --------------------------------------- > --- --- 0.00 > 0.23100 0.31900 19701.68 > --- --- --- > 0.23100 0.31900 19701.68 > > > Test Result Analysis: > ===================== > 1. With patch we could observe pages are getting promoted. > 2. Memtier-benchmark results shows that, with the patch, > performance has increased more than 50%. > > Ops/sec without fix - 305792.03 > Ops/sec with fix - 521942.24 > > Changes: > V4 > - Added an example in the "PATCH 2/2" commit message as per the discussion > from V3. > V3: > - Added "* @vmf: structure describing the fault" comment for > mpol_misplaced() to fix the warning. > https://lore.kernel.org/oe-kbuild-all/202403202229.WZeAnUuO-lkp@xxxxxxxxx/ > -https://lore.kernel.org/lkml/cover.1711002865.git.donettom@xxxxxxxxxxxxx/ > v2: > - Rebased on latest upstream (v6.8-rc7) > - Used 'numa_node_id()' to get the current execution node ID, Added > 'lockdep_assert_held' to make sure that the 'mpol_misplaced()' is > called with ptl held. > - The migration condition has been updated; now, migration will only > occur if the execution node is present in the policy nodemask. > -https://lore.kernel.org/lkml/cover.1709909210.git.donettom@xxxxxxxxxxxxx/ > > -v1: https://lore.kernel.org/linux-mm/9c3f7b743477560d1c5b12b8c111a584a2cc92ee.1708097962.git.donettom@xxxxxxxxxxxxx/#t > > > Donet Tom (2): > mm/mempolicy: Use numa_node_id() instead of cpu_to_node() > mm/numa_balancing:Allow migrate on protnone reference with > MPOL_PREFERRED_MANY policy > > include/linux/mempolicy.h | 5 +++-- > mm/huge_memory.c | 2 +- > mm/internal.h | 2 +- > mm/memory.c | 8 +++++--- > mm/mempolicy.c | 36 +++++++++++++++++++++++++++--------- > 5 files changed, 37 insertions(+), 16 deletions(-) LGTM, Thanks! Feel free to add Reviewed-by: "Huang, Ying" <ying.huang@xxxxxxxxx> in the future version. -- Best Regards, Huang, Ying