On Thu, Apr 21, 2022 at 04:34:09PM +0800, ying.huang@xxxxxxxxx wrote: > On Thu, 2022-04-21 at 16:17 +0800, Aaron Lu wrote: > > On Thu, Apr 21, 2022 at 03:49:21PM +0800, ying.huang@xxxxxxxxx wrote: ... ... > > > For swap-in latency, we can use pmbench, which can output latency > > > information. > > > > > > > OK, I'll give pmbench a run, thanks for the suggestion. > > Better to construct a senario with more swapin than swapout. For > example, start a memory eater, then kill it later. What about vm-scalability/case-swapin? https://git.kernel.org/pub/scm/linux/kernel/git/wfg/vm-scalability.git/tree/case-swapin I think you are pretty familiar with it but still: 1) it starts $nr_task processes and each mmaps $size/$nr_task area and then consumes the memory, after this, it waits for a signal; 2) start another process to consume $size memory to push the memory in step 1) to swap device; 3) kick processes in step 1) to start accessing their memory, thus trigger swapins. The metric of this testcase is the swapin throughput. I plan to restrict the cgroup's limit to $size. Considering there is only one NVMe drive attached to node 0, I will run the test as described before: 1) bind processes to run on node 0, allocate on node 1 to test the performance when reclaimer's node id is the same as swap device's. 2) bind processes to run on node 1, allocate on node 0 to test the performance when page's node id is the same as swap device's. Ying and Yang, Let me know what you think about the case used and the way the test is conducted.