To optimize page placement in a memory tiering system with NUMA balancing, the hot pages in the slow memory node need to be identified. Essentially, the original NUMA balancing implementation selects and promote the mostly recently accessed (MRU) pages. But the recently accessed pages may be cold. So in this patchset, we implement a new hot page identification algorithm based on the latency between NUMA balancing page table scanning and hint page fault. And the hot page promotion can incur some overhead in the system. To control the overhead a simple promotion rate limit mechanism is implemented. The hot threshold used to identify the hot pages is workload dependent usually. So we also implemented a hot threshold automatic adjustment algorithm. The basic idea is to increase/decrease the hot threshold to make the number of pages that pass the hot threshold (promote candidate) near the rate limit. We used the pmbench memory accessing benchmark tested the patchset on a 2-socket server system with DRAM and PMEM installed. The test results are as follows, pmbench score promote rate (accesses/s) MB/s ------------- ------------ base 146887704.1 725.6 hot selection 165695601.2 544.0 rate limit 162814569.8 165.2 auto adjustment 170495294.0 136.9