On Thu, Jan 23, 2025 at 11:46:49AM +0800, Huang, Ying wrote: > Gregory Price <gourry@xxxxxxxxxx> writes: > > Test 2 shows overhead of TPP on + pagecache promo off > > Test 3 shows overhead of TPP+Promo on, but all the memory is on top tier > > > > This shows the check as to whether the folio is in the top tier is > > actually somewhat expensive (~5% compared to baseline, ~2.7% compared to > > TPP-on Promo-off). > > This is unexpected. Can we try to optimize it? For example, via using > a nodemask? node_is_toptier() is used in the mapped pages promotion > too (1 vs. 2 above). I guess that the optimization can reduce the > overhead there with measurable difference too. > Agreed it surprised me a bit as well. But more surprising is the fact that test 2 was also 2-3% slower given that it's a simple boolean check against whether tiering is turned on. I suppose that since the test is blowing up the cache/tlb by design, multiple additional cache/tlb misses could cause a non-trivial slowdown, but it is certainly a small puzzle I haven't dug into yet. ~Gregory