On 9/3/20 10:40 AM, Alex Shi wrote: > > > 在 2020/9/3 下午4:32, Alex Shi 写道: >>> >> I have run thpscale with 'always' defrag setting of THP. The Amean stddev is much >> larger than a very little average run time reducing. >> >> But the left patch 4 could show the cmpxchg retry reduce from thousands to hundreds >> or less. >> >> Subject: [PATCH v4 4/4] add cmpxchg tracing > > > It's a typical result with the patchset: > > Performance counter stats for './run-mmtests.sh -c configs/config-workload-thpscale pageblock-c': > > 9,564 compaction:mm_compaction_isolate_migratepages > 6,430 compaction:mm_compaction_isolate_freepages > 5,287 compaction:mm_compaction_migratepages > 45,299 compaction:mm_compaction_begin > 45,299 compaction:mm_compaction_end > 30,557 compaction:mm_compaction_try_to_compact_pages > 95,540 compaction:mm_compaction_finished > 149,379 compaction:mm_compaction_suitable > 0 compaction:mm_compaction_deferred > 0 compaction:mm_compaction_defer_compaction > 3,949 compaction:mm_compaction_defer_reset > 0 compaction:mm_compaction_kcompactd_sleep > 0 compaction:mm_compaction_wakeup_kcompactd > 0 compaction:mm_compaction_kcompactd_wake > 68 pageblock:hit_cmpxchg > > 113.570974583 seconds time elapsed > > 14.664451000 seconds user > 96.847116000 seconds sys > > It's 5.9-rc2 base kernel result: > > Performance counter stats for './run-mmtests.sh -c configs/config-workload-thpscale rc2-e': > > 15,920 compaction:mm_compaction_isolate_migratepages > 20,523 compaction:mm_compaction_isolate_freepages > 9,752 compaction:mm_compaction_migratepages > 27,773 compaction:mm_compaction_begin > 27,773 compaction:mm_compaction_end > 16,391 compaction:mm_compaction_try_to_compact_pages > 62,809 compaction:mm_compaction_finished > 69,821 compaction:mm_compaction_suitable > 0 compaction:mm_compaction_deferred > 0 compaction:mm_compaction_defer_compaction > 7,875 compaction:mm_compaction_defer_reset > 0 compaction:mm_compaction_kcompactd_sleep > 0 compaction:mm_compaction_wakeup_kcompactd > 0 compaction:mm_compaction_kcompactd_wake > 1,208 pageblock:hit_cmpxchg > > 116.440414591 seconds time elapsed > > 15.326913000 seconds user > 103.752758000 seconds sys The runs wildly differ in many of other stats, so I'm not sure they are really comparable. I guess you could show the fraction of hit_cmpxchg to all cmpxchg. But there's also danger of tracepoints widening the race window. In the end what matters is how these 1208 retries contribute to runtime. I doubt they could be really visible in a 100+ seconds run though.