Hi Dave, On 7/22/2022 6:01 AM, Dave Chinner wrote: > A huge amount of spinlock contention in the xlog_commit_cil() path > went away. The commit identified doesn't remove/change any > spinlocks, it actually adds more overhead to the critical section of > the above spinlock in preparation for removing said spinlocks. > > That removal happens in the next commit in that series - c0fb4765c508 ("xfs: > convert CIL to unordered per cpu lists") - so I'd be expecting a > bisect to demonstrate that the spinlock contention goes away with > the commit that removed the spinlocks (as it does in all the testing > of this I've done over the past 2 years), not the commit this bisect > identified. Hence I think the bisect went wrong somewhere... We did some investigation and got: commit: df7a4a2134b0a ("xfs: convert CIL busy extents to per-cpu") 016a23388cdcb ("xfs: Add order IDs to log items in CIL") c0fb4765c5086 ("xfs: convert CIL to unordered per cpu lists") df7a4a2134b0a201 016a23388cdcb2740deb1379dc4 c0fb4765c5086cfd00f1158f5f4 ---------------- --------------------------- --------------------------- %stddev %change %stddev %change %stddev \ | \ | \ 62.07 +0.0% 62.09 -0.0% 62.06 stress-ng.time.elapsed_time 62.07 +0.0% 62.09 -0.0% 62.06 stress-ng.time.elapsed_time.max 2237 +0.0% 2237 +0.0% 2237 stress-ng.time.file_system_inputs 1842 ± 4% +16.9% 2152 ± 3% +17.6% 2166 stress-ng.time.involuntary_context_switches 551.00 -0.3% 549.10 -0.3% 549.40 stress-ng.time.major_page_faults 6376 -1.1% 6305 ± 2% +0.6% 6416 stress-ng.time.maximum_resident_set_size 9704 -0.3% 9676 -0.1% 9691 stress-ng.time.minor_page_faults 4096 +0.0% 4096 +0.0% 4096 stress-ng.time.page_size 841.90 -2.4% 821.70 -2.4% 821.90 stress-ng.time.percent_of_cpu_this_job_got 512.83 -3.4% 495.24 -3.6% 494.18 stress-ng.time.system_time 10.05 ± 8% +52.3% 15.30 ± 3% +61.1% 16.19 ± 2% stress-ng.time.user_time 2325 ± 16% +66.5% 3873 ± 7% +70.3% 3962 ± 6% stress-ng.time.voluntary_context_switches 1544 ± 4% +54.4% 2385 +63.9% 2531 stress-ng.xattr.ops Yes. commit c0fb4765c5086 ("xfs: convert CIL to unordered per cpu lists") could bring performance gain also. But the most performance gain (54.4%) is from commit 016a23388cdcb ("xfs: Add order IDs to log items in CIL"). Based on commit 016a23388cdcb and add following change: diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c index 6bc540898e3a..7c6c91a0a12d 100644 --- a/fs/xfs/xfs_log_cil.c +++ b/fs/xfs/xfs_log_cil.c @@ -659,9 +659,14 @@ xlog_cil_insert_items( continue; lip->li_order_id = order; - if (!list_empty(&lip->li_cil)) - continue; - list_add_tail(&lip->li_cil, &cil->xc_cil); + + /* + * Only move the item if it isn't already at the tail. This is + * to prevent a transient list_empty() state when reinserting + * an item that is already the only item in the CIL. + */ + if (!list_is_last(&lip->li_cil, &cil->xc_cil)) + list_move_tail(&lip->li_cil, &cil->xc_cil); } The performance will drop to the same level as commit df7a4a2134b0a ("xfs: convert CIL busy extents to per-cpu"): commit: 016a23388cdcb2740deb1379dc408f21c84efb11 a8bef09e7d8e65207c8403e030a0965db43ce3de 016a23388cdcb274 a8bef09e7d8e65207c8403e030a ---------------- --------------------------- %stddev %change %stddev \ | \ 62.06 -0.0% 62.05 stress-ng.time.elapsed_time 62.06 -0.0% 62.05 stress-ng.time.elapsed_time.max 2237 +0.0% 2237 stress-ng.time.file_system_inputs 2226 -16.7% 1855 ± 4% stress-ng.time.involuntary_context_switches 549.00 +0.5% 551.67 stress-ng.time.major_page_faults 6286 +0.1% 6292 stress-ng.time.maximum_resident_set_size 9636 +0.1% 9641 stress-ng.time.minor_page_faults 4096 +0.0% 4096 stress-ng.time.page_size 823.00 +3.0% 847.33 stress-ng.time.percent_of_cpu_this_job_got 496.02 +4.2% 516.61 stress-ng.time.system_time 15.08 -38.1% 9.33 ± 4% stress-ng.time.user_time 4034 ± 3% -43.0% 2299 ± 6% stress-ng.time.voluntary_context_switches 2368 -37.4% 1482 ± 4% stress-ng.xattr.ops Regards Yin, Fengwei