Hi Dave, On Sat, May 13, 2023 at 09:05:04AM +1000, Dave Chinner wrote: > On Fri, May 12, 2023 at 03:44:29PM +0800, Oliver Sang wrote: [...] > > Thanks a lot for guidance! > > > > we plan to disable XFS_DEBUG (as well as XFS_WARN) in our performance tests. > > want to consult with you if this is the correct thing to do? > > You can use XFS_WARN=y with performance tests - that elides all the > debug specific code that changes behaviour but leaves all the > ASSERT-based correctness checks in the code. > > > and I guess we should still keep them in functional tests, am I right? > > Yes. > > > BTW, regarding this case, we tested again with disabling XFS_DEBUG (as well as > > XFS_WARN), kconfig is attached, only diff with last time is: > > -CONFIG_XFS_DEBUG=y > > -CONFIG_XFS_ASSERT_FATAL=y > > +# CONFIG_XFS_WARN is not set > > +# CONFIG_XFS_DEBUG is not set > > > > but we still observed similar regression: > > > > ecd788a92460eef4 2edf06a50f5bbe664283f3c55c4 > > ---------------- --------------------------- > > %stddev %change %stddev > > \ | \ > > 8176057 ± 15% +4.7% 8558110 fsmark.app_overhead > > 14484 -6.3% 13568 fsmark.files_per_sec > > So the application spent 5% more CPU time in userspace, and the rate > the kernel processed IO went down by 6%. Seems to me like > everything is running slower, not just the kernel code.... > > > 100.50 ± 5% +0.3% 100.83 turbostat.Avg_MHz > > 5.54 ± 11% +0.3 5.82 turbostat.Busy% > > 1863 ± 19% -6.9% 1733 turbostat.Bzy_MHz > > Evidence that the CPU is running at a 7% lower clock rate when the > results are 6% slower is a bit suspicious to me. Shouldn't the CPU > clock rate be fixed to the same value for A-B performance regression > testing? For commit 2edf06a50f5, it seems to change the semantics a little about handling of 'flags' for xfs_alloc_fix_freelist(). With the debug below, the performance is restored. ecd788a92460eef4 2edf06a50f5bbe664283f3c55c4 68721405630744da1c07c9c1c3c ---------------- --------------------------- --------------------------- 14349 -5.7% 13527 +0.6% 14437 fsmark.files_per_sec 486.29 +5.8% 514.28 -0.5% 483.70 fsmark.time.elapsed_time Please help to review if the debug patch miss anything as I don't know the internals of xfs, thanks. --- diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index 98defd19e09e..8c85cc68c5f4 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -3246,12 +3246,12 @@ xfs_alloc_vextent_set_fsbno( */ static int __xfs_alloc_vextent_this_ag( - struct xfs_alloc_arg *args) + struct xfs_alloc_arg *args, int flag) { struct xfs_mount *mp = args->mp; int error; - error = xfs_alloc_fix_freelist(args, 0); + error = xfs_alloc_fix_freelist(args, flag); if (error) { trace_xfs_alloc_vextent_nofix(args); return error; @@ -3289,7 +3289,7 @@ xfs_alloc_vextent_this_ag( } args->pag = xfs_perag_get(mp, args->agno); - error = __xfs_alloc_vextent_this_ag(args); + error = __xfs_alloc_vextent_this_ag(args, 0); xfs_alloc_vextent_set_fsbno(args, minimum_agno); xfs_perag_put(args->pag); @@ -3329,7 +3329,7 @@ xfs_alloc_vextent_iterate_ags( args->agno = start_agno; for (;;) { args->pag = xfs_perag_get(mp, args->agno); - error = __xfs_alloc_vextent_this_ag(args); + error = __xfs_alloc_vextent_this_ag(args, flags); if (error) { args->agbno = NULLAGBLOCK; break; Also for the turbostat.Bzy_MHz diff, IIUC, 0Day always uses 'performance' cpufreq governor. And as the test case is running 32 thread in a platform with 96 CPUs, there are many CPUs in idle state in average, and I suspect the Bzy_MHz may be calculated considering those cpufreq and cpuidle factors. Thanks, Feng > > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx