On Sun, May 14, 2023 at 10:36:48PM +0800, Feng Tang wrote: > Hi Dave, > > On Sat, May 13, 2023 at 09:05:04AM +1000, Dave Chinner wrote: > > On Fri, May 12, 2023 at 03:44:29PM +0800, Oliver Sang wrote: > [...] > > > Thanks a lot for guidance! > > > > > > we plan to disable XFS_DEBUG (as well as XFS_WARN) in our performance tests. > > > want to consult with you if this is the correct thing to do? > > > > You can use XFS_WARN=y with performance tests - that elides all the > > debug specific code that changes behaviour but leaves all the > > ASSERT-based correctness checks in the code. > > > > > and I guess we should still keep them in functional tests, am I right? > > > > Yes. > > > > > BTW, regarding this case, we tested again with disabling XFS_DEBUG (as well as > > > XFS_WARN), kconfig is attached, only diff with last time is: > > > -CONFIG_XFS_DEBUG=y > > > -CONFIG_XFS_ASSERT_FATAL=y > > > +# CONFIG_XFS_WARN is not set > > > +# CONFIG_XFS_DEBUG is not set > > > > > > but we still observed similar regression: > > > > > > ecd788a92460eef4 2edf06a50f5bbe664283f3c55c4 > > > ---------------- --------------------------- > > > %stddev %change %stddev > > > \ | \ > > > 8176057 ± 15% +4.7% 8558110 fsmark.app_overhead > > > 14484 -6.3% 13568 fsmark.files_per_sec > > > > So the application spent 5% more CPU time in userspace, and the rate > > the kernel processed IO went down by 6%. Seems to me like > > everything is running slower, not just the kernel code.... > > > > > 100.50 ± 5% +0.3% 100.83 turbostat.Avg_MHz > > > 5.54 ± 11% +0.3 5.82 turbostat.Busy% > > > 1863 ± 19% -6.9% 1733 turbostat.Bzy_MHz > > > > Evidence that the CPU is running at a 7% lower clock rate when the > > results are 6% slower is a bit suspicious to me. Shouldn't the CPU > > clock rate be fixed to the same value for A-B performance regression > > testing? > > For commit 2edf06a50f5, it seems to change the semantics a little > about handling of 'flags' for xfs_alloc_fix_freelist(). With the debug > below, the performance is restored. > > > ecd788a92460eef4 2edf06a50f5bbe664283f3c55c4 68721405630744da1c07c9c1c3c > ---------------- --------------------------- --------------------------- > > 14349 -5.7% 13527 +0.6% 14437 fsmark.files_per_sec > 486.29 +5.8% 514.28 -0.5% 483.70 fsmark.time.elapsed_time > > Please help to review if the debug patch miss anything as I don't > know the internals of xfs, thanks. Well spotted. :) The relevant commit dropped the trylock flag, so the perf regression and change of allocator behaviour is due to it blocking on a busy AG instead of skipping to the next uncontended on and so all allocations came from extents in the last block of the free space btree in that AG. > > --- > diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c > index 98defd19e09e..8c85cc68c5f4 100644 > --- a/fs/xfs/libxfs/xfs_alloc.c > +++ b/fs/xfs/libxfs/xfs_alloc.c > @@ -3246,12 +3246,12 @@ xfs_alloc_vextent_set_fsbno( > */ > static int > __xfs_alloc_vextent_this_ag( > - struct xfs_alloc_arg *args) > + struct xfs_alloc_arg *args, int flag) > { > struct xfs_mount *mp = args->mp; > int error; > > - error = xfs_alloc_fix_freelist(args, 0); > + error = xfs_alloc_fix_freelist(args, flag); > if (error) { > trace_xfs_alloc_vextent_nofix(args); > return error; > @@ -3289,7 +3289,7 @@ xfs_alloc_vextent_this_ag( > } > > args->pag = xfs_perag_get(mp, args->agno); > - error = __xfs_alloc_vextent_this_ag(args); > + error = __xfs_alloc_vextent_this_ag(args, 0); > > xfs_alloc_vextent_set_fsbno(args, minimum_agno); > xfs_perag_put(args->pag); > @@ -3329,7 +3329,7 @@ xfs_alloc_vextent_iterate_ags( > args->agno = start_agno; > for (;;) { > args->pag = xfs_perag_get(mp, args->agno); > - error = __xfs_alloc_vextent_this_ag(args); > + error = __xfs_alloc_vextent_this_ag(args, flags); > if (error) { > args->agbno = NULLAGBLOCK; > break; I don't think this is the right way to fix this. The code is -very- different at the end of the series that this is in the middle of, and I need to check what callers now use the trylock behaviour to determine how the trylock flag shold be passed around... > Also for the turbostat.Bzy_MHz diff, IIUC, 0Day always uses > 'performance' cpufreq governor. And as the test case is running > 32 thread in a platform with 96 CPUs, there are many CPUs in idle > state in average, and I suspect the Bzy_MHz may be calculated > considering those cpufreq and cpuidle factors. If "busy MHz" includes the speed of idle CPUs, then it's not really a measure of the speed of "busy" CPUs. If what you say is true, then it is, at best, badly names - it would just be the "average Mhz", right? -Dave. -- Dave Chinner david@xxxxxxxxxxxxx