hi, Oleg, On Mon, Jan 20, 2025 at 04:50:10PM +0100, Oleg Nesterov wrote: > Again, I'll try to take another look tomorrow. Not sure I will find the > explanation though... > > But can you help? I know nothing about stress-ng. > > Google finds a lot of stress-ng repositories, I've clone the 1st one > https://github.com/ColinIanKing/stress-ng/blob/master/stress-poll.c > hopefully this is what you used. yes, this is the one we used. (you could see it in https://github.com/intel/lkp-tests/blob/master/programs/stress-ng/pkg/PKGBUILD) > > On 01/20, kernel test robot wrote: > > > > 9.45 -6.3 3.13 ± 9% perf-profile.calltrace.cycles-pp.pipe_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe > > ... > > 10.00 -6.5 3.53 ± 9% perf-profile.children.cycles-pp.pipe_read > > 2.34 -1.3 1.07 ± 9% perf-profile.children.cycles-pp.pipe_poll > > Could you explain what do these numbers mean and how there are calculated? > > "git-grep cycles-pp" find nothing in stress-ng/ and tools/perf/ we use perf. the perf-profile is a so-callded 'monitor' name. https://github.com/intel/lkp-tests/blob/master/monitors/no-stdout/perf-profile this link shows how this 'monitor' runs. I attached a perf-profile.gz in one run FYI. within it, you can see something like "-e cycles:pp" then for each run, all data will be parsed by https://github.com/intel/lkp-tests/blob/master/programs/perf-profile/parse for each commit, we run the test at least 6 times, then comparison list could give avg and @stddev like 2.34 -1.3 1.07 ± 9% perf-profile**** > > > kernel test robot noticed a 11.1% regression of stress-ng.poll.ops_per_sec on: > > same for ops_per_sec this the kpi for this stress-ng.poll test. the raw output looks like: 2025-01-17 00:02:55 stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --poll 224 stress-ng: info: [6458] setting to a 1 min run per stressor stress-ng: info: [6458] dispatching hogs: 224 poll stress-ng: info: [6458] note: /proc/sys/kernel/sched_autogroup_enabled is 1 and this can impact scheduling throughput for processes not attached to a tty. Setting this to 0 may improve performance metrics stress-ng: metrc: [6458] stressor bogo ops real time usr time sys time bogo ops/s bogo ops/s CPU used per RSS Max stress-ng: metrc: [6458] (secs) (secs) (secs) (real time) (usr+sys time) instance (%) (KB) stress-ng: metrc: [6458] poll 776040080 60.00 451.84 1488.52 12933899.40 399947.20 14.44 1844 stress-ng: info: [6458] for a 60.08s run time: stress-ng: info: [6458] 13457.69s available CPU time stress-ng: info: [6458] 451.85s user time ( 3.36%) stress-ng: info: [6458] 1488.58s system time ( 11.06%) stress-ng: info: [6458] 1940.43s total time ( 14.42%) stress-ng: info: [6458] load average: 44.46 13.42 4.66 stress-ng: info: [6458] skipped: 0 stress-ng: info: [6458] passed: 224: poll (224) stress-ng: info: [6458] failed: 0 stress-ng: info: [6458] metrics untrustworthy: 0 stress-ng: info: [6458] successful run completed in 1 min we parse this data by https://github.com/intel/lkp-tests/blob/master/programs/stress-ng/parse then got the stress-ng.poll.ops_per_sec for this run from bogo ops/s (real time) -- 12933899.40 again, below data is the avg for multi-runs: 14516970 -11.1% 12907569 stress-ng.poll.ops_per_sec (no %stddev since we won't show it if it's < 3%, which means stable enough) > > > 6150 -47.8% 3208 stress-ng.time.percent_of_cpu_this_job_got > > same for percent_of_cpu_this_job_got > > > 2993 -50.6% 1477 stress-ng.time.system_time > > 711.20 -36.0% 454.85 stress-ng.time.user_time > > Is that what I think it is?? Does it run faster? these time data are got by https://github.com/intel/lkp-tests/blob/master/tests/wrapper#L38 below is a raw data FYI. Command being timed: "/lkp/lkp/src/programs/stress-ng/run" User time (seconds): 451.86 System time (seconds): 1488.79 Percent of CPU this job got: 3222% Elapsed (wall clock) time (h:mm:ss or m:ss): 1:00.21 <---- (1) Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 222324 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 87235 Voluntary context switches: 194087227 Involuntary context switches: 78364 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 > > Or it exits after some timeout and the decrease in system/user_time can be > explained by the change in the mysterious 'percent_of_cpu_this_job_got' above? above (1) seems expected, we run the test 60s. from the data you spot out, it seems stress-ng test itself get less cpu time to run, which may explain why it become slow. this reminds us maybe your commit about fs/pipe could have some impacts on our 'monitors'. so we rerun the tests by disabling all monitors. but still see similar regression. (below is the full list of compasison, as you can see, it's much shorter than in our original report, since now 'monitors' are disabled, so no monitor data.) ========================================================================================= compiler/cpufreq_governor/debug-setup/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: gcc-12/performance/no-monitor/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/poll/stress-ng/60s d2fc0ed52a284a13 aaec5a95d59615523db03dd53c2 ---------------- --------------------------- %stddev %change %stddev \ | \ 105111 ± 25% -58.9% 43248 ± 10% time.involuntary_context_switches 6036 -46.1% 3253 time.percent_of_cpu_this_job_got 2901 -48.7% 1489 time.system_time 734.52 -36.0% 469.77 time.user_time 4.472e+08 -55.5% 1.99e+08 time.voluntary_context_switches 8.817e+08 -9.8% 7.957e+08 stress-ng.poll.ops 14694617 -9.8% 13261019 stress-ng.poll.ops_per_sec 105111 ± 25% -58.9% 43248 ± 10% stress-ng.time.involuntary_context_switches 6036 -46.1% 3253 stress-ng.time.percent_of_cpu_this_job_got 2901 -48.7% 1489 stress-ng.time.system_time 734.52 -36.0% 469.77 stress-ng.time.user_time 4.472e+08 -55.5% 1.99e+08 stress-ng.time.voluntary_context_switches > > Oleg. >
Attachment:
perf-profile.gz
Description: application/gzip