On Fri, 10 Sep 2021 12:27:47 -0700 Atish Patra <atish.patra@xxxxxxx> wrote: Hello Atish, > Perf stat: > ========= > > [root@fedora-riscv riscv]# perf stat -e r8000000000000005 -e > r8000000000000007 -e r8000000000000006 -e r0000000000020002 -e > r0000000000020004 -e branch-misses -e cache-misses -e > dTLB-load-misses -e dTLB-store-misses -e iTLB-load-misses -e cycles > -e instructions ./hackbench -pipe 15 process Running with 15*40 (== > 600) tasks. Time: 6.578 > > Performance counter stats for './hackbench -pipe 15 process': > > 6,491 r8000000000000005 (52.59%) --> > SBI_PMU_FW_SET_TIMER 20,433 r8000000000000007 (60.74%) --> > SBI_PMU_FW_IPI_RECVD 21,271 r8000000000000006 (68.71%) --> > SBI_PMU_FW_IPI_SENT 0 r0000000000020002 (76.55%) > <not counted> r0000000000020004 (0.00%) > <not counted> branch-misses (0.00%) > <not counted> cache-misses (0.00%) > 57,537,853 dTLB-load-misses (9.49%) > 2,821,147 dTLB-store-misses (18.64%) > 52,928,130 iTLB-load-misses (27.53%) > 89,521,791,110 cycles (36.08%) > 90,678,132,464 instructions # 1.01 insn per cycle > (44.44%) > > 6.975908032 seconds time elapsed > > 3.130950000 seconds user > 24.353310000 seconds sys > Tested your patch series with qemu and got results as expected: perf stat -e r8000000000000005 -e r8000000000000007 \ -e r8000000000000006 -e r0000000000020002 -e r0000000000020004 -e branch-misses \ -e cache-misses -e dTLB-load-misses -e dTLB-store-misses -e iTLB-load-misses \ -e cycles -e instructions ./hackbench -pipe 15 process Running with 15*40 (== 600) tasks.nch -pipe 15 process Time: 20.027 Performance counter stats for './hackbench -pipe 15 process': 4896 r8000000000000005 (53.34%) 0 r8000000000000007 (61.20%) 0 r8000000000000006 (68.88%) 0 r0000000000020002 (76.53%) <not counted> r0000000000020004 (0.00%) <not counted> branch-misses (0.00%) <not counted> cache-misses (0.00%) 48414917 dTLB-load-misses (9.87%) 2427413 dTLB-store-misses (19.43%) 46958092 iTLB-load-misses (28.58%) 69245163600 cycles (37.09%) 70334279943 instructions # 1.02 insn per cycle (45.24%) 20.895871900 seconds time elapsed 2.724942000 seconds user 18.126277000 seconds sys perf top/record also works. Tested-by: Nikita Shubin <n.shubin@xxxxxxxxx> Yours, Nikita Shubin