Hi Ilpo, > On Thu, 1 Jun 2023, Shaopeng Tan (Fujitsu) wrote: > > > > > When reading memory in order, HW prefetching optimizations will > > > > > interfere with measuring how caches and memory are being accessed. > > > > > This adds noise into the results. > > > > > > > > > > Change the fill_buf reading loop to not use an obvious in-order > > > > > access using multiply by a prime and modulo. > > > > > > > > > > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx> > > > > > --- > > > > > tools/testing/selftests/resctrl/fill_buf.c | 17 > > > > > ++++++++++------- > > > > > 1 file changed, 10 insertions(+), 7 deletions(-) > > > > > > > > > > diff --git a/tools/testing/selftests/resctrl/fill_buf.c > > > > > b/tools/testing/selftests/resctrl/fill_buf.c > > > > > index 7e0d3a1ea555..049a520498a9 100644 > > > > > --- a/tools/testing/selftests/resctrl/fill_buf.c > > > > > +++ b/tools/testing/selftests/resctrl/fill_buf.c > > > > > @@ -88,14 +88,17 @@ static void *malloc_and_init_memory(size_t > > > > > s) > > > > > > > > > > static int fill_one_span_read(unsigned char *start_ptr, > > > > > unsigned char > > > > > *end_ptr) { > > > > > - unsigned char sum, *p; > > > > > - > > > > > + unsigned int size = (end_ptr - start_ptr) / (CL_SIZE / 2); > > > > > + unsigned int count = size; > > > > > + unsigned char sum; > > > > > + > > > > > + /* > > > > > + * Read the buffer in an order that is unexpected by HW > prefetching > > > > > + * optimizations to prevent them interfering with the caching > pattern. > > > > > + */ > > > > > sum = 0; > > > > > - p = start_ptr; > > > > > - while (p < end_ptr) { > > > > > - sum += *p; > > > > > - p += (CL_SIZE / 2); > > > > > - } > > > > > + while (count--) > > > > > + sum += start_ptr[((count * 59) % size) * CL_SIZE / 2]; > > > > > > > > Could you please elaborate why 59 is used? > > > > > > The main reason is that it's a prime number ensuring the whole buffer gets > read. > > > I picked something that doesn't make it to wrap on almost every iteration. > > > > Thanks for your explanation. It seems there is no problem. > > > > Perhaps you have already tested this patch in your environment and got a test > result of "ok". > > Because HW prefetching does not work well, the IMC counter fluctuates > > a lot in my environment, and the test result is "not ok". > > > > In order to ensure this test set runs in any environments and gets > > "ok", would you consider changing the value of MAX_DIFF_PERCENT of > each test? > > or changing something else? > > > > ``` > > Environment: > > Kernel: 6.4.0-rc2 > > CPU: Intel(R) Xeon(R) Gold 6254 CPU @ 3.10GHz > > > > Test result(MBM as an example): > > # # Starting MBM BW change ... > > # # Mounting resctrl to "/sys/fs/resctrl" > > # # Benchmark PID: 8671 > > # # Writing benchmark parameters to resctrl FS # # Write schema > > "MB:0=100" to resctrl FS # # Checking for pass/fail # # Fail: Check > > MBM diff within 5% # # avg_diff_per: 9% # # Span in bytes: 262144000 # > > # avg_bw_imc: 6202 # # avg_bw_resc: 5585 # not ok 1 MBM: bw change ``` > > Could you try if the approach below works better (I think it should apply cleanly > on top of the fixes+cleanups v3 series which you recently tested, no need to > have the other CAT test changes). I ran the test set several times. MBA and MBM seem fine, but CAT is always "not ok". The result is below. --- $ sudo make -C tools/testing/selftests/resctrl run_tests make: Entering directory '/**/tools/testing/selftests/resctrl' TAP version 13 1..1 # selftests: resctrl: resctrl_tests # TAP version 13 # # Pass: Check kernel supports resctrl filesystem # # Pass: Check resctrl mountpoint "/sys/fs/resctrl" exists # # resctrl filesystem not mounted # # dmesg: [ 3.658029] resctrl: L3 allocation detected # # dmesg: [ 3.658420] resctrl: MB allocation detected # # dmesg: [ 3.658604] resctrl: L3 monitoring detected # 1..4 # # Starting MBM BW change ... # # Mounting resctrl to "/sys/fs/resctrl" # # Benchmark PID: 9555 # # Writing benchmark parameters to resctrl FS # # Write schema "MB:0=100" to resctrl FS # # Checking for pass/fail # # Pass: Check MBM diff within 5% # # avg_diff_per: 0% # # Span (MB): 250 # # avg_bw_imc: 6880 # # avg_bw_resc: 6895 # ok 1 MBM: bw change # # Starting MBA Schemata change ... # # Mounting resctrl to "/sys/fs/resctrl" # # Benchmark PID: 9561 # # Writing benchmark parameters to resctrl FS # # Write schema "MB:0=100" to resctrl FS # # Write schema "MB:0=90" to resctrl FS # # Write schema "MB:0=80" to resctrl FS # # Write schema "MB:0=70" to resctrl FS # # Write schema "MB:0=60" to resctrl FS # # Write schema "MB:0=50" to resctrl FS # # Write schema "MB:0=40" to resctrl FS # # Write schema "MB:0=30" to resctrl FS # # Write schema "MB:0=20" to resctrl FS # # Write schema "MB:0=10" to resctrl FS # # Results are displayed in (MB) # # Pass: Check MBA diff within 5% for schemata 100 # # avg_diff_per: 0% # # avg_bw_imc: 6874 # # avg_bw_resc: 6904 # # Pass: Check MBA diff within 5% for schemata 90 # # avg_diff_per: 1% # # avg_bw_imc: 6738 # # avg_bw_resc: 6807 # # Pass: Check MBA diff within 5% for schemata 80 # # avg_diff_per: 1% # # avg_bw_imc: 6735 # # avg_bw_resc: 6803 # # Pass: Check MBA diff within 5% for schemata 70 # # avg_diff_per: 1% # # avg_bw_imc: 6702 # # avg_bw_resc: 6770 # # Pass: Check MBA diff within 5% for schemata 60 # # avg_diff_per: 1% # # avg_bw_imc: 6632 # # avg_bw_resc: 6725 # # Pass: Check MBA diff within 5% for schemata 50 # # avg_diff_per: 1% # # avg_bw_imc: 6510 # # avg_bw_resc: 6635 # # Pass: Check MBA diff within 5% for schemata 40 # # avg_diff_per: 2% # # avg_bw_imc: 6206 # # avg_bw_resc: 6342 # # Pass: Check MBA diff within 5% for schemata 30 # # avg_diff_per: 1% # # avg_bw_imc: 3826 # # avg_bw_resc: 3896 # # Pass: Check MBA diff within 5% for schemata 20 # # avg_diff_per: 1% # # avg_bw_imc: 2820 # # avg_bw_resc: 2862 # # Pass: Check MBA diff within 5% for schemata 10 # # avg_diff_per: 1% # # avg_bw_imc: 1876 # # avg_bw_resc: 1898 # # Pass: Check schemata change using MBA # ok 2 MBA: schemata change # # Starting CMT test ... # # Mounting resctrl to "/sys/fs/resctrl" # # Cache size :25952256 # # Benchmark PID: 9573 # # Writing benchmark parameters to resctrl FS # # Checking for pass/fail # # Pass: Check cache miss rate within 15% # # Percent diff=10 # # Number of bits: 5 # # Average LLC val: 12994560 # # Cache span (bytes): 11796480 # ok 3 CMT: test # # Starting CAT test ... # # Mounting resctrl to "/sys/fs/resctrl" # # Cache size :25952256 # # Writing benchmark parameters to resctrl FS # # Write schema "L3:0=3f" to resctrl FS # # Checking for pass/fail # # Fail: Check cache miss rate within 4% # # Percent diff=24 # # Number of bits: 6 # # Average LLC val: 275418 # # Cache span (lines): 221184 # not ok 4 CAT: test # # Totals: pass:3 fail:1 xfail:0 xpass:0 skip:0 error:0 not ok 1 selftests: resctrl: resctrl_tests # exit=1 make: Leaving directory '/**/tools/testing/selftests/resctrl' --- Best regards, Shaopeng TAN