On Tue, 28 Nov 2023, Reinette Chatre wrote: > On 11/20/2023 3:13 AM, Ilpo Järvinen wrote: > > CAT test spawns two processes into two different control groups with > > exclusive schemata. Both the processes alloc a buffer from memory > > matching their allocated LLC block size and flush the entire buffer out > > of caches. Since the processes are reading through the buffer only once > > during the measurement and initially all the buffer was flushed, the > > test isn't testing CAT. > > > > Rewrite the CAT test to allocate a buffer sized to half of LLC. Then > > perform a sequence of tests with different LLC alloc sizes starting > > from half of the CBM bits down to 1-bit CBM. Flush the buffer before > > each test and read the buffer twice. Observe the LLC misses on the > > second read through the buffer. As the allocated LLC block gets smaller > > and smaller, the LLC misses will become larger and larger giving a > > strong signal on CAT working properly. > > > > The new CAT test is using only a single process because it relies on > > measured effect against another run of itself rather than another > > process adding noise. The rest of the system is set to use the CBM bits > > not used by the CAT test to keep the test isolated. > > > > Replace count_bits() with count_contiguous_bits() to get the first bit > > position in order to be able to calculate masks based on it. > > > > This change has been tested with a number of systems from different > > generations. > > > > Suggested-by: Reinette Chatre <reinette.chatre@xxxxxxxxx> > > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx> > > --- > > tools/testing/selftests/resctrl/cat_test.c | 282 +++++++++----------- > > tools/testing/selftests/resctrl/fill_buf.c | 6 +- > > tools/testing/selftests/resctrl/resctrl.h | 5 +- > > tools/testing/selftests/resctrl/resctrlfs.c | 44 +-- > > 4 files changed, 138 insertions(+), 199 deletions(-) > > > > diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c > > index cfda87667b46..4169b17b8f91 100644 > > --- a/tools/testing/selftests/resctrl/cat_test.c > > +++ b/tools/testing/selftests/resctrl/cat_test.c > > @@ -11,65 +11,69 @@ > > #include "resctrl.h" > > #include <unistd.h> > > > > -#define RESULT_FILE_NAME1 "result_cat1" > > -#define RESULT_FILE_NAME2 "result_cat2" > > +#define RESULT_FILE_NAME "result_cat" > > #define NUM_OF_RUNS 5 > > -#define MAX_DIFF_PERCENT 4 > > -#define MAX_DIFF 1000000 > > > > /* > > - * Change schemata. Write schemata to specified > > - * con_mon grp, mon_grp in resctrl FS. > > - * Run 5 times in order to get average values. > > + * Minimum difference in LLC misses between a test with n+1 bits CBM to the > > + * test with n bits is MIN_DIFF_PERCENT_PER_BIT * (n - 1). With e.g. 5 vs 4 > > + * bits in the CBM mask, the minimum difference must be at least > > + * MIN_DIFF_PERCENT_PER_BIT * (4 - 1) = 3 percent. > > + * > > + * The relationship between number of used CBM bits and difference in LLC > > + * misses is not expected to be linear. With a small number of bits, the > > + * margin is smaller than with larger number of bits. For selftest purposes, > > + * however, linear approach is enough because ultimately only pass/fail > > + * decision has to be made and distinction between strong and stronger > > + * signal is irrelevant. > > */ > > -static int cat_setup(struct resctrl_val_param *p) > > -{ > > - char schemata[64]; > > - int ret = 0; > > - > > - /* Run NUM_OF_RUNS times */ > > - if (p->num_of_runs >= NUM_OF_RUNS) > > - return END_OF_TESTS; > > - > > - if (p->num_of_runs == 0) { > > - sprintf(schemata, "%lx", p->mask); > > - ret = write_schemata(p->ctrlgrp, schemata, p->cpu_no, > > - p->resctrl_val); > > - } > > - p->num_of_runs++; > > - > > - return ret; > > -} > > +#define MIN_DIFF_PERCENT_PER_BIT 1 > > > > static int show_results_info(__u64 sum_llc_val, int no_of_bits, > > - unsigned long cache_span, unsigned long max_diff, > > - unsigned long max_diff_percent, unsigned long num_of_runs, > > - bool platform) > > + unsigned long cache_span, long min_diff_percent, > > With all care taken in unsigned use I wonder why min_diff_percent is > just long? This was a leftover from the time when I still wasn't using floats so the compare typing was easier. But's it's long gone now so I'll make that unsigned long. > It looks to me as though this test impacts the affinity of main program > since it is only one process, changes its affinity, but never change it back. Ah, right. It looks pre-existing problem though as despite more than one process in the old CAT test, it altered affinity of both of them. I'll need to look into fixing this. -- i.