On Fri, 21 Apr 2023, Reinette Chatre wrote: > Hi Ilpo, > > On 4/18/2023 4:45 AM, Ilpo Järvinen wrote: > > CAT test spawns two processes into two different control groups with > > exclusive schemata. Both the processes alloc a buffer from memory > > matching their allocated LLC block size and flush the entire buffer out > > of caches. Since the processes are reading through the buffer only once > > during the measurement and initially all the buffer was flushed, the > > test isn't testing CAT. > > > > Rewrite the CAT test to allocated a buffer sized to half of LLC. Then > > "allocated a buffer" -> "allocate a buffer" ? > > > perform a sequence of tests with different LLC alloc sizes starting > > from half of the CBM bits down to 1-bit CBM. Flush the buffer before > > each test and read the buffer twice. Observe the LLC misses on the > > second read through the buffer. As the allocated LLC block gets smaller > > and smaller, the LLC misses will become larger and larger giving a > > strong signal on CAT working properly. > > Since the changelog starts by describing the CAT test needing two > processes I think it would help to highlight that this test uses a > single process. I think it would also help to describing how the cache > is used by the rest while this test is running. Sure, good points, I'll add the info. > > Suggested-by: Reinette Chatre <reinette.chatre@xxxxxxxxx> > > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx> > > --- > > tools/testing/selftests/resctrl/cache.c | 20 +- > > tools/testing/selftests/resctrl/cat_test.c | 204 +++++++++------------ > > 2 files changed, 97 insertions(+), 127 deletions(-) > > > > diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c > > index 7970239413da..64f08ba5edc2 100644 > > --- a/tools/testing/selftests/resctrl/cache.c > > +++ b/tools/testing/selftests/resctrl/cache.c > > @@ -224,10 +224,10 @@ int measure_llc_resctrl(struct resctrl_val_param *param, int bm_pid) > > */ > > int cat_val(struct resctrl_val_param *param) > > { > > - int memflush = 1, operation = 0, ret = 0; > > char *resctrl_val = param->resctrl_val; > > unsigned long llc_perf_miss = 0; > > pid_t bm_pid; > > + int ret; > > > > if (strcmp(param->filename, "") == 0) > > sprintf(param->filename, "stdio"); > > @@ -245,6 +245,10 @@ int cat_val(struct resctrl_val_param *param) > > if (ret) > > return ret; > > > > + ret = alloc_buffer(param->span, 1); > > + if (ret) > > + return ret; > > + > > initialize_llc_perf(); > > > > /* Test runs until the callback setup() tells the test to stop. */ > > @@ -256,17 +260,15 @@ int cat_val(struct resctrl_val_param *param) > > } > > if (ret < 0) > > break; > > + > > + flush_buffer(param->span); > > + use_buffer(param->span, 0, true); > > + > > ret = reset_enable_llc_perf(bm_pid, param->cpu_no); > > if (ret) > > break; > > > > - if (run_fill_buf(param->span, memflush, operation, true)) { > > - fprintf(stderr, "Error-running fill buffer\n"); > > - ret = -1; > > - break; > > - } > > - > > - sleep(1); > > + use_buffer(param->span, 0, true); > > > > /* Measure cache miss from perf */ > > ret = get_llc_perf(&llc_perf_miss); > > @@ -279,6 +281,8 @@ int cat_val(struct resctrl_val_param *param) > > break; > > } > > > > + free_buffer(); > > + > > return ret; > > } > > > > diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c > > index 4b505fdb35d7..85053829b9c5 100644 > > --- a/tools/testing/selftests/resctrl/cat_test.c > > +++ b/tools/testing/selftests/resctrl/cat_test.c > > @@ -11,11 +11,12 @@ > > #include "resctrl.h" > > #include <unistd.h> > > > > -#define RESULT_FILE_NAME1 "result_cat1" > > -#define RESULT_FILE_NAME2 "result_cat2" > > -#define NUM_OF_RUNS 5 > > -#define MAX_DIFF_PERCENT 4 > > -#define MAX_DIFF 1000000 > > +#define RESULT_FILE_NAME "result_cat" > > +#define NUM_OF_RUNS 5 > > +#define MIN_DIFF_PERCENT_PER_BIT 2 > > Could you please start a new trend that adds documentation > that explains what this constant means and how it was chosen? I can try although that particular 2 was a bit handwavy that just seems to work with the tests I performed. > > +static unsigned long current_mask; > > +static long prev_avg_llc_val; > > > > /* > > * Change schemata. Write schemata to specified > > @@ -28,13 +29,24 @@ static int cat_setup(struct resctrl_val_param *p) > > int ret = 0; > > > > /* Run NUM_OF_RUNS times */ > > - if (p->num_of_runs >= NUM_OF_RUNS) > > - return END_OF_TESTS; > > + if (p->num_of_runs >= NUM_OF_RUNS) { > > + /* Remove one bit from the consecutive block */ > > + current_mask &= current_mask >> 1; > > + if (!current_mask) > > + return END_OF_TESTS; > > + > > + p->num_of_runs = 0; > > This seems like a workaround to get the schemata to be written. It is > problematic since now p->num_of_runs no longer accurately reflects the > number of test runs. This is already the case. MBA test works around this very same problem by using a custom static variable (runs_per_allocation) which is reset to 0 every NUM_OF_RUNS tests and not keeping ->num_of_runs at all. If MBA test would replace runs_per_allocation with use of ->num_of_runs, it would match what the new CAT test does. Nothing currently relies on ->num_of_runs counting across the different "tests" that are run inside CAT and MBA tests. And I don't have anything immediately around the corner that would require ->num_of_runs to count total number of repetitions that were ran. I guess it would be possible to attempt to consolidate that second layer MBA and the rewritten CAT tests need somehow into resctrl_val_param. But IMHO that too is low-prio refactor as nothing is broken as is. > I was expecting this mask manipulation to be > in cat_val() so that it is clear how test works instead of part > of the logic handled here. That seems to be moving into opposite direction from how things are currently handled. Doing it in cat_val() would be relying less on ->setup(). If that's the preferred direction, then the question becomes, should CAT test do anything in ->setup() because also the schemata writing could be done in directly cat_val(). What I would prefer not to do is to have a rule which says: if there's a test-specific function, don't use ->setup() but do any setup directly in the test-specific function but, otherwise use ->setup(). Such an inconsistency would make things hard to track. > > + } > > > > if (p->num_of_runs == 0) { > > - sprintf(schemata, "%lx", p->mask); > > - ret = write_schemata(p->ctrlgrp, schemata, p->cpu_no, > > - p->resctrl_val); > > + snprintf(schemata, sizeof(schemata), "%lx", p->mask & ~current_mask); > > + ret = write_schemata("", schemata, p->cpu_no, p->resctrl_val); > > + if (ret) > > + return ret; > > + snprintf(schemata, sizeof(schemata), "%lx", current_mask); > > + ret = write_schemata(p->ctrlgrp, schemata, p->cpu_no, p->resctrl_val); > > + if (ret) > > + return ret; > > } > > p->num_of_runs++; > > > > ... > > > @@ -126,7 +162,7 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) > > ret = get_mask_no_shareable(cache_type, &long_mask); > > if (ret) > > return ret; > > - count_of_bits = count_consecutive_bits(long_mask, NULL); > > + count_of_bits = count_consecutive_bits(long_mask, &start); > > > > /* Get L3/L2 cache size */ > > ret = get_cache_size(cpu_no, cache_type, &cache_size); > > @@ -143,99 +179,29 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) > > count_of_bits - 1); > > return -1; > > } > > - > > - /* Get core id from same socket for running another thread */ > > - sibling_cpu_no = get_core_sibling(cpu_no); > > Do any users of get_core_sibling() remain after this? Correct observation, there seems to be no other users after this is removed. -- i.