Re: [PATCH 16/24] selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test

Reinette Chatre <reinette.chatre@xxxxxxxxx> · Fri, 3 Nov 2023 15:50:13 -0700

Hi Ilpo,

On 11/3/2023 3:57 AM, Ilpo Järvinen wrote:
> On Thu, 2 Nov 2023, Reinette Chatre wrote:
>> On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:

...

>>>  /*
>>> - * Change schemata. Write schemata to specified
>>> - * con_mon grp, mon_grp in resctrl FS.
>>> - * Run 5 times in order to get average values.
>>> + * Minimum difference in LLC misses between a test with n+1 bits CBM mask to
>>> + * the test with n bits. With e.g. 5 vs 4 bits in the CBM mask, the minimum
>>> + * difference must be at least MIN_DIFF_PERCENT_PER_BIT * (4 - 1) = 3 percent.
>>
>> This formula is not clear to me. In the code the formula is always:
>> MIN_DIFF_PERCENT_PER_BIT * (bits - 1) ... is the "-1" because it always
>> decrements the number of bits tested by one?
> 
> No, -1 is not related to decrementing bits by one, but setting a boundary 
> which workds for 1 bit masks. In general,  the smaller the number of bits 
> in the allocation mask is, less change there will be between n+1 -> n bits
> results. When n is 1, the result with some platforms is close to zero so I 
> just had to make the min diff to allow it. Thus, n-1 to set the failure 
> threshold at 0%.
> 
>> So, for example, if testing
>> 5 then 3 bits it would be  MIN_DIFF_PERCENT_PER_BIT * (3 - 2)?
>> Would above example thus be:
>> MIN_DIFF_PERCENT_PER_BIT * (4 - (5 - 4)) = 3 ?
> 
> I suspect you're overthinking it here. The CAT selftest currently doesn't 
> jump from 5 to 3 bits so I don't know what you're trying to calculate 
> here.

I was trying to understand the formula. The example starts with "e.g 5 vs 4 bits
in the CBM mask ..." and then jumps to "MIN_DIFF_PERCENT_PER_BIT * (4 - 1) = 3"
It is not obvious to me where the "5" and "4" from the example produces the
resulting formula.

Perhaps it would be simpler (to me) to say something like 

	Minimum difference in LLC misses between a test with n+1 bits CBM mask to
	the test with n bits is MIN_DIFF_PERCENT_PER_BIT * (n - 1). With e.g. 5 vs 4
	bits in the CBM mask, the minimum difference must be at least
	MIN_DIFF_PERCENT_PER_BIT * (4 - 1) = 3 percent.

> 
>>> - * Return:	0 on success. non-zero on failure.
>>> + * Return:		0 on success. non-zero on failure.
>>
>> Is non-zero specific enough? Does that mean that <0 and >0 are failure?
> 
> I suspect it is, after all the cleanups and fixes that have been done.
> The wording is from the original though.
> 
>>>   */
>>> -static int cat_test(struct resctrl_val_param *param, size_t span)
>>> +static int cat_test(struct resctrl_val_param *param, size_t span, unsigned long current_mask)
>>>  {
>>> -	int memflush = 1, operation = 0, ret = 0;
>>>  	char *resctrl_val = param->resctrl_val;
>>>  	static struct perf_event_read pe_read;
>>>  	struct perf_event_attr pea;
>>> +	unsigned char *buf;
>>> +	char schemata[64];
>>> +	int ret, i, pe_fd;
>>>  	pid_t bm_pid;
>>> -	int pe_fd;
>>>  
>>>  	if (strcmp(param->filename, "") == 0)
>>>  		sprintf(param->filename, "stdio");
>>> @@ -143,54 +168,64 @@ static int cat_test(struct resctrl_val_param *param, size_t span)
>>>  	if (ret)
>>>  		return ret;
>>>  
>>> +	buf = alloc_buffer(span, 1);
>>> +	if (buf == NULL)
>>> +		return -1;
>>> +
>>>  	perf_event_attr_initialize(&pea, PERF_COUNT_HW_CACHE_MISSES);
>>>  	perf_event_initialize_read_format(&pe_read);
>>>  
>>> -	/* Test runs until the callback setup() tells the test to stop. */
>>> -	while (1) {
>>> -		ret = param->setup(param);
>>> -		if (ret == END_OF_TESTS) {
>>> -			ret = 0;
>>> -			break;
>>> -		}
>>> -		if (ret < 0)
>>> -			break;
>>> -		pe_fd = perf_event_reset_enable(&pea, bm_pid, param->cpu_no);
>>> -		if (pe_fd < 0) {
>>> -			ret = -1;
>>> -			break;
>>> -		}
>>> +	while (current_mask) {
>>> +		snprintf(schemata, sizeof(schemata), "%lx", param->mask & ~current_mask);
>>> +		ret = write_schemata("", schemata, param->cpu_no, param->resctrl_val);
>>> +		if (ret)
>>> +			goto free_buf;
>>> +		snprintf(schemata, sizeof(schemata), "%lx", current_mask);
>>> +		ret = write_schemata(param->ctrlgrp, schemata, param->cpu_no, param->resctrl_val);
>>> +		if (ret)
>>> +			goto free_buf;
>>> +
>>> +		for (i = 0; i < NUM_OF_RUNS; i++) {
>>> +			mem_flush(buf, span);
>>> +			ret = fill_cache_read(buf, span, true);
>>> +			if (ret)
>>> +				goto free_buf;
>>> +
>>> +			pe_fd = perf_event_reset_enable(&pea, bm_pid, param->cpu_no);
>>> +			if (pe_fd < 0) {
>>> +				ret = -1;
>>> +				goto free_buf;
>>> +			}
>>
>> It seems to me that the perf counters are reconfigured at every iteration.
>> Can it not just be configured once and then the counters just reset and
>> enabled at each iteration? I'd expect this additional work to impact
>> values measured.
> 
> So you suggest I undo one of the changes made in 10/24 and just call the 
> function which does only the ioctl() calls? I don't know why it has been 
> done the way it has been, I can try to change it and see what happens.

The relationships between the functions are not as obvious to me but as
far as execution is concerned I am indeed suggesting that only the ioctl()s
to reset and enable counters are repeated in the loop. From what I understand
the counters need only be configured once.

Reinette