Re: [PATCH v4 1/2] selftests/resctrl: Adjust effective L3 cache size with SNC enabled

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, thanks for the review,

On 2024-08-12 at 16:40:00 -0700, Reinette Chatre wrote:
>Hi Maciej,
>
>On 7/12/24 2:04 AM, Maciej Wieczor-Retman wrote:
>> Sub-NUMA Cluster divides CPUs sharing an L3 cache into separate NUMA
>> nodes. Systems may support splitting into either two, three or four
>> nodes.
>> 
>> When SNC mode is enabled the effective amount of L3 cache available
>> for allocation is divided by the number of nodes per L3.
>> 
>> Detect which SNC mode is active by comparing the number of CPUs
>> that share a cache with CPU0, with the number of CPUs on node0.
>> 
>> Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>
>> Co-developed-by: Maciej Wieczor-Retman <maciej.wieczor-retman@xxxxxxxxx>
>> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@xxxxxxxxx>
>
>Since you are "From:" author there is no need for a "Co-developed-by"
>for you. Tony does need one. Please check: "When to use Acked-by:, Cc:,
>and Co-developed-by:" in Documentation/process/submitting-patches.rst
>(checkpatch.pl also warns about this).

Thanks, I changed patch's author at some point and I think I forgot to change
the tags.

>
>> ---
>> Changelog v4:
>> - Make returned value a static local variable so the function only runs
>>    the logic once. (Reinette)
>> 
>> Changelog v3:
>> - Add comparison between present and online cpus to test if the
>>    calculated SNC mode is credible. (Reinette)
>> - Added comment to cache size modification to better explain why it is
>>    needed there. (Reinette)
>> - Fix facts in patch message. (Reinette)
>> - Change snc_ways() to snc_nodes_per_l3_cache(). (Reinette)
>> 
>>   tools/testing/selftests/resctrl/resctrl.h   |  4 ++
>>   tools/testing/selftests/resctrl/resctrlfs.c | 73 +++++++++++++++++++++
>>   2 files changed, 77 insertions(+)
>> 
>> diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
>> index 2dda56084588..851b37c9c38a 100644
>> --- a/tools/testing/selftests/resctrl/resctrl.h
>> +++ b/tools/testing/selftests/resctrl/resctrl.h
>> @@ -11,6 +11,7 @@
>>   #include <signal.h>
>>   #include <dirent.h>
>>   #include <stdbool.h>
>> +#include <ctype.h>
>>   #include <sys/stat.h>
>>   #include <sys/ioctl.h>
>>   #include <sys/mount.h>
>> @@ -43,6 +44,8 @@
>>   #define DEFAULT_SPAN		(250 * MB)
>> +#define MAX_SNC		4
>> +
>>   /*
>>    * user_params:		User supplied parameters
>>    * @cpu:		CPU number to which the benchmark will be bound to
>> @@ -120,6 +123,7 @@ extern volatile int *value_sink;
>>   extern char llc_occup_path[1024];
>> +int snc_nodes_per_l3_cache(void);
>>   int get_vendor(void);
>>   bool check_resctrlfs_support(void);
>>   int filter_dmesg(void);
>> diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c
>> index 250c320349a7..803dd415984c 100644
>> --- a/tools/testing/selftests/resctrl/resctrlfs.c
>> +++ b/tools/testing/selftests/resctrl/resctrlfs.c
>> @@ -156,6 +156,68 @@ int get_domain_id(const char *resource, int cpu_no, int *domain_id)
>>   	return 0;
>>   }
>> +/*
>> + * Count number of CPUs in a /sys bit map
>
>"bit map" -> "bitmap"

Will do.

>
>> + */
>> +static unsigned int count_sys_bitmap_bits(char *name)
>> +{
>> +	FILE *fp = fopen(name, "r");
>> +	int count = 0, c;
>> +
>> +	if (!fp)
>> +		return 0;
>> +
>> +	while ((c = fgetc(fp)) != EOF) {
>> +		if (!isxdigit(c))
>> +			continue;
>> +		switch (c) {
>> +		case 'f':
>> +			count++;
>> +		case '7': case 'b': case 'd': case 'e':
>> +			count++;
>> +		case '3': case '5': case '6': case '9': case 'a': case 'c':
>> +			count++;
>> +		case '1': case '2': case '4': case '8':
>> +			count++;
>> +		}
>> +	}
>> +	fclose(fp);
>> +
>> +	return count;
>> +}
>> +
>> +/*
>> + * Detect SNC by comparing #CPUs in node0 with #CPUs sharing LLC with CPU0.
>> + * If some CPUs are offline the numbers may not be exact multiples of each
>> + * other. Any offline CPUs on node0 will be also gone from shared_cpu_map of
>> + * CPU0 but offline CPUs from other nodes will only make the cache_cpus value
>> + * lower. Still try to get the ratio right by preventing the second possibility.
>
>This all seems unnecessary since the next patch sets snc_mode to 1 if there
>are any offline CPUs.
>Seems more appropriate to move the offline CPU handling to this patch.

Okay, I'll move it here.

>
>> + */
>> +int snc_nodes_per_l3_cache(void)
>> +{
>> +	int node_cpus, cache_cpus, i;
>> +	static int snc_mode;
>> +
>> +	if (!snc_mode) {
>> +		node_cpus = count_sys_bitmap_bits("/sys/devices/system/node/node0/cpumap");
>> +		cache_cpus = count_sys_bitmap_bits("/sys/devices/system/cpu/cpu0/cache/index3/shared_cpu_map");
>> +
>> +		if (!node_cpus || !cache_cpus) {
>> +			ksft_print_msg("Could not determine Sub-NUMA Cluster mode.\n");
>> +			return 1;
>> +		}
>> +
>> +		for (i = 1; i <= MAX_SNC ; i++) {
>
>(nit: unnecessary space)

Will fix.

>
>> +			if (i * node_cpus >= cache_cpus) {
>> +				snc_mode = i;
>> +				break;
>> +			}
>
>This is irrelevant after the subsequent patch but note that there are scenarios
>where above loop cannot set snc_mode and the function will thus return 0 since
>snc_mode is not initialized. This is a problem in division done by following hunk.

Oh right, I'll set initial value to 1.

>
>> +		}
>> +	}
>> +
>> +	return snc_mode;
>> +}
>> +
>>   /*
>>    * get_cache_size - Get cache size for a specified CPU
>>    * @cpu_no:	CPU number
>> @@ -211,6 +273,17 @@ int get_cache_size(int cpu_no, const char *cache_type, unsigned long *cache_size
>>   			break;
>>   	}
>> +	/*
>> +	 * The amount of cache represented by each bit in the masks
>> +	 * in the schemata file is reduced by a factor equal to SNC
>> +	 * nodes per L3 cache.
>> +	 * E.g. on a SNC-2 system with a 100MB L3 cache a test that
>> +	 * allocates memory from its local SNC node (default behavior
>> +	 * without using libnuma) will only see 50 MB llc_occupancy
>> +	 * with a fully populated L3 mask in the schemata file.
>> +	 */
>> +	if (cache_num == 3)
>> +		*cache_size /= snc_nodes_per_l3_cache();
>>   	return 0;
>>   }
>
>Reinette

-- 
Kind regards
Maciej Wieczór-Retman




[Index of Archives]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]

  Powered by Linux