Hi Reinette, On 2/6/25 12:05, Reinette Chatre wrote: > Hi Babu, > > On 1/22/25 12:20 PM, Babu Moger wrote: >> Resctrl subsystem can support two monitoring modes, 'mbm_cntr_assign' or >> 'default'. In mbm_cntr_assign, monitoring event can only accumulate data >> while it is backed by a hardware counter. In 'default' mode, resctrl >> assumes there is a hardware counter for each event within every CTRL_MON >> and MON group. >> >> Introduce interface to switch between mbm_cntr_assign and default modes. >> >> $ cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode >> [mbm_cntr_assign] >> default >> >> To enable the "mbm_cntr_assign" mode: >> $ echo "mbm_cntr_assign" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode >> >> To enable the default monitoring mode: >> $ echo "default" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode >> >> MBM event counters are automatically reset as part of changing the mode. >> Clear both architectural and non-architectural event states to prevent >> overflow conditions during the next event read. >> >> Signed-off-by: Babu Moger <babu.moger@xxxxxxx> >> --- > > >> --- >> Documentation/arch/x86/resctrl.rst | 25 ++++++++++++- >> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 50 +++++++++++++++++++++++++- >> 2 files changed, 73 insertions(+), 2 deletions(-) >> >> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst >> index 072b15550ff7..5d18c4c8bc48 100644 >> --- a/Documentation/arch/x86/resctrl.rst >> +++ b/Documentation/arch/x86/resctrl.rst >> @@ -259,7 +259,10 @@ with the following files: >> >> "mbm_assign_mode": >> Reports the list of monitoring modes supported. The enclosed brackets >> - indicate which mode is enabled. >> + indicate which mode is enabled. The MBM events (mbm_total_bytes and/or >> + mbm_local_bytes) associated with counters may reset when "mbm_assign_mode" >> + is changed. >> + >> :: >> >> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode >> @@ -275,6 +278,16 @@ with the following files: >> available is described in the "num_mbm_cntrs" file. Changing the mode >> may cause all counters on a resource to reset. >> >> + Moving to mbm_cntr_assign mode require users to assign the counters to >> + the events. Otherwise, the MBM event counters will return "Unassigned" >> + when read. > > Again ... please be consistent in using single or double quotes for information > returned from file. ok. Will change it to 'Unassigned'. > >> + >> + The mode is beneficial for AMD platforms that support more CTRL_MON >> + and MON groups than available hardware counters. By default, this >> + feature is enabled on AMD platforms with the ABMC (Assignable Bandwidth >> + Monitoring Counters) capability, ensuring counters remain assigned even >> + when the corresponding RMID is not actively used by any processor. >> + >> "default": >> >> In default mode, resctrl assumes there is a hardware counter for each >> @@ -283,6 +296,16 @@ with the following files: >> "mbm_total_bytes" or "mbm_local_bytes" will report 'Unavailable' if >> there is no counter associated with that event. >> >> + * To enable "mbm_cntr_assign" mode: >> + :: >> + >> + # echo "mbm_cntr_assign" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode >> + >> + * To enable default monitoring mode: >> + :: >> + >> + # echo "default" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode >> + > > Please be consistent in the documentation. > > To enable "mbm_cntr_assign" mode: > To enable "default" mode: > or > To enable "mbm_cntr_assign" monitoring mode: > To enable "default" monitoring mode: This sounds (monitoring mode) better. > or > ...? > > > >> "num_mbm_cntrs": >> The number of monitoring counters available for assignment when the >> system supports mbm_cntr_assign mode. >> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >> index f61f0cd032ef..6922173c4f8f 100644 >> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c >> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >> @@ -928,6 +928,53 @@ static int resctrl_available_mbm_cntrs_show(struct kernfs_open_file *of, >> return ret; >> } >> >> +static ssize_t resctrl_mbm_assign_mode_write(struct kernfs_open_file *of, >> + char *buf, size_t nbytes, loff_t off) >> +{ >> + struct rdt_resource *r = of->kn->parent->priv; >> + int ret = 0; >> + bool enable; >> + >> + /* Valid input requires a trailing newline */ >> + if (nbytes == 0 || buf[nbytes - 1] != '\n') >> + return -EINVAL; >> + >> + buf[nbytes - 1] = '\0'; >> + >> + cpus_read_lock(); >> + mutex_lock(&rdtgroup_mutex); >> + >> + rdt_last_cmd_clear(); >> + >> + if (!strcmp(buf, "default")) { >> + enable = 0; >> + } else if (!strcmp(buf, "mbm_cntr_assign")) { >> + if (r->mon.mbm_cntr_assignable) { >> + enable = 1; >> + } else { >> + ret = -EINVAL; >> + rdt_last_cmd_puts("mbm_cntr_assign mode is not supported\n"); >> + goto write_exit; >> + } >> + } else { >> + ret = -EINVAL; >> + rdt_last_cmd_puts("Unsupported assign mode\n"); >> + goto write_exit; >> + } >> + >> + if (enable != resctrl_arch_mbm_cntr_assign_enabled(r)) { >> + ret = resctrl_arch_mbm_cntr_assign_set(r, enable); >> + if (!ret) >> + mbm_cntr_reset(r); > > The following APIs interact with the MBM assignable counters: > > mbm_cntr_alloc() > mbm_cntr_get() > mbm_cntr_free() > > mbm_cntr_reset() appears to be related but does significantly more > than interact with the MBM assignable counters and that creates a > confusing API. > > How about introducing mbm_cntr_free_all() that _only_ releases all > MBM assignable counters and match with mbm_cntr_free() that releases > a single MBM assignable counter? mbm_cntr_free_all() lives with the > other functions operating on MBM assignable counters, thus not > hiding its functionality in other parts of resctrl. > > This series open codes reset of non-architectural state in two places, > within mbm_cntr_reset() and within mbm_config_write_domain(). That > can be turned into a new helper that only resets architectural state, > for example resctrl_reset_rmid_all() to match existing > resctrl_arch_reset_rmid_all(). > > resctrl_arch_mbm_cntr_assign_set() can also reset any architectural > state leaving mbm_cntr_free_all() and resctrl_reset_rmid_all() to be called > here and from within mbm_config_write_domain(). > > What do you think? Sounds like a good code separation. It should be fine. Will let you know if there are any issues. > >> + } >> + >> +write_exit: >> + mutex_unlock(&rdtgroup_mutex); >> + cpus_read_unlock(); >> + >> + return ret ?: nbytes; >> +} >> + >> #ifdef CONFIG_PROC_CPU_RESCTRL >> >> /* >> @@ -1945,9 +1992,10 @@ static struct rftype res_common_files[] = { >> }, >> { >> .name = "mbm_assign_mode", >> - .mode = 0444, >> + .mode = 0644, >> .kf_ops = &rdtgroup_kf_single_ops, >> .seq_show = resctrl_mbm_assign_mode_show, >> + .write = resctrl_mbm_assign_mode_write, >> .fflags = RFTYPE_MON_INFO, >> }, >> { > > Reinette > -- Thanks Babu Moger