Hi Babu, On 10/29/24 4:21 PM, Babu Moger wrote: > The mbm_cntr_assign mode offers several hardware counters that can be > assigned to an RMID, event pair and monitor the bandwidth as long as it > is assigned. > > Counters are managed at two levels. The global assignment is tracked > using the mbm_cntr_free_map field in the struct resctrl_mon, while > domain-specific assignments are tracked using the mbm_cntr_map field > in the struct rdt_mon_domain. Allocation begins at the global level > and is then applied individually to each domain. > > Introduce an interface to allocate these counters and update the > corresponding domains accordingly. > > Signed-off-by: Babu Moger <babu.moger@xxxxxxx> > --- ... > diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h > index 00f7bf60e16a..cb496bd97007 100644 > --- a/arch/x86/kernel/cpu/resctrl/internal.h > +++ b/arch/x86/kernel/cpu/resctrl/internal.h > @@ -717,6 +717,8 @@ unsigned int mon_event_config_index_get(u32 evtid); > int resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, > enum resctrl_event_id evtid, u32 rmid, u32 closid, > u32 cntr_id, bool assign); > +int rdtgroup_assign_cntr_event(struct rdt_resource *r, struct rdtgroup *rdtgrp, > + struct rdt_mon_domain *d, enum resctrl_event_id evtid); > void rdt_staged_configs_clear(void); > bool closid_allocated(unsigned int closid); > int resctrl_find_cleanest_closid(void); > diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c > index 1b5529c212f5..bc3752967c44 100644 > --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c > +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c > @@ -1924,6 +1924,93 @@ int resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, > return 0; > } > > +/* > + * Configure the counter for the event, RMID pair for the domain. > + * Update the bitmap and reset the architectural state. > + */ > +static int resctrl_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, > + enum resctrl_event_id evtid, u32 rmid, u32 closid, > + u32 cntr_id, bool assign) > +{ > + int ret; > + > + ret = resctrl_arch_config_cntr(r, d, evtid, rmid, closid, cntr_id, assign); > + if (ret) > + return ret; > + > + if (assign) > + __set_bit(cntr_id, d->mbm_cntr_map); > + else > + __clear_bit(cntr_id, d->mbm_cntr_map); > + > + /* > + * Reset the architectural state so that reading of hardware > + * counter is not considered as an overflow in next update. > + */ > + resctrl_arch_reset_rmid(r, d, closid, rmid, evtid); resctrl_arch_reset_rmid() expects to be run on a CPU that is in the domain @d ... note that after the architectural state is reset it initializes the state by reading the event on the current CPU. By running it here it is run on a random CPU that may not be in the right domain. > + > + return ret; > +} > + > +static bool mbm_cntr_assigned_to_domain(struct rdt_resource *r, u32 cntr_id) > +{ > + struct rdt_mon_domain *d; > + > + list_for_each_entry(d, &r->mon_domains, hdr.list) > + if (test_bit(cntr_id, d->mbm_cntr_map)) > + return 1; > + > + return 0; > +} > + > +/* > + * Assign a hardware counter to event @evtid of group @rdtgrp. > + * Counter will be assigned to all the domains if rdt_mon_domain is NULL > + * else the counter will be assigned to specific domain. > + */ > +int rdtgroup_assign_cntr_event(struct rdt_resource *r, struct rdtgroup *rdtgrp, > + struct rdt_mon_domain *d, enum resctrl_event_id evtid) > +{ > + int index = MBM_EVENT_ARRAY_INDEX(evtid); > + int cntr_id = rdtgrp->mon.cntr_id[index]; > + int ret; > + > + /* > + * Allocate a new counter id to the event if the counter is not > + * assigned already. > + */ > + if (cntr_id == MON_CNTR_UNSET) { > + cntr_id = mbm_cntr_alloc(r); > + if (cntr_id < 0) { > + rdt_last_cmd_puts("Out of MBM assignable counters\n"); > + return -ENOSPC; > + } > + rdtgrp->mon.cntr_id[index] = cntr_id; > + } > + > + if (!d) { > + list_for_each_entry(d, &r->mon_domains, hdr.list) { > + ret = resctrl_config_cntr(r, d, evtid, rdtgrp->mon.rmid, > + rdtgrp->closid, cntr_id, true); > + if (ret) > + goto out_done_assign; This may not be what users expect. What if, for example, domain #1 has a counter assigned to "total" event and then user wants to change that to assign a counter to "total" event of all domains. Would this not reconfigure the counter associated with domain #1 and unnecessarily reset it? Could this be made a bit smarter to only configure a counter on a domain if it is not already configured? This could perhaps form part of resctrl_config_cntr() to not scatter the duplicate check everywhere. What do you think? Also, looks like this can do partial assignment. For example, if one of the domains encounter a failure then domains already configured are not undone. This matches other similar flows but is not documented and left to reader to decipher. > + } > + } else { > + ret = resctrl_config_cntr(r, d, evtid, rdtgrp->mon.rmid, > + rdtgrp->closid, cntr_id, true); Looking at flows calling rdtgroup_assign_cntr_event() I do not see a check if counter is already assigned. So, if a user makes a loop of assigning a counter to the same event over and over it will result in an IPI every time. This seems unnecessary, what do you think? > + if (ret) > + goto out_done_assign; > + } > + > +out_done_assign: > + if (ret && !mbm_cntr_assigned_to_domain(r, cntr_id)) { > + mbm_cntr_free(r, cntr_id); > + rdtgroup_cntr_id_init(rdtgrp, evtid); > + } > + > + return ret; > +} > + > /* rdtgroup information files for one cache resource. */ > static struct rftype res_common_files[] = { > { Reinette