On 8/24/22 16:16, Reinette Chatre wrote: > Hi Babu, > > On 8/22/2022 6:44 AM, Babu Moger wrote: >> Add the sysfs interface to write the event configuration for the >> MBM configurable events. The event configuration can be changed by >> writing to the sysfs file for that specific event. >> >> Following are the types of events supported. >> ================================================================== >> Bits Description >> 6 Dirty Victims from the QOS domain to all types of memory >> 5 Reads to slow memory in the non-local NUMA domain >> 4 Reads to slow memory in the local NUMA domain >> 3 Non-temporal writes to non-local NUMA domain >> 2 Non-temporal writes to local NUMA domain >> 1 Reads to memory in the non-local NUMA domain >> 0 Reads to memory in the local NUMA domain >> >> By default the mbm_total_bytes configuration is set to 0x7f to count >> all the types of events and mbm_local_bytes configuration is set to >> 0x15 to count all the local memory events. >> >> For example: >> To change the mbm_total_bytes to count all the reads, run the command. >> $echo 0x33 > /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_config >> >> To change the mbm_local_bytes to count all the slow memory reads, run >> the command. >> $echo 0x30 > /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_config >> >> Signed-off-by: Babu Moger <babu.moger@xxxxxxx> >> Reviewed-by: Ingo Molnar <mingo@xxxxxxxxxx> >> --- >> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 109 ++++++++++++++++++++++++++++++++ >> 1 file changed, 109 insertions(+) >> >> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >> index e1847d49fa15..83c8780726ff 100644 >> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c >> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >> @@ -323,9 +323,118 @@ int rdtgroup_mondata_config_show(struct seq_file *m, void *arg) >> return ret; >> } >> >> +/* >> + * This is called via IPI to read the CQM/MBM counters >> + * in a domain. >> + */ >> +void mon_event_config_write(void *info) >> +{ >> + union mon_data_bits *md = info; >> + u32 evtid = md->u.evtid; >> + u32 msr_index; >> + >> + switch (evtid) { >> + case QOS_L3_MBM_TOTAL_EVENT_ID: >> + msr_index = 0; >> + break; >> + case QOS_L3_MBM_LOCAL_EVENT_ID: >> + msr_index = 1; >> + break; >> + default: >> + return; /* Not expected to come here */ > Please no inline comments. Ok sure. > >> + } >> + >> + wrmsr(MSR_IA32_EVT_CFG_BASE + msr_index, md->u.mon_config, 0); >> +} >> + >> +ssize_t rdtgroup_mondata_config_write(struct kernfs_open_file *of, >> + char *buf, size_t nbytes, loff_t off) >> +{ >> + struct rdt_hw_resource *hw_res; >> + struct rdtgroup *rdtgrp; >> + struct rdt_resource *r; >> + unsigned int mon_config; >> + cpumask_var_t cpu_mask; >> + union mon_data_bits md; >> + struct rdt_domain *d; >> + u32 resid, domid; >> + int ret = 0, cpu; >> + >> + ret = kstrtouint(buf, 0, &mon_config); >> + if (ret) >> + return ret; >> + >> + rdt_last_cmd_clear(); >> + >> + /* mon_config cannot be more than the supported set of events */ >> + if (mon_config > GENMASK(6, 0)) { > Could this "GENMASK()" be given a name and be moved closer to where > the bits are defined in internal.h? This will be easier to read and if any > new bits are added it would hopefully be noticed more easily to get > an update also. Ok. Sure.. > >> + rdt_last_cmd_puts("Invalid event configuration\n"); >> + return -EINVAL; >> + } >> + >> + cpus_read_lock(); >> + rdtgrp = rdtgroup_kn_lock_live(of->kn); >> + if (!rdtgrp) { >> + return -ENOENT; >> + goto e_unlock; >> + } >> + >> + if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL)) { >> + ret = -ENOMEM; >> + goto e_unlock; >> + } >> + >> + >> + md.priv = of->kn->priv; >> + resid = md.u.rid; >> + domid = md.u.domid; >> + >> + hw_res = &rdt_resources_all[resid]; >> + r = &hw_res->r_resctrl; >> + d = rdt_find_domain(r, domid, NULL); >> + if (IS_ERR_OR_NULL(d)) { >> + ret = -ENOENT; >> + goto e_cpumask; >> + } >> + >> + md.u.mon_config = mon_config & 0xFF; > Same question as previous patch. I do not see this internal > value used in the code, is storing it necessary? Storing is not necessary. As I commented before we need mon_config as part of mon_data_bits structure because, it need to be passed to mon_event_config_read and rdtgroup_mondata_config_write. In these functions we need evtid and also config value (mon_config). > >> + >> + /* Pick all the CPUs in the domain instance */ >> + for_each_cpu(cpu, &d->cpu_mask) >> + cpumask_set_cpu(cpu, cpu_mask); >> + >> + cpu = get_cpu(); >> + /* Update MSR_IA32_EVT_CFG_BASE MSR on this cpu if it's in cpu_mask */ > Please always use caps for CPU. Sure > >> + if (cpumask_test_cpu(cpu, cpu_mask)) >> + mon_event_config_write(&md); >> + >> + /* Update MSR_IA32_EVT_CFG_BASE MSR on all other cpus in cpu_mask */ >> + smp_call_function_many(cpu_mask, mon_event_config_write, &md, 1); >> + put_cpu(); > I do not think we need to propagate this pattern more in the resctrl code. > How about on_each_cpu_mask()? Yea. We can do that. It probably better to replace all the current smp_call_function_many in resctrl code. > >> + >> + /* >> + * When an Event Configuration is changed, the bandwidth counters >> + * for all RMIDs and Events will be cleared, and the U-bit for every >> + * RMID will be set on the next read to any BwEvent for every RMID. >> + * Clear the mbm_local and mbm_total counts for all the RMIDs. >> + */ >> + memset(d->mbm_local, 0, sizeof(struct mbm_state) * r->num_rmid); >> + memset(d->mbm_total, 0, sizeof(struct mbm_state) * r->num_rmid); > Could you please check if this is sufficient? Note how "mon_event_read()" > is called with "first = true" right after the mon sysfs files are created to > clear the state _and_ initialize the state to set the "prev" MSRs correctly. I see that it is set m->prev_bw_msr = m->prev_msr = tval; With above code we are setting m->prev_bw_msr = m->prev_msr = 0; I cannot think of any problems this could cause. Please let me know if you see a problem. Thanks Babu Moger > > >> + >> +e_cpumask: >> + free_cpumask_var(cpu_mask); >> + >> +e_unlock: >> + rdtgroup_kn_unlock(of->kn); >> + cpus_read_unlock(); >> + >> + return ret ?: nbytes; >> +} >> + >> static const struct kernfs_ops kf_mondata_config_ops = { >> .atomic_write_len = PAGE_SIZE, >> .seq_show = rdtgroup_mondata_config_show, >> + .write = rdtgroup_mondata_config_write, >> }; >> >> static bool is_cpu_list(struct kernfs_open_file *of) >> >> > Reinette -- Thanks Babu Moger