Hi James, On 2/21/25 12:06, James Morse wrote: > Hi Babu, > > On 22/01/2025 20:20, Babu Moger wrote: >> Configure mbm_cntr_assign mode on AMD platforms. On AMD platforms, it >> is recommended to use mbm_cntr_assign mode if supported, because >> reading "mbm_total_bytes" or "mbm_local_bytes" will report 'Unavailable' >> if there is no counter associated with that event. > > (If you agree with my comment on patch 7, it would be good to update this > wording to match.) Sure. > > >> The mbm_cntr_assign mode, referred to as ABMC (Assignable Bandwidth >> Monitoring Counters) on AMD, is enabled by default when supported by the >> system. >> >> Update ABMC across all logical processors within the resctrl domain to >> ensure proper functionality. >> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h >> index c006c4d8d6ff..2480698b643d 100644 >> --- a/arch/x86/kernel/cpu/resctrl/internal.h >> +++ b/arch/x86/kernel/cpu/resctrl/internal.h >> @@ -734,4 +734,5 @@ int resctrl_unassign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d >> void mbm_cntr_reset(struct rdt_resource *r); >> int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d, >> struct rdtgroup *rdtgrp, enum resctrl_event_id evtid); >> +void resctrl_arch_mbm_cntr_assign_set_one(struct rdt_resource *r); >> #endif /* _ASM_X86_RESCTRL_INTERNAL_H */ > > Could this be put in include/linux/resctrl.h, its where it needs to end up eventually. > As Reinette already mentioned in [1], Boris wanted this moved when arch/fs code separation integrated. Lets keep it in resctrl/internal.h for now. [1] https://lore.kernel.org/lkml/e524c376-9ef8-488e-8053-b49ccafd306d@xxxxxxxxx/ > > > This sequence has me confused: > >> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c >> index 3d748fdbcb5f..a9a5dc626a1e 100644 >> --- a/arch/x86/kernel/cpu/resctrl/monitor.c >> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c >> @@ -1233,6 +1233,7 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r) >> r->mon.mbm_cntr_assignable = true; >> cpuid_count(0x80000020, 5, &eax, &ebx, &ecx, &edx); >> r->mon.num_mbm_cntrs = (ebx & GENMASK(15, 0)) + 1; > >> + hw_res->mbm_cntr_assign_enabled = true; > > Here the arch code sets ABMC to be enabled by default at boot. > > >> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >> index 6922173c4f8f..515969c5f64f 100644 >> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c >> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >> @@ -4302,9 +4302,13 @@ int resctrl_online_mon_domain(struct rdt_resource *r, struct rdt_mon_domain *d) >> >> void resctrl_online_cpu(unsigned int cpu) >> { >> + struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; >> + >> mutex_lock(&rdtgroup_mutex); >> /* The CPU is set in default rdtgroup after online. */ >> cpumask_set_cpu(cpu, &rdtgroup_default.cpu_mask); >> + if (r->mon_capable && r->mon.mbm_cntr_assignable) >> + resctrl_arch_mbm_cntr_assign_set_one(r); >> mutex_unlock(&rdtgroup_mutex); >> } > > But here, resctrl has to call back to the arch code to make sure the hardware is in the > same state as hw_res->mbm_cntr_assign_enabled. > > Could this be done in resctrl_arch_online_cpu() instead? That way resctrl doesn't get CPUs > in an inconsistent state that it has to fix up... > Sure. Here is the diff. diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index 22399f19810f..f48b298413bc 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -771,6 +771,12 @@ static int resctrl_arch_online_cpu(unsigned int cpu) domain_add_cpu(cpu, r); mutex_unlock(&domain_list_lock); + r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; + mutex_lock(&rdtgroup_mutex); + if (r->mon_capable && r->mon.mbm_cntr_assignable) + resctrl_arch_mbm_cntr_assign_set_one(r); + mutex_unlock(&rdtgroup_mutex); + clear_closid_rmid(cpu); resctrl_online_cpu(cpu); -- Thanks Babu Moger