Hi Babu, On 2/10/25 9:27 AM, Moger, Babu wrote: > On 2/6/25 12:03, Reinette Chatre wrote: >> On 1/22/25 12:20 PM, Babu Moger wrote: >> >>> + * of hardware counter is not considered as an overflow in the >>> + * next update. >>> + */ >>> + if (is_mbm_enabled() && r->mon.mbm_cntr_assignable) { >>> + list_for_each_entry(dom, &r->mon_domains, hdr.list) { >>> + memset(dom->cntr_cfg, 0, >>> + sizeof(*dom->cntr_cfg) * r->mon.num_mbm_cntrs); >>> + if (is_mbm_total_enabled()) >>> + memset(dom->mbm_total, 0, >>> + sizeof(struct mbm_state) * idx_limit); >>> + if (is_mbm_local_enabled()) >>> + memset(dom->mbm_local, 0, >>> + sizeof(struct mbm_state) * idx_limit); >>> + resctrl_arch_reset_rmid_all(r, dom); >>> + } >>> + } >>> +} >> >> I looked back at the previous versions to better understand how this function >> came about and I do not think it actually solves the problem it aims to solve. >> >> rdtgroup_unassign_cntrs() can fail and when it does the counter is not free'd. That >> leaves a monitoring domain's array with an entry that points to a resource group >> that no longer exists (unless it is the default resource group) since >> rdtgroup_unassign_cntrs() does not check the return and proceeds to remove the >> resource group. mbm_cntr_reset() is called on umount of resctrl but >> rdtgroup_unassign_cntrs() is called on every group remove and those scenarios >> are not handled. >> >> To address this I believe that I need to go back on a previous request to have >> resctrl_arch_config_cntr() return an error code. AMD does not need this and >> it is difficult to predict what will work for MPAM. I originally wanted to be >> flexible here but this appears to be impractical. With a new requirement that >> resctrl_arch_config_cntr() always succeeds the counter will in turn always >> be free'd and not leave dangling pointers. I believe doing so eliminates >> the need for mbm_cntr_reset() as used in this patch. My apologies for the >> misdirection. We can re-evaluate these flows if MPAM needs anything different. > > So, new requirement is to free the counter even if the > resctrl_arch_config_cntr() call fails. That way after calling No. Quoting above: "new requirement that resctrl_arch_config_cntr() always succeeds". As I see it this will eliminate a lot of error checking on the calling path, not ignore errors. Reinette