Hi Reinette, On 12/2/24 12:33, Reinette Chatre wrote: > Hi Babu and Peter, > > On 11/29/24 9:06 AM, Moger, Babu wrote: >> Hi Peter, Reinette, >> >> On 11/29/2024 3:59 AM, Peter Newman wrote: >>> Hi Babu, >>> >>> On Thu, Nov 28, 2024 at 8:35 PM Moger, Babu <bmoger@xxxxxxx> wrote: >>>> >>>> Hi Peter, >>>> >>>> On 11/28/2024 5:10 AM, Peter Newman wrote: >>>>> Hi Babu, Reinette, >>>>> >>>>> On Wed, Nov 27, 2024 at 8:05 PM Reinette Chatre >>>>> <reinette.chatre@xxxxxxxxx> wrote: >>>>>> >>>>>> Hi Babu, >>>>>> >>>>>> On 11/27/24 6:57 AM, Moger, Babu wrote: >>> >>>>>>> 1. Each group needs to remember counter ids in each domain for each event. >>>>>>> For example: >>>>>>> Resctrl group mon1 >>>>>>> Total event >>>>>>> dom 0 cntr_id 1, >>>>>>> dom 1 cntr_id 10 >>>>>>> dom 2 cntr_id 11 >>>>>>> >>>>>>> Local event >>>>>>> dom 0 cntr_id 2, >>>>>>> dom 1 cntr_id 15 >>>>>>> dom 2 cntr_id 10 >>>>>> >>>>>> Indeed. The challenge here is that domains may come and go so it cannot be a simple >>>>>> static array. As an alternative it can be an xarray indexed by the domain ID with >>>>>> pointers to a struct like below to contain the counters associated with the monitor >>>>>> group: >>>>>> struct cntr_id { >>>>>> u32 mbm_total; >>>>>> u32 mbm_local; >>>>>> } >>>>>> >>>>>> >>>>>> Thinking more about how this array needs to be managed made me wonder how the >>>>>> current implementation deals with domains that come and go. I do not think >>>>>> this is currently handled. For example, if a new domain comes online and >>>>>> monitoring groups had counters dynamically assigned, then these counters are >>>>>> not configured to the newly online domain. >>>> >>>> I am trying to understand the details of your approach here. >>>>> >>>>> In my prototype, I allocated a counter id-indexed array to each >>>>> monitoring domain structure for tracking the counter allocations, >>>>> because the hardware counters are all domain-scoped. That way the >>>>> tracking data goes away when the hardware does. >>>>> >>>>> I was focused on allowing all pending counter updates to a domain >>>>> resulting from a single mbm_assign_control write to be batched and >>>>> processed in a single IPI, so I structured the counter tracker >>>>> something like this: >>>> >>>> Not sure what you meant here. How are you batching two IPIs for two domains? >>>> >>>> #echo "//0=t;1=t" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control >>>> >>>> This is still a single write. Two IPIs are sent separately, one for each >>>> domain. >>>> >>>> Are you doing something different? >>> >>> I said "all pending counter updates to a domain", whereby I meant >>> targeting a single domain. >>> >>> Depending on the CPU of the caller, your example write requires 1 or 2 IPIs. >>> >>> What is important is that the following write also requires 1 or 2 IPIs: >>> >>> (assuming /sys/fs/resctrl/mon_groups/[g1-g31] exist, line breaks added >>> for readability) >>> >>> echo $'//0=t;1=t\n >>> /g1/0=t;1=t\n >>> /g2/0=t;1=t\n >>> /g3/0=t;1=t\n >>> /g4/0=t;1=t\n >>> /g5/0=t;1=t\n >>> /g6/0=t;1=t\n >>> /g7/0=t;1=t\n >>> /g8/0=t;1=t\n >>> /g9/0=t;1=t\n >>> /g10/0=t;1=t\n >>> /g11/0=t;1=t\n >>> /g12/0=t;1=t\n >>> /g13/0=t;1=t\n >>> /g14/0=t;1=t\n >>> /g15/0=t;1=t\n >>> /g16/0=t;1=t\n >>> /g17/0=t;1=t\n >>> /g18/0=t;1=t\n >>> /g19/0=t;1=t\n >>> /g20/0=t;1=t\n >>> /g21/0=t;1=t\n >>> /g22/0=t;1=t\n >>> /g23/0=t;1=t\n >>> /g24/0=t;1=t\n >>> /g25/0=t;1=t\n >>> /g26/0=t;1=t\n >>> /g27/0=t;1=t\n >>> /g28/0=t;1=t\n >>> /g29/0=t;1=t\n >>> /g30/0=t;1=t\n >>> /g31/0=t;1=t\n' >>> >>> My ultimate goal is for a thread bound to a particular domain to be >>> able to unassign and reassign the local domain's 32 counters in a >>> single write() with no IPIs at all. And when IPIs are required, then >>> no more than one per domain, regardless of the number of groups >>> updated. >>> >> >> Yes. I think I got the idea. Thanks. >> >>> >>>> >>>>> >>>>> struct resctrl_monitor_cfg { >>>>> int closid; >>>>> int rmid; >>>>> int evtid; >>>>> bool dirty; >>>>> }; >>>>> >>>>> This mirrors the info needed in whatever register configures the >>>>> counter, plus a dirty flag to skip over the ones that don't need to be >>>>> updated. >>>> >>>> This is what my understanding of your implementation. >>>> >>>> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h >>>> index d94abba1c716..9cebf065cc97 100644 >>>> --- a/include/linux/resctrl.h >>>> +++ b/include/linux/resctrl.h >>>> @@ -94,6 +94,13 @@ struct rdt_ctrl_domain { >>>> u32 *mbps_val; >>>> }; >>>> >>>> +struct resctrl_monitor_cfg { >>>> + int closid; >>>> + int rmid; >>>> + int evtid; >>>> + bool dirty; >>>> +}; >>>> + >>>> /** >>>> * struct rdt_mon_domain - group of CPUs sharing a resctrl monitor >>>> resource >>>> * @hdr: common header for different domain types >>>> @@ -116,6 +123,7 @@ struct rdt_mon_domain { >>>> struct delayed_work cqm_limbo; >>>> int mbm_work_cpu; >>>> int cqm_work_cpu; >>>> + /* Allocate num_mbm_cntrs entries in each domain */ >>>> + struct resctrl_monitor_cfg *mon_cfg; >>>> }; >>>> >>>> >>>> When a user requests an assignment for total event to the default group >>>> for domain 0, you go search in rdt_mon_domain(dom 0) for empty mon_cfg >>>> entry. >>>> >>>> If there is an empty entry, then use that entry for assignment and >>>> update closid, rmid, evtid and dirty = 1. We can get all these >>>> information from default group here. >>>> >>>> Does this make sense? >>> >>> Yes, sounds correct. >> >> I will probably add cntr_id in resctrl_monitor_cfg structure and >> initialize during the allocation. And rename the field 'dirty' to >> 'active'(or something similar) to hold the assign state for that >> entry. That way we have all the information required for assignment >> at one place. We don't need to update the rdtgroup structure. >> >> Reinette, What do you think about this approach? > > I think this approach is in the right direction. Thanks to Peter for > the guidance here. > I do not think that it is necessary to add cntr_id to resctrl_monitor_cfg > though, I think the cntr_id would be the index to the array instead? Yes. I think We can use the index as cntn_id. Will let you know otherwise. > > It may also be worthwhile to consider using a pointer to the resource > group instead of storing closid and rmid directly. If used to indicate > initialization then an initialized pointer is easier to distinguish than > the closid/rmid that may have zero as valid values. Sure. Sounds good. > > I expect evtid will be enum resctrl_event_id and that raises the question > of whether "0" can indeed be used as an "uninitialized" value since doing > so would change the meaning of the enum. It may indeed keep things > separated by maintaining evtid as an enum resctrl_event_id and note the > initialization differently ... either via a pointer to a resource group > or entirely separately as Babu indicates later. Sure. Will add evtid as enum resctrl_event_id and use the "state" to indicate assign/unassign/dirty status. > >>>>> For the benefit of displaying mbm_assign_control, I put a pointer back >>>>> to any counter array entry allocated in the mbm_state struct only >>>>> because it's an existing structure that exists for every rmid-domain >>>>> combination. >>>> >>>> Pointer in mbm_state may not be required here. >>>> >>>> We are going to loop over resctrl groups. We can search the >>>> rdt_mon_domain to see if specific closid, rmid, evtid is already >>>> assigned or not in that domain. >>> >>> No, not required I guess. High-performance CPUs can probably search a >>> 32-entry array very quickly. >> >> Ok. >> > > This is not so clear to me. I am wondering about the scenario when a resource > group is removed and its counters need to be freed. Searching which counters > need to be freed would then require a search of the array within every domain, > of which I understand there can be many? Having a pointer from the mbm_state > may help here. Sure. Will add the allocated entry pointer in mbm_state. > > Reinette > > -- Thanks Babu Moger