Hi Reinette, On 12/2/24 15:47, Reinette Chatre wrote: > Hi Babu, > > On 12/2/24 1:28 PM, Moger, Babu wrote: >> Hi Reinette, >> >> On 12/2/24 15:09, Reinette Chatre wrote: >>> Hi Babu, >>> >>> On 12/2/24 12:42 PM, Moger, Babu wrote: >>>> Hi Reinette, >>>> >>>> On 12/2/24 14:15, Reinette Chatre wrote: >>>>> Hi Babu, >>>>> >>>>> On 12/2/24 11:48 AM, Moger, Babu wrote: >>>>>> On 12/2/24 12:33, Reinette Chatre wrote: >>>>>>> On 11/29/24 9:06 AM, Moger, Babu wrote: >>>>>>>> On 11/29/2024 3:59 AM, Peter Newman wrote: >>>>>>>>> On Thu, Nov 28, 2024 at 8:35 PM Moger, Babu <bmoger@xxxxxxx> wrote: >>>>>>>>>> On 11/28/2024 5:10 AM, Peter Newman wrote: >>>>>>>>>>> On Wed, Nov 27, 2024 at 8:05 PM Reinette Chatre >>>>>>>>>>> <reinette.chatre@xxxxxxxxx> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hi Babu, >>>>>>>>>>>> >>>>>>>>>>>> On 11/27/24 6:57 AM, Moger, Babu wrote: >>>>>>>>> >>>>>>>>>>>>> 1. Each group needs to remember counter ids in each domain for each event. >>>>>>>>>>>>> For example: >>>>>>>>>>>>> Resctrl group mon1 >>>>>>>>>>>>> Total event >>>>>>>>>>>>> dom 0 cntr_id 1, >>>>>>>>>>>>> dom 1 cntr_id 10 >>>>>>>>>>>>> dom 2 cntr_id 11 >>>>>>>>>>>>> >>>>>>>>>>>>> Local event >>>>>>>>>>>>> dom 0 cntr_id 2, >>>>>>>>>>>>> dom 1 cntr_id 15 >>>>>>>>>>>>> dom 2 cntr_id 10 >>>>>>>>>>>> >>>>>>>>>>>> Indeed. The challenge here is that domains may come and go so it cannot be a simple >>>>>>>>>>>> static array. As an alternative it can be an xarray indexed by the domain ID with >>>>>>>>>>>> pointers to a struct like below to contain the counters associated with the monitor >>>>>>>>>>>> group: >>>>>>>>>>>> struct cntr_id { >>>>>>>>>>>> u32 mbm_total; >>>>>>>>>>>> u32 mbm_local; >>>>>>>>>>>> } >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thinking more about how this array needs to be managed made me wonder how the >>>>>>>>>>>> current implementation deals with domains that come and go. I do not think >>>>>>>>>>>> this is currently handled. For example, if a new domain comes online and >>>>>>>>>>>> monitoring groups had counters dynamically assigned, then these counters are >>>>>>>>>>>> not configured to the newly online domain. >>>>>>>>>> >>>>>>>>>> I am trying to understand the details of your approach here. >>>>>>>>>>> >>>>>>>>>>> In my prototype, I allocated a counter id-indexed array to each >>>>>>>>>>> monitoring domain structure for tracking the counter allocations, >>>>>>>>>>> because the hardware counters are all domain-scoped. That way the >>>>>>>>>>> tracking data goes away when the hardware does. >>>>>>>>>>> >>>>>>>>>>> I was focused on allowing all pending counter updates to a domain >>>>>>>>>>> resulting from a single mbm_assign_control write to be batched and >>>>>>>>>>> processed in a single IPI, so I structured the counter tracker >>>>>>>>>>> something like this: >>>>>>>>>> >>>>>>>>>> Not sure what you meant here. How are you batching two IPIs for two domains? >>>>>>>>>> >>>>>>>>>> #echo "//0=t;1=t" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control >>>>>>>>>> >>>>>>>>>> This is still a single write. Two IPIs are sent separately, one for each >>>>>>>>>> domain. >>>>>>>>>> >>>>>>>>>> Are you doing something different? >>>>>>>>> >>>>>>>>> I said "all pending counter updates to a domain", whereby I meant >>>>>>>>> targeting a single domain. >>>>>>>>> >>>>>>>>> Depending on the CPU of the caller, your example write requires 1 or 2 IPIs. >>>>>>>>> >>>>>>>>> What is important is that the following write also requires 1 or 2 IPIs: >>>>>>>>> >>>>>>>>> (assuming /sys/fs/resctrl/mon_groups/[g1-g31] exist, line breaks added >>>>>>>>> for readability) >>>>>>>>> >>>>>>>>> echo $'//0=t;1=t\n >>>>>>>>> /g1/0=t;1=t\n >>>>>>>>> /g2/0=t;1=t\n >>>>>>>>> /g3/0=t;1=t\n >>>>>>>>> /g4/0=t;1=t\n >>>>>>>>> /g5/0=t;1=t\n >>>>>>>>> /g6/0=t;1=t\n >>>>>>>>> /g7/0=t;1=t\n >>>>>>>>> /g8/0=t;1=t\n >>>>>>>>> /g9/0=t;1=t\n >>>>>>>>> /g10/0=t;1=t\n >>>>>>>>> /g11/0=t;1=t\n >>>>>>>>> /g12/0=t;1=t\n >>>>>>>>> /g13/0=t;1=t\n >>>>>>>>> /g14/0=t;1=t\n >>>>>>>>> /g15/0=t;1=t\n >>>>>>>>> /g16/0=t;1=t\n >>>>>>>>> /g17/0=t;1=t\n >>>>>>>>> /g18/0=t;1=t\n >>>>>>>>> /g19/0=t;1=t\n >>>>>>>>> /g20/0=t;1=t\n >>>>>>>>> /g21/0=t;1=t\n >>>>>>>>> /g22/0=t;1=t\n >>>>>>>>> /g23/0=t;1=t\n >>>>>>>>> /g24/0=t;1=t\n >>>>>>>>> /g25/0=t;1=t\n >>>>>>>>> /g26/0=t;1=t\n >>>>>>>>> /g27/0=t;1=t\n >>>>>>>>> /g28/0=t;1=t\n >>>>>>>>> /g29/0=t;1=t\n >>>>>>>>> /g30/0=t;1=t\n >>>>>>>>> /g31/0=t;1=t\n' >>>>>>>>> >>>>>>>>> My ultimate goal is for a thread bound to a particular domain to be >>>>>>>>> able to unassign and reassign the local domain's 32 counters in a >>>>>>>>> single write() with no IPIs at all. And when IPIs are required, then >>>>>>>>> no more than one per domain, regardless of the number of groups >>>>>>>>> updated. >>>>>>>>> >>>>>>>> >>>>>>>> Yes. I think I got the idea. Thanks. >>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> struct resctrl_monitor_cfg { >>>>>>>>>>> int closid; >>>>>>>>>>> int rmid; >>>>>>>>>>> int evtid; >>>>>>>>>>> bool dirty; >>>>>>>>>>> }; >>>>>>>>>>> >>>>>>>>>>> This mirrors the info needed in whatever register configures the >>>>>>>>>>> counter, plus a dirty flag to skip over the ones that don't need to be >>>>>>>>>>> updated. >>>>>>>>>> >>>>>>>>>> This is what my understanding of your implementation. >>>>>>>>>> >>>>>>>>>> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h >>>>>>>>>> index d94abba1c716..9cebf065cc97 100644 >>>>>>>>>> --- a/include/linux/resctrl.h >>>>>>>>>> +++ b/include/linux/resctrl.h >>>>>>>>>> @@ -94,6 +94,13 @@ struct rdt_ctrl_domain { >>>>>>>>>> u32 *mbps_val; >>>>>>>>>> }; >>>>>>>>>> >>>>>>>>>> +struct resctrl_monitor_cfg { >>>>>>>>>> + int closid; >>>>>>>>>> + int rmid; >>>>>>>>>> + int evtid; >>>>>>>>>> + bool dirty; >>>>>>>>>> +}; >>>>>>>>>> + >>>>>>>>>> /** >>>>>>>>>> * struct rdt_mon_domain - group of CPUs sharing a resctrl monitor >>>>>>>>>> resource >>>>>>>>>> * @hdr: common header for different domain types >>>>>>>>>> @@ -116,6 +123,7 @@ struct rdt_mon_domain { >>>>>>>>>> struct delayed_work cqm_limbo; >>>>>>>>>> int mbm_work_cpu; >>>>>>>>>> int cqm_work_cpu; >>>>>>>>>> + /* Allocate num_mbm_cntrs entries in each domain */ >>>>>>>>>> + struct resctrl_monitor_cfg *mon_cfg; >>>>>>>>>> }; >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> When a user requests an assignment for total event to the default group >>>>>>>>>> for domain 0, you go search in rdt_mon_domain(dom 0) for empty mon_cfg >>>>>>>>>> entry. >>>>>>>>>> >>>>>>>>>> If there is an empty entry, then use that entry for assignment and >>>>>>>>>> update closid, rmid, evtid and dirty = 1. We can get all these >>>>>>>>>> information from default group here. >>>>>>>>>> >>>>>>>>>> Does this make sense? >>>>>>>>> >>>>>>>>> Yes, sounds correct. >>>>>>>> >>>>>>>> I will probably add cntr_id in resctrl_monitor_cfg structure and >>>>>>>> initialize during the allocation. And rename the field 'dirty' to >>>>>>>> 'active'(or something similar) to hold the assign state for that >>>>>>>> entry. That way we have all the information required for assignment >>>>>>>> at one place. We don't need to update the rdtgroup structure. >>>>>>>> >>>>>>>> Reinette, What do you think about this approach? >>>>>>> >>>>>>> I think this approach is in the right direction. Thanks to Peter for >>>>>>> the guidance here. >>>>>>> I do not think that it is necessary to add cntr_id to resctrl_monitor_cfg >>>>>>> though, I think the cntr_id would be the index to the array instead? >>>>>> >>>>>> Yes. I think We can use the index as cntn_id. Will let you know otherwise. >>>>>> >>>>>> >>>>>>> >>>>>>> It may also be worthwhile to consider using a pointer to the resource >>>>>>> group instead of storing closid and rmid directly. If used to indicate >>>>>>> initialization then an initialized pointer is easier to distinguish than >>>>>>> the closid/rmid that may have zero as valid values. >>>>>> >>>>>> Sure. Sounds good. >>>>>> >>>>>>> >>>>>>> I expect evtid will be enum resctrl_event_id and that raises the question >>>>>>> of whether "0" can indeed be used as an "uninitialized" value since doing >>>>>>> so would change the meaning of the enum. It may indeed keep things >>>>>>> separated by maintaining evtid as an enum resctrl_event_id and note the >>>>>>> initialization differently ... either via a pointer to a resource group >>>>>>> or entirely separately as Babu indicates later. >>>>>> >>>>>> Sure. Will add evtid as enum resctrl_event_id and use the "state" to >>>>>> indicate assign/unassign/dirty status. >>>>> >>>>> Is "assign/unassign" state needed? If resctrl_monitor_cfg contains a pointer >>>>> to the resource group to which the counter has been assigned then I expect NULL >>>>> means unassigned and a value means assigned? >>>> >>>> Yes. We use the rdtgroup pointer to check the assign/unassign state. >>>> >>>> I will drop the 'state' field. Peter can add state when he wants use it >>>> for optimization later. >>>> >>>> I think we need to have the 'cntr_id" field here in resctrl_monitor_cfg. >>>> When we access the pointer from mbm_state, we wont know what is cntr_id >>>> index it came from. >>>> >>> >>> oh, good point. I wonder how Peter addressed this in his PoC. As an alternative, >>> could the cntr_id be used in mbm_state instead of a pointer? >>> >> >> Yes. It can be done. >> >> I thought it would be better to have everything at once place. >> >> struct resctrl_monitor_cfg { >> unsigned int cntr_id; >> enum resctrl_event_id evtid; >> struct rdtgroup *rgtgrp; >> }; >> >> This will have everything required to assign/unassign the event. >> > > The "everything in one place" argument is not clear to me since the cntr_id > is indeed present already as the index to the array that stores this structure. > Including the cntr_id seems redundant to me. This is similar to several > other data structures in resctrl that are indexed either by closid or rmid, > without needing to store closid/rmid in these data structures self. > Ok. That is fine. Will remove cntr_id index from resctrl_monitor_cfg. Will add it in mbm_state. That should be good. -- Thanks Babu Moger