Hi Babu, On 12/2/24 11:48 AM, Moger, Babu wrote: > On 12/2/24 12:33, Reinette Chatre wrote: >> On 11/29/24 9:06 AM, Moger, Babu wrote: >>> On 11/29/2024 3:59 AM, Peter Newman wrote: >>>> On Thu, Nov 28, 2024 at 8:35 PM Moger, Babu <bmoger@xxxxxxx> wrote: >>>>> On 11/28/2024 5:10 AM, Peter Newman wrote: >>>>>> On Wed, Nov 27, 2024 at 8:05 PM Reinette Chatre >>>>>> <reinette.chatre@xxxxxxxxx> wrote: >>>>>>> >>>>>>> Hi Babu, >>>>>>> >>>>>>> On 11/27/24 6:57 AM, Moger, Babu wrote: >>>> >>>>>>>> 1. Each group needs to remember counter ids in each domain for each event. >>>>>>>> For example: >>>>>>>> Resctrl group mon1 >>>>>>>> Total event >>>>>>>> dom 0 cntr_id 1, >>>>>>>> dom 1 cntr_id 10 >>>>>>>> dom 2 cntr_id 11 >>>>>>>> >>>>>>>> Local event >>>>>>>> dom 0 cntr_id 2, >>>>>>>> dom 1 cntr_id 15 >>>>>>>> dom 2 cntr_id 10 >>>>>>> >>>>>>> Indeed. The challenge here is that domains may come and go so it cannot be a simple >>>>>>> static array. As an alternative it can be an xarray indexed by the domain ID with >>>>>>> pointers to a struct like below to contain the counters associated with the monitor >>>>>>> group: >>>>>>> struct cntr_id { >>>>>>> u32 mbm_total; >>>>>>> u32 mbm_local; >>>>>>> } >>>>>>> >>>>>>> >>>>>>> Thinking more about how this array needs to be managed made me wonder how the >>>>>>> current implementation deals with domains that come and go. I do not think >>>>>>> this is currently handled. For example, if a new domain comes online and >>>>>>> monitoring groups had counters dynamically assigned, then these counters are >>>>>>> not configured to the newly online domain. >>>>> >>>>> I am trying to understand the details of your approach here. >>>>>> >>>>>> In my prototype, I allocated a counter id-indexed array to each >>>>>> monitoring domain structure for tracking the counter allocations, >>>>>> because the hardware counters are all domain-scoped. That way the >>>>>> tracking data goes away when the hardware does. >>>>>> >>>>>> I was focused on allowing all pending counter updates to a domain >>>>>> resulting from a single mbm_assign_control write to be batched and >>>>>> processed in a single IPI, so I structured the counter tracker >>>>>> something like this: >>>>> >>>>> Not sure what you meant here. How are you batching two IPIs for two domains? >>>>> >>>>> #echo "//0=t;1=t" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control >>>>> >>>>> This is still a single write. Two IPIs are sent separately, one for each >>>>> domain. >>>>> >>>>> Are you doing something different? >>>> >>>> I said "all pending counter updates to a domain", whereby I meant >>>> targeting a single domain. >>>> >>>> Depending on the CPU of the caller, your example write requires 1 or 2 IPIs. >>>> >>>> What is important is that the following write also requires 1 or 2 IPIs: >>>> >>>> (assuming /sys/fs/resctrl/mon_groups/[g1-g31] exist, line breaks added >>>> for readability) >>>> >>>> echo $'//0=t;1=t\n >>>> /g1/0=t;1=t\n >>>> /g2/0=t;1=t\n >>>> /g3/0=t;1=t\n >>>> /g4/0=t;1=t\n >>>> /g5/0=t;1=t\n >>>> /g6/0=t;1=t\n >>>> /g7/0=t;1=t\n >>>> /g8/0=t;1=t\n >>>> /g9/0=t;1=t\n >>>> /g10/0=t;1=t\n >>>> /g11/0=t;1=t\n >>>> /g12/0=t;1=t\n >>>> /g13/0=t;1=t\n >>>> /g14/0=t;1=t\n >>>> /g15/0=t;1=t\n >>>> /g16/0=t;1=t\n >>>> /g17/0=t;1=t\n >>>> /g18/0=t;1=t\n >>>> /g19/0=t;1=t\n >>>> /g20/0=t;1=t\n >>>> /g21/0=t;1=t\n >>>> /g22/0=t;1=t\n >>>> /g23/0=t;1=t\n >>>> /g24/0=t;1=t\n >>>> /g25/0=t;1=t\n >>>> /g26/0=t;1=t\n >>>> /g27/0=t;1=t\n >>>> /g28/0=t;1=t\n >>>> /g29/0=t;1=t\n >>>> /g30/0=t;1=t\n >>>> /g31/0=t;1=t\n' >>>> >>>> My ultimate goal is for a thread bound to a particular domain to be >>>> able to unassign and reassign the local domain's 32 counters in a >>>> single write() with no IPIs at all. And when IPIs are required, then >>>> no more than one per domain, regardless of the number of groups >>>> updated. >>>> >>> >>> Yes. I think I got the idea. Thanks. >>> >>>> >>>>> >>>>>> >>>>>> struct resctrl_monitor_cfg { >>>>>> int closid; >>>>>> int rmid; >>>>>> int evtid; >>>>>> bool dirty; >>>>>> }; >>>>>> >>>>>> This mirrors the info needed in whatever register configures the >>>>>> counter, plus a dirty flag to skip over the ones that don't need to be >>>>>> updated. >>>>> >>>>> This is what my understanding of your implementation. >>>>> >>>>> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h >>>>> index d94abba1c716..9cebf065cc97 100644 >>>>> --- a/include/linux/resctrl.h >>>>> +++ b/include/linux/resctrl.h >>>>> @@ -94,6 +94,13 @@ struct rdt_ctrl_domain { >>>>> u32 *mbps_val; >>>>> }; >>>>> >>>>> +struct resctrl_monitor_cfg { >>>>> + int closid; >>>>> + int rmid; >>>>> + int evtid; >>>>> + bool dirty; >>>>> +}; >>>>> + >>>>> /** >>>>> * struct rdt_mon_domain - group of CPUs sharing a resctrl monitor >>>>> resource >>>>> * @hdr: common header for different domain types >>>>> @@ -116,6 +123,7 @@ struct rdt_mon_domain { >>>>> struct delayed_work cqm_limbo; >>>>> int mbm_work_cpu; >>>>> int cqm_work_cpu; >>>>> + /* Allocate num_mbm_cntrs entries in each domain */ >>>>> + struct resctrl_monitor_cfg *mon_cfg; >>>>> }; >>>>> >>>>> >>>>> When a user requests an assignment for total event to the default group >>>>> for domain 0, you go search in rdt_mon_domain(dom 0) for empty mon_cfg >>>>> entry. >>>>> >>>>> If there is an empty entry, then use that entry for assignment and >>>>> update closid, rmid, evtid and dirty = 1. We can get all these >>>>> information from default group here. >>>>> >>>>> Does this make sense? >>>> >>>> Yes, sounds correct. >>> >>> I will probably add cntr_id in resctrl_monitor_cfg structure and >>> initialize during the allocation. And rename the field 'dirty' to >>> 'active'(or something similar) to hold the assign state for that >>> entry. That way we have all the information required for assignment >>> at one place. We don't need to update the rdtgroup structure. >>> >>> Reinette, What do you think about this approach? >> >> I think this approach is in the right direction. Thanks to Peter for >> the guidance here. >> I do not think that it is necessary to add cntr_id to resctrl_monitor_cfg >> though, I think the cntr_id would be the index to the array instead? > > Yes. I think We can use the index as cntn_id. Will let you know otherwise. > > >> >> It may also be worthwhile to consider using a pointer to the resource >> group instead of storing closid and rmid directly. If used to indicate >> initialization then an initialized pointer is easier to distinguish than >> the closid/rmid that may have zero as valid values. > > Sure. Sounds good. > >> >> I expect evtid will be enum resctrl_event_id and that raises the question >> of whether "0" can indeed be used as an "uninitialized" value since doing >> so would change the meaning of the enum. It may indeed keep things >> separated by maintaining evtid as an enum resctrl_event_id and note the >> initialization differently ... either via a pointer to a resource group >> or entirely separately as Babu indicates later. > > Sure. Will add evtid as enum resctrl_event_id and use the "state" to > indicate assign/unassign/dirty status. Is "assign/unassign" state needed? If resctrl_monitor_cfg contains a pointer to the resource group to which the counter has been assigned then I expect NULL means unassigned and a value means assigned? Reinette