Hi Babu, On 2/28/2024 9:59 AM, Moger, Babu wrote: > On 2/27/24 17:50, Reinette Chatre wrote: >> On 2/27/2024 10:12 AM, Moger, Babu wrote: >>> On 2/26/24 15:20, Reinette Chatre wrote: >>>> On 2/26/2024 9:59 AM, Moger, Babu wrote: >>>>> On 2/23/24 16:21, Reinette Chatre wrote: >> >>>> For example, if I understand correctly, theoretically, when ABMC is enabled then >>>> "num_rmids" can be U32_MAX (after a quick look it is not clear to me why r->num_rmid >>>> is not unsigned, tbd if number of directories may also be limited by kernfs). >>>> User space could theoretically create more monitor groups than the number of >>>> rmids that a resource claims to support using current upstream enumeration. >>> >>> CPU or task association still uses PQR_ASSOC(MSR C8Fh). There are only 11 >>> bits(depends on specific h/w) to represent RMIDs. So, we cannot create >>> more than this limit(r->num_rmid). >>> >>> In case of ABMC, h/w uses another counter(mbm_assignable_counters) with >>> RMID to assign the monitoring. So, assignment limit is >>> mbm_assignable_counters. The number of mon groups limit is still r->num_rmid. >> >> I see. Thank you for clarifying. This does make enabling simpler and one >> less user interface item that needs changing. >> >> ... >> >>>>> 2. /sys/fs/resctrl/monitor_state. >>>>> This can used to individually assign or unassign the counters in each group. >>>>> >>>>> When assigned: >>>>> #cat /sys/fs/resctrl/monitor_state >>>>> 0=total-assign,local-assign;1=total-assign,local-assign >>>>> >>>>> When unassigned: >>>>> #cat /sys/fs/resctrl/monitor_state >>>>> 0=total-unassign,local-unassign;1=total-unassign,local-unassign >>>>> >>>>> >>>>> Thoughts? >>>> >>>> How do you expect this interface to be used? I understand the mechanics >>>> of this interface but on a higher level, do you expect user space to >>>> once in a while assign a new counter to a single event or monitor group >>>> (for which a fine grained interface works) or do you expect user space to >>>> shift multiple counters across several monitor events at intervals? >>> >>> I think we should provide both the options. I was thinking of providing >>> fine grained interface first. >> >> Could you please provide a motivation for why two interfaces, one inefficient >> and one not, should be created and maintained? Users can still do fine grained >> assignment with a global assignment interface. > > Lets consider one by one. > > 1. Fine grained assignment. > > It will be part of the mongroup(or control mongroup). User has the access > to the group and can query the group's current status before assigning or > unassigning. > > $cd /sys/fs/resctrl/ctrl_mon1 > $cat /sys/fs/resctrl/ctrl_mon1/monitor_state > 0=total-unassign,local-unassign;1=total-unassign,local-unassign; > > Assign the total event > > $echo 0=total-assign > /sys/fs/resctrl/ctrl_mon1/monitor_state > > Assign the local event > > $echo 0=local-assign > /sys/fs/resctrl/ctrl_mon1/monitor_state > > Assign both events: > > $echo 0=total-assign,local-assign > /sys/fs/resctrl/ctrl_mon1/monitor_state > > Check the assignment status. > > $cat /sys/fs/resctrl/ctrl_mon1/monitor_state > 0=total-assign,local-assign;1=total-unassign,local-unassign; > > -User interface is simple. This should not be the only motivation. Please do not sacrifice efficiency and usability just to have a simple interface. One can also argue that this interface can only be considered simple from the kernel implementation perspective, from user space it seems complicated. For example, as James pointed out earlier [1], user space would need to walk the entire resctrl to find out where counters are assigned. Peter also pointed out how the multiple syscalls needed when adjusting hundreds of monitor groups is inefficient. Please take all feedback into account. You consider "simple interface" as a motivation, there seems to be at least two arguments against this interface. Please consider these in your comparison between interfaces. These are things that should be noted and make their way to the cover letter. > > -Assignment will fail if all the h/w counters are exhausted. User needs to > unassign a counter from another group and use that counter here. This can > be done just querying the monitor state of another group. Right ... and as you state there can be hundreds of monitor groups that user space would need to walk and query to get this information. > > -Monitor group's details(cpus, tasks) are part of the group. So, it is > better to have assignment state inside the group. The assignment state should be clear from the event file. > Note: Used interface names here just to give example. > > > 2. global assignment: > > I would assume the interface file will be in /sys/fs/resctrl/info/L3_MON/ > directory. > > In case there are 100 mongroups, we need to have a way to list current > assignment status for these groups. I am not sure how to list status of > these 100 groups. The kernel has many examples of interfaces that manages status of a large number of entities. I am thinking, for example, we can learn a lot from how dynamic debug works. On my system I see: $ wc -l /sys/kernel/debug/dynamic_debug/control 5359 /sys/kernel/debug/dynamic_debug/control > > If user is wants to assign the local event(or total) in a specific group > in this list of 100 groups, I am not sure how to provide interface for > that. Should we pass the name of mongroup? That will involve looping > through using the call kernfs_walk_and_get. This may be ok if we are > dealing with very small number of groups. > What is your concern when needing to modify a large number of groups? Are you concerned about the size of the writes needing to be parsed? It looks like kernfs does support writes of larger than PAGE_SIZE, but it is not clear to me that such large sizes will be required. There is also kernfs_find_and_get() that may be more convenient to use. I believe user space needs to provide control group name for a global interface (the same name can be used by monitor groups belonging to different control groups), and that can be used to narrow search. Reading your message I do not find any motivation _against_ a global interface, except that it is not obvious to you how such interface may look or work. That is fair. Peter seems to have ideas and a working implementation that can be used as reference. So far I have only seen one comment [2] from James that was skeptical about the global interface but the reason notes that MPAM allocates counters per domain, which is the same as ABMC so we will need more information from James here on what is required since he did not respond to Peter. Below is a *hypothetical* interface to start a discussion that explores how to support fine grained assignment in an interface that aims to be easy to use by user space. Obviously Peter is also working on something so there are many viewpoints to consider. File info/L3_MON/mbm_assign_control: #control_group/mon_group/flags ctrl_a/mon_a/00=_;01=_ ctrl_a/mon_b/00=l;01=t ctrl_b/mon_c/00=lt;01=lt Above file displays to user: * No counters are assigned to monitor group mon_a within control group ctrl_a * Counter for local MBM is assigned to domain 0 of monitor group mon_b within control group ctrl_a * Counter for total MBM is assigned to domain 1 of monitor group mon_b within control group ctrl_a * Counters for local and total MBM are assigned to both domains of monitor group mon_c within control group ctrl_b With above interface user space can, with a single read, get insight into how counters are assigned across all monitor groups. User space can write to the file to modify the flags. If assigning a new counter when no more counters are available then the write will fail. Potentially, if changes are made in order provided by the user then the user will be able to unassign counters from one group and re-assign to another group with a single write. I provide this purely to generate some ideas and gather more thoughts on a global interface. Reinette [1] https://lore.kernel.org/lkml/2f373abf-f0c0-4f5d-9e22-1039a40a57f0@xxxxxxx/ [2] https://lore.kernel.org/lkml/1a8c1cd6-a1ce-47a2-bc87-d4cccc84519b@xxxxxxx/