Hi Reinette, On 3/12/25 12:14, Reinette Chatre wrote: > Hi Babu, > > On 3/12/25 9:03 AM, Moger, Babu wrote: >> Hi Reinette, >> >> On 3/12/25 10:07, Reinette Chatre wrote: >>> Hi Babu, >>> >>> On 3/11/25 1:35 PM, Moger, Babu wrote: >>>> Hi All, >>>> >>>> On 3/10/25 22:51, Reinette Chatre wrote: >>>>> >>>>> >>>>> On 3/10/25 6:44 PM, Moger, Babu wrote: >>>>>> Hi Tony, >>>>>> >>>>>> On 3/10/2025 6:22 PM, Luck, Tony wrote: >>>>>>> On Mon, Mar 10, 2025 at 05:48:44PM -0500, Moger, Babu wrote: >>>>>>>> Hi All, >>>>>>>> >>>>>>>> On 3/5/2025 1:34 PM, Moger, Babu wrote: >>>>>>>>> Hi Peter, >>>>>>>>> >>>>>>>>> On 3/5/25 04:40, Peter Newman wrote: >>>>>>>>>> Hi Babu, >>>>>>>>>> >>>>>>>>>> On Tue, Mar 4, 2025 at 10:49 PM Moger, Babu <babu.moger@xxxxxxx> wrote: >>>>>>>>>>> >>>>>>>>>>> Hi Peter, >>>>>>>>>>> >>>>>>>>>>> On 3/4/25 10:44, Peter Newman wrote: >>>>>>>>>>>> On Mon, Mar 3, 2025 at 8:16 PM Moger, Babu <babu.moger@xxxxxxx> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Hi Peter/Reinette, >>>>>>>>>>>>> >>>>>>>>>>>>> On 2/26/25 07:27, Peter Newman wrote: >>>>>>>>>>>>>> Hi Babu, >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Tue, Feb 25, 2025 at 10:31 PM Moger, Babu <babu.moger@xxxxxxx> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Peter, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 2/25/25 11:11, Peter Newman wrote: >>>>>>>>>>>>>>>> Hi Reinette, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Fri, Feb 21, 2025 at 11:43 PM Reinette Chatre >>>>>>>>>>>>>>>> <reinette.chatre@xxxxxxxxx> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi Peter, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 2/21/25 5:12 AM, Peter Newman wrote: >>>>>>>>>>>>>>>>>> On Thu, Feb 20, 2025 at 7:36 PM Reinette Chatre >>>>>>>>>>>>>>>>>> <reinette.chatre@xxxxxxxxx> wrote: >>>>>>>>>>>>>>>>>>> On 2/20/25 6:53 AM, Peter Newman wrote: >>>>>>>>>>>>>>>>>>>> On Wed, Feb 19, 2025 at 7:21 PM Reinette Chatre >>>>>>>>>>>>>>>>>>>> <reinette.chatre@xxxxxxxxx> wrote: >>>>>>>>>>>>>>>>>>>>> On 2/19/25 3:28 AM, Peter Newman wrote: >>>>>>>>>>>>>>>>>>>>>> On Tue, Feb 18, 2025 at 6:50 PM Reinette Chatre >>>>>>>>>>>>>>>>>>>>>> <reinette.chatre@xxxxxxxxx> wrote: >>>>>>>>>>>>>>>>>>>>>>> On 2/17/25 2:26 AM, Peter Newman wrote: >>>>>>>>>>>>>>>>>>>>>>>> On Fri, Feb 14, 2025 at 8:18 PM Reinette Chatre >>>>>>>>>>>>>>>>>>>>>>>> <reinette.chatre@xxxxxxxxx> wrote: >>>>>>>>>>>>>>>>>>>>>>>>> On 2/14/25 10:31 AM, Moger, Babu wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> On 2/14/2025 12:26 AM, Reinette Chatre wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>> On 2/13/25 9:37 AM, Dave Martin wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 12, 2025 at 03:33:31PM -0800, Reinette Chatre wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 2/12/25 9:46 AM, Dave Martin wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Jan 22, 2025 at 02:20:08PM -0600, Babu Moger wrote: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> (quoting relevant parts with goal to focus discussion on new possible syntax) >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> I see the support for MPAM events distinct from the support of assignable counters. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Once the MPAM events are sorted, I think that they can be assigned with existing interface. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Please help me understand if you see it differently. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Doing so would need to come up with alphabetical letters for these events, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> which seems to be needed for your proposal also? If we use possible flags of: >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> mbm_local_read_bytes a >>>>>>>>>>>>>>>>>>>>>>>>>>>>> mbm_local_write_bytes b >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Then mbm_assign_control can be used as: >>>>>>>>>>>>>>>>>>>>>>>>>>>>> # echo '//0=ab;1=b' >/sys/fs/resctrl/info/L3_MON/mbm_assign_control >>>>>>>>>>>>>>>>>>>>>>>>>>>>> # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_read_bytes >>>>>>>>>>>>>>>>>>>>>>>>>>>>> <value> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes >>>>>>>>>>>>>>>>>>>>>>>>>>>>> <sum of mbm_local_read_bytes and mbm_local_write_bytes> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> One issue would be when resctrl needs to support more than 26 events (no more flags available), >>>>>>>>>>>>>>>>>>>>>>>>>>>>> assuming that upper case would be used for "shared" counters (unless this interface is defined >>>>>>>>>>>>>>>>>>>>>>>>>>>>> differently and only few uppercase letters used for it). Would this be too low of a limit? >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> As mentioned above, one possible issue with existing interface is that >>>>>>>>>>>>>>>>>>>>>>>>> it is limited to 26 events (assuming only lower case letters are used). The limit >>>>>>>>>>>>>>>>>>>>>>>>> is low enough to be of concern. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> The events which can be monitored by a single counter on ABMC and MPAM >>>>>>>>>>>>>>>>>>>>>>>> so far are combinable, so 26 counters per group today means it limits >>>>>>>>>>>>>>>>>>>>>>>> breaking down MBM traffic for each group 26 ways. If a user complained >>>>>>>>>>>>>>>>>>>>>>>> that a 26-way breakdown of a group's MBM traffic was limiting their >>>>>>>>>>>>>>>>>>>>>>>> investigation, I would question whether they know what they're looking >>>>>>>>>>>>>>>>>>>>>>>> for. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> The key here is "so far" as well as the focus on MBM only. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> It is impossible for me to predict what we will see in a couple of years >>>>>>>>>>>>>>>>>>>>>>> from Intel RDT, AMD PQoS, and Arm MPAM that now all rely on resctrl interface >>>>>>>>>>>>>>>>>>>>>>> to support their users. Just looking at the Intel RDT spec the event register >>>>>>>>>>>>>>>>>>>>>>> has space for 32 events for each "CPU agent" resource. That does not take into >>>>>>>>>>>>>>>>>>>>>>> account the "non-CPU agents" that are enumerated via ACPI. Tony already mentioned >>>>>>>>>>>>>>>>>>>>>>> that he is working on patches [1] that will add new events and shared the idea >>>>>>>>>>>>>>>>>>>>>>> that we may be trending to support "perf" like events associated with RMID. I >>>>>>>>>>>>>>>>>>>>>>> expect AMD PQoS and Arm MPAM to provide related enhancements to support their >>>>>>>>>>>>>>>>>>>>>>> customers. >>>>>>>>>>>>>>>>>>>>>>> This all makes me think that resctrl should be ready to support more events than 26. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> I was thinking of the letters as representing a reusable, user-defined >>>>>>>>>>>>>>>>>>>>>> event-set for applying to a single counter rather than as individual >>>>>>>>>>>>>>>>>>>>>> events, since MPAM and ABMC allow us to choose the set of events each >>>>>>>>>>>>>>>>>>>>>> one counts. Wherever we define the letters, we could use more symbolic >>>>>>>>>>>>>>>>>>>>>> event names. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thank you for clarifying. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> In the letters as events model, choosing the events assigned to a >>>>>>>>>>>>>>>>>>>>>> group wouldn't be enough information, since we would want to control >>>>>>>>>>>>>>>>>>>>>> which events should share a counter and which should be counted by >>>>>>>>>>>>>>>>>>>>>> separate counters. I think the amount of information that would need >>>>>>>>>>>>>>>>>>>>>> to be encoded into mbm_assign_control to represent the level of >>>>>>>>>>>>>>>>>>>>>> configurability supported by hardware would quickly get out of hand. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Maybe as an example, one counter for all reads, one counter for all >>>>>>>>>>>>>>>>>>>>>> writes in ABMC would look like... >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> (L3_QOS_ABMC_CFG.BwType field names below) >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> (per domain) >>>>>>>>>>>>>>>>>>>>>> group 0: >>>>>>>>>>>>>>>>>>>>>> counter 0: LclFill,RmtFill,LclSlowFill,RmtSlowFill >>>>>>>>>>>>>>>>>>>>>> counter 1: VictimBW,LclNTWr,RmtNTWr >>>>>>>>>>>>>>>>>>>>>> group 1: >>>>>>>>>>>>>>>>>>>>>> counter 2: LclFill,RmtFill,LclSlowFill,RmtSlowFill >>>>>>>>>>>>>>>>>>>>>> counter 3: VictimBW,LclNTWr,RmtNTWr >>>>>>>>>>>>>>>>>>>>>> ... >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I think this may also be what Dave was heading towards in [2] but in that >>>>>>>>>>>>>>>>>>>>> example and above the counter configuration appears to be global. You do mention >>>>>>>>>>>>>>>>>>>>> "configurability supported by hardware" so I wonder if per-domain counter >>>>>>>>>>>>>>>>>>>>> configuration is a requirement? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> If it's global and we want a particular group to be watched by more >>>>>>>>>>>>>>>>>>>> counters, I wouldn't want this to result in allocating more counters >>>>>>>>>>>>>>>>>>>> for that group in all domains, or allocating counters in domains where >>>>>>>>>>>>>>>>>>>> they're not needed. I want to encourage my users to avoid allocating >>>>>>>>>>>>>>>>>>>> monitoring resources in domains where a job is not allowed to run so >>>>>>>>>>>>>>>>>>>> there's less pressure on the counters. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> In Dave's proposal it looks like global configuration means >>>>>>>>>>>>>>>>>>>> globally-defined "named counter configurations", which works because >>>>>>>>>>>>>>>>>>>> it's really per-domain assignment of the configurations to however >>>>>>>>>>>>>>>>>>>> many counters the group needs in each domain. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I think I am becoming lost. Would a global configuration not break your >>>>>>>>>>>>>>>>>>> view of "event-set applied to a single counter"? If a counter is configured >>>>>>>>>>>>>>>>>>> globally then it would not make it possible to support the full configurability >>>>>>>>>>>>>>>>>>> of the hardware. >>>>>>>>>>>>>>>>>>> Before I add more confusion, let me try with an example that builds on your >>>>>>>>>>>>>>>>>>> earlier example copied below: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> (per domain) >>>>>>>>>>>>>>>>>>>>>> group 0: >>>>>>>>>>>>>>>>>>>>>> counter 0: LclFill,RmtFill,LclSlowFill,RmtSlowFill >>>>>>>>>>>>>>>>>>>>>> counter 1: VictimBW,LclNTWr,RmtNTWr >>>>>>>>>>>>>>>>>>>>>> group 1: >>>>>>>>>>>>>>>>>>>>>> counter 2: LclFill,RmtFill,LclSlowFill,RmtSlowFill >>>>>>>>>>>>>>>>>>>>>> counter 3: VictimBW,LclNTWr,RmtNTWr >>>>>>>>>>>>>>>>>>>>>> ... >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Since the above states "per domain" I rewrite the example to highlight that as >>>>>>>>>>>>>>>>>>> I understand it: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> group 0: >>>>>>>>>>>>>>>>>>> domain 0: >>>>>>>>>>>>>>>>>>> counter 0: LclFill,RmtFill,LclSlowFill,RmtSlowFill >>>>>>>>>>>>>>>>>>> counter 1: VictimBW,LclNTWr,RmtNTWr >>>>>>>>>>>>>>>>>>> domain 1: >>>>>>>>>>>>>>>>>>> counter 0: LclFill,RmtFill,LclSlowFill,RmtSlowFill >>>>>>>>>>>>>>>>>>> counter 1: VictimBW,LclNTWr,RmtNTWr >>>>>>>>>>>>>>>>>>> group 1: >>>>>>>>>>>>>>>>>>> domain 0: >>>>>>>>>>>>>>>>>>> counter 2: LclFill,RmtFill,LclSlowFill,RmtSlowFill >>>>>>>>>>>>>>>>>>> counter 3: VictimBW,LclNTWr,RmtNTWr >>>>>>>>>>>>>>>>>>> domain 1: >>>>>>>>>>>>>>>>>>> counter 2: LclFill,RmtFill,LclSlowFill,RmtSlowFill >>>>>>>>>>>>>>>>>>> counter 3: VictimBW,LclNTWr,RmtNTWr >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> You mention that you do not want counters to be allocated in domains that they >>>>>>>>>>>>>>>>>>> are not needed in. So, let's say group 0 does not need counter 0 and counter 1 >>>>>>>>>>>>>>>>>>> in domain 1, resulting in: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> group 0: >>>>>>>>>>>>>>>>>>> domain 0: >>>>>>>>>>>>>>>>>>> counter 0: LclFill,RmtFill,LclSlowFill,RmtSlowFill >>>>>>>>>>>>>>>>>>> counter 1: VictimBW,LclNTWr,RmtNTWr >>>>>>>>>>>>>>>>>>> group 1: >>>>>>>>>>>>>>>>>>> domain 0: >>>>>>>>>>>>>>>>>>> counter 2: LclFill,RmtFill,LclSlowFill,RmtSlowFill >>>>>>>>>>>>>>>>>>> counter 3: VictimBW,LclNTWr,RmtNTWr >>>>>>>>>>>>>>>>>>> domain 1: >>>>>>>>>>>>>>>>>>> counter 2: LclFill,RmtFill,LclSlowFill,RmtSlowFill >>>>>>>>>>>>>>>>>>> counter 3: VictimBW,LclNTWr,RmtNTWr >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> With counter 0 and counter 1 available in domain 1, these counters could >>>>>>>>>>>>>>>>>>> theoretically be configured to give group 1 more data in domain 1: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> group 0: >>>>>>>>>>>>>>>>>>> domain 0: >>>>>>>>>>>>>>>>>>> counter 0: LclFill,RmtFill,LclSlowFill,RmtSlowFill >>>>>>>>>>>>>>>>>>> counter 1: VictimBW,LclNTWr,RmtNTWr >>>>>>>>>>>>>>>>>>> group 1: >>>>>>>>>>>>>>>>>>> domain 0: >>>>>>>>>>>>>>>>>>> counter 2: LclFill,RmtFill,LclSlowFill,RmtSlowFill >>>>>>>>>>>>>>>>>>> counter 3: VictimBW,LclNTWr,RmtNTWr >>>>>>>>>>>>>>>>>>> domain 1: >>>>>>>>>>>>>>>>>>> counter 0: LclFill,RmtFill >>>>>>>>>>>>>>>>>>> counter 1: LclNTWr,RmtNTWr >>>>>>>>>>>>>>>>>>> counter 2: LclSlowFill,RmtSlowFill >>>>>>>>>>>>>>>>>>> counter 3: VictimBW >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The counters are shown with different per-domain configurations that seems to >>>>>>>>>>>>>>>>>>> match with earlier goals of (a) choose events counted by each counter and >>>>>>>>>>>>>>>>>>> (b) do not allocate counters in domains where they are not needed. As I >>>>>>>>>>>>>>>>>>> understand the above does contradict global counter configuration though. >>>>>>>>>>>>>>>>>>> Or do you mean that only the *name* of the counter is global and then >>>>>>>>>>>>>>>>>>> that it is reconfigured as part of every assignment? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Yes, I meant only the *name* is global. I assume based on a particular >>>>>>>>>>>>>>>>>> system configuration, the user will settle on a handful of useful >>>>>>>>>>>>>>>>>> groupings to count. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Perhaps mbm_assign_control syntax is the clearest way to express an example... >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> # define global configurations (in ABMC terms), not necessarily in this >>>>>>>>>>>>>>>>>> # syntax and probably not in the mbm_assign_control file. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> r=LclFill,RmtFill,LclSlowFill,RmtSlowFill >>>>>>>>>>>>>>>>>> w=VictimBW,LclNTWr,RmtNTWr >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> # legacy "total" configuration, effectively r+w >>>>>>>>>>>>>>>>>> t=LclFill,RmtFill,LclSlowFill,RmtSlowFill,VictimBW,LclNTWr,RmtNTWr >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> /group0/0=t;1=t >>>>>>>>>>>>>>>>>> /group1/0=t;1=t >>>>>>>>>>>>>>>>>> /group2/0=_;1=t >>>>>>>>>>>>>>>>>> /group3/0=rw;1=_ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> - group2 is restricted to domain 0 >>>>>>>>>>>>>>>>>> - group3 is restricted to domain 1 >>>>>>>>>>>>>>>>>> - the rest are unrestricted >>>>>>>>>>>>>>>>>> - In group3, we decided we need to separate read and write traffic >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> This consumes 4 counters in domain 0 and 3 counters in domain 1. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I see. Thank you for the example. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> resctrl supports per-domain configurations with the following possible when >>>>>>>>>>>>>>>>> using mbm_total_bytes_config and mbm_local_bytes_config: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> t(domain 0)=LclFill,RmtFill,LclSlowFill,RmtSlowFill,VictimBW,LclNTWr,RmtNTWr >>>>>>>>>>>>>>>>> t(domain 1)=LclFill,RmtFill,VictimBW,LclNTWr,RmtNTWr >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> /group0/0=t;1=t >>>>>>>>>>>>>>>>> /group1/0=t;1=t >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Even though the flags are identical in all domains, the assigned counters will >>>>>>>>>>>>>>>>> be configured differently in each domain. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> With this supported by hardware and currently also supported by resctrl it seems >>>>>>>>>>>>>>>>> reasonable to carry this forward to what will be supported next. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The hardware supports both a per-domain mode, where all groups in a >>>>>>>>>>>>>>>> domain use the same configurations and are limited to two events per >>>>>>>>>>>>>>>> group and a per-group mode where every group can be configured and >>>>>>>>>>>>>>>> assigned freely. This series is using the legacy counter access mode >>>>>>>>>>>>>>>> where only counters whose BwType matches an instance of QOS_EVT_CFG_n >>>>>>>>>>>>>>>> in the domain can be read. If we chose to read the assigned counter >>>>>>>>>>>>>>>> directly (QM_EVTSEL[ExtendedEvtID]=1, QM_EVTSEL[EvtID]=L3CacheABMC) >>>>>>>>>>>>>>>> rather than asking the hardware to find the counter by RMID, we would >>>>>>>>>>>>>>>> not be limited to 2 counters per group/domain and the hardware would >>>>>>>>>>>>>>>> have the same flexibility as on MPAM. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> In extended mode, the contents of a specific counter can be read by >>>>>>>>>>>>>>> setting the following fields in QM_EVTSEL: [ExtendedEvtID]=1, >>>>>>>>>>>>>>> [EvtID]=L3CacheABMC and setting [RMID] to the desired counter ID. Reading >>>>>>>>>>>>>>> QM_CTR will then return the contents of the specified counter. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> It is documented below. >>>>>>>>>>>>>>> https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/24593.pdf >>>>>>>>>>>>>>> Section: 19.3.3.3 Assignable Bandwidth Monitoring (ABMC) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> We previously discussed this with you (off the public list) and I >>>>>>>>>>>>>>> initially proposed the extended assignment mode. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yes, the extended mode allows greater flexibility by enabling multiple >>>>>>>>>>>>>>> counters to be assigned to the same group, rather than being limited to >>>>>>>>>>>>>>> just two. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> However, the challenge is that we currently lack the necessary interfaces >>>>>>>>>>>>>>> to configure multiple events per group. Without these interfaces, the >>>>>>>>>>>>>>> extended mode is not practical at this time. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Therefore, we ultimately agreed to use the legacy mode, as it does not >>>>>>>>>>>>>>> require modifications to the existing interface, allowing us to continue >>>>>>>>>>>>>>> using it as is. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> (I might have said something confusing in my last messages because I >>>>>>>>>>>>>>>> had forgotten that I switched to the extended assignment mode when >>>>>>>>>>>>>>>> prototyping with soft-ABMC and MPAM.) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Forcing all groups on a domain to share the same 2 counter >>>>>>>>>>>>>>>> configurations would not be acceptable for us, as the example I gave >>>>>>>>>>>>>>>> earlier is one I've already been asked about. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I don’t see this as a blocker. It should be considered an extension to the >>>>>>>>>>>>>>> current ABMC series. We can easily build on top of this series once we >>>>>>>>>>>>>>> finalize how to configure the multiple event interface for each group. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I don't think it is, either. Only being able to use ABMC to assign >>>>>>>>>>>>>> counters is fine for our use as an incremental step. My longer-term >>>>>>>>>>>>>> concern is the domain-scoped mbm_total_bytes_config and >>>>>>>>>>>>>> mbm_local_bytes_config files, but they were introduced with BMEC, so >>>>>>>>>>>>>> there's already an expectation that the files are present when BMEC is >>>>>>>>>>>>>> supported. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On ABMC hardware that also supports BMEC, I'm concerned about enabling >>>>>>>>>>>>>> ABMC when only the BMEC-style event configuration interface exists. >>>>>>>>>>>>>> The scope of my issue is just whether enabling "full" ABMC support >>>>>>>>>>>>>> will require an additional opt-in, since that could remove the BMEC >>>>>>>>>>>>>> interface. If it does, it's something we can live with. >>>>>>>>>>>>> >>>>>>>>>>>>> As you know, this series is currently blocked without further feedback. >>>>>>>>>>>>> >>>>>>>>>>>>> I’d like to begin reworking these patches to incorporate Peter’s feedback. >>>>>>>>>>>>> Any input or suggestions would be appreciated. >>>>>>>>>>>>> >>>>>>>>>>>>> Here’s what we’ve learned so far: >>>>>>>>>>>>> >>>>>>>>>>>>> 1. Assignments should be independent of BMEC. >>>>>>>>>>>>> 2. We should be able to specify multiple event types to a counter (e.g., >>>>>>>>>>>>> read, write, victimBM, etc.). This is also called shared counter >>>>>>>>>>>>> 3. There should be an option to assign events per domain. >>>>>>>>>>>>> 4. Currently, only two counters can be assigned per group, but the design >>>>>>>>>>>>> should allow flexibility to assign more in the future as the interface >>>>>>>>>>>>> evolves. >>>>>>>>>>>>> 5. Utilize the extended RMID read mode. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Here is my proposal using Peter's earlier example: >>>>>>>>>>>>> >>>>>>>>>>>>> # define event configurations >>>>>>>>>>>>> >>>>>>>>>>>>> ======================================================== >>>>>>>>>>>>> Bits Mnemonics Description >>>>>>>>>>>>> ==== ======================================================== >>>>>>>>>>>>> 6 VictimBW Dirty Victims from all types of memory >>>>>>>>>>>>> 5 RmtSlowFill Reads to slow memory in the non-local NUMA domain >>>>>>>>>>>>> 4 LclSlowFill Reads to slow memory in the local NUMA domain >>>>>>>>>>>>> 3 RmtNTWr Non-temporal writes to non-local NUMA domain >>>>>>>>>>>>> 2 LclNTWr Non-temporal writes to local NUMA domain >>>>>>>>>>>>> 1 mtFill Reads to memory in the non-local NUMA domain >>>>>>>>>>>>> 0 LclFill Reads to memory in the local NUMA domain >>>>>>>>>>>>> ==== ======================================================== >>>>>>>>>>>>> >>>>>>>>>>>>> #Define flags based on combination of above event types. >>>>>>>>>>>>> >>>>>>>>>>>>> t = LclFill,RmtFill,LclSlowFill,RmtSlowFill,VictimBW,LclNTWr,RmtNTWr >>>>>>>>>>>>> l = LclFill, LclNTWr, LclSlowFill >>>>>>>>>>>>> r = LclFill,RmtFill,LclSlowFill,RmtSlowFill >>>>>>>>>>>>> w = VictimBW,LclNTWr,RmtNTWr >>>>>>>>>>>>> v = VictimBW >>>>>>>>>>>>> >>>>>>>>>>>>> Peter suggested the following format earlier : >>>>>>>>>>>>> >>>>>>>>>>>>> /group0/0=t;1=t >>>>>>>>>>>>> /group1/0=t;1=t >>>>>>>>>>>>> /group2/0=_;1=t >>>>>>>>>>>>> /group3/0=rw;1=_ >>>>>>>>>>>> >>>>>>>>>>>> After some inquiries within Google, it sounds like nobody has invested >>>>>>>>>>>> much into the current mbm_assign_control format yet, so it would be >>>>>>>>>>>> best to drop it and distribute the configuration around the filesystem >>>>>>>>>>>> hierarchy[1], which should allow us to produce something more flexible >>>>>>>>>>>> and cleaner to implement. >>>>>>>>>>>> >>>>>>>>>>>> Roughly what I had in mind: >>>>>>>>>>>> >>>>>>>>>>>> Use mkdir in a info/<resource>_MON subdirectory to create free-form >>>>>>>>>>>> names for the assignable configurations rather than being restricted >>>>>>>>>>>> to single letters. In the resulting directory, populate a file where >>>>>>>>>>>> we can specify the set of events the config should represent. I think >>>>>>>>>>>> we should use symbolic names for the events rather than raw BMEC field >>>>>>>>>>>> values. Moving forward we could come up with portable names for common >>>>>>>>>>>> events and only support the BMEC names on AMD machines for users who >>>>>>>>>>>> want specific events and don't care about portability. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I’m still processing this. Let me start with some initial questions. >>>>>>>>>>> >>>>>>>>>>> So, we are creating event configurations here, which seems reasonable. >>>>>>>>>>> >>>>>>>>>>> Yes, we should use portable names and are not limited to BMEC names. >>>>>>>>>>> >>>>>>>>>>> How many configurations should we allow? Do we know? >>>>>>>>>> >>>>>>>>>> Do we need an upper limit? >>>>>>>>> >>>>>>>>> I think so. This needs to be maintained in some data structure. We can >>>>>>>>> start with 2 default configurations for now. >>>>> >>>>> There is a big difference between no upper limit and 2. The hardware is >>>>> capable of supporting per-domain configurations so more flexibility is >>>>> certainly possible. Consider the example presented by Peter in: >>>>> https://lore.kernel.org/lkml/CALPaoCi0mFZ9TycyNs+SCR+2tuRJovQ2809jYMun4HtC64hJmA@xxxxxxxxxxxxxx/ >>>>> >>>>>>>>>>>> Next, put assignment-control file nodes in per-domain directories >>>>>>>>>>>> (i.e., mon_data/mon_L3_00/assign_{exclusive,shared}). Writing a >>>>>>>>>>>> counter-configuration name into the file would then allocate a counter >>>>>>>>>>>> in the domain, apply the named configuration, and monitor the parent >>>>>>>>>>>> group-directory. We can also put a group/resource-scoped assign_* file >>>>>>>>>>>> higher in the hierarchy to make it easier for users who want to >>>>>>>>>>>> configure all domains the same for a group. >>>>>>>>>>> >>>>>>>>>>> What is the difference between shared and exclusive? >>>>>>>>>> >>>>>>>>>> Shared assignment[1] means that non-exclusively-assigned counters in >>>>>>>>>> each domain will be scheduled round-robin to the groups requesting >>>>>>>>>> shared access to a counter. In my tests, I assigned the counters long >>>>>>>>>> enough to produce a single 1-second MB/s sample for the per-domain >>>>>>>>>> aggregation files[2]. >>>>>>>>>> >>>>>>>>>> These do not need to be implemented immediately, but knowing that they >>>>>>>>>> work addresses the overhead and scalability concerns of reassigning >>>>>>>>>> counters and reading their values. >>>>>>>>> >>>>>>>>> Ok. Lets focus on exclusive assignments for now. >>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Having three files—assign_shared, assign_exclusive, and unassign—for each >>>>>>>>>>> domain seems excessive. In a system with 32 groups and 12 domains, this >>>>>>>>>>> results in 32 × 12 × 3 files, which is quite large. >>>>>>>>>>> >>>>>>>>>>> There should be a more efficient way to handle this. >>>>>>>>>>> >>>>>>>>>>> Initially, we started with a group-level file for this interface, but it >>>>>>>>>>> was rejected due to the high number of sysfs calls, making it inefficient. >>>>>>>>>> >>>>>>>>>> I had rejected it due to the high-frequency of access of a large >>>>>>>>>> number of files, which has since been addressed by shared assignment >>>>>>>>>> (or automatic reassignment) and aggregated mbps files. >>>>>>>>> >>>>>>>>> I think we should address this as well. Creating three extra files for >>>>>>>>> each group isn’t ideal when there are more efficient alternatives. >>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Additionally, how can we list all assignments with a single sysfs call? >>>>>>>>>>> >>>>>>>>>>> That was another problem we need to address. >>>>>>>>>> >>>>>>>>>> This is not a requirement I was aware of. If the user forgot where >>>>>>>>>> they assigned counters (or forgot to disable auto-assignment), they >>>>>>>>>> can read multiple sysfs nodes to remind themselves. >>>>>>>>> >>>>>>>>> I suggest, we should provide users with an option to list the assignments >>>>>>>>> of all groups in a single command. As the number of groups increases, it >>>>>>>>> becomes cumbersome to query each group individually. >>>>>>>>> >>>>>>>>> To achieve this, we can reuse our existing mbm_assign_control interface >>>>>>>>> for this purpose. More details on this below. >>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> The configuration names listed in assign_* would result in files of >>>>>>>>>>>> the same name in the appropriate mon_data domain directories from >>>>>>>>>>>> which the count values can be read. >>>>>>>>>>>> >>>>>>>>>>>> # mkdir info/L3_MON/counter_configs/mbm_local_bytes >>>>>>>>>>>> # echo LclFill > info/L3_MON/counter_configs/mbm_local_bytes/event_filter >>>>>>>>>>>> # echo LclNTWr > info/L3_MON/counter_configs/mbm_local_bytes/event_filter >>>>>>>>>>>> # echo LclSlowFill > info/L3_MON/counter_configs/mbm_local_bytes/event_filter >>>>>>>>>>>> # cat info/L3_MON/counter_configs/mbm_local_bytes/event_filter >>>>>>>>>>>> LclFill >>>>>>>>>>>> LclNTWr >>>>>>>>>>>> LclSlowFill >>>>>>>>>>> >>>>>>>>>>> I feel we can just have the configs. event_filter file is not required. >>>>>>>>>> >>>>>>>>>> That's right, I forgot that we can implement kernfs_ops::open(). I was >>>>>>>>>> only looking at struct kernfs_syscall_ops >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> #cat info/L3_MON/counter_configs/mbm_local_bytes >>>>>>>>>>> LclFill <-rename these to generic names. >>>>>>>>>>> LclNTWr >>>>>>>>>>> LclSlowFill >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I think portable and non-portable event names should both be available >>>>>>>>>> as options. There are simple bandwidth measurement mechanisms that >>>>>>>>>> will be applied in general, but when they turn up an issue, it can >>>>>>>>>> often lead to a more focused investigation, requiring more precise >>>>>>>>>> events. >>>>>>>>> >>>>>>>>> I aggree. We should provide both portable and non-portable event names. >>>>>>>>> >>>>>>>>> Here is my draft proposal based on the discussion so far and reusing some >>>>>>>>> of the current interface. Idea here is to start with basic assigment >>>>>>>>> feature with options to enhance it in the future. Feel free to >>>>>>>>> comment/suggest. >>>>>>>>> >>>>>>>>> 1. Event configurations will be in >>>>>>>>> /sys/fs/resctrl/info/L3_MON/counter_configs/. >>>>>>>>> >>>>>>>>> There will be two pre-defined configurations by default. >>>>>>>>> >>>>>>>>> #cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_total_bytes >>>>>>>>> LclFill, LclNTWr,LclSlowFill,VictimBM,RmtSlowFill,LclSlowFill,RmtFill >>>>>>>>> >>>>>>>>> #cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes >>>>>>>>> LclFill, LclNTWr, LclSlowFill >>>>>>>>> >>>>>>>>> 2. Users will have options to update these configurations. >>>>>>>>> >>>>>>>>> #echo "LclFill, LclNTWr, RmtFill" > >>>>>>>>> /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes >>>>>>> >>>>>>> This part seems odd to me. Now the "mbm_local_bytes" files aren't >>>>>>> reporting "local_bytes" any more. They report something different, >>>>>>> and users only know if they come to check the options currently >>>>>>> configured in this file. Changing the contents without changing >>>>>>> the name seems confusing to me. >>>>>> >>>>>> It is the same behaviour right now with BMEC. It is configurable. >>>>>> By default it is mbm_local_bytes, but users can configure whatever they want to monitor using /info/L3_MON/mbm_local_bytes_config. >>>>>> >>>>>> We can continue the same behaviour with ABMC, but the configuration will be in /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes. >>>>> >>>>> This could be supported by following Peter's original proposal where the name >>>>> of the counter configuration is provided by the user via a mkdir: >>>>> https://lore.kernel.org/lkml/CALPaoCiii0vXOF06mfV=kVLBzhfNo0SFqt4kQGwGSGVUqvr2Dg@xxxxxxxxxxxxxx/ >>>>> >>>>> As he mentioned there could be pre-populated mbm_local_bytes/mbm_total_bytes. >>>> >>>> Sure. We can do that. I was thinking in the first phase, just provide the >>>> default pre-defined configuration and option to update the configuration. >>>> >>>> We can add the mkdir support later. That way we can provide basic ABMC >>>> support without too much code complexity with mkdir support. >>> >>> This is not clear to me how you envision the "first phase". Is it what you >>> proposed above, for example: >>> #echo "LclFill, LclNTWr, RmtFill" > >>> /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes >>> >>> In above the counter configuration name is a file. >> >> Yes. That is correct. >> >> There will be two configuration files by default when resctrl is mounted >> when ABMC is enabled. >> /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_total_bytes >> /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes >> >>> >>> How could mkdir support be added to this later if there are already files present? >> >> We already have these directories when resctrl is mounted. >> /sys/fs/resctrl/test/mon_data/mon_L3_00/mbm_total_bytes >> /sys/fs/resctrl/test/mon_data/mon_L3_00/mbm_local_bytes >> /sys/fs/resctrl/test/mon_data/mon_L3_01/mbm_total_bytes >> /sys/fs/resctrl/test/mon_data/mon_L3_01/mbm_local_bytes >> >> We dont need "mkdir" support for default configurations. > > I was referring to the "mkdir" support for additional configurations that > I understood you are thinking about adding later. For example, > (copied from Peter's message > https://lore.kernel.org/lkml/CALPaoCiii0vXOF06mfV=kVLBzhfNo0SFqt4kQGwGSGVUqvr2Dg@xxxxxxxxxxxxxx/): > > > # mkdir info/L3_MON/counter_configs/mbm_local_bytes > # echo LclFill > info/L3_MON/counter_configs/mbm_local_bytes/event_filter > # echo LclNTWr > info/L3_MON/counter_configs/mbm_local_bytes/event_filter > # echo LclSlowFill > info/L3_MON/counter_configs/mbm_local_bytes/event_filter > # cat info/L3_MON/counter_configs/mbm_local_bytes/event_filter > LclFill > LclNTWr > LclSlowFill > > Any "later" work needs to be backward compatible with the first phase. Actually, we dont need extra file "event_filter". This was discussed here. https://lore.kernel.org/lkml/CALPaoChLL8p49eANYgQ0dJiFs7G=223fGae+LJyx3DwEhNeR8A@xxxxxxxxxxxxxx/ # echo LclFill > info/L3_MON/counter_configs/mbm_local_bytes # echo LclNTWr > info/L3_MON/counter_configs/mbm_local_bytes # echo LclSlowFill > info/L3_MON/counter_configs/mbm_local_bytes # cat info/L3_MON/counter_configs/mbm_local_bytes LclFill LclNTWr LclSlowFill In the future, we can add mkdir support. # mkdir info/L3_MON/counter_configs/mbm_read_only # echo LclFill > info/L3_MON/counter_configs/mbm_read_only # cat info/L3_MON/counter_configs/mbm_read_only LclFill #echo mbm_read_only > test/mon_data/mon_L3_00/assign_exclusive Which would result in the creation of test/mon_data/mon_L3_*/mbm_read_only So, there is not breakage of backword compatibility. > > If the first phase starts with a file: > /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes > ... I do not see how second phase can be backward compatible when that work > needs a directory with the same name that contains a file for configuration: > /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes/event_filter > > sidenote: I think interactions with the "event_filter" file needs more > descriptions since it is not clear with the provided example how user space > may want to interact with the file when adding vs replacing event configurations. > >> >> My plan was to support only the default configurations in the first phase. >> That way there is no difference in the usage model with ABMC when mounted. >> >> >>> >>>> >>>>> >>>>>> >>>>>>> >>>>>>>>> >>>>>>>>> # #cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes >>>>>>>>> LclFill, LclNTWr, RmtFill >>>>>>>>> >>>>>>>>> 3. The default configurations will be used when user mounts the resctrl. >>>>>>>>> >>>>>>>>> mount -t resctrl resctrl /sys/fs/resctrl/ >>>>>>>>> mkdir /sys/fs/resctrl/test/ >>>>>>>>> >>>>>>>>> 4. The resctrl group/domains can be in one of these assingnment states. >>>>>>>>> e: Exclusive >>>>>>>>> s: Shared >>>>>>>>> u: Unassigned >>>>>>>>> >>>>>>>>> Exclusive mode is supported now. Shared mode will be supported in the >>>>>>>>> future. >>>>>>>>> >>>>>>>>> 5. We can use the current /sys/fs/resctrl/info/L3_MON/mbm_assign_control >>>>>>>>> to list the assignment state of all the groups. >>>>>>>>> >>>>>>>>> Format: >>>>>>>>> "<CTRL_MON group>/<MON group>/<confguration>:<domain_id>=<assign state>" >>>>>>>>> >>>>>>>>> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control >>>>>>>>> test//mbm_total_bytes:0=e;1=e >>>>>>>>> test//mbm_local_bytes:0=e;1=e >>>>>>>>> //mbm_total_bytes:0=e;1=e >>>>>>>>> //mbm_local_bytes:0=e;1=e >>>>> >>>>> This would make mbm_assign_control even more unwieldy and quicker to exceed a >>>>> page of data (these examples never seem to reflect those AMD systems with the many >>>>> L3 domains). How to handle resctrl files larger than 4KB needs to be well understood >>>>> and solved when/if going this route. >>>> >>>> This problem is not specific this series. I feel it is a generic problem >>>> to many of the semilar interfaces. I dont know how it is addressed. May >>>> have to investigate on this. Any pointers would be helpful. >>> >>> Dave Martin already did a lot of analysis here. What other pointers do you need? Yea. He did. I still need little more details on implementation of that. Will come back to that when we decide which way to go. >>> >>>> >>>> >>>>> >>>>> There seems to be two opinions about this file at moment. Would it be possible to >>>>> summarize the discussion with pros/cons raised to make an informed selection? >>>>> I understand that Google as represented by Peter no longer requires/requests this >>>>> file but the motivation for this change seems new and does not seem to reduce the >>>>> original motivation for this file. We may also want to separate requirements for reading >>>>> from and writing to this file. >>>> >>>> Yea. We can just use mbm_assign_control for reading the assignment states. >>>> >>>> Summary: We have two proposals. >>>> >>>> First one from Peter: >>>> >>>> https://lore.kernel.org/lkml/CALPaoCiii0vXOF06mfV=kVLBzhfNo0SFqt4kQGwGSGVUqvr2Dg@xxxxxxxxxxxxxx/ >>>> >>>> >>>> Pros >>>> a. Allows flexible creation of free-form names for assignable >>>> configurations, stored in info/L3_MON/counter_configs/. >>>> >>>> b. Events can be accessed using corresponding free-form names in the >>>> mon_data directory, making it clear to users what each event represents. >>>> >>>> >>>> Cons: >>>> a. Requires three separate files for assignment in each group >>>> (assign_exclusive, assign_shared, unassign), which might be excessive. >>>> >>>> b. No built-in listing support, meaning users must query each group >>>> individually to check assignment states. >>>> >>>> >>>> Second Proposal (Mine) >>>> >>>> https://lore.kernel.org/lkml/a4ab53b5-03be-4299-8853-e86270d46f2e@xxxxxxx/ >>>> >>>> Pros: >>>> >>>> a. Maintains the flexibility of free-form names for assignable >>>> configurations (info/L3_MON/counter_configs/). >>>> >>>> b. Events remain accessible via free-form names in mon_data, ensuring >>>> clarity on their purpose. >>>> >>>> c. Adds the ability to list assignment states for all groups in a single >>>> command. >>>> >>>> Cons: >>>> a. Potential buffer overflow issues when handling a large number of >>>> groups and domains and code complexity to fix the issue. >>>> >>>> >>>> Third Option: A Hybrid Approach >>>> >>>> We could combine elements from both proposals: >>>> >>>> a. Retain the free-form naming approach for assignable configurations in >>>> info/L3_MON/counter_configs/. >>>> >>>> b. Use the assignment method from the first proposal: >>>> $mkdir test >>>> $echo mbm_local_bytes > test/mon_data/mon_L3_00/assign_exclusive >>>> >>>> c. Introduce listing support via the info/L3_MON/mbm_assign_control >>>> interface, enabling users to read assignment states for all groups in one >>>> place. Only reading support. >>>> >>>> >>>>> >>>>>>>>> >>>>>>>>> 6. Users can modify the assignment state by writing to mbm_assign_control. >>>>>>>>> >>>>>>>>> Format: >>>>>>>>> “<CTRL_MON group>/<MON group>/<configuration>:<domain_id>=<assign state>” >>>>>>>>> >>>>>>>>> #echo "test//mbm_local_bytes:0=e;1=e" > >>>>>>>>> /sys/fs/resctrl/info/L3_MON/mbm_assign_control >>>>>>>>> >>>>>>>>> #echo "test//mbm_local_bytes:0=u;1=u" > >>>>>>>>> /sys/fs/resctrl/info/L3_MON/mbm_assign_control >>>>>>>>> >>>>>>>>> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control >>>>>>>>> test//mbm_total_bytes:0=u;1=u >>>>>>>>> test//mbm_local_bytes:0=u;1=u >>>>>>>>> //mbm_total_bytes:0=e;1=e >>>>>>>>> //mbm_local_bytes:0=e;1=e >>>>>>>>> >>>>>>>>> The corresponding events will be read in >>>>>>>>> >>>>>>>>> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes >>>>>>>>> /sys/fs/resctrl/mon_data/mon_L3_01/mbm_total_bytes >>>>>>>>> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes >>>>>>>>> /sys/fs/resctrl/mon_data/mon_L3_01/mbm_local_bytes >>>>>>>>> /sys/fs/resctrl/test/mon_data/mon_L3_00/mbm_total_bytes >>>>>>>>> /sys/fs/resctrl/test/mon_data/mon_L3_01/mbm_total_bytes >>>>>>>>> /sys/fs/resctrl/test/mon_data/mon_L3_00/mbm_local_bytes >>>>>>>>> /sys/fs/resctrl/test/mon_data/mon_L3_01/mbm_local_bytes >>>>>>>>> >>>>>>>>> 7. In the first stage, only two configurations(mbm_total_bytes and >>>>>>>>> mbm_local_bytes) will be supported. >>>>>>>>> >>>>>>>>> 8. In the future, there will be options to create multiple configurations >>>>>>>>> and corresponding directory will be created in >>>>>>>>> /sysf/fs/resctrl/test/mon_data/mon_L3_00/<configation name>. >>>>>>> >>>>>>> Would this be done by creating a new file in the /sys/fs/resctrl/info/L3_MON/counter_configs >>>>>>> directory? Like this: >>>>>>> >>>>>>> # echo "LclFill, LclNTWr, RmtFill" > >>>>>>> /sys/fs/resctrl/info/L3_MON/counter_configs/cache_stuff >>>>>>> >>>>>>> This seems OK (dependent on the user picking meaningful names for >>>>>>> the set of attributes picked ... but if they want to name this >>>>>>> monitor file "brian" then they have to live with any confusion >>>>>>> that they bring on themselves). >>>>>>> >>>>>>> Would this involve an extension to kernfs? I don't see a function >>>>>>> pointer callback for file creation in kernfs_syscall_ops. >>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> I know you are all busy with multiple series going on parallel. I am still >>>>>>>> waiting for the inputs on this. It will be great if you can spend some time >>>>>>>> on this to see if we can find common ground on the interface. >>>>>>>> >>>>>>>> Thanks >>>>>>>> Babu >>>>>>> >>>>>>> -Tony >>>>>>> >>>>>> >>>>>> >>>>>> thanks >>>>>> Babu >>>>> >>>>> Reinette >>>>> >>>>> >>>> >>> >>> >> > > -- Thanks Babu Moger