-Fenghua (his email address does not work anymore) On 2/12/25 3:33 PM, Reinette Chatre wrote: > Hi Dave, > > On 2/12/25 9:46 AM, Dave Martin wrote: >> Hi there, >> >> On Wed, Jan 22, 2025 at 02:20:08PM -0600, Babu Moger wrote: >>> >>> This series adds the support for Assignable Bandwidth Monitoring Counters >>> (ABMC). It is also called QoS RMID Pinning feature >>> >>> Series is written such that it is easier to support other assignable >>> features supported from different vendors. >>> >>> The feature details are documented in the APM listed below [1]. >>> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming >>> Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth >>> Monitoring (ABMC). The documentation is available at >>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 >>> >>> The patches are based on top of commit >>> d361b84d51bfe (tip/master) Merge branch into tip/master: 'x86/tdx' >>> >>> # Introduction >> >> [...] >> >>> # Examples >>> >>> a. Check if ABMC support is available >>> #mount -t resctrl resctrl /sys/fs/resctrl/ >>> >>> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode >>> [mbm_cntr_assign] >>> default >> >> (Nit: can this be called "mbm_counter_assign"? The name is already >> long, so I wonder whether anything is gained by using a cryptic >> abbreviation for "counter". Same with all the "cntrs" elsewhere. >> This is purely cosmetic, though -- the interface works either way.) >> >>> ABMC feature is detected and it is enabled. >>> >>> b. Check how many ABMC counters are available. >>> >>> # cat /sys/fs/resctrl/info/L3_MON/num_mbm_cntrs >>> 32 >> >> Is this file needed? >> >> With MPAM, it is more difficult to promise that the same number of >> counters will be available everywhere. >> >> Rather than lie, or report a "safe" value here that may waste some >> counters, can we just allow the number of counters to be be discovered >> per domain via available_mbm_cntrs? > > This sounds reasonable to me. I think us having trouble with the > user documentation of this file so late in development should also have been > a sign to rethink its value. > > For a user to discover the number of counters supported via available_mbm_cntrs > would require the file's contents to be captured right after mount. Since we've > had scenarios where new userspace needs to discover an up-and-running system's > configuration this may not be possible. I thus wonder instead of removing > num_mbm_cntrs, it could be modified to return the per-domain supported counters > instead of a single value? > >> num_closids and num_rmids are already problematic for MPAM, so it would >> be good to avoid any more parameters of this sort from being reported >> to userspace unless there is a clear understanding of why they are >> needed. > > Yes. Appreciate your help in identifying what could be problematic for MPAM. > >> >> Reporting number of counters per monitoring domain is a more natural >> fit for MPAM, as below: >> >>> c. Check how many ABMC counters are available in each domain. >>> >>> # cat /sys/fs/resctrl/info/L3_MON/available_mbm_cntrs >>> 0=30;1=30 >> >> For MPAM, this seems supportable. Each monitoring domain will have >> some counters, and a well-defined number of them will be available for >> allocation at any one time. >> >>> d. Create few resctrl groups. >>> >>> # mkdir /sys/fs/resctrl/mon_groups/child_default_mon_grp >>> # mkdir /sys/fs/resctrl/non_default_ctrl_mon_grp >>> # mkdir /sys/fs/resctrl/non_default_ctrl_mon_grp/mon_groups/child_non_default_mon_grp >>> >>> e. This series adds a new interface file /sys/fs/resctrl/info/L3_MON/mbm_assign_control >>> to list and modify any group's monitoring states. File provides single place >>> to list monitoring states of all the resctrl groups. It makes it easier for >>> user space to learn about the used counters without needing to traverse all >>> the groups thus reducing the number of file system calls. >>> >>> The list follows the following format: >>> >>> "<CTRL_MON group>/<MON group>/<domain_id>=<flags>" >>> >>> Format for specific type of groups: >>> >>> * Default CTRL_MON group: >>> "//<domain_id>=<flags>" >> >> [...] >> >>> Flags can be one of the following: >>> >>> t MBM total event is enabled. >>> l MBM local event is enabled. >>> tl Both total and local MBM events are enabled. >>> _ None of the MBM events are enabled >>> >>> Examples: >> >> [...] >> >> I think that this basically works for MPAM. >> >> The local/total distinction doesn't map in a consistent way onto MPAM, >> but this problem is not specific to ABMC. It feels sensible for ABMC >> to be built around the same concepts that resctrl already has elsewhere >> in the interface. MPAM will do its best to fit (as already). >> >> Regarding Peter's use case of assiging multiple counters to a >> monitoring group [1], I feel that it's probably good enough to make >> sure that the ABMC interface can be extended in future in a backwards >> compatible way so as to support this, without trying to support it >> immediately. >> >> [1] https://lore.kernel.org/lkml/CALPaoCjY-3f2tWvBjuaQPfoPhxveWxxCxHqQMn4BEaeBXBa0bA@xxxxxxxxxxxxxx/ >> > > I do not think that resctrl's current support of the mbm_total_bytes and > mbm_local_bytes should be considered as the "only" two available "slots" > into which all possible events should be forced into. "mon_features" exists > to guide user space to which events are supported and as I see it new events > can be listed here to inform user space of their availability, with their > associated event files available in the resource groups. > >> >> For example, if we added new generic "letters" -- say, "0" to "9", >> combined with new counter files in resctrlfs, that feels like a >> possible approach. ABMC (as in this series) should just reject such >> such assignments, and the new counter files wouldn't exist. >> >> Availability of this feature could also be reported as a distinct mode >> in mbm_assign_mode, say "mbm_cntr_generic", or whatever. >> >> >> A _sketch_ of this follows. This is NOT a proposal -- the key >> question is whether we are confident that we can extend the interface >> in this way in the future without breaking anything. >> >> If "yes", then the ABMC interface (as proposed by this series) works as >> a foundation to build on. >> >> --8<-- >> >> [artists's impression] >> >> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode >> mbm_cntr_generic >> [mbm_cntr_assign] >> default >> >> # echo mbm_cntr_generic >/sys/fs/resctrl/info/L3_MON/mbm_assign_mode >> # echo '//0=01;1=23' >/sys/fs/resctrl/info/L3_MON/mbm_assign_control >> # echo t >/sys/fs/resctrl/info/L3_MON/mbm_counter0_bytes_type >> # echo l >/sys/fs/resctrl/info/L3_MON/mbm_counter1_bytes_type >> # echo t >/sys/fs/resctrl/info/L3_MON/mbm_counter2_bytes_type >> # echo l >/sys/fs/resctrl/info/L3_MON/mbm_counter3_bytes_type >> >> ... >> >> # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_counter1_bytes >> >> etc. >> > > It is not clear to me what additional features such an interface enables. It > also looks like user space will need to track and manage counter IDs? > > It sounds to me as though the issue starts with your statement > "The local/total distinction doesn't map in a consistent way onto MPAM". To > address this I expect that an MPAM system will not support nor list > mbm_total_bytes and/or mbm_local_bytes in its mon_features file (*)? Instead, > it would list the events that are appropriate to the system? Trying to match > with what Peter said [1] in the message you refer to, this may be possible: > > # cat /sys/fs/resctrl/info/L3_MON/mon_features > mbm_local_read_bytes > mbm_local_write_bytes > mbm_local_bytes > > (*) I am including mbm_local_bytes since it could be an event that can be software > defined as a sum of mbm_local_read_bytes and mbm_local_write_bytes when they are both > counted. > > I see the support for MPAM events distinct from the support of assignable counters. > Once the MPAM events are sorted, I think that they can be assigned with existing interface. > Please help me understand if you see it differently. > > Doing so would need to come up with alphabetical letters for these events, > which seems to be needed for your proposal also? If we use possible flags of: > > mbm_local_read_bytes a > mbm_local_write_bytes b > > Then mbm_assign_control can be used as: > # echo '//0=ab;1=b' >/sys/fs/resctrl/info/L3_MON/mbm_assign_control > # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_read_bytes > <value> > # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes > <sum of mbm_local_read_bytes and mbm_local_write_bytes> > > One issue would be when resctrl needs to support more than 26 events (no more flags available), > assuming that upper case would be used for "shared" counters (unless this interface is defined > differently and only few uppercase letters used for it). Would this be too low of a limit? > > Reinette > > [1] https://lore.kernel.org/lkml/CALPaoCjY-3f2tWvBjuaQPfoPhxveWxxCxHqQMn4BEaeBXBa0bA@xxxxxxxxxxxxxx/