Hi Babu, On 2/23/2024 12:11 PM, Moger, Babu wrote: > On 2/23/24 11:17, Reinette Chatre wrote: >> >> >> On 2/20/2024 12:48 PM, Moger, Babu wrote: >>> On 2/20/24 09:21, James Morse wrote: >>>> On 19/01/2024 18:22, Babu Moger wrote: >> >>>>> e. Enable ABMC mode. >>>>> >>>>> #echo 1 > /sys/fs/resctrl/info/L3_MON/mbm_assign_enable >>>>> #cat /sys/fs/resctrl/info/L3_MON/mbm_assign_enable >>>>> 1 >>>> >>>> Why does this mode need enabling? Can't it be enabled automatically on hardware that >>>> supports it, or enabled implicitly when the first assignment attempt arrives? >>>> >>>> I guess this is really needed for a reset - could we implement that instead? This way >>>> there isn't an extra step user-space has to do to make the assignments work. >>> >>> Mostly the new features are added as an opt-in method. So, kept it that >>> way. If we enable this feature automatically, then we have provide an >>> option to disable it. >>> >> >> At the same time it sounds to me like ABMC can improve current users' >> experience without requiring them to do anything. This sounds appealing. >> For example, if I understand correctly, it may be possible to start resctrl >> with ABMC enabled by default and the number of monitoring groups (currently >> exposed to user space via "num_rmids") limited to the number of counters >> supported by ABMC. Existing users would then by default obtain better behavior >> of counters not resetting. > > Yes, I like the idea. But i will break compatibility with pqos > tool(intel_cmt_cat utility). pqos tool monitoring will not work without > supporting ABMC enablement in the tool. ABMC feature requires an extra > step to assign the counters for monitor to work. I am considering two scenarios, the "default behavior" is what a user will experience when booting resctrl on an ABMC system and the "new feature behavior" where a user can take full advantage of all that ABMC (and soft RMID, and MPAM) can offer. So, first, on an ABMC system in the "default behavior" scenario I expect that resctrl can do required ABMC counter configuration automatically at the time a monitor group is created. In this "default behavior" scenario resctrl would expose "num_rmids" to be half of the number of assignable counters. When a user then creates a monitor group two counters will be used and configured to count the local and total bytes respectively. If two counters are not available then ENOSPC returned, just like when system is out of closid/rmid. With this "default behavior" user space thus gets improved behavior without making any changes on its part. I do not have insight into how many counters ABMC could be expected to expose though ... so some users may be surprised at how few monitor groups can be created with new hardware? This may not be an issue since that would accurately reflect how many _reliable_ monitor groups can be created and if user needs more monitor groups then that would be a time to explore the "new feature" that requires changes in how user interacts with resctrl. Apart from the "default behavior" there are two options to consider ... (a) the "original" behavior(? I do not know what to call it) - this would be where user space wants(?) to have the current non-ABMC behavior on an ABMC system, where the previous "num_rmids" monitor groups can be created but the counters are reset unpredictably ... should this still be supported on ABMC systems though? (b) the "new feature" behavior where user space gets full benefit of ABMC that allows user space to create any number of monitor groups but then user space needs to let hardware (via resctrl) know which events should be counted. I expect that only (b) above would require user space change. Considering that per documentation, "num_rmids" means "This is the upper bound for how many "CTRL_MON" + "MON" groups can be created" I expect that "num_rmids" becomes undefined when "new feature" is enabled. When this new feature is enabled then user space is no longer limited by number of RMIDs on how many monitor groups can be created and this is the point that the user interface that you and Peter have ideas about comes into play. Specifically, user space needing a way to specify: (a) "let me create more monitor groups that the hardware can support"/"let me control which events/monitor groups are counted" (like the "mbm_assign" file in your proposal) (b) "here are the events that need to be counted" (like the "monitor_state" and "mbm_{local,total}_bytes_assigned" proposals) Reinette