Hi Babu, On 5/24/24 5:23 AM, Babu Moger wrote:
d. This series adds a new interface file /sys/fs/resctrl/info/L3_MON/mbm_assign_control to list and modify the group's assignment states.
There was a lot of discussion resulting in this centralized file. At first glance this file appears to be very complicated and I believe any reasonable person would wonder if all of this is necessary. I recommend that you add a motivation for why this file is needed. Some items I recall are : it makes it easier for user space to learn how counters are used (no need to traverse resctrl and open()/close() many files), on the resctrl side it makes it possible to support counter re-assignment with a single IPI. There may be other motivations that I am forgetting now. Also, could the name just be "mbm_control"? What is enabled at this time are "assignable counters" but in the future we may want to add support for other flags that have nothing to do with "assignable counters".
The list follows the following format: "<CTRL_MON group>/<MON group>/<domain_id>=<assignment_flags>"
"assignment_flags" -> "flags" ? (throughout)
Format for specific type of groups: * Default CTRL_MON group: "//<domain_id>=<assignment_flags>" * Non-default CTRL_MON group: "<CTRL_MON group>//<domain_id>=<assignment_flags>" * Child MON group of default CTRL_MON group: "/<MON group>/<domain_id>=<assignment_flags>" * Child MON group of non-default CTRL_MON group: "<CTRL_MON group>/<MON group>/<domain_id>=<assignment_flags>" Assignment flags can be one of the following: t MBM total event is enabled l MBM local event is enabled tl Both total and local MBM events are enabled _ None of the MBM events are enabled Examples: # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control non_default_ctrl_mon_grp//0=tl;1=tl; non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl; //0=tl;1=tl; /child_default_mon_grp/0=tl;1=tl; There are four groups and all the groups have local and total event enabled on domain 0 and 1. =tl means both total and local events are enabled. "//" - This is a default CONTROL MON group "non_default_ctrl_mon_grp//" - This is non default CONTROL MON group
Be consistent with "non-default" (vs non default) as well as "CTRL_MON" (vs CONTROL MON).
"/child_default_mon_grp/" - This is Child MON group of the defult group
"Child" -> "child" "defult" -> "default"
"non_default_ctrl_mon_grp/child_non_default_mon_grp/" - This is child MON group of the non default group
non-default
e. Update the group assignment states using the interface file /sys/fs/resctrl/info/L3_MON/mbm_assign_control. The write format is similar to the above list format with addition of op-code for the assignment operation. * Default CTRL_MON group: "//<domain_id><op-code><assignment_flags>" * Non-default CTRL_MON group: "<CTRL_MON group>//<domain_id><op-code><assignment_flags>" * Child MON group of default CTRL_MON group: "/<MON group>/<domain_id><op-code><assignment_flags>" * Child MON group of non-default CTRL_MON group: "<CTRL_MON group>/<MON group>/<domain_id><op-code><assignment_flags>" Op-code can be one of the following: = Update the assignment to match the flags + Assign a new state - Unassign a new state
Looking here and the implementation it seems that "+_" and "-_" is supported. I think that should be invalid. Only "=_" seems appropriate to me. Also please take care to not have a catchall "default" that does an unassign. Doing something like that will prevent us from ever being able to add any flags in the future.
Initial group status: # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control non_default_ctrl_mon_grp//0=tl;1=tl; non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl; //0=tl;1=tl; /child_default_mon_grp/0=tl;1=tl; To update the default group to enable only total event on domain 0: # echo "//0=t" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control Assignment status after the update: # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control non_default_ctrl_mon_grp//0=tl;1=tl; non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl; //0=t;1=tl; /child_default_mon_grp/0=tl;1=tl; To update the MON group child_default_mon_grp to remove total event on domain 1: # echo "/child_default_mon_grp/1-t" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control Assignment status after the update: $ cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control non_default_ctrl_mon_grp//0=tl;1=tl; non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl; //0=t;1=l; /child_default_mon_grp/0=t;1=tl;
This does not look right. Why did domain #1 of the default CTRL_MON group change also?
To update the MON group non_default_ctrl_mon_grp/child_non_default_mon_grp to remove both local and total events on domain 1: # echo "non_default_ctrl_mon_grp/child_non_default_mon_grp/1=_" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control Assignment status after the update: # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control non_default_ctrl_mon_grp//0=tl;1=tl; non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_; //0=t;1=l; /child_default_mon_grp/0=t;1=tl; To update the default group to add a total event domain 1. # echo "//1+t" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
Unclear where "t" flag was removed.
Assignment status after the update: # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control non_default_ctrl_mon_grp//0=tl;1=tl; non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_; //0=t;1=tl; /child_default_mon_grp/0=t;1=tl; f. Read the event mbm_total_bytes and mbm_local_bytes of the default group. There is no change in reading the evetns with ABMC. If the event is unassigned
"evetns" -> "events"
when reading, then the read will come back as Unavailable.
Should this not rather be "Unassigned"? According to the docs the counters will return "Unavailable" right after reconfigure so it seems that there are scenarios where an "assigned" counter returns "Unavailable". It seems more useful to return "Unassigned" that will have a new specific meaning that overloading existing "Unavailable" that has original meaning of "try again" .... but in this case trying again will be futile.
# cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes 779247936 # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes 765207488 g. Users will have the option to go back to legacy_mbm mode if required. This can be done using the following command. # echo "legacy_mbm" > /sys/fs/resctrl/info/L3_MON/mbm_assign # cat /sys/fs/resctrl/info/L3_MON/mbm_assign abmc [mbm_legacy]
It is confusing for the value written by user space to be different from the value displayed: "legacy_mbm" vs "mbm_legacy. This is still missing information about what happens to the counters/events on such a switch. Will events just keep counting? Will they be reset? ...? I also think we should try to find a more generic name for this file. "mbm_cntr_mode" or "mbm_mode" maybe?
h. Check the bandwidth configuration for the group. Note that bandwidth configuration has a domain scope. Total event defaults to 0x7F (to count all the events) and local event defaults to 0x15 (to count all the local numa events). The event bitmap decoding is available at https://www.kernel.org/doc/Documentation/x86/resctrl.rst in section "mbm_total_bytes_config", "mbm_local_bytes_config": #cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config 0=0x7f;1=0x7f #cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config 0=0x15;1=0x15 j. Change the bandwidth source for domain 0 for the total event to count only reads. Note that this change effects total events on the domain 0. #echo 0=0x33 > /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config #cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config 0=0x33;1=0x7F k. Now read the total event again. The mbm_total_bytes should display only the read events. #cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes 314101
According to doc, right after a BMEC change the counter will read "Unavailable" is this not the case here?
l. Unmount the resctrl #umount /sys/fs/resctrl/
Reinette