Hi Reinette,
On 8/16/24 16:28, Reinette Chatre wrote:
Hi Babu,
On 8/6/24 3:00 PM, Babu Moger wrote:
Feature adds following interface files:
/sys/fs/resctrl/info/L3_MON/mbm_mode: Reports the list of assignable
monitoring features supported. The enclosed brackets indicate which
feature is enabled.
I've been considering this file as a generic file where all future "MBM
modes"
can be captured, while this series treats it as specific to "assignable
monitoring
features" (btw, should this be "assignable monitoring modes" to match the
name?).
Looking closer at this implementation it does make things easier that
"mbm_mode" is
specific to "assignable monitoring features" but when doing so I think it
should have
a less generic name to avoid the obstacles we have with the existing
"mon_features".
Apologies that this goes back to be close to what you had earlier ... maybe
"mbm_assign_mode"?
Lets see:
#cat /sys/fs/resctrl/info/L3_MON/mbm_mode
[mbm_cntr_assign] <- This already says 'assign'. Isn't that enough?
default <- Default mode is not related assignable features.
I would think mbm_mode is fine. Let me know.
/sys/fs/resctrl/info/L3_MON/num_mbm_cntrs: Reports the number of monitoring
counters available for assignment.
/sys/fs/resctrl/info/L3_MON/mbm_control: Reports the resctrl group and
monitor
status of each group. Assignment state can be updated by writing to the
interface.
# Examples
a. Check if ABMC support is available
#mount -t resctrl resctrl /sys/fs/resctrl/
#cat /sys/fs/resctrl/info/L3_MON/mbm_mode
[mbm_cntr_assign]
legacy
ABMC feature is detected and it is enabled.
b. Check how many ABMC counters are available.
#cat /sys/fs/resctrl/info/L3_MON/num_mbm_cntrs
32
c. Create few resctrl groups.
# mkdir /sys/fs/resctrl/mon_groups/child_default_mon_grp
# mkdir /sys/fs/resctrl/non_default_ctrl_mon_grp
# mkdir
/sys/fs/resctrl/non_default_ctrl_mon_grp/mon_groups/child_non_default_mon_grp
d. This series adds a new interface file
/sys/fs/resctrl/info/L3_MON/mbm_control
to list and modify the group's monitoring states. File provides
single place
to list monitoring states of all the resctrl groups. It makes it
easier for
user space to learn about the counters are used without needing to
traverse
"to learn about the counters are used" -> "to learn the counters that are
used" or
"to learn about the used counters" or ...?
Sure.
all the groups thus reducing the number of file system calls.
The list follows the following format:
"<CTRL_MON group>/<MON group>/<domain_id>=<flags>"
Format for specific type of groups:
* Default CTRL_MON group:
"//<domain_id>=<flags>"
* Non-default CTRL_MON group:
"<CTRL_MON group>//<domain_id>=<flags>"
* Child MON group of default CTRL_MON group:
"/<MON group>/<domain_id>=<flags>"
* Child MON group of non-default CTRL_MON group:
"<CTRL_MON group>/<MON group>/<domain_id>=<flags>"
Flags can be one of the following:
t MBM total event is enabled.
l MBM local event is enabled.
tl Both total and local MBM events are enabled.
_ None of the MBM events are enabled
Examples:
# cat /sys/fs/resctrl/info/L3_MON/mbm_control
non_default_ctrl_mon_grp//0=tl;1=tl;
non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
//0=tl;1=tl;
/child_default_mon_grp/0=tl;1=tl;
There are four groups and all the groups have local and total
event enabled on domain 0 and 1.
e. Update the group assignment states using the interface file
/sys/fs/resctrl/info/L3_MON/mbm_control.
The write format is similar to the above list format with addition
of opcode for the assignment operation.
“<CTRL_MON group>/<MON group>/<domain_id><opcode><flags>”
* Default CTRL_MON group:
"//<domain_id><opcode><flags>"
* Non-default CTRL_MON group:
"<CTRL_MON group>//<domain_id><opcode><flags>"
* Child MON group of default CTRL_MON group:
"/<MON group>/<domain_id><opcode><flags>"
* Child MON group of non-default CTRL_MON group:
"<CTRL_MON group>/<MON group>/<domain_id><opcode><flags>"
Opcode can be one of the following:
= Update the assignment to match the flag.
+ Assign a new event.
- Unassign a new event.
Since user space can provide more than one flag the text could be more
accurate
noting this. Eg. "Update the assignment to match the flag" -> "Update the
assignment
to match the flags.".
Sure.
Flags can be one of the following:
t MBM total event.
l MBM local event.
tl Both total and local MBM events.
_ None of the MBM events. Only works with '=' opcode.
Please take care with the implementation that seems to support a variety of
combinations. If I understand correctly the implementation support flags
like,
for example, "tttt", "llll", "ltlt" ... those may not be an issue but of most
concern is, for example, a pattern like "_lt" that (unexpectedly) appears to
result in set of total and local.
Yes. Should we not allow flag combinations with "_"?
I am not very sure about how to go about this.
Initial group status:
# cat /sys/fs/resctrl/info/L3_MON/mbm_control
non_default_ctrl_mon_grp//0=tl;1=tl;
non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
//0=tl;1=tl;
/child_default_mon_grp/0=tl;1=tl;
To update the default group to enable only total event on domain 0:
# echo "//0=t" > /sys/fs/resctrl/info/L3_MON/mbm_control
Assignment status after the update:
# cat /sys/fs/resctrl/info/L3_MON/mbm_control
non_default_ctrl_mon_grp//0=tl;1=tl;
non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
//0=t;1=tl;
/child_default_mon_grp/0=tl;1=tl;
To update the MON group child_default_mon_grp to remove total event
on domain 1:
# echo "/child_default_mon_grp/1-t" >
/sys/fs/resctrl/info/L3_MON/mbm_control
Assignment status after the update:
$ cat /sys/fs/resctrl/info/L3_MON/mbm_control
non_default_ctrl_mon_grp//0=tl;1=tl;
non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
//0=t;1=tl;
/child_default_mon_grp/0=tl;1=l;
To update the MON group
non_default_ctrl_mon_grp/child_non_default_mon_grp to
remove both local and total events on domain 1:
# echo "non_default_ctrl_mon_grp/child_non_default_mon_grp/1=_" >
/sys/fs/resctrl/info/L3_MON/mbm_control
Assignment status after the update:
non_default_ctrl_mon_grp//0=tl;1=tl;
non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
//0=t;1=tl;
/child_default_mon_grp/0=tl;1=l;
To update the default group to add a local event domain 0.
# echo "//0+l" > /sys/fs/resctrl/info/L3_MON/mbm_control
Assignment status after the update:
# cat /sys/fs/resctrl/info/L3_MON/mbm_control
non_default_ctrl_mon_grp//0=tl;1=tl;
non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
//0=tl;1=tl;
/child_default_mon_grp/0=tl;1=l;
To update the non default CTRL_MON group non_default_ctrl_mon_grp to
unassign all
the MBM events on all the domains.
# echo "non_default_ctrl_mon_grp//*=_" >
/sys/fs/resctrl/info/L3_MON/mbm_control
Assignment status after the update:
# cat /sys/fs/resctrl/info/L3_MON/mbm_control
non_default_ctrl_mon_grp//0=_;1=_;
non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
//0=tl;1=tl;
/child_default_mon_grp/0=tl;1=l;
f. Read the event mbm_total_bytes and mbm_local_bytes of the default group.
There is no change in reading the events with ABMC. If the event is
unassigned
when reading, then the read will come back as "Unassigned".
# cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
779247936
# cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
765207488
g. Check the bandwidth configuration for the group. Note that bandwidth
configuration has a domain scope. Total event defaults to 0x7F (to
count all the events) and local event defaults to 0x15 (to count all
the local numa events). The event bitmap decoding is available at
https://www.kernel.org/doc/Documentation/x86/resctrl.rst
in section "mbm_total_bytes_config", "mbm_local_bytes_config":
#cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
0=0x7f;1=0x7f
#cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
0=0x15;1=0x15
h. Change the bandwidth source for domain 0 for the total event to count
only reads.
Note that this change effects total events on the domain 0.
#echo 0=0x33 > /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
#cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
0=0x33;1=0x7F
i. Now read the total event again. The first read will come back with
"Unavailable"
status. The subsequent read of mbm_total_bytes will display only the
read events.
#cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
Unavailable
#cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
314101
j. Users will have the option to go back to legacy mbm_mode if required.
This can be done using the following command. Note that switching the
mbm_mode will reset all the mbm counters of all resctrl groups.
"reset all the mbm counters" -> "reset all the MBM counters"
Sure.
# echo "legacy" > /sys/fs/resctrl/info/L3_MON/mbm_mode
# cat /sys/fs/resctrl/info/L3_MON/mbm_mode
mbm_cntr_assign
[legacy]
k. Unmount the resctrl
#umount /sys/fs/resctrl/
---
v6:
We still need to finalize few interface details on mbm_mode and
mbm_control
in case of ABMC and Soft-ABMC. We can continue the discussion with
this series.
Could you please list the details that need to be finalized?
1. mbm_mode display
# cat /sys/fs/resctrl/info/L3_MON/mbm_mode
mbm_cntr_assign
[legacy]
"mbm_cntr_assign"
Are we sticking with ""mbm_cntr_assign" for ABMC?
What should we name for soft-ABMC?
2. Also we had some concerns about Individual event assignment(ABMC)
and group assignment(soft-ABMC)?
Are the flags "t" and 'l' good for both these modes?
Thank you
Reinette
--
Thanks
Babu Moger