Re: [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Reinette,

Thanks for the feedback for the series.

On 6/13/24 19:54, Reinette Chatre wrote:
Hi Babu,

On 5/24/24 5:23 AM, Babu Moger wrote:


d. This series adds a new interface file
/sys/fs/resctrl/info/L3_MON/mbm_assign_control
    to list and modify the group's assignment states.

There was a lot of discussion resulting in this centralized file. At first
glance this
file appears to be very complicated and I believe any reasonable person
would wonder if
all of this is necessary. I recommend that you add a motivation for why
this file is needed.
Some items I recall are : it makes it easier for user space to learn how
counters are used (no
need to traverse resctrl and open()/close() many files), on the resctrl
side it makes
it possible to support counter re-assignment with a single IPI. There may
be other motivations
that I am forgetting now.

Sure. Will add those details.

Also, could the name just be "mbm_control"? What is enabled at this time
are "assignable
counters" but in the future we may want to add support for other flags
that have nothing to
do with "assignable counters".

Yes. Sure.



    The list follows the following format:

    "<CTRL_MON group>/<MON group>/<domain_id>=<assignment_flags>"

"assignment_flags" -> "flags" ? (throughout)

Yes.




    Format for specific type of groups:

    * Default CTRL_MON group:
     "//<domain_id>=<assignment_flags>"

        * Non-default CTRL_MON group:
                "<CTRL_MON group>//<domain_id>=<assignment_flags>"

        * Child MON group of default CTRL_MON group:
                "/<MON group>/<domain_id>=<assignment_flags>"

        * Child MON group of non-default CTRL_MON group:
                "<CTRL_MON group>/<MON
group>/<domain_id>=<assignment_flags>"

        Assignment flags can be one of the following:

         t  MBM total event is enabled
         l  MBM local event is enabled
         tl Both total and local MBM events are enabled
         _  None of the MBM events are enabled

    Examples:

    # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
    non_default_ctrl_mon_grp//0=tl;1=tl;
    non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
    //0=tl;1=tl;
    /child_default_mon_grp/0=tl;1=tl;

    There are four groups and all the groups have local and total
    event enabled on domain 0 and 1.

    =tl means both total and local events are enabled.

    "//" - This is a default CONTROL MON group

    "non_default_ctrl_mon_grp//" - This is non default CONTROL MON group

Be consistent with "non-default" (vs non default) as well as "CTRL_MON" (vs
CONTROL MON).

Sure.



    "/child_default_mon_grp/"  - This is Child MON group of the defult
group

"Child" -> "child"
"defult" -> "default"

Yes.


    "non_default_ctrl_mon_grp/child_non_default_mon_grp/" - This is child
    MON group of the non default group

non-default

Sure.



e. Update the group assignment states using the interface file
/sys/fs/resctrl/info/L3_MON/mbm_assign_control.

    The write format is similar to the above list format with addition of
    op-code for the assignment operation.
    * Default CTRL_MON group:
            "//<domain_id><op-code><assignment_flags>"
    * Non-default CTRL_MON group:
            "<CTRL_MON group>//<domain_id><op-code><assignment_flags>"
    * Child MON group of default CTRL_MON group:
            "/<MON group>/<domain_id><op-code><assignment_flags>"
    * Child MON group of non-default CTRL_MON group:
            "<CTRL_MON group>/<MON
group>/<domain_id><op-code><assignment_flags>"
    Op-code can be one of the following:     = Update the assignment to match the flags
    + Assign a new state
    - Unassign a new state

Looking here and the implementation it seems that "+_" and "-_" is supported.
I think that should be invalid. Only "=_" seems appropriate to me.
Also please take care to not have a catchall "default" that does an
unassign. Doing something like that will prevent us from ever being
able to add any flags in the future.

Yes. Good catch..  Will fix it.


    Initial group status:     # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
    non_default_ctrl_mon_grp//0=tl;1=tl;
    non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
    //0=tl;1=tl;
    /child_default_mon_grp/0=tl;1=tl;
     To update the default group to enable only total event on domain 0:
     # echo "//0=t" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
     Assignment status after the update:
     # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
     non_default_ctrl_mon_grp//0=tl;1=tl;
     non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
     //0=t;1=tl;
     /child_default_mon_grp/0=tl;1=tl;
     To update the MON group child_default_mon_grp to remove total event
on domain 1:
     # echo "/child_default_mon_grp/1-t" >
/sys/fs/resctrl/info/L3_MON/mbm_assign_control
     Assignment status after the update:
     $ cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
     non_default_ctrl_mon_grp//0=tl;1=tl;
     non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
     //0=t;1=l;
     /child_default_mon_grp/0=t;1=tl;

This does not look right. Why did domain #1 of the default CTRL_MON group
change also?

Will correct  it.


     To update the MON group
non_default_ctrl_mon_grp/child_non_default_mon_grp to
     remove both local and total events on domain 1:
     # echo "non_default_ctrl_mon_grp/child_non_default_mon_grp/1=_" >
                   /sys/fs/resctrl/info/L3_MON/mbm_assign_control
     Assignment status after the update:
     # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
     non_default_ctrl_mon_grp//0=tl;1=tl;
     non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
     //0=t;1=l;
     /child_default_mon_grp/0=t;1=tl;
     To update the default group to add a total event domain 1.
     # echo "//1+t" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control

Unclear where "t" flag was removed.

Yes. Will correct.


     Assignment status after the update:
     # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
     non_default_ctrl_mon_grp//0=tl;1=tl;
     non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
     //0=t;1=tl;
     /child_default_mon_grp/0=t;1=tl;
f. Read the event mbm_total_bytes and mbm_local_bytes of the default group.
    There is no change in reading the evetns with ABMC. If the event is
unassigned

"evetns" -> "events"

Sure.


    when reading, then the read will come back as Unavailable.

Should this not rather be "Unassigned"? According to the docs the counters
will return "Unavailable" right after reconfigure so it seems that there
are scenarios where an "assigned" counter returns "Unavailable". It seems
more
useful to return "Unassigned" that will have a new specific meaning that
overloading existing "Unavailable" that has original meaning of "try
again" ....
but in this case trying again will be futile.

Hardware returns "Unavailable" in both the cases. So, thought of reporting the same without any interpretation.


    # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
    779247936
    # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
    765207488
g. Users will have the option to go back to legacy_mbm mode if required.
    This can be done using the following command.

    # echo "legacy_mbm" > /sys/fs/resctrl/info/L3_MON/mbm_assign
    # cat /sys/fs/resctrl/info/L3_MON/mbm_assign
         abmc
         [mbm_legacy]

It is confusing for the value written by user space to be different from
the value displayed: "legacy_mbm" vs "mbm_legacy.

My bad. Both should have been "legacy_mbm"


This is still missing information about what happens to the
counters/events on
such a switch. Will events just keep counting? Will they be reset? ...?

It will all reset.


I also think we should try to find a more generic name for this file.
"mbm_cntr_mode" or "mbm_mode" maybe?

"mbm_mode" looks better.  Then I will change "legacy_mbm" to "mbm_legacy".



h. Check the bandwidth configuration for the group. Note that bandwidth
    configuration has a domain scope. Total event defaults to 0x7F (to
    count all the events) and local event defaults to 0x15 (to count all
    the local numa events). The event bitmap decoding is available at
    https://www.kernel.org/doc/Documentation/x86/resctrl.rst
    in section "mbm_total_bytes_config", "mbm_local_bytes_config":
    #cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
    0=0x7f;1=0x7f
    #cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
    0=0x15;1=0x15
j. Change the bandwidth source for domain 0 for the total event to count
only reads.
    Note that this change effects total events on the domain 0.
    #echo 0=0x33 > /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
    #cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
    0=0x33;1=0x7F
k. Now read the total event again. The mbm_total_bytes should display
    only the read events.
    #cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
    314101

According to doc, right after a BMEC change the counter will read
"Unavailable"
is this not the case here?

Yes. First read will come back with "Unavailable". Will have add one line about that here.


l. Unmount the resctrl     #umount /sys/fs/resctrl/

Reinette



--
Thanks
Babu Moger




[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux