Re: [PATCH v9 08/26] x86/resctrl: Introduce the interface to display monitor mode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Babu,

On 11/18/24 11:04 AM, Moger, Babu wrote:
> Hi Reinette,
> 
> On 11/15/24 18:00, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 10/29/24 4:21 PM, Babu Moger wrote:
>>> Introduce the interface file "mbm_assign_mode" to list monitor modes
>>> supported.
>>>
>>> The "mbm_cntr_assign" mode provides the option to assign a counter to
>>> an RMID, event pair and monitor the bandwidth as long as it is assigned.
>>>
>>> On AMD systems "mbm_cntr_assign" is backed by the ABMC (Assignable
>>> Bandwidth Monitoring Counters) hardware feature and is enabled by default.
>>>
>>> The "default" mode is the existing monitoring mode that works without the
>>> explicit counter assignment, instead relying on dynamic counter assignment
>>> by hardware that may result in hardware not dedicating a counter resulting
>>> in monitoring data reads returning "Unavailable".
>>>
>>> Provide an interface to display the monitor mode on the system.
>>> $ cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
>>> [mbm_cntr_assign]
>>> default
>>>
>>> Signed-off-by: Babu Moger <babu.moger@xxxxxxx>
>>> ---

...

>> I'm concerned that users with Intel platforms may want to use the "mbm_cntr_assign" mode
>> to make the event data "more predictable" and then be concerned when the mode does
>> not exist.
>>
>> As an alternative, is it possible to know the number of hardware counters on AMD systems
>> without ABMC? I wonder if we could perhaps always expose num_mbm_cntrs as a way for
>> users to know if their platform may be impacted by this type of "unpredictability" (by comparing 
>> num_mbm_cntrs to num_rmids).
> 
> There is some round about(or hacky) way to find that out number of RMIDs
> that can be active.

Does this give consistent and accurate data? Is this something that can be added to resctrl?
(Reading your other message [1] it does not sound as though it can produce an accurate
number on boot.)
If not then it will be up to the documentation to be accurate.


>>> +
>>> +	AMD Platforms with ABMC (Assignable Bandwidth Monitoring Counters) feature
>>> +	enable this mode by default so that counters remain assigned even when the
>>> +	corresponding RMID is not in use by any processor.
>>> +
>>> +	"default":
>>> +
>>> +	In default mode resctrl assumes there is a hardware counter for each
>>> +	event within every CTRL_MON and MON group. Reading mbm_total_bytes or
>>> +	mbm_local_bytes may report 'Unavailable' if there is no counter associated
>>> +	with that event.
>>
>> If I understand correctly, on AMD platforms without ABMC the events only report
>> "Unavailable" if there is no counter assigned at the time of the query. If a counter
>> is unassigned and then reassigned then the event count will reset and the user
>> will get some data back but it may thus be unpredictable (to match earlier language).
>> Is this correct? Any AMD platform in "default" mode may thus be vulnerable to
>> "unpredictable" event counts (not just "Unavailable") ... this gets complicated
> 
> Yes. All the AMD systems without ABMC are affected by this problem.
> 
>> because users should be steered to avoid "default" mode if mbm_assign_mode is
>> available, while not be made concerned to use "default" mode on Intel where
>> mbm_assign_mode is not available.
> 
> Can we add text to clarify this?

Please do.

Reinette

[1] https://lore.kernel.org/all/35fc70fd-0281-4ac8-b32b-efa2f4516901@xxxxxxx/




[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux