Re: [RFC PATCH v3 00/17] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Peter/Reinette,

On 5/3/2024 3:44 PM, Moger, Babu wrote:
Hi Peter,

On 5/2/2024 7:57 PM, Peter Newman wrote:
Hi Reinette,

On Thu, May 2, 2024 at 4:21 PM Reinette Chatre
<reinette.chatre@xxxxxxxxx> wrote:

Hi Peter and Babu,

On 5/2/2024 1:14 PM, Moger, Babu wrote:
Are you suggesting to enable ABMC by default when available?

I do think ABMC should be enabled by default when available and it looks
to be what this series aims to do [1]. The way I reason about this is
that legacy user space gets more reliable monitoring behavior without
needing to change behavior.

I don't like that for a monitor assignment-aware user, following the
creation of new monitoring groups, there will be less monitors
available for assignment. If the user wants precise control over where
monitors are allocated, they would need to manually unassign the
automatically-assigned monitor after creating new groups.

It's an annoyance, but I'm not sure if it would break any realistic
usage model. Maybe if the monitoring agent operates independently of

Yes. Its annoyance.

But if you think about it, normal users don't create too many groups.
They wont have to worry about assign/unassign headache if we enable monitor assignment automatically. Also there is pqos tool which uses this interface. It does not have to know about assign/unassign stuff.


whoever creates monitoring groups it could result in brief periods
where less monitors than expected are available because whoever just
created a new monitoring group hasn't given the automatically-assigned
monitors back yet.


I thought there was discussion about communicating to user space
when an attempt is made to read data from an event that does not
have a counter assigned. Something like below but I did not notice this
in this series.

# cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
Unassigned


Then provide the mount option switch back to legacy mode?
I am fine with that if we all agree on that.

Why is a mount option needed? I think we should avoid requiring a remount
unless required and I would like to understand why it is required here.

Peter: could you please elaborate what you mean with it makes it more
difficult for the FS code to generically manage monitor assignment?

Why would user space be required to recreate all control and monitor
groups if wanting to change how memory bandwidth monitoring is done?

I was looking at this more from the perspective of whether it's
necessary to support the live transition of the groups' configuration
back and forth between programming models.  I find it very unlikely
for the userspace controller software to change its mind about the
programming model for monitoring in a running system, so I thought
this would be in the same category as choosing at mount time whether
or not to use CDP or the MBA software controller.

Good point about the mount option is, we don't create extra files for monitor assignment in /sys/fs/resctrl when we mount with legacy option.

I think we still have not decided about the "mount" option for switching to legacy monitoring. Mount option seems safe at this point. We don't have to deal with extra files in resctrl filesystem with dynamic switching.


Also, in the software implementation of monitor assignment for older
AMD processors, which is based on allocating a subset of RMIDs, I'm
concerned that the context switch handler would want to read the
monitors associated with the incoming thread's current group to
determine whether it should use one of the tracked RMIDs. I believe it
would be cleaner if the lifetime of the generic monitor-tracking
structures would last until the static branches gating
__resctrl_sched_in() could be disabled.


 From this implementation it has been difficult to understand the impact
of switching between ABMC and legacy.

I'll see if there's a good way to share my software monitor assignment
prototype so it's clearer how the user interface would interact with
diverse implementations. Unfortunately, it's difficult to see the
required abstraction boundaries without the fs/resctrl refactoring
changes[1] applied. It would also require my changes[2] for reading a
thread's RMID from the FS structures to prevent monitor assignments
from forcing an update of all task_structs in the system.

-Peter

[1] https://lore.kernel.org/lkml/20240426150537.8094-1-Dave.Martin@xxxxxxx/ [2] https://lore.kernel.org/lkml/20240325172707.73966-1-peternewman@xxxxxxxxxx/



--
- Babu Moger




[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux