Re: [PATCH v11 07/23] x86/resctrl: Introduce the interface to display monitor mode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi James,

On 2/21/2025 12:06 PM, James Morse wrote:
Hi Babu,

On 22/01/2025 20:20, Babu Moger wrote:
Introduce the interface file "mbm_assign_mode" to list monitor modes
supported.

The "mbm_cntr_assign" mode provides the option to assign a counter to
an RMID, event pair and monitor the bandwidth as long as it is assigned.

On AMD systems "mbm_cntr_assign" is backed by the ABMC (Assignable
Bandwidth Monitoring Counters) hardware feature and is enabled by default.

The "default" mode is the existing monitoring mode that works without the
explicit counter assignment, instead relying on dynamic counter assignment
by hardware that may result in hardware not dedicating a counter resulting
in monitoring data reads returning "Unavailable".

Provide an interface to display the monitor mode on the system.
$ cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
[mbm_cntr_assign]
default

diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
index fb90f08e564e..b5defc5bce0e 100644
--- a/Documentation/arch/x86/resctrl.rst
+++ b/Documentation/arch/x86/resctrl.rst
@@ -257,6 +257,32 @@ with the following files:
  	    # cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
  	    0=0x30;1=0x30;3=0x15;4=0x15
+"mbm_assign_mode":
+	Reports the list of monitoring modes supported. The enclosed brackets
+	indicate which mode is enabled.
+	::
+
+	  # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
+	  [mbm_cntr_assign]
+	  default
+
+	"mbm_cntr_assign":
+
+	In mbm_cntr_assign, monitoring event can only accumulate data while
+	it is backed by a hardware counter. The user-space is able to specify
+	which of the events in CTRL_MON or MON groups should have a counter
+	assigned using the "mbm_assign_control" file. The number of counters
+	available is described in the "num_mbm_cntrs" file. Changing the mode
+	may cause all counters on a resource to reset.

+	"default":
+
+	In default mode, resctrl assumes there is a hardware counter for each
+	event within every CTRL_MON and MON group. On AMD platforms, it is
+	recommended to use mbm_cntr_assign mode if supported, because reading
+	"mbm_total_bytes" or "mbm_local_bytes" will report 'Unavailable' if
+	there is no counter associated with that event.

But if you read a value instead of "Unavailable", that doesn't mean the value is correct.
For two reads that succeed, the counter may have been reset in the middle.

I'm suggesting something like:
| it is recommended to use mbm_cntr_assign mode if supported, to avoid counters
| being re-allocated by hardware. This can cause a misleading value to be read,
| or if no counter is associated with that event "Unavailable".

Looks good with slight modification.

On AMD platforms, it is recommended to use the mbm_cntr_assign mode, if supported, to prevent the hardware from resetting counters between reads. This can result in misleading values or display "Unavailable" if no counter is assigned to the event.



  "max_threshold_occupancy":
  		Read/write file provides the largest value (in
  		bytes) at which a previously used LLC_occupancy
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index f91fe605766f..3880480a41d2 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -854,6 +854,30 @@ static int rdtgroup_rmid_show(struct kernfs_open_file *of,
  	return ret;
  }
+static int resctrl_mbm_assign_mode_show(struct kernfs_open_file *of,
+					struct seq_file *s, void *v)
+{
+	struct rdt_resource *r = of->kn->parent->priv;
+
+	mutex_lock(&rdtgroup_mutex);
+
+	if (r->mon.mbm_cntr_assignable) {
+		if (resctrl_arch_mbm_cntr_assign_enabled(r)) {
+			seq_puts(s, "[mbm_cntr_assign]\n");
+			seq_puts(s, "default\n");
+		} else {
+			seq_puts(s, "mbm_cntr_assign\n");
+			seq_puts(s, "[default]\n");
+		}

What do you think to an architecture being able to opt-out of this flexibility?

If there aren't enough counters I can expose what the hardware has through this interface
- but if user-space turns it off ... then what?

For MPAM this would need to be some best-effort software allocation strategy that I'd
rather not write - its not a problem that can be solved, and any value that is reported is
likely to be wrong. For ABMC platforms, existing stable kernels expose a value, so being
able to preserve the existing behaviour makes sense. MPAM doesn't have this problem.

Something like this:
----------%<----------
@@ -861,16 +861,21 @@ static int resctrl_mbm_assign_mode_show(struct kernfs_open
_file *of,
                                         struct seq_file *s, void *v)
  {
         struct rdt_resource *r = of->kn->parent->priv;
+       bool enabled = resctrl_arch_mbm_cntr_assign_enabled(r);

         mutex_lock(&rdtgroup_mutex);

         if (r->mon.mbm_cntr_assignable) {
-               if (resctrl_arch_mbm_cntr_assign_enabled(r)) {
+               if (enabled)
                         seq_puts(s, "[mbm_cntr_assign]\n");
-                       seq_puts(s, "default\n");
-               } else {
-                       seq_puts(s, "mbm_cntr_assign\n");
+               else
                         seq_puts(s, "[default]\n");
+
+               if (!IS_ENABLED(CONFIG_RESCTRL_ASSIGN_FIXED) {
+                       if (enabled)
+                               seq_puts(s, "default\n");
+                       else
+                               seq_puts(s, "mbm_cntr_assign\n");
                 }
         } else {
                 seq_puts(s, "[default]\n");
----------%<----------

x86 wouldn't define CONFIG_RESCTRL_ASSIGN_FIXED, arm64 would, meaning for MPAM the file
would be either:
  | [default]
or
| [mbm_cntr_assign]


Looks good mostly.

resctrl_arch_mbm_cntr_assign_enabled(r) needs to be called with the lock.

Also, there is no reference of CONFIG_RESCTRL_ASSIGN_FIXED in this series. Hope that is not an issue.

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 3880480a41d2..2907bb7bfa56 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -858,16 +858,22 @@ static int resctrl_mbm_assign_mode_show(struct kernfs_open_file *of,
                                        struct seq_file *s, void *v)
 {
        struct rdt_resource *r = of->kn->parent->priv;
+       bool enabled;

        mutex_lock(&rdtgroup_mutex);
+       enabled = resctrl_arch_mbm_cntr_assign_enabled(r);

        if (r->mon.mbm_cntr_assignable) {
-               if (resctrl_arch_mbm_cntr_assign_enabled(r)) {
+               if (enabled)
                        seq_puts(s, "[mbm_cntr_assign]\n");
-                       seq_puts(s, "default\n");
-               } else {
-                       seq_puts(s, "mbm_cntr_assign\n");
+               else
                        seq_puts(s, "[default]\n");
+
+               if (!IS_ENABLED(CONFIG_RESCTRL_ASSIGN_FIXED)) {
+                       if (enabled)
+                               seq_puts(s, "default\n");
+                       else
+                               seq_puts(s, "mbm_cntr_assign\n");
                }
        } else {
                seq_puts(s, "[default]\n");





+	} else {
+		seq_puts(s, "[default]\n");
+	}
+
+	mutex_unlock(&rdtgroup_mutex);
+
+	return 0;
+}


Thanks,

James


Thanks
Babu




[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux