Re: [PATCH v6 20/22] x86/resctrl: Enable AMD ABMC feature by default when supported

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Babu,

On 8/20/24 1:18 PM, Moger, Babu wrote:
On 8/20/24 15:04, Moger, Babu wrote:
On 8/20/24 13:12, Reinette Chatre wrote:
On 8/19/24 11:18 AM, Moger, Babu wrote:
On 8/16/24 17:33, Reinette Chatre wrote:
On 8/6/24 3:00 PM, Babu Moger wrote:

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 66febff2a3d3..d15fd1bde5f4 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2756,6 +2756,23 @@ void resctrl_arch_mbm_cntr_assign_disable(void)
        }
    }
    +void resctrl_arch_mbm_cntr_assign_configure(void)
+{
+    struct rdt_resource *r =
&rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+    struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
+    bool enable = true;
+
+    mutex_lock(&rdtgroup_mutex);
+
+    if (r->mon.mbm_cntr_assignable) {
+        if (!hw_res->mbm_cntr_assign_enabled)
+            hw_res->mbm_cntr_assign_enabled = true;
+        resctrl_abmc_set_one_amd(&enable);

Earlier changelogs mentioned that counters are reset when ABMC is enabled.
How does that behave here when one CPU comes online? Consider the scenario
where
a system is booted without all CPUs online. ABMC is initially enabled on
all online
CPUs with this flow ... user space could start using resctrl fs and create
monitor groups that start accumulating architectural state. If the
remaining
CPUs come online at this point and this snippet enables ABMC, would it
reset
all counters? Should the architectural state be cleared?

When new cpu comes online, it should inherit the abmc state which is set
already. it should not force it either way. In that case, it is not
required to reset the architectural state.

Responded to your earlier comment.
https://lore.kernel.org/lkml/0256b457-175d-4923-aa49-00e8e52b865b@xxxxxxx/



Also, it still does not look right that the architecture decides the
policy.
Could this enabling be moved to resctrl_online_cpu() for resctrl fs to
request architecture to enable assignable counters if it is supported?

Sure. Will move the resctrl_arch_mbm_cntr_assign_configure() here with
changes just to update the abmc state which is set during the init.


I do not think we are seeing it the same way. In your earlier comment you
mention:

We need to set abmc state to "enabled" during the init when abmc is
detected.  resctrl_late_init -> .. -> rdt_get_mon_l3_config

This only happens once during the init.


I do not think that the ABMC state can be set during init since that runs
before the fs code and thus the arch code cannot be aware of the fs policy
that "mbm_assign_mode" is the default. This may become clear when you move
resctrl_arch_mbm_cntr_assign_configure() to resctrl_online_cpu() though
since I expect that the r->mon.mbm_cntr_assignable check will move
into the fs resctrl_online_cpu() that will call the arch helper to
set the state to enabled.

There are couple of problems here.

1. Hotplug with ABMC enabled.

   System is running with ABMC enabled. Now, new cpu cames online.
   The function resctrl_arch_mbm_cntr_assign_configure() will set the MSR
MSR_IA32_L3_QOS_EXT_CFG to enable ABMC on the new CPU. This scenario works
fine.


2. Hotplug with ABMC disabled.
   Current code will force the system to enable ABMC on the new CPU.
   That is not correct.


We need to address both these cases.


I was thinking of separating the functionality in
resctrl_arch_mbm_cntr_assign_configure() into two.

a. Just set the mbm_cntr_assign_enabled to true during the init.
    if (r->mon.mbm_cntr_assignable)
        hw_res->mbm_cntr_assign_enabled = true;

    This is similar to rdtgroup_setup_default().  Isn't it?

Right. I (mis)understood from your earlier comment that this was planned to
be done from "resctrl_late_init -> .. -> rdt_get_mon_l3_config", but
handling that within rdtgroup_init() (after MPAM it will be resctrl_init())
seem appropriate to me since it is where the fs code can set its policy.


I just noticed that I cannot access rdt_hw_resource here. I may have to
write another function resctrl_arch_mbm_cntr_assign_set_default() to do
this. What do you think?

Sounds good.





b. Change the functionality in resctrl_arch_mbm_cntr_assign_configure()
    to update the MSR MSR_IA32_L3_QOS_EXT_CFG based on
hw_res->mbm_cntr_assign_enabled.  Something like this.


    void resctrl_arch_mbm_cntr_assign_configure(void)
{
     ---
     if (r->mon.mbm_cntr_assignable &&  hw_res->mbm_cntr_assign_enabled)
             abmc_set_one_amd(&enable);
    ---
}


Yes.  The function resctrl_arch_mbm_cntr_assign_configure() will be called
from resctrl_online_cpu().

Checking r->mon.mbm_cntr_assignable may be redundant since I expect that
resctrl_online_cpu() will also do the check. Would do no harm though.


Does it make sense?  Any other idea?


Yes, this makes sense to me. Thank you.

Reinette





[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux