Re: [PATCH v8 1/7] x86/resctrl: Prepare for per-ctrl_mon group mba_MBps control

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Tony,

On 10/29/24 10:28 AM, Tony Luck wrote:
> Resctrl uses local memory bandwidth event as input to the feedback
> loop when the mba_MBps mount option is used. This means that this
> mount option cannot be used on systems that only support monitoring
> of total bandwidth.
> 
> Prepare to allow users to choose the input event independently for
> each ctrl_mon group.
> 
> Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>
> ---
>  include/linux/resctrl.h                | 2 ++
>  arch/x86/kernel/cpu/resctrl/internal.h | 2 ++
>  arch/x86/kernel/cpu/resctrl/core.c     | 5 +++++
>  arch/x86/kernel/cpu/resctrl/rdtgroup.c | 6 ++++++
>  4 files changed, 15 insertions(+)
> 
> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index d94abba1c716..fd05b937e2f4 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -49,6 +49,8 @@ enum resctrl_event_id {
>  	QOS_L3_MBM_LOCAL_EVENT_ID	= 0x03,
>  };
>  
> +extern enum resctrl_event_id mba_mbps_default_event;
> +
>  /**
>   * struct resctrl_staged_config - parsed configuration to be applied
>   * @new_ctrl:		new ctrl value to be loaded
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 955999aecfca..a6f051fb2e69 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -283,6 +283,7 @@ struct pseudo_lock_region {
>   *				monitor only or ctrl_mon group
>   * @mon:			mongroup related data
>   * @mode:			mode of resource group
> + * @mba_mbps_event:		input event id when mba_sc mode is active

Referring to mba_sc as a "mode" is potentially confusing when considering the
line right above it (specifically, mba_sc is not a possible value of @mode).
How about "input monitoring event id when MBA software controller (mba_sc) is enabled"
or "input monitoring event id when mba_sc is enabled"?


>   * @plr:			pseudo-locked region
>   */
>  struct rdtgroup {
> @@ -295,6 +296,7 @@ struct rdtgroup {
>  	enum rdt_group_type		type;
>  	struct mongroup			mon;
>  	enum rdtgrp_mode		mode;
> +	enum resctrl_event_id		mba_mbps_event;
>  	struct pseudo_lock_region	*plr;
>  };
>  
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index b681c2e07dbf..5b55a7ac7013 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -958,6 +958,11 @@ static __init bool get_rdt_mon_resources(void)
>  	if (rdt_cpu_has(X86_FEATURE_CQM_MBM_LOCAL))
>  		rdt_mon_features |= (1 << QOS_L3_MBM_LOCAL_EVENT_ID);
>  
> +	if (rdt_mon_features & (1 << QOS_L3_MBM_LOCAL_EVENT_ID))
> +		mba_mbps_default_event = QOS_L3_MBM_LOCAL_EVENT_ID;
> +	else if (rdt_mon_features & (1 << QOS_L3_MBM_TOTAL_EVENT_ID))
> +		mba_mbps_default_event = QOS_L3_MBM_TOTAL_EVENT_ID;
> +
>  	if (!rdt_mon_features)
>  		return false;

Shouldn't checking of individual monitoring features be at this point? That is,
after confirming that there are indeed monitoring features?

fyi ... since these checks are not architectural I expect that they will
move to resctrl_mon_resource_init() as part of the resctrl fs/arch split.
https://lore.kernel.org/all/20241004180347.19985-17-james.morse@xxxxxxx/
This move will be supported when using the helpers as Fenghua suggested.


>  
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index d7163b764c62..dbfb9d11f3f8 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -65,6 +65,8 @@ static void rdtgroup_destroy_root(void);
>  
>  struct dentry *debugfs_resctrl;
>  
> +enum resctrl_event_id mba_mbps_default_event;
> +
>  static bool resctrl_debug;
>  
>  void rdt_last_cmd_clear(void)
> @@ -2665,6 +2667,8 @@ static int rdt_get_tree(struct fs_context *fc)
>  	if (ret)
>  		goto out_schemata_free;
>  
> +	rdtgroup_default.mba_mbps_event = mba_mbps_default_event;
> +

The pattern of the default group properties are to initialize them once
in rdtgroup_setup_default() and then do any needed reset in rdt_kill_sb().
You can compare with how rdtgroup_default.mode is handled.
If existing pattern is not acceptable then it could be reworked to have one pattern
support all scenarios instead of fragmenting the default group's initialization and
reset.

I do notice that the current MPAM proposal I linked above calls 
rdtgroup_setup_default() before resctrl_mon_resource_init() so I think that
needs to be swapped to support this feature.


>  	kernfs_activate(rdtgroup_default.kn);
>  
>  	ret = rdtgroup_create_info_dir(rdtgroup_default.kn);
> @@ -3624,6 +3628,8 @@ static int rdtgroup_mkdir_ctrl_mon(struct kernfs_node *parent_kn,
>  		}
>  	}
>  
> +	rdtgrp->mba_mbps_event = mba_mbps_default_event;
> +

Should this be in the "if (resctrl_arch_mon_capable())" section? Looking at rdtgroup_default
I also do not see any check for whether monitoring is supported. Is this intended? 
mba_mbps_default_event is only initialized if monitoring is supported so it *seems* as though
the value of 0, which is not a valid value of enum resctrl_event_id is intended as an
"uninitialized"? Unclear how to interpret this implementation at this point. Reading
the changelog again there are no details about these subtleties.

>  	goto out_unlock;
>  
>  out_del_list:

Reinette




[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux