Hi Reinette, On 9/13/24 15:51, Reinette Chatre wrote: > Hi Babu, > > On 8/16/24 9:16 AM, Babu Moger wrote: >> The SDCIAE (SDCI Allocation Enforcement) PQE feature allows system software >> to configure the portion of the L3 cache used for SDCI. >> >> When enabled, SDCIAE forces all SDCI lines to be placed into the L3 cache >> partitions identified by the highest-supported L3_MASK_n register as >> reported by CPUID Fn0000_0010_EDX_x1.MAX_COS. For example, if MAX_COS=15, >> SDCI lines will be allocated into the L3 cache partitions determined by >> the bitmask in the L3_MASK_15 register. >> >> Introduce interface to enable/disable SDCIAE feature on user input. >> >> Signed-off-by: Babu Moger <babu.moger@xxxxxxx> >> --- >> Documentation/arch/x86/resctrl.rst | 22 +++++++ >> arch/x86/kernel/cpu/resctrl/core.c | 1 + >> arch/x86/kernel/cpu/resctrl/internal.h | 1 + >> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 88 ++++++++++++++++++++++++++ >> 4 files changed, 112 insertions(+) >> >> diff --git a/Documentation/arch/x86/resctrl.rst >> b/Documentation/arch/x86/resctrl.rst >> index a824affd741d..cb1532dd843f 100644 >> --- a/Documentation/arch/x86/resctrl.rst >> +++ b/Documentation/arch/x86/resctrl.rst >> @@ -135,6 +135,28 @@ related to allocation: >> "1": >> Non-contiguous 1s value in CBM is supported. >> +"sdciae": >> + Indicates if the system can support SDCIAE (L3 Smart Data Cache >> + Injection Allocation Enforcement) feature. >> + >> + Smart Data Cache Injection (SDCI) is a mechanism that enables >> + direct insertion of data from I/O devices into the L3 cache. >> + By directly caching data from I/O devices rather than first >> + storing the I/O data in DRAM, SDCI reduces demands on DRAM >> + bandwidth and reduces latency to the processor consuming the >> + I/O data. The SDCIAE feature allows system software to configure >> + limit the portion of the L3 cache used for SDCI. > > Above needs to change to a generic resctrl fs feature. Agree. Will rephrase it > >> + >> + "0": >> + Feature is not enabled. >> + "1": >> + Feature is enabled. > > What does "feature is enabled" and "feature is not enabled" mean to the user? > What can the user expect to happen after enabling/disabling this feature? Ok. Will add few more details. > >> + >> + Feature can be enabled/disabled by writing to the interface. >> + Example:: >> + >> + # echo 1 > /sys/fs/resctrl/info/L3/sdciae >> + >> Memory bandwidth(MB) subdirectory contains the following files >> with respect to allocation: >> diff --git a/arch/x86/kernel/cpu/resctrl/core.c >> b/arch/x86/kernel/cpu/resctrl/core.c >> index e4381e3feb75..6a9512008a4a 100644 >> --- a/arch/x86/kernel/cpu/resctrl/core.c >> +++ b/arch/x86/kernel/cpu/resctrl/core.c >> @@ -299,6 +299,7 @@ static void rdt_get_cdp_config(int level) >> static void rdt_get_sdciae_alloc_cfg(struct rdt_resource *r) >> { >> r->sdciae_capable = true; >> + resctrl_sdciae_rftype_init(); >> } >> static void rdt_get_cdp_l3_config(void) >> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h >> b/arch/x86/kernel/cpu/resctrl/internal.h >> index ceb0e8e1ed76..9a3da6d49144 100644 >> --- a/arch/x86/kernel/cpu/resctrl/internal.h >> +++ b/arch/x86/kernel/cpu/resctrl/internal.h >> @@ -662,6 +662,7 @@ void rdt_domain_reconfigure_cdp(struct rdt_resource >> *r); >> void __init thread_throttle_mode_init(void); >> void __init mbm_config_rftype_init(const char *config); >> void rdt_staged_configs_clear(void); >> +void __init resctrl_sdciae_rftype_init(void); >> bool closid_allocated(unsigned int closid); >> int resctrl_find_cleanest_closid(void); >> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c >> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >> index c62d6183bfe4..58e4df195207 100644 >> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c >> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >> @@ -171,6 +171,27 @@ void closid_free(int closid) >> __set_bit(closid, &closid_free_map); >> } >> +/* >> + * SDCIAE feature uses max CLOSID to route the SDCI traffic. >> + * Get the max CLOSID number >> + */ >> +static u32 get_sdciae_closid(struct rdt_resource *r) >> +{ >> + return resctrl_arch_get_num_closid(r) - 1; >> +} >> + >> +static int closid_alloc_sdciae(struct rdt_resource *r) >> +{ >> + u32 sdciae_closid = get_sdciae_closid(r); >> + >> + if (closid_free_map & (1 << sdciae_closid)) { >> + closid_free_map &= ~(1 << sdciae_closid); >> + return sdciae_closid; >> + } else { >> + return -ENOSPC; >> + } >> +} > > How does this interact with CDP? It seems to me that the above would > cause overflow on a CDP system since the closid_free_map is sized to > half of what the hardware supports. This also seems to still allow > creating resource groups that may end up using the CLOSID dedicated > to SDCIAE here since the (when CDP enabled) resource groups use > software closid and then hardware configuration is done with the > hardware closid. When CDP is enabled it seems possible to still > create a resource group and modify the CBM of what is then intended to > be sdciae allocations? Yes. This code is problematic when CDP is enabled. Hardware techically supports both CDP and SDCIAE together. In that case SDCIAE mask is shared with COS=7 Instruction mask. We have couple of options. 1. Dont allow SDCIAE when CDP is enabled. 2. Modify the code to handle both CDP and SDCIAE. Add documentation to clarify about sharing of SDCIAE and CDP masks. I would think option 2 may be better. > > Also, please be consistent with function naming, note how the above > two functions differ wrt namespace and verb. resctrl is surely not > consistent in this regard but it really helps to have partner functions > look similar. For example, this could be "feature_closid_get()" and > "feature_closid_alloc()". > Sure. > Also, there looks to be opportunity to use bitops here ... perhaps > __test_and_clear_bit()? Sure. > > >> + >> /** >> * closid_allocated - test if provided closid is in use >> * @closid: closid to be tested >> @@ -1850,6 +1871,57 @@ int resctrl_arch_set_sdciae_enabled(enum >> resctrl_res_level l, bool enable) >> return 0; >> } >> +static int resctrl_sdciae_show(struct kernfs_open_file *of, >> + struct seq_file *seq, void *v) >> +{ >> + seq_printf(seq, "%x\n", >> resctrl_arch_get_sdciae_enabled(RDT_RESOURCE_L3)); >> + return 0; >> +} > > This does not look right ... this file has flags "RFTYPE_CTRL_INFO | > RFTYPE_RES_CACHE" > which means that it will be created for all CAT resources. So, on a system > that supports > L2 CAT, the "sdciae" file will be created for the L2 resource and it will > show whether > "sdciae" is enabled on the *L3* resource? Yes. I will have to get it from resctrl_schema. Good catch. struct resctrl_schema *s = of->kn->parent->priv; struct rdt_resource *r = s->res; seq_printf(seq, "%x\n", resctrl_arch_get_sdciae_enabled(r->rid)); return 0; > >> + >> +static ssize_t resctrl_sdciae_write(struct kernfs_open_file *of, char >> *buf, >> + size_t nbytes, loff_t off) >> +{ >> + struct resctrl_schema *s = of->kn->parent->priv; >> + struct rdt_resource *r = s->res; >> + unsigned int enable; >> + u32 sdciae_closid; >> + int ret; >> + >> + if (!r->sdciae_capable) >> + return -EINVAL; >> + >> + ret = kstrtouint(buf, 0, &enable); > > How about kstrtobool()? This will make things more consistent with > resctrl_arch_set_sdciae_enabled() expecting a bool. Or should we be > looking ahead > at this file later possibly needing to support more integers to activate > more capabilities > related to this feature? In that case this implementation needs to take > care since instead > of supporting "0" and "1" it supports "0" and "anything but 0" that > prevents any such > future enhancements. I will change it kstrtobool(). > > >> + if (ret) >> + return ret; >> + >> + cpus_read_lock(); >> + mutex_lock(&rdtgroup_mutex); >> + >> + rdt_last_cmd_clear(); >> + >> + /* Update the MSR only when there is a change */ > > The resctrl fs cannot make predictions on what arch code needs to do to > enable feature. Sure. Will remove it. > >> + if (resctrl_arch_get_sdciae_enabled(RDT_RESOURCE_L3) != enable) { > > (same issue with this file being present under L2 resource creating > confusion) Ok. Will address it. > >> + if (enable) { >> + ret = closid_alloc_sdciae(r); >> + if (ret < 0) { >> + rdt_last_cmd_puts("SDCIAE CLOSID is not available\n"); >> + goto out_sdciae; >> + } >> + } else { >> + sdciae_closid = get_sdciae_closid(r); >> + closid_free(sdciae_closid); >> + } > > >> + >> + ret = resctrl_arch_set_sdciae_enabled(RDT_RESOURCE_L3, enable); > > I assume that once SDCIAE is enabled the I/O traffic will start flowing to > whatever > was the last CBM of the max CLOSID? Is this intended or should there be > some default > CBM that this feature should start with? It will start with whatever the last CBM for max CLOSID. > >> + } >> + >> +out_sdciae: >> + mutex_unlock(&rdtgroup_mutex); >> + cpus_read_unlock(); >> + >> + return ret ?: nbytes; >> +} >> + >> /* rdtgroup information files for one cache resource. */ >> static struct rftype res_common_files[] = { >> { >> @@ -2002,6 +2074,13 @@ static struct rftype res_common_files[] = { >> .seq_show = rdtgroup_schemata_show, >> .fflags = RFTYPE_CTRL_BASE, >> }, >> + { >> + .name = "sdciae", >> + .mode = 0644, >> + .kf_ops = &rdtgroup_kf_single_ops, >> + .seq_show = resctrl_sdciae_show, >> + .write = resctrl_sdciae_write, >> + }, >> { >> .name = "mode", >> .mode = 0644, >> @@ -2101,6 +2180,15 @@ void __init mbm_config_rftype_init(const char >> *config) >> rft->fflags = RFTYPE_MON_INFO | RFTYPE_RES_CACHE; >> } >> +void __init resctrl_sdciae_rftype_init(void) >> +{ >> + struct rftype *rft; >> + >> + rft = rdtgroup_get_rftype_by_name("sdciae"); >> + if (rft) >> + rft->fflags = RFTYPE_CTRL_INFO | RFTYPE_RES_CACHE; >> +} >> + > > Another instance of the pattern that is impacted by the ABMC and MPAM work. Yes. Agree. > >> /** >> * rdtgroup_kn_mode_restrict - Restrict user access to named resctrl file >> * @r: The resource group with which the file is associated. > > Reinette > -- Thanks Babu Moger