Hi Sanjaya, On 10/11/22 04:06, Bagas Sanjaya wrote: > On Mon, Oct 10, 2022 at 03:30:40PM -0500, Babu Moger wrote: >> diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst >> index 71a531061e4e..e2a59249d183 100644 >> --- a/Documentation/x86/resctrl.rst >> +++ b/Documentation/x86/resctrl.rst >> @@ -17,14 +17,16 @@ AMD refers to this feature as AMD Platform Quality of Service(AMD QoS). >> This feature is enabled by the CONFIG_X86_CPU_RESCTRL and the x86 /proc/cpuinfo >> flag bits: >> >> -============================================= ================================ >> +=============================================== ================================ >> RDT (Resource Director Technology) Allocation "rdt_a" >> CAT (Cache Allocation Technology) "cat_l3", "cat_l2" >> CDP (Code and Data Prioritization) "cdp_l3", "cdp_l2" >> CQM (Cache QoS Monitoring) "cqm_llc", "cqm_occup_llc" >> MBM (Memory Bandwidth Monitoring) "cqm_mbm_total", "cqm_mbm_local" >> MBA (Memory Bandwidth Allocation) "mba" >> -============================================= ================================ >> +SMBA (Slow Memory Bandwidth Allocation) "smba" >> +BMEC (Bandwidth Monitoring Event Configuration) "bmec" >> +=============================================== ================================ >> >> To use the feature mount the file system:: >> >> @@ -161,6 +163,79 @@ with the following files: >> "mon_features": >> Lists the monitoring events if >> monitoring is enabled for the resource. >> + Example:: >> + >> + # cat /sys/fs/resctrl/info/L3_MON/mon_features >> + llc_occupancy >> + mbm_total_bytes >> + mbm_local_bytes >> + >> + If the system supports Bandwidth Monitoring Event >> + Configuration (BMEC), then the bandwidth events will >> + be configurable. The output will be:: >> + >> + # cat /sys/fs/resctrl/info/L3_MON/mon_features >> + llc_occupancy >> + mbm_total_bytes >> + mbm_total_config >> + mbm_local_bytes >> + mbm_local_config >> + >> +"mbm_total_config", "mbm_local_config": >> + These files contain the current event configuration for the events >> + mbm_total_bytes and mbm_local_bytes, respectively, when the >> + Bandwidth Monitoring Event Configuration (BMEC) feature is supported. >> + The event configuration settings are domain specific and will affect >> + all the CPUs in the domain. >> + >> + Following are the types of events supported: >> + >> + ==== ======================================================== >> + Bits Description >> + ==== ======================================================== >> + 6 Dirty Victims from the QOS domain to all types of memory >> + 5 Reads to slow memory in the non-local NUMA domain >> + 4 Reads to slow memory in the local NUMA domain >> + 3 Non-temporal writes to non-local NUMA domain >> + 2 Non-temporal writes to local NUMA domain >> + 1 Reads to memory in the non-local NUMA domain >> + 0 Reads to memory in the local NUMA domain >> + ==== ======================================================== >> + >> + By default, the mbm_total_bytes configuration is set to 0x7f to count >> + all the event types and the mbm_local_bytes configuration is set to >> + 0x15 to count all the local memory events. >> + >> + Examples: >> + >> + * To view the current configuration:: >> + :: >> + >> + # cat /sys/fs/resctrl/info/L3_MON/mbm_total_config >> + 0=0x7f;1=0x7f;2=0x7f;3=0x7f >> + >> + # cat /sys/fs/resctrl/info/L3_MON/mbm_local_config >> + 0=0x15;1=0x15;3=0x15;4=0x15 >> + >> + * To change the mbm_total_bytes to count only reads on domain 0. >> + To achieve this, the bits 0, 1, 4 and 5 needs to be set which is >> + 110011b (in hex 0x33). >> + :: >> + >> + # echo "0=0x33" > /sys/fs/resctrl/info/L3_MON/mbm_total_config >> + >> + # cat /sys/fs/resctrl/info/L3_MON/mbm_total_config >> + 0=0x33;1=0x7f;2=0x7f;3=0x7f >> + >> + * To change the mbm_local_bytes to count all the slow memory reads >> + on domain 1. To achieve this, the bits 4 and 5 needs to be set >> + which is 110000b (in hex 0x30). >> + :: >> + >> + # echo "1=0x30" > /sys/fs/resctrl/info/L3_MON/mbm_local_config >> + >> + # cat /sys/fs/resctrl/info/L3_MON/mbm_local_config >> + 0=0x15;1=0x30;3=0x15;4=0x15 >> >> "max_threshold_occupancy": >> Read/write file provides the largest value (in >> @@ -464,6 +539,25 @@ Memory bandwidth domain is L3 cache. >> >> MB:<cache_id0>=bw_MBps0;<cache_id1>=bw_MBps1;... >> >> +Slow Memory bandwidth Allocation (SMBA) >> +------------------------------------------------- >> +AMD hardware can support slow Memory bandwidth Allocation feature. >> +Currently, CXL.memory is the only supported "slow" memory device. >> +With the support of SMBA feature the hardware enables bandwidth >> +allocation on the slow memory devices. If there are multiple slow >> +memory devices in the system, then the throttling logic groups all >> +the slow sources together and applies the limit on them as a whole. >> + >> +The presence of the SMBA feature(with CXL.memory) is independent >> +of whether slow memory device is actually present in the system. >> +If there is no slow memory in the system, then setting a SMBA limit >> +will have no impact on the performance of the system. >> + >> +Slow Memory bandwidth domain is L3 cache. >> +:: >> + >> + SMBA:<cache_id0>=bandwidth0;<cache_id1>=bandwidth1;... >> + >> Reading/writing the schemata file >> --------------------------------- >> Reading the schemata file will show the state of all resources >> @@ -479,6 +573,44 @@ which you wish to change. E.g. >> L3DATA:0=fffff;1=fffff;2=3c0;3=fffff >> L3CODE:0=fffff;1=fffff;2=fffff;3=fffff >> >> +Reading/writing the schemata file (on AMD systems) >> +-------------------------------------------------- >> +Reading the schemata file will show the state of all resources >> +on all domains. When writing the memory bandwidth allocation you >> +only need to specify those values in an absolute number expressed >> +in 1/8 GB/s increments. To allocate bandwidth limit of 2GB, you >> +need to specify the value 16 (16 * 1/8 = 2). For example: >> +:: >> + >> + # cat schemata >> + MB:0=2048;1=2048;2=2048;3=2048 >> + L3:0=ffff;1=ffff;2=ffff;3=ffff >> + >> + # echo "MB:1=16" > schemata >> + # cat schemata >> + MB:0=2048;1= 16;2=2048;3=2048 >> + L3:0=ffff;1=ffff;2=ffff;3=ffff >> + >> +Reading/writing the schemata file (on AMD systems) with SMBA feature >> +------------------------------------------------------------------- > The heading above produces htmldocs warnings: > > Documentation/x86/resctrl.rst:595: WARNING: Title underline too short. > > Reading/writing the schemata file (on AMD systems) with SMBA feature > ------------------------------------------------------------------- > Documentation/x86/resctrl.rst:595: WARNING: Title underline too short. > > Reading/writing the schemata file (on AMD systems) with SMBA feature > ------------------------------------------------------------------- > > I have applied the fixup: Thanks > > ---- >8 ---- > > diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst > index e2a59249d18322..145916828f2bae 100644 > --- a/Documentation/x86/resctrl.rst > +++ b/Documentation/x86/resctrl.rst > @@ -592,7 +592,7 @@ need to specify the value 16 (16 * 1/8 = 2). For example: > L3:0=ffff;1=ffff;2=ffff;3=ffff > > Reading/writing the schemata file (on AMD systems) with SMBA feature > -------------------------------------------------------------------- > +-------------------------------------------------------------------- > Reading the schemata file will show the state of all resources > on all domains. When writing the memory bandwidth allocation you > only need to specify those values in an absolute number expressed > >> +Reading the schemata file will show the state of all resources >> +on all domains. When writing the memory bandwidth allocation you >> +only need to specify those values in an absolute number expressed >> +in 1/8 GB/s increments. To allocate bandwidth limit of 8GB, you >> +need to specify the value 64 (64 * 1/8 = 8). E.g. >> +:: >> + >> + # cat schemata >> + SMBA:0=2048;1=2048;2=2048;3=2048 >> + MB:0=2048;1=2048;2=2048;3=2048 >> + L3:0=ffff;1=ffff;2=ffff;3=ffff >> + >> + # echo "SMBA:1=64" > schemata >> + # cat schemata >> + SMBA:0=2048;1= 64;2=2048;3=2048 >> + MB:0=2048;1=2048;2=2048;3=2048 >> + L3:0=ffff;1=ffff;2=ffff;3=ffff >> + >> Cache Pseudo-Locking >> ==================== >> CAT enables a user to specify the amount of cache space that an >> >> > The rest of prose can be improved: > > ---- >8 ---- > > diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst > index 145916828f2bae..92b2c4e03a4a26 100644 > --- a/Documentation/x86/resctrl.rst > +++ b/Documentation/x86/resctrl.rst > @@ -217,9 +217,9 @@ with the following files: > # cat /sys/fs/resctrl/info/L3_MON/mbm_local_config > 0=0x15;1=0x15;3=0x15;4=0x15 > > - * To change the mbm_total_bytes to count only reads on domain 0. > - To achieve this, the bits 0, 1, 4 and 5 needs to be set which is > - 110011b (in hex 0x33). > + * To change the mbm_total_bytes to count only reads on domain 0 > + (the bits 0, 1, 4 and 5 needs to be set, which means 110011b > + {in hex 0x33}): > :: > > # echo "0=0x33" > /sys/fs/resctrl/info/L3_MON/mbm_total_config > @@ -228,8 +228,8 @@ with the following files: > 0=0x33;1=0x7f;2=0x7f;3=0x7f > > * To change the mbm_local_bytes to count all the slow memory reads > - on domain 1. To achieve this, the bits 4 and 5 needs to be set > - which is 110000b (in hex 0x30). > + on domain 1 (the bits 4 and 5 needs to be set, which means 110000b > + {in hex 0x30}): > :: > > # echo "1=0x30" > /sys/fs/resctrl/info/L3_MON/mbm_local_config > @@ -540,20 +540,21 @@ Memory bandwidth domain is L3 cache. > MB:<cache_id0>=bw_MBps0;<cache_id1>=bw_MBps1;... > > Slow Memory bandwidth Allocation (SMBA) > -------------------------------------------------- > -AMD hardware can support slow Memory bandwidth Allocation feature. > +--------------------------------------- > +AMD hardwares support Slow Memory bandwidth Allocation (SMBA) feature. > Currently, CXL.memory is the only supported "slow" memory device. > -With the support of SMBA feature the hardware enables bandwidth > -allocation on the slow memory devices. If there are multiple slow > -memory devices in the system, then the throttling logic groups all > -the slow sources together and applies the limit on them as a whole. > +With the support of SMBA, the hardware enables bandwidth allocation > +on the slow memory devices. If there are multiple such devices in the > +system, the throttling logic groups all the slow sources together > +and applies the limit on them as a whole. > > -The presence of the SMBA feature(with CXL.memory) is independent > -of whether slow memory device is actually present in the system. > -If there is no slow memory in the system, then setting a SMBA limit > -will have no impact on the performance of the system. > +The presence of SMBA (with CXL.memory) is independent of slow memory > +devices presence. If there is no such devices on the system, then > +setting the configuring SMBA will have no impact on the performance > +of the system. > > -Slow Memory bandwidth domain is L3 cache. > +The bandwidth domain for slow memory is L3 cache. Its schemata file > +is formatted as: > :: > > SMBA:<cache_id0>=bandwidth0;<cache_id1>=bandwidth1;... > @@ -575,11 +576,13 @@ which you wish to change. E.g. > > Reading/writing the schemata file (on AMD systems) > -------------------------------------------------- > -Reading the schemata file will show the state of all resources > -on all domains. When writing the memory bandwidth allocation you > -only need to specify those values in an absolute number expressed > -in 1/8 GB/s increments. To allocate bandwidth limit of 2GB, you > -need to specify the value 16 (16 * 1/8 = 2). For example: > +Reading the schemata file will show the current bandwidth limit on all > +domains. The allocated resources are in multiples of one eighth GB/s. > +When writing to the file, you need to specify what cache id you wish to > +configure the bandwidth limit. > + > +For example, to allocate 2GB/s limit on the first cache id: > + > :: > > # cat schemata > @@ -593,11 +596,11 @@ need to specify the value 16 (16 * 1/8 = 2). For example: > > Reading/writing the schemata file (on AMD systems) with SMBA feature > -------------------------------------------------------------------- > -Reading the schemata file will show the state of all resources > -on all domains. When writing the memory bandwidth allocation you > -only need to specify those values in an absolute number expressed > -in 1/8 GB/s increments. To allocate bandwidth limit of 8GB, you > -need to specify the value 64 (64 * 1/8 = 8). E.g. > +Reading and writing the schemata file is the same as without SMBA in > +above section. > + > +For example, to allocate 8GB/s limit on the first cache id: > + > :: > > # cat schemata > Thanks Babu Moger