Re: [PATCH V4 00/15] selftests/resctrl: Support diverse platforms with MBM and MBA tests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/24/24 15:18, Reinette Chatre wrote:
Changes since V3:
- V3: https://lore.kernel.org/all/cover.1729218182.git.reinette.chatre@xxxxxxxxx/
- Rebased on HEAD 2a027d6bb660 of kselftest/next.
- Fix empty string parsing issues pointed out by Ilpo.
- Add Reviewed-by tags.
- Please see individual patches for detailed changes.

Changes since V2:
- V2: https://lore.kernel.org/all/cover.1726164080.git.reinette.chatre@xxxxxxxxx/
- Add fix to protect against buffer overflow when parsing text from sysfs files.
- Add cleanup patch to address use of magic constants as pointed out by
   Ilpo.
- Add Reviewed-by tags where received, except for "selftests/resctrl: Use cache
   size to determine "fill_buf" buffer size" that changed too much since
   receiving the Reviewed-by tag.
- Please see individual patches for detailed changes.

Changes since V1:
- V1: https://lore.kernel.org/cover.1724970211.git.reinette.chatre@xxxxxxxxx/
- V2 contains the same general solutions to stated problem as V1 but these
   are now preceded by more fixes (patches 1 to 5) and improved robustness
   (patches 6 to 9) to existing tests before the series gets back
   to solving the original problem with more confidence in patches 10 to 13.
- The posibility of making "memflush = false" for CMT test was discussed
   during V1. Modifying this setting does not have a significant impact on the
   observed results that are already well within acceptable range and this
   version thus keeps original default. If performance was a goal it may
   be possible to do further experimentation where "memflush = false" could
   eliminate the need for the sleep(1) within the test wrapper, but
   improving the performance is not a goal of this work.
- (New) Support what seems to be unintended ability for user space to provide
   parameters to "fill_buf" by making the parsing robust and only support
   changing parameters that are supported to be changed. Drop support for
   "write" operation since it has never been measured.
- (New) Improve wraparound handling. (Ilpo)
- (New) A couple of new fixes addressing issues discovered during development.
- (Change from V1) To support fill_buf parameters provided by user space as
   well as test specific fill_buf parameters struct fill_buf_param is no longer
   just a member of struct resctrl_val_param, instead there could be at most
   two instances of struct fill_buf_param, the immutable parameters provided
   by user space and the parameters used by individual tests. (Ilpo)
- Please see individual patches for detailed changes.

V1 cover:

The resctrl selftests for Memory Bandwidth Allocation (MBA) and Memory
Bandwidth Monitoring (MBM) are failing on some (for example [1]) Emerald
Rapids systems. The test failures result from the following two
properties of these systems:
1) Emerald Rapids systems can have up to 320MB L3 cache. The resctrl
    MBA and MBM selftests measure memory traffic for which a hardcoded
    250MB buffer has been sufficient so far. On platforms with L3 cache
    larger than the buffer, the buffer fits in the L3 cache and thus
    no/very little memory traffic is generated during the "memory
    bandwidth" tests.
2) Some platform features, for example RAS features or memory
    performance features that generate memory traffic may drive accesses
    that are counted differently by performance counters and MBM
    respectively, for instance generating "overhead" traffic which is not
    counted against any specific RMID. Until now these counting
    differences have always been "in the noise". On Emerald Rapids
    systems the maximum MBA throttling (10% memory bandwidth)
    throttles memory bandwidth to where memory accesses by these other
    platform features push the memory bandwidth difference between
    memory controller performance counters and resctrl (MBM) beyond the
    tests' hardcoded tolerance.

Make the tests more robust against platform variations:
1) Let the buffer used by memory bandwidth tests be guided by the size
    of the L3 cache.
2) Larger buffers require longer initialization time before the buffer can
    be used to measurement. Rework the tests to ensure that buffer
    initialization is complete before measurements start.
3) Do not compare performance counters and MBM measurements at low
    bandwidth. The value of "low" is hardcoded to 750MiB based on
    measurements on Emerald Rapids, Sapphire Rapids, and Ice Lake
    systems. This limit is not applicable to AMD systems since it
    only applies to the MBA and MBM tests that are isolated to Intel.

[1]
https://ark.intel.com/content/www/us/en/ark/products/237261/intel-xeon-platinum-8592-processor-320m-cache-1-9-ghz.html

Reinette Chatre (15):
   selftests/resctrl: Make functions only used in same file static
   selftests/resctrl: Print accurate buffer size as part of MBM results
   selftests/resctrl: Fix memory overflow due to unhandled wraparound
   selftests/resctrl: Protect against array overrun during iMC config
     parsing
   selftests/resctrl: Protect against array overflow when reading strings
   selftests/resctrl: Make wraparound handling obvious
   selftests/resctrl: Remove "once" parameter required to be false
   selftests/resctrl: Only support measured read operation
   selftests/resctrl: Remove unused measurement code
   selftests/resctrl: Make benchmark parameter passing robust
   selftests/resctrl: Ensure measurements skip initialization of default
     benchmark
   selftests/resctrl: Use cache size to determine "fill_buf" buffer size
   selftests/resctrl: Do not compare performance counters and resctrl at
     low bandwidth
   selftests/resctrl: Keep results from first test run
   selftests/resctrl: Replace magic constants used as array size

  tools/testing/selftests/resctrl/cmt_test.c    |  37 +-
  tools/testing/selftests/resctrl/fill_buf.c    |  45 +-
  tools/testing/selftests/resctrl/mba_test.c    |  54 ++-
  tools/testing/selftests/resctrl/mbm_test.c    |  37 +-
  tools/testing/selftests/resctrl/resctrl.h     |  79 +++-
  .../testing/selftests/resctrl/resctrl_tests.c |  95 +++-
  tools/testing/selftests/resctrl/resctrl_val.c | 447 +++++-------------
  tools/testing/selftests/resctrl/resctrlfs.c   |  19 +-
  8 files changed, 354 insertions(+), 459 deletions(-)


base-commit: 2a027d6bb66002c8e50e974676f932b33c5fce10

Is this patch series ready to be applied?

thanks,
-- Shuah





[Index of Archives]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]

  Powered by Linux