Re: [RFC 00/10] Introduce In Field Scan driver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Tue, Mar 1, 2022, at 11:54 AM, Jithu Joseph wrote:
> Note to Maintainers:
> Requesting x86 Maintainers to take a look at patch01 as it
> touches arch/x86 portion of the kernel. Also would like to guide them
> to patch07 which sets up hotplug notifiers and creates kthreads.
>
> Patch 2/10 - Adds Documentation. Requesting Documentation maintainer to 
> review it.
>
> Requesting Greg KH to review the sysfs changes added by patch08.
>
> Patch10 adds tracing support, requesting Steven Rostedt to review that.
>
> Rest of the patches adds the IFS platform driver, requesting Platform 
> driver maintainers
> to review them.
>
>
> In Field Scan (IFS) is a hardware feature to run circuit level tests on
> a CPU core to detect problems that are not caught by parity or ECC checks.
>
> Intel will provide a firmware file containing the scan tests.  Similar to
> microcode there is a separate file for each family-model-stepping. The
> tests in the file are divided into some number of "chunks" that can be
> run individually.
>
> The driver loads the tests into memory reserved BIOS local to each CPU
> socket in a two step process using writes to MSRs to first load the
> SHA hashes for the test. Then the tests themselves. Status MSRs provide
> feedback on the success/failure of these steps.
>
> Tests are run by synchronizing execution of all threads on a core and
> then writing to the ACTIVATE_SCAN MSR on all threads. Instruction
> execution continues when:
>
> 1) all tests have completed
> 2) execution was interrupted
> 3) a test detected a problem
>
> In all cases reading the SCAN_STATUS MSR provides details on what
> happened. Interrupted tests may be restarted.
>
> The IFS driver provides interfaces from /sys to reload tests and to
> control execution:
>
> /sys/devices/system/cpu/ifs/reload
>   Writing "1" to this file will reload the tests from
>   /lib/firmware/intel/ifs/{ff-mm-ss}.scan

IMO this interface is wrong.  /lib/firmware is for firmware (or ucode, etc) files that should be provided by a distribution and loaded, as needed, by a driver so the hardware can function.  This is not at all what IFS does. For IFS, an administrator wants to run a specific test, and the test blob is part of the instruction to run the test.  The distribution should not be involved, and this should work even on systems where /lib/firmware is immutable.

So either the blob should be written to a file in sysfs or it should be supplied by write or ioctl to a device node.

>
> /sys/devices/system/cpu/ifs/run_test
>   Writing "1" to this file will trigger a scan on each core
>   sequentially by logical CPU number (when HT is enabled this only
>   runs the tests once for each core)
>
> /sys/devices/system/cpu/cpu#/ifs/run_test
>   Writing "1" to one of these files will trigger a scan on just
>   that core.
>
> Results of the tests are also provided in /sys:
>
> /sys/devices/system/cpu/ifs/status
>   Global status. Will show the most serious status across
>   all cores (fail > untested > pass)
>
> /sys/devices/system/cpu/ifs/cpu_fail_list
> /sys/devices/system/cpu/ifs/cpu_pass_list
> /sys/devices/system/cpu/ifs/cpu_untested_list
>   CPU lists showing which CPUs have which test status
>
> /sys/devices/system/cpu/cpu#/ifs/status
>   Status (pass/fail/untested) of each core
>
> /sys/devices/system/cpu/cpu#/ifs/details
>   Hex value of the SCAN_STATUS MSR for the most recent test on
>   this core. Note that the error_code field may contain driver
>   defined software code not defined in the Intel SDM.
>
> Current driver limitations:
>
> 1) The ACTIVATE_SCAN MSR allows for running any consecutive subrange or
> available tests. But the driver always tries to run all tests and only
> uses the subrange feature to restart an interrupted test.
>
> 2) Hardware allows for some number of cores to be tested in parallel.
> The driver does not make use of this, it only tests one core at a time.
>
>
> Jithu Joseph (8):
>   x86/microcode/intel: expose collect_cpu_info_early() for IFS
>   platform/x86/intel/ifs: Add driver for In-Field Scan
>   platform/x86/intel/ifs: Load IFS Image
>   platform/x86/intel/ifs: Check IFS Image sanity
>   platform/x86/intel/ifs: Authenticate and copy to secured memory
>   platform/x86/intel/ifs: Create kthreads for online cpus for scan test
>   platform/x86/intel/ifs: Add IFS sysfs interface
>   platform/x86/intel/ifs: add ABI documentation for IFS
>
> Tony Luck (2):
>   Documentation: In-Field Scan
>   trace: platform/x86/intel/ifs: Add trace point to track Intel IFS
>     operations
>
>  Documentation/ABI/stable/sysfs-driver-ifs |  85 +++++
>  Documentation/x86/ifs.rst                 | 108 ++++++
>  Documentation/x86/index.rst               |   1 +
>  MAINTAINERS                               |   7 +
>  arch/x86/include/asm/microcode_intel.h    |   6 +
>  arch/x86/kernel/cpu/microcode/intel.c     |   8 +-
>  drivers/platform/x86/intel/Kconfig        |   1 +
>  drivers/platform/x86/intel/Makefile       |   1 +
>  drivers/platform/x86/intel/ifs/Kconfig    |   9 +
>  drivers/platform/x86/intel/ifs/Makefile   |   7 +
>  drivers/platform/x86/intel/ifs/core.c     | 387 +++++++++++++++++++++
>  drivers/platform/x86/intel/ifs/ifs.h      | 155 +++++++++
>  drivers/platform/x86/intel/ifs/load.c     | 299 ++++++++++++++++
>  drivers/platform/x86/intel/ifs/sysfs.c    | 394 ++++++++++++++++++++++
>  include/trace/events/ifs.h                |  38 +++
>  15 files changed, 1503 insertions(+), 3 deletions(-)
>  create mode 100644 Documentation/ABI/stable/sysfs-driver-ifs
>  create mode 100644 Documentation/x86/ifs.rst
>  create mode 100644 drivers/platform/x86/intel/ifs/Kconfig
>  create mode 100644 drivers/platform/x86/intel/ifs/Makefile
>  create mode 100644 drivers/platform/x86/intel/ifs/core.c
>  create mode 100644 drivers/platform/x86/intel/ifs/ifs.h
>  create mode 100644 drivers/platform/x86/intel/ifs/load.c
>  create mode 100644 drivers/platform/x86/intel/ifs/sysfs.c
>  create mode 100644 include/trace/events/ifs.h
>
> -- 
> 2.17.1



[Index of Archives]     [Linux Kernel Development]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux