Re: [PATCH v2 3/4] platform/x86/intel/ifs: Add SBAF test support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 11 Jul 2024, Kuppuswamy Sathyanarayanan wrote:
> On 7/11/24 2:44 AM, Ilpo Järvinen wrote:
> > On Wed, 10 Jul 2024, Kuppuswamy Sathyanarayanan wrote:
> >
> >> From: Jithu Joseph <jithu.joseph@xxxxxxxxx>
> >>
> >> In a core, the SBAF test engine is shared between sibling CPUs.
> >>
> >> An SBAF test image contains multiple bundles. Each bundle is further
> >> composed of subunits called programs. When a SBAF test (for a particular
> >> core) is triggered by the user, each SBAF bundle from the loaded test
> >> image is executed sequentially on all the threads on the core using
> >> the stop_core_cpuslocked mechanism. Each bundle execution is initiated by
> >> writing to MSR_ACTIVATE_SBAF.
> >>
> >> SBAF test bundle execution may be aborted when an interrupt occurs or
> >> if the CPU does not have enough power budget for the test. In these
> >> cases the kernel restarts the test from the aborted bundle. SBAF
> >> execution is not retried if the test fails or if the test makes no
> >> forward progress after 5 retries.
> >>
> >> Reviewed-by: Ashok Raj <ashok.raj@xxxxxxxxx>
> >> Reviewed-by: Tony Luck <tony.luck@xxxxxxxxx>
> >> Signed-off-by: Jithu Joseph <jithu.joseph@xxxxxxxxx>
> >> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx>
> >> ---

> >> +static const char * const sbaf_test_status[] = {
> >> +	[IFS_SBAF_NO_ERROR] = "SBAF no error",
> >> +	[IFS_SBAF_OTHER_THREAD_COULD_NOT_JOIN] = "Other thread could not join.",
> >> +	[IFS_SBAF_INTERRUPTED_BEFORE_RENDEZVOUS] = "Interrupt occurred prior to SBAF coordination.",
> >> +	[IFS_SBAF_UNASSIGNED_ERROR_CODE3] = "Unassigned error code 0x3",
> >> +	[IFS_SBAF_INVALID_BUNDLE_INDEX] = "Non valid sbaf bundles. Reload test image",
> > Non-valid SBAF
> >
> > ...but given your define is named "INVALID", why not use just Invalid 
> > SBAF?
> 
> Above string is from the specification document.But I think it is ok to use
> "Invalid" or "Non-valid".
> 
> Jithu, any concerns?
> 
> >> +	[IFS_SBAF_MISMATCH_ARGS_BETWEEN_THREADS] = "Mismatch in arguments between threads T0/T1.",
> >> +	[IFS_SBAF_CORE_NOT_CAPABLE_CURRENTLY] = "Core not capable of performing SBAF currently",
> >> +	[IFS_SBAF_UNASSIGNED_ERROR_CODE7] = "Unassigned error code 0x7",
> >> +	[IFS_SBAF_EXCEED_NUMBER_OF_THREADS_CONCURRENT] = "Exceeded number of Logical Processors (LP) allowed to run Scan-At-Field concurrently",
> >> +	[IFS_SBAF_INTERRUPTED_DURING_EXECUTION] = "Interrupt occurred prior to SBAF start",
> >> +	[IFS_SBAF_INVALID_PROGRAM_INDEX] = "SBAF program index not valid",
> >> +	[IFS_SBAF_CORRUPTED_CHUNK] = "SBAF operation aborted due to corrupted chunk",
> >> +	[IFS_SBAF_DID_NOT_START] = "SBAF operation did not start",
> >> +};
> >> +
> >> +static void sbaf_message_not_tested(struct device *dev, int cpu, u64 status_data)
> >> +{
> >> +	union ifs_sbaf_status status = (union ifs_sbaf_status)status_data;
> >> +
> >> +	if (status.error_code < ARRAY_SIZE(sbaf_test_status)) {
> >> +		dev_info(dev, "CPU(s) %*pbl: SBAF operation did not start. %s\n",
> >> +			 cpumask_pr_args(cpu_smt_mask(cpu)),
> >> +			 sbaf_test_status[status.error_code]);
> >> +	} else if (status.error_code == IFS_SW_TIMEOUT) {
> >> +		dev_info(dev, "CPU(s) %*pbl: software timeout during scan\n",
> >> +			 cpumask_pr_args(cpu_smt_mask(cpu)));
> >> +	} else if (status.error_code == IFS_SW_PARTIAL_COMPLETION) {
> >> +		dev_info(dev, "CPU(s) %*pbl: %s\n",
> >> +			 cpumask_pr_args(cpu_smt_mask(cpu)),
> >> +			 "Not all SBAF bundles executed. Maximum forward progress retries exceeded");
> >> +	} else {
> >> +		dev_info(dev, "CPU(s) %*pbl: SBAF unknown status %llx\n",
> >> +			 cpumask_pr_args(cpu_smt_mask(cpu)), status.data);
> >> +	}
> >> +}
> >> +
> >> +static void sbaf_message_fail(struct device *dev, int cpu, union ifs_sbaf_status status)
> >> +{
> >> +	/* Failed signature check is set when SBAF signature did not match the expected value */
> >> +	if (status.sbaf_status == SBAF_STATUS_SIGN_FAIL) {
> >> +		dev_err(dev, "CPU(s) %*pbl: Failed signature check\n",
> >> +			cpumask_pr_args(cpu_smt_mask(cpu)));
> >> +	}
> >> +
> >> +	/* Failed to reach end of test */
> >> +	if (status.sbaf_status == SBAF_STATUS_TEST_FAIL) {
> >> +		dev_err(dev, "CPU(s) %*pbl: Failed to complete test\n",
> >> +			cpumask_pr_args(cpu_smt_mask(cpu)));
> >> +	}
> >> +}
> >> +
> >> +static bool sbaf_bundle_completed(union ifs_sbaf_status status)
> >> +{
> >> +	if (status.sbaf_status || status.error_code)
> >> +		return false;
> >> +	return true;
> > This is same as:
> >
> > 	return !status.sbaf_status && !status.error_code;
> 
> Yes. Your version looks good. Do you want me to send a version with
> this change or you can include it when merging it?

Please do send a new version, there were too many things to change for
me to do it while applying.

-- 
 i.

[Index of Archives]     [Linux Kernel Development]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux