On Thu, 11 Jul 2024, Kuppuswamy Sathyanarayanan wrote: > On 7/11/24 2:44 AM, Ilpo Järvinen wrote: > > On Wed, 10 Jul 2024, Kuppuswamy Sathyanarayanan wrote: > > > >> From: Jithu Joseph <jithu.joseph@xxxxxxxxx> > >> > >> In a core, the SBAF test engine is shared between sibling CPUs. > >> > >> An SBAF test image contains multiple bundles. Each bundle is further > >> composed of subunits called programs. When a SBAF test (for a particular > >> core) is triggered by the user, each SBAF bundle from the loaded test > >> image is executed sequentially on all the threads on the core using > >> the stop_core_cpuslocked mechanism. Each bundle execution is initiated by > >> writing to MSR_ACTIVATE_SBAF. > >> > >> SBAF test bundle execution may be aborted when an interrupt occurs or > >> if the CPU does not have enough power budget for the test. In these > >> cases the kernel restarts the test from the aborted bundle. SBAF > >> execution is not retried if the test fails or if the test makes no > >> forward progress after 5 retries. > >> > >> Reviewed-by: Ashok Raj <ashok.raj@xxxxxxxxx> > >> Reviewed-by: Tony Luck <tony.luck@xxxxxxxxx> > >> Signed-off-by: Jithu Joseph <jithu.joseph@xxxxxxxxx> > >> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx> > >> --- > >> +static const char * const sbaf_test_status[] = { > >> + [IFS_SBAF_NO_ERROR] = "SBAF no error", > >> + [IFS_SBAF_OTHER_THREAD_COULD_NOT_JOIN] = "Other thread could not join.", > >> + [IFS_SBAF_INTERRUPTED_BEFORE_RENDEZVOUS] = "Interrupt occurred prior to SBAF coordination.", > >> + [IFS_SBAF_UNASSIGNED_ERROR_CODE3] = "Unassigned error code 0x3", > >> + [IFS_SBAF_INVALID_BUNDLE_INDEX] = "Non valid sbaf bundles. Reload test image", > > Non-valid SBAF > > > > ...but given your define is named "INVALID", why not use just Invalid > > SBAF? > > Above string is from the specification document.But I think it is ok to use > "Invalid" or "Non-valid". > > Jithu, any concerns? > > >> + [IFS_SBAF_MISMATCH_ARGS_BETWEEN_THREADS] = "Mismatch in arguments between threads T0/T1.", > >> + [IFS_SBAF_CORE_NOT_CAPABLE_CURRENTLY] = "Core not capable of performing SBAF currently", > >> + [IFS_SBAF_UNASSIGNED_ERROR_CODE7] = "Unassigned error code 0x7", > >> + [IFS_SBAF_EXCEED_NUMBER_OF_THREADS_CONCURRENT] = "Exceeded number of Logical Processors (LP) allowed to run Scan-At-Field concurrently", > >> + [IFS_SBAF_INTERRUPTED_DURING_EXECUTION] = "Interrupt occurred prior to SBAF start", > >> + [IFS_SBAF_INVALID_PROGRAM_INDEX] = "SBAF program index not valid", > >> + [IFS_SBAF_CORRUPTED_CHUNK] = "SBAF operation aborted due to corrupted chunk", > >> + [IFS_SBAF_DID_NOT_START] = "SBAF operation did not start", > >> +}; > >> + > >> +static void sbaf_message_not_tested(struct device *dev, int cpu, u64 status_data) > >> +{ > >> + union ifs_sbaf_status status = (union ifs_sbaf_status)status_data; > >> + > >> + if (status.error_code < ARRAY_SIZE(sbaf_test_status)) { > >> + dev_info(dev, "CPU(s) %*pbl: SBAF operation did not start. %s\n", > >> + cpumask_pr_args(cpu_smt_mask(cpu)), > >> + sbaf_test_status[status.error_code]); > >> + } else if (status.error_code == IFS_SW_TIMEOUT) { > >> + dev_info(dev, "CPU(s) %*pbl: software timeout during scan\n", > >> + cpumask_pr_args(cpu_smt_mask(cpu))); > >> + } else if (status.error_code == IFS_SW_PARTIAL_COMPLETION) { > >> + dev_info(dev, "CPU(s) %*pbl: %s\n", > >> + cpumask_pr_args(cpu_smt_mask(cpu)), > >> + "Not all SBAF bundles executed. Maximum forward progress retries exceeded"); > >> + } else { > >> + dev_info(dev, "CPU(s) %*pbl: SBAF unknown status %llx\n", > >> + cpumask_pr_args(cpu_smt_mask(cpu)), status.data); > >> + } > >> +} > >> + > >> +static void sbaf_message_fail(struct device *dev, int cpu, union ifs_sbaf_status status) > >> +{ > >> + /* Failed signature check is set when SBAF signature did not match the expected value */ > >> + if (status.sbaf_status == SBAF_STATUS_SIGN_FAIL) { > >> + dev_err(dev, "CPU(s) %*pbl: Failed signature check\n", > >> + cpumask_pr_args(cpu_smt_mask(cpu))); > >> + } > >> + > >> + /* Failed to reach end of test */ > >> + if (status.sbaf_status == SBAF_STATUS_TEST_FAIL) { > >> + dev_err(dev, "CPU(s) %*pbl: Failed to complete test\n", > >> + cpumask_pr_args(cpu_smt_mask(cpu))); > >> + } > >> +} > >> + > >> +static bool sbaf_bundle_completed(union ifs_sbaf_status status) > >> +{ > >> + if (status.sbaf_status || status.error_code) > >> + return false; > >> + return true; > > This is same as: > > > > return !status.sbaf_status && !status.error_code; > > Yes. Your version looks good. Do you want me to send a version with > this change or you can include it when merging it? Please do send a new version, there were too many things to change for me to do it while applying. -- i.