Re: [PATCH v3] ACPI, APEI, EINJ: Relax platform response timeout to 1 second.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Tony,

Thank you for your patient revision. :)

Cheers,
Shuai

On 2021/10/27 AM1:05, Luck, Tony wrote:
> On Tue, Oct 26, 2021 at 03:28:29PM +0800, Shuai Xue wrote:
>> When injecting an error into the platform, the OSPM executes an
>> EXECUTE_OPERATION action to instruct the platform to begin the injection
>> operation. And then, the OSPM busy waits for a while by continually
>> executing CHECK_BUSY_STATUS action until the platform indicates that the
>> operation is complete. More specifically, the platform is limited to
>> respond within 1 millisecond right now. This is too strict for some
>> platforms.
>>
>> For example, in Arm platform, when injecting a Processor Correctable error,
>> the OSPM will warn:
>>     Firmware does not respond in time.
>>
>> And a message is printed on the console:
>>     echo: write error: Input/output error
>>
>> We observe that the waiting time for DDR error injection is about 10 ms and
>> that for PCIe error injection is about 500 ms in Arm platform.
>>
>> In this patch, we relax the response timeout to 1 second.
>>
>> Signed-off-by: Shuai Xue <xueshuai@xxxxxxxxxxxxxxxxx>
> 
> Reviewed-by: Tony Luck <tony.luck@xxxxxxxxx>
> 
> Rafael: Do you want to take this in the acpi tree? If not, I can
> apply it to the RAS tree (already at -rc7, so in next merge cycle
> after 5.16-rc1 comes out).
> 
>> ---
>> Changelog v2 -> v3:
>> - Implemented the timeout in usleep_range instead of msleep.
>> - Dropped command line interface of timeout.
>> - Link to the v1 patch: https://lkml.org/lkml/2021/10/14/1402
>> ---
>>  drivers/acpi/apei/einj.c | 15 ++++++++-------
>>  1 file changed, 8 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/acpi/apei/einj.c b/drivers/acpi/apei/einj.c
>> index 133156759551..6e1ff4b62a8f 100644
>> --- a/drivers/acpi/apei/einj.c
>> +++ b/drivers/acpi/apei/einj.c
>> @@ -28,9 +28,10 @@
>>  #undef pr_fmt
>>  #define pr_fmt(fmt) "EINJ: " fmt
>>  
>> -#define SPIN_UNIT		100			/* 100ns */
>> -/* Firmware should respond within 1 milliseconds */
>> -#define FIRMWARE_TIMEOUT	(1 * NSEC_PER_MSEC)
>> +#define SLEEP_UNIT_MIN		1000			/* 1ms */
>> +#define SLEEP_UNIT_MAX		5000			/* 5ms */
>> +/* Firmware should respond within 1 seconds */
>> +#define FIRMWARE_TIMEOUT	(1 * USEC_PER_SEC)
>>  #define ACPI5_VENDOR_BIT	BIT(31)
>>  #define MEM_ERROR_MASK		(ACPI_EINJ_MEMORY_CORRECTABLE | \
>>  				ACPI_EINJ_MEMORY_UNCORRECTABLE | \
>> @@ -171,13 +172,13 @@ static int einj_get_available_error_type(u32 *type)
>>  
>>  static int einj_timedout(u64 *t)
>>  {
>> -	if ((s64)*t < SPIN_UNIT) {
>> +	if ((s64)*t < SLEEP_UNIT_MIN) {
>>  		pr_warn(FW_WARN "Firmware does not respond in time\n");
>>  		return 1;
>>  	}
>> -	*t -= SPIN_UNIT;
>> -	ndelay(SPIN_UNIT);
>> -	touch_nmi_watchdog();
>> +	*t -= SLEEP_UNIT_MIN;
>> +	usleep_range(SLEEP_UNIT_MIN, SLEEP_UNIT_MAX);
>> +
>>  	return 0;
>>  }
>>  
>> -- 
>> 2.20.1.12.g72788fdb
>>



[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux