Re: [PATCH v2] ACPI, APEI, EINJ: Remove memory range validation for CXL error types

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks for the review Dan, responses inline.

On 5/31/23 4:31 PM, Dan Williams wrote:
> Hi Ben,
> 
> Ben Cheatham wrote:
>> From: Yazen Ghannam <yazen.ghannam@xxxxxxx>
>>
>> V2 Changes:
>>  - Added Link tags for links
>>  - removed stray unused variable
>>
>> This patch is a follow up to the discussion at [1], and builds on Tony's
>> CXL error patch at [2].
>>
>> The new CXL error types will use the Memory Address field in the
>> SET_ERROR_TYPE_WITH_ADDRESS structure in order to target a CXL 1.1
>> compliant memory-mapped Downstream port. The value of the Memory Address
>> will be in the port's MMIO range, and it will not represent physical
>> (normal or persistent) memory.
>>
>> Allow error injection for CXL 1.1 systems by skipping memory range
>> validation for CXL error injection types.
> 
> This just feels a bit too loose especially when the kernel has
> the cxl_acpi driver to perform the enumeration of CXL root ports.
> 
> I know that Terry and Robert are teaching the PCI AER core how to
> coordinate with RCRB information [1] (I still need to go dig in deeper
> on that set). I would expect ACPI EINJ could benefit from similar
> coordination and validate these addresses.
>> Now, is it any address in the downstream-port RCRB range that is valid,
> or only the base?
> 
Response to above in your follow up email.

> Another minor comment below...
> 
> [1]: http://lore.kernel.org/r/20230523232214.55282-1-terry.bowman@xxxxxxx
> 
>>
>> Output trying to inject CXL.mem error without patch:
>>
>> -bash: echo: write error: Invalid argument
>>
>> [1]:
>> Link: https://lore.kernel.org/linux-acpi/20221206205234.606073-1-Benjamin.Cheatham@xxxxxxx/
>> [2]:
>> Link: https://lore.kernel.org/linux-cxl/CAJZ5v0hNQUfWViqxbJ5B4JCGJUuHpWWSpqpCFWPNpGuagoFbsQ@xxxxxxxxxxxxxx/T/#t
>>
>> Signed-off-by: Yazen Ghannam <yazen.ghannam@xxxxxxx>
>> Signed-off-by: Ben Cheatham <benjamin.cheatham@xxxxxxx>
>> ---
>>  drivers/acpi/apei/einj.c | 12 +++++++++++-
>>  include/acpi/actbl1.h    |  6 ++++++
>>  2 files changed, 17 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/acpi/apei/einj.c b/drivers/acpi/apei/einj.c
>> index 013eb621dc92..68a20326ed7c 100644
>> --- a/drivers/acpi/apei/einj.c
>> +++ b/drivers/acpi/apei/einj.c
>> @@ -37,6 +37,13 @@
>>  				ACPI_EINJ_MEMORY_UNCORRECTABLE | \
>>  				ACPI_EINJ_MEMORY_FATAL)
>>  
>> +#define CXL_ERROR_MASK		(ACPI_EINJ_CXL_CACHE_CORRECTABLE	| \
>> +				ACPI_EINJ_CXL_CACHE_UNCORRECTABLE	| \
>> +				ACPI_EINJ_CXL_CACHE_FATAL		| \
>> +				ACPI_EINJ_CXL_MEM_CORRECTABLE		| \
>> +				ACPI_EINJ_CXL_MEM_UNCORRECTABLE		| \
>> +				ACPI_EINJ_CXL_MEM_FATAL)
>> +
>>  /*
>>   * ACPI version 5 provides a SET_ERROR_TYPE_WITH_ADDRESS action.
>>   */
>> @@ -537,8 +544,11 @@ static int einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2,
>>  	if (type & ACPI5_VENDOR_BIT) {
>>  		if (vendor_flags != SETWA_FLAGS_MEM)
>>  			goto inject;
>> -	} else if (!(type & MEM_ERROR_MASK) && !(flags & SETWA_FLAGS_MEM))
>> +	} else if (!(type & MEM_ERROR_MASK) && !(flags & SETWA_FLAGS_MEM)) {
>> +		goto inject;
>> +	} else if (type & CXL_ERROR_MASK) {
>>  		goto inject;
>> +	}
>>  
>>  	/*
>>  	 * Disallow crazy address masks that give BIOS leeway to pick
>> diff --git a/include/acpi/actbl1.h b/include/acpi/actbl1.h
>> index 81b9e794424d..c39837266414 100644
>> --- a/include/acpi/actbl1.h
>> +++ b/include/acpi/actbl1.h
>> @@ -1044,6 +1044,12 @@ enum acpi_einj_command_status {
>>  #define ACPI_EINJ_PLATFORM_CORRECTABLE      (1<<9)
>>  #define ACPI_EINJ_PLATFORM_UNCORRECTABLE    (1<<10)
>>  #define ACPI_EINJ_PLATFORM_FATAL            (1<<11)
>> +#define ACPI_EINJ_CXL_CACHE_CORRECTABLE     BIT(12)
>> +#define ACPI_EINJ_CXL_CACHE_UNCORRECTABLE   BIT(13)
>> +#define ACPI_EINJ_CXL_CACHE_FATAL           BIT(14)
>> +#define ACPI_EINJ_CXL_MEM_CORRECTABLE       BIT(15)
>> +#define ACPI_EINJ_CXL_MEM_UNCORRECTABLE     BIT(16)
>> +#define ACPI_EINJ_CXL_MEM_FATAL             BIT(17)
> 
> I expect these to come from the next ACPICA update just like the other
> definitions. The change in style from (x<<N) to BIT(N) was a tip-off.
> The process is to submit a pull request to the ACPICA project, for
> example:
> 
> https://github.com/acpica/acpica/commit/e948142526c0

I wasn't aware of this, I'll go ahead and do that.



[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]
  Powered by Linux