Re: [PATCH v11 1/3] ACPI: APEI: send SIGBUS to current task if synchronous memory error not recovered

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2024/2/26 18:29, Borislav Petkov wrote:
> On Sat, Feb 24, 2024 at 02:08:42PM +0800, Shuai Xue wrote:
>> @Borislav, do you have any other concerns?
> 
> Yes, this change needs to be further reviewed by an ARM person: I have
> no clue what those "abnormal synchronous errors" on ARM are 

Hi, Borislav,

May the `abnormal` is not inaccurate and misled you. I mean the preconditions
check before memory_failure_queue():

- `if (!(mem_err->validation_bits & CPER_MEM_VALID_PA))` in ghes_handle_memory_failure()
- `if (flags == -1)` in ghes_handle_memory_failure()
- `if (!IS_ENABLED(CONFIG_ACPI_APEI_MEMORY_FAILURE))` in ghes_do_memory_failure()
- `if (!pfn_valid(pfn) && !arch_is_platform_page(physical_addr)) ` in ghes_do_memory_failure()

If the preconditions are not passed, the user-space process will trigger SEA again.
This loop can potentially exceed the platform firmware threshold or even
trigger a kernel hard lockup, leading to a system reboot.

> and how
> they're supposed to be handled properly there:
> 
> - what happens if you get such an error when ghes is disabled there?

If ghes_disable is set, the GHES driver will not be inited by acpi_ghes_init(),
so none of error notifications will be handled. IMHO, it is expected.

> 
> - is that even the right place to handle them?
> 
> James?
> 

Leave this to @James.

Thank you.

Best Regards,
Shuai





[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]
  Powered by Linux