[BUG] kernel side can NOT trigger memory error with einj

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi folks,

If we inject an memory error at physical memory address, e.g. 0x92f033038,
used by a user space process:

	echo 0x92f033038 > /sys/kernel/debug/apei/einj/param1
	echo 0xfffffffffffff000 > /sys/kernel/debug/apei/einj/param2
	echo 0x1 > /sys/kernel/debug/apei/einj/flags
	echo 0x8 > /sys/kernel/debug/apei/einj/error_type
	echo 1 > /sys/kernel/debug/apei/einj/error_inject

Then the following error will be reported in dmesg:

    ACPI: [Firmware Bug]: requested region covers kernel memory @ 0x000000092f033038

After digging into einj trigger interface, I think it's a kernel bug.

On our platform, firmware relies on kernel to trigger an injected error.
Specifically, it populates trigger_tab with the injected physical memory
address, which is set in param1. It is expected to map the RAM address and
run read action. And the execution path is as follows:

    __einj_error_trigger
        => apei_resources_request
            => apei_exec_pre_map_gars
                => apei_exec_run

The root cause is because:

1. Commit fdea163d8c17 ("ACPI, APEI, EINJ, Fix resource conflict on some
machine") removes the injecting memory address range which conflits with
regular memory from trigger table resources. It make sense when calling
apei_resources_request(). **However, the actual mapping operation in
apei_exec_pre_map_gars() with trigger_ctx. And the conflit physical address
is still in trigger_ctx.**

2. Then apei_exec_pre_map_gars() will finally call acpi_os_ioremap().
The injected physical memory address is EFI_CONVENTIONAL_MEMORY and
memblock_is_map_memory is true (arch/arm64/kernel/acpi.c) so that we see
the printed message.

        case EFI_CONVENTIONAL_MEMORY:
        case EFI_PERSISTENT_MEMORY:
            if (memblock_is_map_memory(phys) ||
                !memblock_is_region_memory(phys, size)) {
                pr_warn(FW_BUG "requested region covers kernel memory @ %pa\n", &phys);
                return NULL;
            }

3. On the other hand, commit ba242d5b1a84 ("ACPI, APEI: Add RAM mapping support to ACPI")
add RAM support with kmap. But after commit aafc65c731fe ("ACPI: add arm64 to the
platforms that use ioremap"), ioremap is used to map memory. However, the
ioremap implementation (arch/arm64/mm/ioremap.c) not allowed to map RAM at
all.

    /*
     * Don't allow RAM to be mapped.
     */
    if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr))))
        return NULL;

**As a result, the error could not be triggered, which is not expected if we want
to inject an error to a physical page used by process.**

A normal workflow maps Generic Address Register (GAR) by acpi_os_ioremap
and add its virtual address into acpi_ioremaps. The execution path is as
follows:

    apei_exec_pre_map_gars
        => pre_map_gar_callback
            => apei_map_generic_address
                => acpi_os_map_generic_address
                    => acpi_os_map_iomem    /* add mapped VA into acpi_ioremaps */
                        =>    acpi_map
                            => acpi_os_ioremap /**/

Then, a read or write action is taken. It will check if the physical
address is mapped from acpi_ioremap. If yes, the value is read directly.
Otherwise, acpi_os_ioremap the physical address first. The execution path
is as follows:

    __apei_exec_run
        => apei_exec_read_register
            => apei_read
                => acpi_os_read_memory
                    => acpi_map_vaddr_lookup    /* lookup VA of PA from acpi_ioremap */
                    => acpi_os_ioremap

It works well for reserved memory, but not for common case in which we want
to inject normal memory.


A hacking way to address this issue is that map RAM memory with kmap
instead of apei_exec_pre_map_gars, and read it directly instead of
apei_exec_run.
-       rc = apei_exec_pre_map_gars(&trigger_ctx);
-       if (rc)
-               goto out_release;
+       volatile long *ptr;
+       long tmp;
+       unsigned long pfn;
+       pfn = param1 >> PAGE_SHIFT;

-       rc = apei_exec_run(&trigger_ctx, ACPI_EINJ_TRIGGER_ERROR);
+       ptr = kmap(pfn_to_page(pfn));
+       tmp = *(ptr + (param1 & ~ PAGE_MASK));

-       apei_exec_post_unmap_gars(&trigger_ctx);

I am wondering that should we use kmap to map RAM in acpi_map or add a
another path to address this issue? Any comment is welcomed.

Best Regards,
Shuai



[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux