Nam Cao <namcao@xxxxxxxxxxxxx> writes: > On 2024-04-18 Nam Cao wrote: >> There is nothing preventing kernel memory allocators from allocating a >> page that overlaps with PTR_ERR(), except for architecture-specific >> code that setup memblock. >> >> It was discovered that RISCV architecture doesn't setup memblock >> corectly, leading to a page overlapping with PTR_ERR() being allocated, >> and subsequently crashing the kernel (link in Close: ) >> >> The reported crash has nothing to do with PTR_ERR(): the last page >> (at address 0xfffff000) being allocated leads to an unexpected >> arithmetic overflow in ext4; but still, this page shouldn't be >> allocated in the first place. >> >> Because PTR_ERR() is an architecture-independent thing, we shouldn't >> ask every single architecture to set this up. There may be other >> architectures beside RISCV that have the same problem. >> >> Fix this one and for all by reserving the physical memory page that >> may be mapped to the last virtual memory page as part of low memory. >> >> Unfortunately, this means if there is actual memory at this reserved >> location, that memory will become inaccessible. However, if this page >> is not reserved, it can only be accessed as high memory, so this >> doesn't matter if high memory is not supported. Even if high memory is >> supported, it is still only one page. >> >> Closes: https://lore.kernel.org/linux-riscv/878r1ibpdn.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx >> Signed-off-by: Nam Cao <namcao@xxxxxxxxxxxxx> >> Cc: <stable@xxxxxxxxxxxxxxx> # all versions > > Sorry, forgot to add: > Reported-by: Björn Töpel <bjorn@xxxxxxxxxx> Hmm, can't we get rid of the whole check in arch/riscv/mm/init.c for 32b? --8<-- diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index fe8e159394d8..1e91d5728887 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -196,7 +196,6 @@ early_param("mem", early_mem); static void __init setup_bootmem(void) { phys_addr_t vmlinux_end = __pa_symbol(&_end); - phys_addr_t max_mapped_addr; phys_addr_t phys_ram_end, vmlinux_start; if (IS_ENABLED(CONFIG_XIP_KERNEL)) @@ -234,21 +233,6 @@ static void __init setup_bootmem(void) if (IS_ENABLED(CONFIG_64BIT)) kernel_map.va_pa_offset = PAGE_OFFSET - phys_ram_base; - /* - * memblock allocator is not aware of the fact that last 4K bytes of - * the addressable memory can not be mapped because of IS_ERR_VALUE - * macro. Make sure that last 4k bytes are not usable by memblock - * if end of dram is equal to maximum addressable memory. For 64-bit - * kernel, this problem can't happen here as the end of the virtual - * address space is occupied by the kernel mapping then this check must - * be done as soon as the kernel mapping base address is determined. - */ - if (!IS_ENABLED(CONFIG_64BIT)) { - max_mapped_addr = __pa(~(ulong)0); - if (max_mapped_addr == (phys_ram_end - 1)) - memblock_set_current_limit(max_mapped_addr - 4096); - } - min_low_pfn = PFN_UP(phys_ram_base); max_low_pfn = max_pfn = PFN_DOWN(phys_ram_end); high_memory = (void *)(__va(PFN_PHYS(max_low_pfn))); --8<-- Mike hints that's *not* the case (https://lore.kernel.org/linux-riscv/ZiAkRMUfiPDUGPdL@xxxxxxxxxx/). memblock_reserve() should disallow allocation as well, no? Thanks, and FWIW: Tested-by: Björn Töpel <bjorn@xxxxxxxxxxxx>