On Wed, Jan 13, 2021 at 01:56:45PM +0100, David Hildenbrand wrote: > On 11.01.21 20:40, Mike Rapoport wrote: > > From: Mike Rapoport <rppt@xxxxxxxxxxxxx> > > > > The first 4Kb of memory is a BIOS owned area and to avoid its allocation > > for the kernel it was not listed in e820 tables as memory. As the result, > > pfn 0 was never recognised by the generic memory management and it is not a > > part of neither node 0 nor ZONE_DMA. > > > > If set_pfnblock_flags_mask() would be ever called for the pageblock > > corresponding to the first 2Mbytes of memory, having pfn 0 outside of > > ZONE_DMA would trigger > > > > VM_BUG_ON_PAGE(!zone_spans_pfn(page_zone(page), pfn), page); > > > > Along with reserving the first 4Kb in e820 tables, several first pages are > > reserved with memblock in several places during setup_arch(). These > > reservations are enough to ensure the kernel does not touch the BIOS area > > and it is not necessary to remove E820_TYPE_RAM for pfn 0. > > > > Remove the update of e820 table that changes the type of pfn 0 and move the > > comment describing why it was done to trim_low_memory_range() that reserves > > the beginning of the memory. > > > > Signed-off-by: Mike Rapoport <rppt@xxxxxxxxxxxxx> > > --- > > arch/x86/kernel/setup.c | 20 +++++++++----------- > > 1 file changed, 9 insertions(+), 11 deletions(-) > > > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > > index 740f3bdb3f61..3412c4595efd 100644 > > --- a/arch/x86/kernel/setup.c > > +++ b/arch/x86/kernel/setup.c > > @@ -660,17 +660,6 @@ static void __init trim_platform_memory_ranges(void) > > > > static void __init trim_bios_range(void) > > { > > - /* > > - * A special case is the first 4Kb of memory; > > - * This is a BIOS owned area, not kernel ram, but generally > > - * not listed as such in the E820 table. > > - * > > - * This typically reserves additional memory (64KiB by default) > > - * since some BIOSes are known to corrupt low memory. See the > > - * Kconfig help text for X86_RESERVE_LOW. > > - */ > > - e820__range_update(0, PAGE_SIZE, E820_TYPE_RAM, E820_TYPE_RESERVED); > > - > > /* > > * special case: Some BIOSes report the PC BIOS > > * area (640Kb -> 1Mb) as RAM even though it is not. > > @@ -728,6 +717,15 @@ early_param("reservelow", parse_reservelow); > > > > static void __init trim_low_memory_range(void) > > { > > + /* > > + * A special case is the first 4Kb of memory; > > + * This is a BIOS owned area, not kernel ram, but generally > > + * not listed as such in the E820 table. > > + * > > + * This typically reserves additional memory (64KiB by default) > > + * since some BIOSes are known to corrupt low memory. See the > > + * Kconfig help text for X86_RESERVE_LOW. > > + */ > > memblock_reserve(0, ALIGN(reserve_low, PAGE_SIZE)); > > } > > > > > > The only somewhat-confusing thing is that in-between > e820__memblock_setup() and trim_low_memory_range(), we already have > memblock allocations. So [0..4095] might look like ordinary memory until > we reserve it later on. > > E.g., reserve_real_mode() does a > > mem = memblock_find_in_range(0, 1<<20, size, PAGE_SIZE); > ... > memblock_reserve(mem, size); > set_real_mode_mem(mem); > > which looks kind of suspicious to me. Most probably I am missing > something, just wanted to point that out. We might want to do such > trimming/adjustments before any kind of allocations. You are right and it looks suspicious, but the first page is reserved at the very beginning of x86::setup_arch() and, moreover, memblock never allocates it (look at memblock::memblock_find_in_range_node()). As for the range 0x1000 <-> reserve_low, we are unlikely to allocate it in the default top-down mode. The bottom-up mode was only allocating memory above the kernel so this would also prevent allocation of the lowest memory, at least until the recent changes for CMA allocation: https://lore.kernel.org/lkml/20201217201214.3414100-1-guro@xxxxxx That said, we'd better consolidate all the trim_some_memory() and move it closer to the beginning of setup_arch(). I'm going to take a look at it in the next few days. > -- > Thanks, > > David / dhildenb > -- Sincerely yours, Mike.