On Sat, Jul 20, 2019 at 03:52:04PM -0700, Sai Praneeth Prakhya wrote: > Hi All, > > Disclaimer: > 1. Please note that this discussion is x86 specific > 2. Below stated things are my understanding about kernel and I could have > missed somethings, so please let me know if I understood something wrong. > 3. I have focused only on memblock here because if I understand correctly, > memblock is the base that feeds other memory management subsystems in kernel > (like the buddy allocator). > > On x86 platforms, there are two sources through which kernel learns about > physical memory in the system namely E820 table and EFI Memory Map. Each table > describes which regions of system memory is usable by kernel and which regions > should be preserved (i.e. reserved regions that typically have BIOS code/data) > so that no other component in the system could read/write to these regions. I > think they are duplicating the information and hence I have couple of > questions regarding these But isn't it true that in x86 systems the E820 table is populated from the EFI memory map? At least in systems with EFI firmware and a Linux which understands EFI. If booting from the EFI stub, the stub will take the EFI memory map and assemble the E820 table passed as part of the boot params [4]. It also considers the case when there are more than 128 entries in the table [5]. Thus, if booting as an EFI application it will definitely use the EFI memory map. If Linux' EFI entry point is not used the bootloader should to the same. For instance, grub also reads the EFI memory map to assemble the E820 memory map [6], [7], [8]. > > 1. I see that only E820 table is being consumed by kernel [1] (i.e. memblock > subsystem in kernel) to distinguish between "usable" vs "reserved" regions. > Assume someone has called memblock_alloc(), the memblock subsystem would > service the caller by allocating memory from "usable" regions and it knows > this *only* from E820 table [2] (it does not check if EFI Memory Map also says > that this region is usable as well). So, why isn't the kernel taking EFI > Memory Map into consideration? (I see that it does happen only when > "add_efi_memmap" kernel command line arg is passed i.e. passing this argument > updates E820 table based on EFI Memory Map) [3]. The problem I see with > memblock not taking EFI Memory Map into consideration is that, we are ignoring > the main purpose for which EFI Memory Map exists. > > 2. Why doesn't the kernel have "add_efi_memmap" by default? From the commit > "200001eb140e: x86 boot: only pick up additional EFI memmap if add_efi_memmap > flag", I didn't understand why the decision was made so. Shouldn't we give > more preference to EFI Memory map rather than E820 table as it's the latest > and E820 is legacy? I did a a quick experiment with and without add_efi_memmmap. the e820 table looked exactly the same. I guess this shows that what I wrote above makes sense ;) . Have you observed difference? Thanks and BR, Ricardo [4]. https://elixir.bootlin.com/linux/latest/source/arch/x86/boot/compressed/eboot.c#L516 [5]. https://elixir.bootlin.com/linux/latest/source/arch/x86/boot/compressed/eboot.c#L493 [6]. http://git.savannah.gnu.org/cgit/grub.git/tree/grub-core/loader/i386/linux.c#n573 [7]. http://git.savannah.gnu.org/cgit/grub.git/tree/grub-core/mmap/mmap.c#n110 [8]. http://git.savannah.gnu.org/cgit/grub.git/tree/grub-core/mmap/efi/mmap.c#n139