On Fri, Dec 15, 2023 at 1:17 PM Dave Hansen <dave.hansen@xxxxxxxxx> wrote: > > On 12/15/23 11:05, Chris Koch wrote: > > A relocatable kernel will relocate itself to pref_address if it is > > loaded below pref_address. This means a booted kernel may be relocating > > itself to an area with reserved memory on modern systems, potentially > > clobbering arbitrary data that may be important to the system. > > > > This is often the case, as the default value of PHYSICAL_START is > > 0x1000000 and kernels are typically loaded at 0x100000 or above by > > bootloaders like iPXE or kexec. GRUB behaves like this patch does. > > > > Also fixes the documentation around pref_address and PHYSICAL_START to > > be accurate. > > Are you reporting a bug and is this a bug fix? It's not super clear > from the changelog. I reported it as a bug yesterday in https://lkml.org/lkml/2023/12/14/1529 -- I'm happy to reword this in some way that indicates it's a bug fix. > > > > diff --git a/Documentation/arch/x86/boot.rst b/Documentation/arch/x86/boot.rst > > index 22cc7a040dae..49bea8986620 100644 > > --- a/Documentation/arch/x86/boot.rst > > +++ b/Documentation/arch/x86/boot.rst > > @@ -878,7 +878,8 @@ Protocol: 2.10+ > > address if possible. > > > > A non-relocatable kernel will unconditionally move itself and to run > > - at this address. > > + at this address. A relocatable kernel will move itself to this address if it > > + loaded below this address. > > I think we should avoid saying the same things over and over again in > different spots. > > Here, it doesn't really help to enumerate the different interpretations > of 'pref_address'. All that matters is that the bootloader can avoid > the overhead of a later copy if it can place the kernel at > 'pref_address'. The exact reasons that various kernels might decide to > relocate are unimportant here. I think it's important documentation for bootloader authors. It's not about avoiding overhead, it's about avoiding clobbering areas of memory that may be reserved in e820 / EFI memory map, which the kernel will do when it relocates itself to pref_address without checking what's reserved and what's not. It emphasizes the importance of choosing an address above pref_address. Happy to reword some way to reflect that. > > > ============ ======= > > Field name: init_size > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > > index 3762f41bb092..1370f43328d7 100644 > > --- a/arch/x86/Kconfig > > +++ b/arch/x86/Kconfig > > @@ -2109,11 +2109,11 @@ config PHYSICAL_START > > help > > This gives the physical address where the kernel is loaded. > > > > - If kernel is a not relocatable (CONFIG_RELOCATABLE=n) then > > - bzImage will decompress itself to above physical address and > > - run from there. Otherwise, bzImage will run from the address where > > - it has been loaded by the boot loader and will ignore above physical > > - address. > > + If the kernel is not relocatable (CONFIG_RELOCATABLE=n) then bzImage > > + will decompress itself to above physical address and run from there. > > + Otherwise, bzImage will run from the address where it has been loaded > > + by the boot loader. The only exception is if it is loaded below the > > + above physical address, in which case it will relocate itself there. > > I kinda dislike how this is written. It's written almost like code > where you're spelling out the conditions. I prefer something much > higher-level. > > This gives a minimum physical address at which the kernel can be > loaded. > > CONFIG_RELOCATABLE=n kernels will be decompressed to and must > run at PHYSICAL_START exactly. > > CONFIG_RELOCATABLE=y kernels can run at any address above > PHYSICAL_START. If a kernel is loaded below PHYSICAL_START, it > will relocate itself to PHYSICAL_START. Happy to change that, yours is better. > > > In normal kdump cases one does not have to set/change this option > > as now bzImage can be compiled as a completely relocatable image > > diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c > > index a61c12c01270..5dcd232d58bf 100644 > > --- a/arch/x86/kernel/kexec-bzimage64.c > > +++ b/arch/x86/kernel/kexec-bzimage64.c > > @@ -498,7 +498,10 @@ static void *bzImage64_load(struct kimage *image, char *kernel, > > kbuf.bufsz = kernel_len - kern16_size; > > kbuf.memsz = PAGE_ALIGN(header->init_size); > > kbuf.buf_align = header->kernel_alignment; > > - kbuf.buf_min = MIN_KERNEL_LOAD_ADDR; > > + if (header->pref_address < MIN_KERNEL_LOAD_ADDR) > > + kbuf.buf_min = MIN_KERNEL_LOAD_ADDR; > > + else > > + kbuf.buf_min = header->pref_address; > > kbuf.mem = KEXEC_BUF_MEM_UNKNOWN; > > ret = kexec_add_buffer(&kbuf); > > if (ret) > > Comment, please. > > It isn't clear from this hunk why or how this fixes the bug. How does > this manage to avoid clobbering reserved areas? When allocated above pref_address, the kernel will not relocate itself to an area that potentially overlaps with reserved memory. I'll add a comment. Not sure what the etiquette is on immediately sending a patch v2, or waiting for more comments. I'll err on waiting on a couple more comments before sending v2. Thanks for the review Chris