On Tue, Apr 20, 2021 at 12:20 AM Lakshmi Ramasubramanian <nramas@xxxxxxxxxxxxxxxxxxx> wrote: > > On 4/19/21 10:00 PM, Dan Carpenter wrote: > > On Tue, Apr 20, 2021 at 09:30:16AM +1000, Michael Ellerman wrote: > >> Lakshmi Ramasubramanian <nramas@xxxxxxxxxxxxxxxxxxx> writes: > >>> On 4/16/21 2:05 AM, Michael Ellerman wrote: > >>> > >>>> Daniel Axtens <dja@xxxxxxxxxx> writes: > >>>>>> On 4/15/21 12:14 PM, Lakshmi Ramasubramanian wrote: > >>>>>> > >>>>>> Sorry - missed copying device-tree and powerpc mailing lists. > >>>>>> > >>>>>>> There are a few "goto out;" statements before the local variable "fdt" > >>>>>>> is initialized through the call to of_kexec_alloc_and_setup_fdt() in > >>>>>>> elf64_load(). This will result in an uninitialized "fdt" being passed > >>>>>>> to kvfree() in this function if there is an error before the call to > >>>>>>> of_kexec_alloc_and_setup_fdt(). > >>>>>>> > >>>>>>> Initialize the local variable "fdt" to NULL. > >>>>>>> > >>>>> I'm a huge fan of initialising local variables! But I'm struggling to > >>>>> find the code path that will lead to an uninit fdt being returned... > >>>>> > >>>>> The out label reads in part: > >>>>> > >>>>> /* Make kimage_file_post_load_cleanup free the fdt buffer for us. */ > >>>>> return ret ? ERR_PTR(ret) : fdt; > >>>>> > >>>>> As far as I can tell, any time we get a non-zero ret, we're going to > >>>>> return an error pointer rather than the uninitialised value... > >>> > >>> As Dan pointed out, the new code is in linux-next. > >>> > >>> I have copied the new one below - the function doesn't return fdt, but > >>> instead sets it in the arch specific field (please see the link to the > >>> updated elf_64.c below). > >>> > >>> https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git/tree/arch/powerpc/kexec/elf_64.c?h=for-next > >>> > >>>>> > >>>>> (btw, it does look like we might leak fdt if we have an error after we > >>>>> successfully kmalloc it.) > >>>>> > >>>>> Am I missing something? Can you link to the report for the kernel test > >>>>> robot or from Dan? > >>> > >>> /* > >>> * Once FDT buffer has been successfully passed to > >>> kexec_add_buffer(), > >>> * the FDT buffer address is saved in image->arch.fdt. In that > >>> case, > >>> * the memory cannot be freed here in case of any other error. > >>> */ > >>> if (ret && !image->arch.fdt) > >>> kvfree(fdt); > >>> > >>> return ret ? ERR_PTR(ret) : NULL; > >>> > >>> In case of an error, the memory allocated for fdt is freed unless it has > >>> already been passed to kexec_add_buffer(). > >> > >> It feels like the root of the problem is that the kvfree of fdt is in > >> the wrong place. It's only allocated later in the function, so the error > >> path should reflect that. Something like the patch below. > >> > >> cheers > >> > >> > >> diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c > >> index 5a569bb51349..02662e72c53d 100644 > >> --- a/arch/powerpc/kexec/elf_64.c > >> +++ b/arch/powerpc/kexec/elf_64.c > >> @@ -114,7 +114,7 @@ static void *elf64_load(struct kimage *image, char *kernel_buf, > >> ret = setup_new_fdt_ppc64(image, fdt, initrd_load_addr, > >> initrd_len, cmdline); > >> if (ret) > >> - goto out; > >> + goto out_free_fdt; > >> > >> fdt_pack(fdt); > >> > >> @@ -125,7 +125,7 @@ static void *elf64_load(struct kimage *image, char *kernel_buf, > >> kbuf.mem = KEXEC_BUF_MEM_UNKNOWN; > >> ret = kexec_add_buffer(&kbuf); > >> if (ret) > >> - goto out; > >> + goto out_free_fdt; > >> > >> /* FDT will be freed in arch_kimage_file_post_load_cleanup */ > >> image->arch.fdt = fdt; > >> @@ -140,18 +140,14 @@ static void *elf64_load(struct kimage *image, char *kernel_buf, > >> if (ret) > >> pr_err("Error setting up the purgatory.\n"); > >> > >> + goto out; > > > > This will leak. It would need to be something like: > > > > if (ret) { > > pr_err("Error setting up the purgatory.\n"); > > goto out_free_fdt; > > } > Once "fdt" buffer is successfully passed to kexec_add_buffer() it cannot > be freed here - it will be freed when the kexec cleanup function is called. That may be the case currently, but really if a function returns an error it should have undone anything it did like memory allocations. I don't think you should do that to fix this issue, but it would be a good clean-up. Rob