Re: issue with MEMBLOCK_NOMAP

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 29 January 2016 at 18:57, Mark Salter <msalter@xxxxxxxxxx> wrote:
> On Fri, 2016-01-29 at 17:16 +0100, Ard Biesheuvel wrote:
>> On 29 January 2016 at 16:53, Mark Salter <msalter@xxxxxxxxxx> wrote:
>> > On Fri, 2016-01-29 at 15:06 +0100, Ard Biesheuvel wrote:
>> > > On 29 January 2016 at 15:00, Mark Salter <msalter@xxxxxxxxxx> wrote:
>> > > > Hi Ard,
>> > > >
>> > > > I ran into an issue with your MEMBLOCK_NOMAP changes on a particular
>> > > > firmware. The symptom is the kernel panics at boot time when it hits
>> > > > an unmapped page while unpacking the initramfs. As it turns out, the
>> > > > start of the initramfs shares a 64k kernel page with the UEFI memmap.
>> > > > I can avoid the problem with:
>> > > >
>> > > > @@ -203,7 +203,7 @@ void __init efi_init(void)
>> > > >
>> > > >         reserve_regions();
>> > > >         early_memunmap(memmap.map, params.mmap_size);
>> > > > -       memblock_mark_nomap(params.mmap & PAGE_MASK,
>> > > > -                           PAGE_ALIGN(params.mmap_size +
>> > > > -                                      (params.mmap & ~PAGE_MASK)));
>> > > > +       memblock_reserve(params.mmap & PAGE_MASK,
>> > > > +                        PAGE_ALIGN(params.mmap_size +
>> > > > +                                   (params.mmap & ~PAGE_MASK)));
>> > > >  }
>> > > >
>> > > >
>> > > > But it makes me worry about the same potential problem with
>> > > > other reserved regions which we nomap. What do you think?
>> > > >
>> > >
>> > > So I take it this initramfs allocation is not made by the stub but by
>> > > GRUB? Since the stub rounds all allocations to 64 KB ...
>> > >
>> > Yes. GRUB.
>> >
>>
>> We have already fixed EDK2 a while ago to round up all regions
>> returned by AllocatePages() to round up to 64 KB. Do you know if this
>> is a GRUB issue (i.e., it traverses the memory map and finds a free
>> range and explicitly allocates it) or a firmware issue?
>
> Grub uses AllocatePages() to get memory for the initrd. The firmware
> that hit this was fairly old (released last May I think). The problem
> didn't show up on newer firmware for same platform but that doesn't
> really mean anything definitive.
>

Indeed. I added the alignment in EDK2 to ensure that runtime regions
are aligned to 64 KB but there is no requirement to that effect for
LoaderData, which I presume is what is used for the initramfs

>>
>> > > In any case, regardless of the underlying cause, if any part of the
>> > > initramfs turns out not to be covered by the linear mapping, we should
>> > > invoke your code to move it. So I think it should be a matter of
>> > > refining the logic in relocate_initrd() to do the right thing in this
>> > > case
>> >
>> > That thought had crossed my mind. I think it would be easy enough to
>> > trigger the copy if first or last page of initrd is unmapped.
>>
>> Indeed. If some page in the middle is missing, then you're really
>> doing something fishy, so I don't see why we should care about that as
>> well.
>>
>> > Somewhat
>> > related to this is that I want to rework this old patch to deal with
>> > acpi tables outside mapped ram:
>> >
>> >   https://lkml.org/lkml/2015/5/14/357
>> >
>> > Basically, we should be able to just do:
>> >
>> > diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
>> > index 15e0aad..4ea638c 100644
>> > --- a/arch/arm64/include/asm/acpi.h
>> > +++ b/arch/arm64/include/asm/acpi.h
>> > @@ -32,7 +32,7 @@
>> >  static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys,
>> >                                             acpi_size size)
>> >  {
>> > -       if (!page_is_ram(phys >> PAGE_SHIFT))
>> > +       if (!memblock_is_memory(phys))
>> >                 return ioremap(phys, size);
>> >
>> >         return ioremap_cache(phys, size);
>> >
>>
>> I think we should fix acpi_os_ioremap(). IIRC it is used via two
>> different code paths that distinguish between memory and I/O, and end
>> up using the same function for no good reason.
>
> I remember this being mentioned before. It would be a nice solution.
>

Indeed. I'll have a look into this on Monday

>>
>> > But this doesn't currently work wrt mem= which works by removing
>> > the end range of memblocks. If I have mem= use the nomap flag
>> > rather than removing the range, the above acpi_os_ioremap change
>> > works, but other issues crop up due to memblock_end_of_DRAM()
>> > returning end of all DRAM regardless of mem=. So we end up with
>> > PFNs and struct pages for memory which will never be in linear
>> > map. Fixing memblock_end_of_DRAM() to look at the flags and
>> > return end of mapped DRAM gets things working but I wonder about
>> > other potential trouble spots with this approach. Any thoughts?
>> >
>>
>> Actually, I think mem= should be considered a development feature, not
>> a production feature, and if its use is suboptimal in this respect, so
>> be it.
>
> It is mostly a devel/debug feature but the production case is
> with kdump where the kexec'd kernel gathering the dump info has
> to be restricted to its own sandbox.
>

Well, with the upcoming changes for KASLR, mem= is not guaranteed to
reserve the memory you expect. It would be much better to define more
unambiguously which region is available for the kexec kernel.

>>
>> But to address this particular issue, it would probably be better to
>> fix page_is_ram(). I have made some attempts in that direction in the
>> past, but that never landed anywhere. Since ACPI on arm64 is tightly
>> coupled to UEFI, implementing page_is_ram() as something that
>> interrogates the UEFI memory map if efi_enabled(EFI_MEMMAP) would be
>> reasonable imo. (Or perhaps putting that in acpi_os_ioremap()
>> directly?)
>>
>> >
>> > >
>> > > Your suggested change will break 32-bit ARM, since we use
>> > > ioremap_nocache() to map the UEFI memory map, and ARM does not allow
>> > > that on ranges that are part of the linear mapping.
>> >
>> > okay. I'll put together a patch to the initrd relocating code.
>> >
>>
>> Great!
>
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux