On Wed, Aug 21, 2019 at 10:29:37AM +0300, Ard Biesheuvel wrote: > On Wed, 21 Aug 2019 at 10:11, Mike Rapoport <rppt@xxxxxxxxxxxxx> wrote: > > > > On Wed, Aug 21, 2019 at 09:35:16AM +0300, Ard Biesheuvel wrote: > > > On Wed, 21 Aug 2019 at 09:11, Chester Lin <clin@xxxxxxxx> wrote: > > > > > > > > On Tue, Aug 20, 2019 at 03:28:25PM +0300, Ard Biesheuvel wrote: > > > > > On Tue, 20 Aug 2019 at 14:56, Russell King - ARM Linux admin > > > > > <linux@xxxxxxxxxxxxxxx> wrote: > > > > > > > > > > > > On Fri, Aug 02, 2019 at 05:38:54AM +0000, Chester Lin wrote: > > > > > > > diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c > > > > > > > index f3ce34113f89..909b11ba48d8 100644 > > > > > > > --- a/arch/arm/mm/mmu.c > > > > > > > +++ b/arch/arm/mm/mmu.c > > > > > > > @@ -1184,6 +1184,9 @@ void __init adjust_lowmem_bounds(void) > > > > > > > phys_addr_t block_start = reg->base; > > > > > > > phys_addr_t block_end = reg->base + reg->size; > > > > > > > > > > > > > > + if (memblock_is_nomap(reg)) > > > > > > > + continue; > > > > > > > + > > > > > > > if (reg->base < vmalloc_limit) { > > > > > > > if (block_end > lowmem_limit) > > > > > > > /* > > > > > > > > > > > > I think this hunk is sane - if the memory is marked nomap, then it isn't > > > > > > available for the kernel's use, so as far as calculating where the > > > > > > lowmem/highmem boundary is, it effectively doesn't exist and should be > > > > > > skipped. > > > > > > > > > > > > > > > > I agree. > > > > > > > > > > Chester, could you explain what you need beyond this change (and my > > > > > EFI stub change involving TEXT_OFFSET) to make things work on the > > > > > RPi2? > > > > > > > > > > > > > Hi Ard, > > > > > > > > In fact I am working with Guillaume to try booting zImage kernel and openSUSE > > > > from grub2.04 + arm32-efistub so that's why we get this issue on RPi2, which is > > > > one of the test machines we have. However we want a better solution for all > > > > cases but not just RPi2 since we don't want to affect other platforms as well. > > > > > > > > > > Thanks Chester, but that doesn't answer my question. > > > > > > Your fix is a single patch that changes various things that are only > > > vaguely related. We have already identified that we need to take > > > TEXT_OFFSET (minus some space used by the swapper page tables) into > > > account into the EFI stub if we want to ensure compatibility with many > > > different platforms, and as it turns out, this applies not only to > > > RPi2 but to other platforms as well, most notably the ones that > > > require a TEXT_OFFSET of 0x208000, since they also have reserved > > > regions at the base of RAM. > > > > > > My question was what else we need beyond: > > > - the EFI stub TEXT_OFFSET fix [0] > > > - the change to disregard NOMAP memblocks in adjust_lowmem_bounds() > > > - what else??? > > > > I think the only missing part here is to ensure that non-reserved memory in > > bank 0 starts from a PMD-aligned address. I believe this could be done if > > EFI stub, but I'm not really familiar with it so this just a semi-educated > > guess :) > > > > Given that it is the ARM arch code that imposes this requirement, how > about adding something like this to adjust_lowmem_bounds(): > > if (memblock_start_of_DRAM() % PMD_SIZE) > memblock_mark_nomap(memblock_start_of_DRAM(), > PMD_SIZE - (memblock_start_of_DRAM() % PMD_SIZE)); memblock_start_of_DRAM() won't work here, as it returns the actual start of the DRAM including NOMAP regions. Moreover, as we cannot mark a region NOMAP inside for_each_memblock() this should be done beforehand. I think something like this could work: diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c index 2f0f07e..f2b635b 100644 --- a/arch/arm/mm/mmu.c +++ b/arch/arm/mm/mmu.c @@ -1178,6 +1178,19 @@ void __init adjust_lowmem_bounds(void) */ vmalloc_limit = (u64)(uintptr_t)vmalloc_min - PAGE_OFFSET + PHYS_OFFSET; + /* + * The first usable region must be PMD aligned. Mark its start + * as MEMBLOCK_NOMAP if it isn't + */ + for_each_memblock(memory, reg) { + if (!memblock_is_nomap(reg) && (reg->base % PMD_SIZE)) { + phys_addr_t size = PMD_SIZE - (reg->base % PMD_SIZE); + + memblock_mark_nomap(reg->base, size); + break; + } + } + for_each_memblock(memory, reg) { phys_addr_t block_start = reg->base; phys_addr_t block_end = reg->base + reg->size; > (and introduce the nomap check into the loop) -- Sincerely yours, Mike.