On Tue, 7 Jul 2020 at 11:40, Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > On Tue, 7 Jul 2020 at 11:35, Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > > > On Tue, 7 Jul 2020 at 10:58, Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote: > > > > > > Hi Ard, > > > > > > On Tue, Jul 7, 2020 at 9:45 AM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > > > On Tue, 7 Jul 2020 at 10:39, Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote: > > > > > On Tue, Jul 7, 2020 at 8:50 AM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > > > > > On Mon, 6 Jul 2020 at 18:02, Geert Uytterhoeven <geert+renesas@xxxxxxxxx> wrote: > > > > > > > Currently, the start address of physical memory is obtained by masking > > > > > > > the program counter with a fixed mask of 0xf8000000. This mask value > > > > > > > was chosen as a balance between the requirements of different platforms. > > > > > > > However, this does require that the start address of physical memory is > > > > > > > a multiple of 128 MiB, precluding booting Linux on platforms where this > > > > > > > requirement is not fulfilled. > > > > > > > > > > > > > > Fix this limitation by obtaining the start address from the DTB instead, > > > > > > > if available (either explicitly passed, or appended to the kernel). > > > > > > > Fall back to the traditional method when needed. > > > > > > > > > > > > > > This allows to boot Linux on r7s9210/rza2mevb using the 64 MiB of SDRAM > > > > > > > on the RZA2MEVB sub board, which is located at 0x0C000000 (CS3 space), > > > > > > > i.e. not at a multiple of 128 MiB. > > > > > > > > > > > > > > Suggested-by: Nicolas Pitre <nico@xxxxxxxxxxx> > > > > > > > Signed-off-by: Geert Uytterhoeven <geert+renesas@xxxxxxxxx> > > > > > > > Reviewed-by: Nicolas Pitre <nico@xxxxxxxxxxx> > > > > > > > Reviewed-by: Ard Biesheuvel <ardb@xxxxxxxxxx> > > > > > > > Tested-by: Marek Szyprowski <m.szyprowski@xxxxxxxxxxx> > > > > > > > Tested-by: Dmitry Osipenko <digetx@xxxxxxxxx> > > > > > > > Cc: Lukasz Stelmach <l.stelmach@xxxxxxxxxxx> > > > > > > > --- > > > > > > > Marked as RFC, because: > > > > > > > 1. This is known to break crashkernel support, as the memory used by > > > > > > > the crashkernel is not marked reserved in DT (yet), > > > > > > > 2. Russell won't apply this for v5.9 anyway, > > > > > > > > > > > > > > > > > > > Would it help if we make this behavior dependent on a simple heuristic, e.g., > > > > > > > > > > > > if (round_up(load_address, 128M) >= dram_end) > > > > > > use dram_start from DT > > > > > > else > > > > > > use round_up(load_address, 128M) > > > > > > > > > > > > That way, the fix is guaranteed to only take effect for systems that > > > > > > cannot even boot otherwise, which fixes the crashkernel case, as well > > > > > > as other potential regressions due to the load address of the core > > > > > > kernel changing for existing boards. > > > > > > > > > > Thanks for your suggestion! > > > > > 1. Shouldn't the calculation use round_down() instead of round_up()? > > > > > 2. Likewise, "round_down(load_address, 128M) < dram_start from DT"? > > > > > > > > > > > > > No. > > > > > > > > What the code does today is round *up* to a multiple of 128 MB, and > > > > only when that leads to a problem, we should use the DT provided > > > > memory regions. > > > > > > mov r4, pc > > > and r4, r4, #0xf8000000 > > > > > > Surely this is rounding down, isn't it? > > > > > > > Yes you are right. > > > > > add r4, r4, #TEXT_OFFSET > > > > > > Followed by adding a small number (typically 0x00008000). > > > > > > On RZA2MEVB with 64 MiB of RAM, the result lies below dram_start. > > > > Yes, but in the general case, this is not true. Platforms that manage > > to boot using the current arrangement will do so by putting the > > decompressor above the first 128 MB aligned boundary covered by DRAM > > (and lose access to any memory below it via the linear mapping, but > > this memory could still be used via a no-map reserved-memory node > > AFAIK.) > > > > > BTW, how to obtain dram_end? From DT again? Do we trust it, as we > > > apparently cannot trust dram_start in some configurations. > > > > > > Do I need more coffee? > > > > > > > Maybe we both do :-) > > > > AIUI, the reason we cannot trust dram_start is because of the > > crashkernel case, i.e., the kernel may have deliberately been put high > > up in memory, and the expectation is that the load address is derived > > by rounding down the load address of the decompressor. > > > > Hence my suggestion to round *up* and compare with dram_end: if > > round_up(load_address, 128M) >= dram_end holds, it is guaranteed that > > no address exists in memory from which we could round down and arrive > > at a valid DRAM address. This would mean that your change will only > > affect platforms that were unable to boot to begin with, and not > > affect any other weird configurations including crashkernels etc > > Uhm maybe not ... > > Time to get that coffee... OK so we know that the memory base should be a 16 MB aligned value >= dram_start. This holds for the crashkernel as well, although in that case, the memory base is much higher than dram_start, and not right above it. So how about if we *only* use the DT dram_start as the memory base if it is aligned to 16 MB, and if rounding down from the load_address produces an address that is below it? That should cover your use case, but very conservatively, reducing the likelihood of regressions.