On Tue, Nov 22, 2022 at 09:41:41AM +0100, Alexandre Ghiti wrote: > During the early page table creation, we used to set the mapping for > PAGE_OFFSET to the kernel load address: but the kernel load address is > always offseted by PMD_SIZE which makes it impossible to use PUD/P4D/PGD > pages as this physical address is not aligned on PUD/P4D/PGD size (whereas > PAGE_OFFSET is). > > But actually we don't have to establish this mapping (ie set va_pa_offset) > that early in the boot process because: > > - first, setup_vm installs a temporary kernel mapping and among other > things, discovers the system memory, > - then, setup_vm_final creates the final kernel mapping and takes > advantage of the discovered system memory to create the linear > mapping. > > During the first phase, we don't know the start of the system memory and > then until the second phase is finished, we can't use the linear mapping at > all and phys_to_virt/virt_to_phys translations must not be used because it > would result in a different translation from the 'real' one once the final > mapping is installed. > > So here we simply delay the initialization of va_pa_offset to after the > system memory discovery. But to make sure noone uses the linear mapping > before, we add some guard in the DEBUG_VIRTUAL config. > > Finally we can use PUD/P4D/PGD hugepages when possible, which will result > in a better TLB utilization. > > Note that we rely on the firmware to protect itself using PMP. > > Signed-off-by: Alexandre Ghiti <alexghiti@xxxxxxxxxxxx> > --- > > Note that this patch is rebased on top of: > [PATCH v1 1/1] riscv: mm: call best_map_size many times during linear-mapping > > arch/riscv/include/asm/page.h | 16 ++++++++++++++++ > arch/riscv/mm/init.c | 25 +++++++++++++++++++------ > arch/riscv/mm/physaddr.c | 16 ++++++++++++++++ > drivers/of/fdt.c | 5 ++++- > 4 files changed, 55 insertions(+), 7 deletions(-) [...] > diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c > index 7b571a631639..04e3ecb51722 100644 > --- a/drivers/of/fdt.c > +++ b/drivers/of/fdt.c > @@ -895,8 +895,11 @@ static void __early_init_dt_declare_initrd(unsigned long start, > * enabled since __va() is called too early. ARM64 does make use > * of phys_initrd_start/phys_initrd_size so we can skip this > * conversion. > + * On RISCV64, the usage of __va() before the linear mapping exists > + * is wrong. I assume the 'does make use of phys_initrd_start/phys_initrd_size so we can skip this conversion' comment applies to RiscV too. Or you just don't care if initrd addresses are not setup? Please rework the comment removing what platforms can skip this. Which platforms skip it is obvious from the code. 'Why' is not, so that's what the comment is for. > */ > - if (!IS_ENABLED(CONFIG_ARM64)) { > + if (!IS_ENABLED(CONFIG_ARM64) && > + !(IS_ENABLED(CONFIG_RISCV) && IS_ENABLED(CONFIG_64BIT))) { > initrd_start = (unsigned long)__va(start); > initrd_end = (unsigned long)__va(end); > initrd_below_start_ok = 1; > -- > 2.37.2 >