On 17.05.23 11:03, Sascha Hauer wrote: > Up to now we use 1MiB sections to setup the page tables in PBL. There > are two places where this leads to problems. First is OP-TEE, we have > to map the OP-TEE area with PTE_EXT_XN to prevent the instruction > prefetcher from speculating into that area. With the current section > mapping we have to align OPTEE_SIZE to 1MiB boundaries. The second > problem comes with SRAM where the PBL might be running. This SRAM has > to be mapped executable, but at the same time we should map the > surrounding areas non executable which is not always possible with > 1MiB mapping granularity. > > We now have everything in place to use two level page tables from PBL, > so use arch_remap_range() for the problematic cases. > > Signed-off-by: Sascha Hauer <s.hauer@xxxxxxxxxxxxxx> > --- > arch/arm/cpu/mmu_32.c | 31 +++++++------------------------ > 1 file changed, 7 insertions(+), 24 deletions(-) > > diff --git a/arch/arm/cpu/mmu_32.c b/arch/arm/cpu/mmu_32.c > index 785b20c7fd..705d27a045 100644 > --- a/arch/arm/cpu/mmu_32.c > +++ b/arch/arm/cpu/mmu_32.c > @@ -111,8 +111,10 @@ void dma_flush_range(void *ptr, size_t size) > unsigned long end = start + size; > > __dma_flush_range(start, end); > +#ifndef __PBL__ > if (outer_cache.flush_range) > outer_cache.flush_range(start, end); > +#endif Meh. I see why this is ok (L2X0 currently initialized in initcall), but this #ifdef looks a bit too fragile. Perhaps, we could do this in <asm/mmu.h> instead? #ifdef __PBL__ /* Existing platforms with non-architected outer cache initialize it * outside PBL and new ones will likely only have architected caches, * so we provide a dummy here */ static __maybe_unused struct outer_cache_fns outer_cache; #else extern struct outer_cache_fns outer_cache; #endif > } > > void dma_inv_range(void *ptr, size_t size) > @@ -120,8 +122,10 @@ void dma_inv_range(void *ptr, size_t size) > unsigned long start = (unsigned long)ptr; > unsigned long end = start + size; > > +#ifndef __PBL__ > if (outer_cache.inv_range) > outer_cache.inv_range(start, end); > +#endif > __dma_inv_range(start, end); > } > > @@ -542,16 +546,6 @@ void *dma_alloc_writecombine(size_t size, dma_addr_t *dma_handle) > return dma_alloc_map(size, dma_handle, ARCH_MAP_WRITECOMBINE); > } > > -static inline void map_region(unsigned long start, unsigned long size, > - uint64_t flags) > - > -{ > - start = ALIGN_DOWN(start, SZ_1M); > - size = ALIGN(size, SZ_1M); > - > - create_sections(start, start + size - 1, flags); > -} > - > void mmu_early_enable(unsigned long membase, unsigned long memsize) > { > uint32_t *ttb = (uint32_t *)arm_mem_ttb(membase + memsize); > @@ -572,21 +566,10 @@ void mmu_early_enable(unsigned long membase, unsigned long memsize) > */ > create_flat_mapping(); > > - /* > - * There can be SoCs that have a section shared between device memory > - * and the on-chip RAM hosting the PBL. Thus mark this section > - * uncachable, but executable. > - * On such SoCs, executing from OCRAM could cause the instruction > - * prefetcher to speculatively access that device memory, triggering > - * potential errant behavior. > - * > - * If your SoC has such a memory layout, you should rewrite the code > - * here to map the OCRAM page-wise. > - */ > - map_region((unsigned long)_stext, _etext - _stext, PMD_SECT_DEF_UNCACHED); > - > /* maps main memory as cachable */ > - map_region(membase, memsize - OPTEE_SIZE, PMD_SECT_DEF_CACHED); > + arch_remap_range((void *)membase, memsize - OPTEE_SIZE, MAP_CACHED); > + arch_remap_range((void *)membase + memsize - OPTEE_SIZE, OPTEE_SIZE, MAP_UNCACHED); > + arch_remap_range(_stext, PAGE_ALIGN(_etext - _stext), MAP_CACHED); Rest looks fine. With above point addressed: Reviewed-by: Ahmad Fatoum <a.fatoum@xxxxxxxxxxxxxx> > > __mmu_cache_on(); > } -- Pengutronix e.K. | | Steuerwalder Str. 21 | http://www.pengutronix.de/ | 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 | Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |