On Mon, 30 Mar 2020 at 16:04, Will Deacon <will@xxxxxxxxxx> wrote: > > On Mon, Mar 30, 2020 at 03:53:04PM +0200, Ard Biesheuvel wrote: > > On Mon, 30 Mar 2020 at 15:51, Will Deacon <will@xxxxxxxxxx> wrote: > > > > > > On Sun, Mar 29, 2020 at 04:12:58PM +0200, Ard Biesheuvel wrote: > > > > When CONFIG_DEBUG_ALIGN_RODATA is enabled, kernel segments mapped with > > > > different permissions (r-x for .text, r-- for .rodata, rw- for .data, > > > > etc) are rounded up to 2 MiB so they can be mapped more efficiently. > > > > In particular, it permits the segments to be mapped using level 2 > > > > block entries when using 4k pages, which is expected to result in less > > > > TLB pressure. > > > > > > > > However, the mappings for the bulk of the kernel will use level 2 > > > > entries anyway, and the misaligned fringes are organized such that they > > > > can take advantage of the contiguous bit, and use far fewer level 3 > > > > entries than would be needed otherwise. > > > > > > > > This makes the value of this feature dubious at best, and since it is not > > > > enabled in defconfig or in the distro configs, it does not appear to be > > > > in wide use either. So let's just remove it. > > > > > > > > Signed-off-by: Ard Biesheuvel <ardb@xxxxxxxxxx> > > > > --- > > > > arch/arm64/Kconfig.debug | 13 ------------- > > > > arch/arm64/include/asm/memory.h | 12 +----------- > > > > drivers/firmware/efi/libstub/arm64-stub.c | 8 +++----- > > > > 3 files changed, 4 insertions(+), 29 deletions(-) > > > > > > Acked-by: Will Deacon <will@xxxxxxxxxx> > > > > > > But I would really like to go a step further and rip out the block mapping > > > support altogether so that we can fix non-coherent DMA aliases: > > > > > > https://lore.kernel.org/lkml/20200224194446.690816-1-hch@xxxxxx > > > > > > > I'm not sure I follow - is this about mapping parts of the static > > kernel Image for non-coherent DMA? > > Sorry, it's not directly related to your patch, just that if we're removing > options relating to kernel mappings then I'd be quite keen on effectively > forcing page-granularity on the linear map, as is currently done by default > thanks to RODATA_FULL_DEFAULT_ENABLED, so that we can nobble cacheable > aliases for non-coherent streaming DMA mappings by hooking into Christoph's > series above. > Right. I don't remember seeing any complaints about RODATA_FULL_DEFAULT_ENABLED, but maybe it's too early for that. > This series just reminded me of it because it's another > "off-by-default-behaviour-for-block-mappings-probably-because-of-performance- > but-never-actually-measured" type of thing which really just gets in the > way. > Well, even though I agree that the lack of actual numbers is a bit disturbing here, I'd hate to penalize all systems even more than they already are (due to ARCH_KMALLOC_MINALIGN == ARCH_DMA_MINALIGN) by adding another workaround that is only needed on devices that have non-coherent masters.