Re: [PATCH] arm64: errata: Minimize tlb flush due to vttbr writes on AmpereOne

Oliver Upton <oliver.upton@xxxxxxxxx> · Tue, 27 Feb 2024 20:26:33 +0000

On Tue, Feb 27, 2024 at 08:11:22PM +0000, Catalin Marinas wrote:
> On Wed, Feb 07, 2024 at 09:45:59AM +0000, Oliver Upton wrote:

[...]

> > Think of the precedent this would establish. What would stop
> > implementers from, say, changing out our memcpy implementation into a
> > a hundred different uarch-specific routines. That isn't maintainable,
> > nor is it even testable as most folks don't have access to your
> > hardware.
> 
> I agree. FTR, I'm fine with uarch optimisations if (a) they don't
> run-time patch the kernel binary, (b) don't affect the existing hardware
> and (c) show significant gains on the targeted uarch in some meaningful
> benchmarks (definitely not microbenchmark hammering a certain kernel
> path).

and (d) they have a minimal, maintainable code footprint :)

> So, if one wants an optimisation, it better benefits the other
> implementations or at least it doesn't make them worse. Now, we do have
> hardware from mobiles to large enterprise systems, so at some point we
> may have to make a call on different kernel behaviours, possibly even at
> run-time. We already do this at build-time, e.g. CONFIG_NUMA where it
> doesn't make much sense in a mobile (yet). But they should not be seen
> as uarch specific tweaks, more like higher-level classes of
> optimisations.

Agreed. I think the way we handled this case is a great example of how
these sort of things should go -- a general improvement to how the stage-2
MMU gets loaded on VHE systems, which ought to benefit other
implementations too.

Only if we can't extract a generalization should we even think about
something implementation-specific, IMO.

-- 
Thanks,
Oliver