On Tue, 11 Jun 2024 at 13:22, Mark Rutland <mark.rutland@xxxxxxx> wrote: > > On arm64 we have early ("boot") and late ("system-wide") alternatives. > We apply the system-wide alternatives in apply_alternatives_all(), a few > callees deep under smp_cpus_done(), after secondary CPUs are brought up, > since that has to handle mismatched features in big.LITTLE systems. Annoyingly, we don't have any generic model for this. Maybe that would be a good thing regardless, but your point that you have big.LITTLE issues does kind of reinforce the thing that different architectures have different requirements for the alternatives patching. On arm64, the late alternatives seem to be in kernel_init() -> kernel_init_freeable() -> smp_init() -> smp_cpus_done() -> setup_system_features() -> setup_system_capabilities() -> apply_alternatives_all() which is nice and late - that's when the system is fully initialized, and kernel_init() is already running as the first real thread. On x86, the alternatives are finalized much earlier in start_kernel() -> arch_cpu_finalize_init -> alternative_instructions() which is quite early, much closer to the early arm64 case. Now, even that early x86 timing is good enough for vfs_caches_early(), which is also done from start_kernel() fairly early on - and before the arch_cpu_finalize_init() code is run. But ... > I had assumed that we could use late/system-wide alternatives here, since > those get applied after vfs_caches_init_early(), but maybe that's too > late? So vfs_caches_init_early() is *one* case for the dcache init, but for the NUMA case, we delay the dcache init until after the MM setup has been completed, and do it relatively later in the init sequence at vfs_caches_init(). See that horribly named 'hashdist' variable ('dist' is not 'distance', it's 'distribute'). It's not dcache-specific, btw. There's a couple of other hashes that do that whole "NUMA distribution or not" thing.. Annoying, yes. I'm not sure that the dual init makes any actual sense - I think it's entirely a historical oddity. But that "done conditionally in two different places" may be ugly, but even if we fixed it, we'd fix it by doing it in just once, and it would be that later "NUMA has been initialized" vfs_caches_init() case. Which is too late for the x86 alternatives. The arm64 late case would seem to work fine. It's late enough to be after all "core kernel init", but still early enough to be before the "generic" initcalls that will start initializing filesystems etc (that then need the vfs code to have been initialized). So that "smp_init()" placement that arm64 has is actually a very good place for at least the dcache case. It's just not what x86 does. Note that my "just replace the constants" model avoids all the ordering issues because it just does the constant initialization synchronously when the constant is initialized. So it doesn't depend on any other ordering at all, and there is no worry about subtle differences in when alternatives are applied, or when the uses happen. (It obviously does have the same ordering requirement that the variable initialization itself has: the dcache init itself has to happen before any dcache use, but that's neither surprising nor a new ordering imposed by the runtime constant case). There's an advantage to just being self-sufficient and not tying into random other subsystems that have random other constraints. Linus