On Sun, Feb 26, 2023 at 02:12:38PM -0800, Andrew Morton wrote: > On Fri, 3 Feb 2023 17:18:37 +1000 Nicholas Piggin <npiggin@xxxxxxxxx> wrote: > > > On a 16-socket 192-core POWER8 system, the context_switch1_threads > > benchmark from will-it-scale (see earlier changelog), upstream can > > achieve a rate of about 1 million context switches per second, due to > > contention on the mm refcount. > > > > 64s meets the prerequisites for CONFIG_MMU_LAZY_TLB_SHOOTDOWN, so enable > > the option. This increases the above benchmark to 118 million context > > switches per second. > > Is that the best you can do ;) > > > This generates 314 additional IPI interrupts on a 144 CPU system doing > > a kernel compile, which is in the noise in terms of kernel cycles. > > > > ... > > > > --- a/arch/powerpc/Kconfig > > +++ b/arch/powerpc/Kconfig > > @@ -265,6 +265,7 @@ config PPC > > select MMU_GATHER_PAGE_SIZE > > select MMU_GATHER_RCU_TABLE_FREE > > select MMU_GATHER_MERGE_VMAS > > + select MMU_LAZY_TLB_SHOOTDOWN if PPC_BOOK3S_64 > > select MODULES_USE_ELF_RELA > > select NEED_DMA_MAP_STATE if PPC64 || NOT_COHERENT_CACHE > > select NEED_PER_CPU_EMBED_FIRST_CHUNK if PPC64 > > Can we please have a summary of which other architectures might benefit > from this, and what must they do? > > As this is powerpc-only, I expect it won't get a lot of testing in > mm.git or in linux-next. The powerpc maintainers might choose to merge > in the mm-stable branch at > git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm if this is a > concern. I haven't really had time to page all of this back in, but x86 is very close to be able to use this, it mostly just needs cleaning up some accidental active_mm usage. I've got a branch here: https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/log/?h=x86/lazy That's mostly Nick's patches with a bunch of Andy's old patches stuck on top. I also have a pile of notes, but alas, not finished in any way.