Oh gawd, that's terrible. Never, ever duplicate code like that.
What the patch does is:
- formally shift the loop one level down in the call graph, adding instances of __tmp_remove_tables()
exactly to locations where instances of __tmp_remove_table() already exist,
- on architectures where __tmp_remove_tables() resulted into calling free_page_and_swap_cache() in loop,
call batched free_page_and_swap_cache_nolru() instead,
- on other places, keep the loop as is - perhaps as a possible target for future optimizations.
The extra duplication added by this patch just highlights already existing duplication of
__tlb_remove_table() implementations.
Ok let's follow your suggestion instead. AFAIU, that is:
- remove the free_page_and_swap_cache() based implementation from archs,
- instead, add it into mm/mmu_gather.c, ifdef-ed by a new Kconfig key, and define that Kconfig key into
the archs that use it,
- then, keep the optimization inside mm/mmu_gather.c.
Indeed, the overall change will become smaller then. Thanks for the idea. Will post patches doing that soon.
Nikita