On Wed, Jan 29, 2020 at 11:39:44AM +0100, Peter Zijlstra wrote:
With the new page-table layout, using full (4k) pages for (256 byte) pte-tables is immensely wastefull. Move the pte-tables over to the same allocator already used for the (512 byte) higher level tables (pgd/pmd). This reduces the pte-table waste from 15x to 2x. Due to no longer being bound to 16 consecutive tables, this might actually already be more efficient than the old code for sparse tables. Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> --- arch/m68k/include/asm/motorola_pgalloc.h | 54 ++++++------------------------- arch/m68k/include/asm/motorola_pgtable.h | 8 ++++ arch/m68k/include/asm/page.h | 2 - 3 files changed, 19 insertions(+), 45 deletions(-) --- a/arch/m68k/include/asm/motorola_pgalloc.h +++ b/arch/m68k/include/asm/motorola_pgalloc.h @@ -10,60 +10,28 @@ extern int free_pointer_table(pmd_t *); static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm) { - pte_t *pte; - - pte = (pte_t *)__get_free_page(GFP_KERNEL|__GFP_ZERO); - if (pte) { - __flush_page_to_ram(pte); - flush_tlb_kernel_page(pte); - nocache_page(pte); - } - - return pte; + return (pte_t *)get_pointer_table();
Weirdly, get_pointer_table() seems to elide the __flush_page_to_ram() call, so you're missing that for ptes with this change. I think it's probably needed for the higher levels too (and kernel_page_table() does it for example) so I'd be inclined to add it unconditionally rather than predicate it on the allocation type introduced by your later patch.
--- a/arch/m68k/include/asm/page.h +++ b/arch/m68k/include/asm/page.h @@ -30,7 +30,7 @@ typedef struct { unsigned long pmd; } pm typedef struct { unsigned long pte; } pte_t; typedef struct { unsigned long pgd; } pgd_t; typedef struct { unsigned long pgprot; } pgprot_t; -typedef struct page *pgtable_t; +typedef pte_t *pgtable_t;
Urgh, this is a big (cross-arch) mess that we should fix later. Will