On Fri, Jul 29, 2022 at 05:08:31PM +0200, Vlastimil Babka wrote: > On 7/12/22 15:39, Hyeonggon Yoo wrote: > > This is v3 of common kmalloc series. > > > > This series generalize most of kmalloc functions and move its > > implementation into mm/slab_common.c. > > > > I believe this series give better maintainability of code for SLAB and SLUB. > > Please consider applying. > > > > This series is based on slab/for-next and also available at > > https://github.com/hygoni/linux/tree/slab-common-v3r0 > > > > Any feedbacks/reviews will be appreciated. > > Hi, thanks for all your efforts. It's shaping up nicely so I think the next > version will be ready to be added to -next after the 5.20 merge window. > As I've finished the individual reviews, I'm looking at the result and see a > bit more potential for cleanups, which could be perhaps incorporated to > existing patches, or additionally: Thank you for reviews and I too would like to add it to -next soon! > > - kmalloc_large_node_notrace() has only one caller, can be removed and the > caller can call __kmalloc_large_node_notrace() directly, especially if it's > not __always_inline as I've IIRC suggested. Will adjust in next version. > - kmem_cache_alloc_trace() and kmem_cache_alloc_node_trace() are weird ones, > they are in fact for kmalloc despite the name. Yeah, I'm the one that would like to rename it to kmalloc_trace() and kmalloc_node_trace(). > They depend on > CONFIG_TRACING, yet if you look carefully, the !CONFIG_TRACING variant also > goes through a trace_* function. The actual difference seems that > slab_alloc() thus kasan_alloc() and kfence_alloc() don't get the orig_size > that way, which is dubious. It might be worth trying to unify this as well? > E.g. > - use only the CONFIG_TRACING variant, discard the other Sounds okay. > - declare it in mm/slab.h, this is not for general usage We can't completely remove it because its needed in include/linux/slab.h for inlined kmalloc. > - single implementation in mm/slab_common.c that calls > __kmem_cache_alloc_node() from SLAB/SLUB and does the trace While I love the idea of single implementation in mm/slab_common.c, making use of __kmem_cache_alloc_node() and __kmem_cache_free() adds a bit of overhead: it adds overhead of function call and can't benefit from inlining (including removing unnnecessary part of function code) So... what about including slab_common.c in sl?b.c, so that compiler can treat sl?b.c and slab_common.c as a single translation unit? (or split kmalloc implementation into kmalloc.c and do similar thing?) Thanks! --- Hyeonggon diff --git a/mm/Makefile b/mm/Makefile index 9a564f836403..bcee8495b531 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -3,7 +3,6 @@ # Makefile for the linux memory manager. # -KASAN_SANITIZE_slab_common.o := n KASAN_SANITIZE_slab.o := n KASAN_SANITIZE_slub.o := n KCSAN_SANITIZE_kmemleak.o := n @@ -11,7 +10,6 @@ KCSAN_SANITIZE_kmemleak.o := n # These produce frequent data race reports: most of them are due to races on # the same word but accesses to different bits of that word. Re-enable KCSAN # for these when we have more consensus on what to do about them. -KCSAN_SANITIZE_slab_common.o := n KCSAN_SANITIZE_slab.o := n KCSAN_SANITIZE_slub.o := n KCSAN_SANITIZE_page_alloc.o := n @@ -21,7 +19,6 @@ KCSAN_INSTRUMENT_BARRIERS := y # These files are disabled because they produce non-interesting and/or # flaky coverage that is not a function of syscall inputs. E.g. slab is out of # free pages, or a task is migrated between nodes. -KCOV_INSTRUMENT_slab_common.o := n KCOV_INSTRUMENT_slob.o := n KCOV_INSTRUMENT_slab.o := n KCOV_INSTRUMENT_slub.o := n @@ -51,8 +48,7 @@ obj-y := filemap.o mempool.o oom_kill.o fadvise.o \ maccess.o page-writeback.o folio-compat.o \ readahead.o swap.o truncate.o vmscan.o shmem.o \ util.o mmzone.o vmstat.o backing-dev.o \ - mm_init.o percpu.o slab_common.o \ - compaction.o vmacache.o \ + mm_init.o percpu.o compaction.o vmacache.o \ interval_tree.o list_lru.o workingset.o \ debug.o gup.o mmap_lock.o $(mmu-y) diff --git a/mm/slab.c b/mm/slab.c index a5486ff8362a..a302a7c17b40 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -131,6 +131,8 @@ #include "slab.h" +#include "slab_common.c" + /* * DEBUG - 1 for kmem_cache_create() to honour; SLAB_RED_ZONE & SLAB_POISON. * 0 for faster, smaller code (especially in the critical paths). @@ -3541,6 +3543,7 @@ void *kmem_cache_alloc_node(struct kmem_cache *cachep, gfp_t flags, int nodeid) } EXPORT_SYMBOL(kmem_cache_alloc_node); +static __always_inline void *__kmem_cache_alloc_node(struct kmem_cache *cachep, gfp_t flags, int nodeid, size_t orig_size, unsigned long caller) @@ -3585,6 +3588,7 @@ void __do_kmem_cache_free(struct kmem_cache *cachep, void *objp, local_irq_restore(flags); } +static __always_inline void __kmem_cache_free(struct kmem_cache *cachep, void *objp, unsigned long caller) { diff --git a/mm/slab.h b/mm/slab.h index 65023f000d42..4e43ab717d99 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -273,11 +273,6 @@ void create_kmalloc_caches(slab_flags_t); /* Find the kmalloc slab corresponding for a certain size */ struct kmem_cache *kmalloc_slab(size_t, gfp_t); - -void *__kmem_cache_alloc_node(struct kmem_cache *s, gfp_t gfpflags, - int node, size_t orig_size, - unsigned long caller); -void __kmem_cache_free(struct kmem_cache *s, void *x, unsigned long caller); #endif gfp_t kmalloc_fix_flags(gfp_t flags); diff --git a/mm/slab_common.c b/mm/slab_common.c index 83cfe91b6ab6..70d7393d31ae 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -898,6 +898,12 @@ void free_large_kmalloc(struct folio *folio, void *object) __free_pages(folio_page(folio, 0), order); } +static __always_inline +void *__kmem_cache_alloc_node(struct kmem_cache *s, gfp_t gfpflags, int node, + size_t orig_size, unsigned long caller); +static __always_inline +void __kmem_cache_free(struct kmem_cache *s, void *objp, unsigned long caller); + static void *__kmalloc_large_node(size_t size, gfp_t flags, int node); static __always_inline void *__do_kmalloc_node(size_t size, gfp_t flags, int node, unsigned long caller) diff --git a/mm/slob.c b/mm/slob.c index 45a061b8ba38..656f4f8b77d9 100644 --- a/mm/slob.c +++ b/mm/slob.c @@ -74,6 +74,9 @@ #include <linux/atomic.h> #include "slab.h" + +#include "slab_common.c" + /* * slob_block has a field 'units', which indicates size of block if +ve, * or offset of next block if -ve (in SLOB_UNITs). diff --git a/mm/slub.c b/mm/slub.c index 8083a6ee5f15..bd47a30049ae 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -45,6 +45,8 @@ #include "internal.h" +#include "slab_common.c" + /* * Lock order: * 1. slab_mutex (Global Mutex) @@ -3261,6 +3263,7 @@ void *kmem_cache_alloc_lru(struct kmem_cache *s, struct list_lru *lru, } EXPORT_SYMBOL(kmem_cache_alloc_lru); +static __always_inline void *__kmem_cache_alloc_node(struct kmem_cache *s, gfp_t gfpflags, int node, size_t orig_size, unsigned long caller) @@ -3505,6 +3508,7 @@ void ___cache_free(struct kmem_cache *cache, void *x, unsigned long addr) } #endif +static __always_inline void __kmem_cache_free(struct kmem_cache *s, void *x, unsigned long caller) { slab_free(s, virt_to_slab(x), x, NULL, &x, 1, caller);