On Fri, 27 May 2022 at 13:37, Vlastimil Babka <vbabka@xxxxxxx> wrote: > > As Linus explained [1], setting the stackdepot hash table size as a > config option is suboptimal, especially as stackdepot becomes a > dependency of less specialized subsystems than initially (e.g. DRM, > networking, SLUB_DEBUG): > > : (a) it introduces a new compile-time question that isn't sane to ask > : a regular user, but is now exposed to regular users. > > : (b) this by default uses 1MB of memory for a feature that didn't in > : the past, so now if you have small machines you need to make sure you > : make a special kernel config for them. > > Ideally we would employ rhashtable for fully automatic resizing, which > should be feasible for many of the new users, but problematic for the > original users with restricted context that call __stack_depot_save() > with can_alloc == false, i.e. KASAN. > > However we can easily remove the config option and scale the hash table > automatically with system memory. The STACK_HASH_MASK constant becomes > stack_hash_mask variable and is used only in one mask operation, so the > overhead should be negligible to none. For early allocation we can > employ the existing alloc_large_system_hash() function and perform > similar scaling for the late allocation. > > The existing limits of the config option (between 4k and 1M buckets) > are preserved, and scaling factor is set to one bucket per 16kB memory > so on 64bit the max 1M buckets (8MB memory) is achieved with 16GB > system, while a 1GB system will use 512kB. Hi Vlastimil, We use KASAN with VMs with 2GB of memory. If I did the math correctly this will result in 128K entries, while currently we have CONFIG_STACK_HASH_ORDER=20 even for arm32. I am actually not sure how full the table gets, but we can fuzz a large kernel for up to an hour, so we can get lots of stacks (we were the only known users who routinely overflowed default LOCKDEP tables :)). I am not opposed to this in general. And I understand that KASAN Is different from the other users. What do you think re allowing CONFIG_STACK_HASH_ORDER=0/is not set which will mean auto-size, but keeping ability to set exact size as well? Or alternatively auto-size if KASAN is not enabled and use a large table otherwise? But I am not sure if anybody used CONFIG_STACK_HASH_ORDER to reduce the default size with KASAN... > If needed, the automatic scaling could be complemented with a boot-time > kernel parameter, but it feels pointless to add it without a specific > use case. > > [1] https://lore.kernel.org/all/CAHk-=wjC5nS+fnf6EzRD9yQRJApAhxx7gRB87ZV+pAWo9oVrTg@xxxxxxxxxxxxxx/ > > Reported-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> > Signed-off-by: Vlastimil Babka <vbabka@xxxxxxx> > --- > lib/Kconfig | 9 --------- > lib/stackdepot.c | 47 ++++++++++++++++++++++++++++++++++++----------- > 2 files changed, 36 insertions(+), 20 deletions(-) > > diff --git a/lib/Kconfig b/lib/Kconfig > index 6a843639814f..1e7cf7c76ae6 100644 > --- a/lib/Kconfig > +++ b/lib/Kconfig > @@ -682,15 +682,6 @@ config STACKDEPOT_ALWAYS_INIT > bool > select STACKDEPOT > > -config STACK_HASH_ORDER > - int "stack depot hash size (12 => 4KB, 20 => 1024KB)" > - range 12 20 > - default 20 > - depends on STACKDEPOT > - help > - Select the hash size as a power of 2 for the stackdepot hash table. > - Choose a lower value to reduce the memory impact. > - > config REF_TRACKER > bool > depends on STACKTRACE_SUPPORT > diff --git a/lib/stackdepot.c b/lib/stackdepot.c > index 5ca0d086ef4a..f7b73ddfca77 100644 > --- a/lib/stackdepot.c > +++ b/lib/stackdepot.c > @@ -145,10 +145,15 @@ depot_alloc_stack(unsigned long *entries, int size, u32 hash, void **prealloc) > return stack; > } > > -#define STACK_HASH_SIZE (1L << CONFIG_STACK_HASH_ORDER) > -#define STACK_HASH_MASK (STACK_HASH_SIZE - 1) > +/* one hash table bucket entry per 16kB of memory */ > +#define STACK_HASH_SCALE 14 > +/* limited between 4k and 1M buckets */ > +#define STACK_HASH_ORDER_MIN 12 > +#define STACK_HASH_ORDER_MAX 20 > #define STACK_HASH_SEED 0x9747b28c > > +static unsigned int stack_hash_mask; > + > static bool stack_depot_disable; > static struct stack_record **stack_table; > > @@ -175,8 +180,6 @@ void __init stack_depot_want_early_init(void) > > int __init stack_depot_early_init(void) > { > - size_t size; > - > /* This is supposed to be called only once, from mm_init() */ > if (WARN_ON(__stack_depot_early_init_passed)) > return 0; > @@ -186,10 +189,15 @@ int __init stack_depot_early_init(void) > if (!__stack_depot_want_early_init || stack_depot_disable) > return 0; > > - size = (STACK_HASH_SIZE * sizeof(struct stack_record *)); > - pr_info("Stack Depot early init allocating hash table with memblock_alloc, %zu bytes\n", > - size); > - stack_table = memblock_alloc(size, SMP_CACHE_BYTES); > + stack_table = alloc_large_system_hash("stackdepot", > + sizeof(struct stack_record *), > + 0, > + STACK_HASH_SCALE, > + HASH_EARLY | HASH_ZERO, > + NULL, > + &stack_hash_mask, > + 1UL << STACK_HASH_ORDER_MIN, > + 1UL << STACK_HASH_ORDER_MAX); > > if (!stack_table) { > pr_err("Stack Depot hash table allocation failed, disabling\n"); > @@ -207,13 +215,30 @@ int stack_depot_init(void) > > mutex_lock(&stack_depot_init_mutex); > if (!stack_depot_disable && !stack_table) { > - pr_info("Stack Depot allocating hash table with kvcalloc\n"); > - stack_table = kvcalloc(STACK_HASH_SIZE, sizeof(struct stack_record *), GFP_KERNEL); > + unsigned long entries; > + > + entries = nr_free_buffer_pages(); > + entries = roundup_pow_of_two(entries); > + > + if (STACK_HASH_SCALE > PAGE_SHIFT) > + entries >>= (STACK_HASH_SCALE - PAGE_SHIFT); > + else > + entries <<= (PAGE_SHIFT - STACK_HASH_SCALE); > + > + if (entries < 1UL << STACK_HASH_ORDER_MIN) > + entries = 1UL << STACK_HASH_ORDER_MIN; > + if (entries > 1UL << STACK_HASH_ORDER_MAX) > + entries = 1UL << STACK_HASH_ORDER_MAX; > + > + pr_info("Stack Depot allocating hash table of %lu entries with kvcalloc\n", > + entries); > + stack_table = kvcalloc(entries, sizeof(struct stack_record *), GFP_KERNEL); > if (!stack_table) { > pr_err("Stack Depot hash table allocation failed, disabling\n"); > stack_depot_disable = true; > ret = -ENOMEM; > } > + stack_hash_mask = entries - 1; > } > mutex_unlock(&stack_depot_init_mutex); > return ret; > @@ -386,7 +411,7 @@ depot_stack_handle_t __stack_depot_save(unsigned long *entries, > goto fast_exit; > > hash = hash_stack(entries, nr_entries); > - bucket = &stack_table[hash & STACK_HASH_MASK]; > + bucket = &stack_table[hash & stack_hash_mask]; > > /* > * Fast path: look the stack trace up without locking. > -- > 2.36.1 >