On Fri, Feb 09, 2024 at 09:00:40AM +0100, Marco Elver wrote: > > +/** > > + * stack_depot_get_next_stack - Returns all stacks, one at a time > > "Returns all stack_records" to be clear that this is returning the struct. Fixed. > > > + * > > + * @table: Current table we are checking > > + * @bucket: Current bucket we are checking > > + * @last_found: Last stack that was found > > + * > > + * This function finds first a non-empty bucket and returns the first stack > > + * stored in it. On consequent calls, it walks the bucket to see whether > > + * it contains more stacks. > > + * Once we have walked all the stacks in a bucket, we check > > + * the next one, and we repeat the same steps until we have checked all of them > > I think for this function it's important to say that no entry returned > from this function can be evicted. > > I.e. the easiest way to ensure this is that the caller makes sure the > entries returned are never passed to stack_depot_put() - which is > certainly the case for your usecase because you do not use > stack_depot_put(). > > > + * Return: A pointer a to stack_record struct, or NULL when we have walked all > > + * buckets. > > + */ > > +struct stack_record *stack_depot_get_next_stack(unsigned long *table, > > To keep consistent, I'd also call this > __stack_depot_get_next_stack_record(), so that we're clear this is > more of an internal function not for general usage. > > > + struct list_head **bucket, > > + struct stack_record **last_found); > > + > > /** > > * stack_depot_fetch - Fetch a stack trace from stack depot > > * > > diff --git a/lib/stackdepot.c b/lib/stackdepot.c > > index 197c355601f9..107bd0174cd6 100644 > > --- a/lib/stackdepot.c > > +++ b/lib/stackdepot.c > > @@ -782,6 +782,52 @@ unsigned int stack_depot_get_extra_bits(depot_stack_handle_t handle) > > } > > EXPORT_SYMBOL(stack_depot_get_extra_bits); > > > > +struct stack_record *stack_depot_get_next_stack(unsigned long *table, > > + struct list_head **curr_bucket, > > + struct stack_record **last_found) > > +{ > > + struct list_head *bucket = *curr_bucket; > > + unsigned long nr_table = *table; > > + struct stack_record *found = NULL; > > + unsigned long stack_table_entries = stack_hash_mask + 1; > > + > > + rcu_read_lock_sched_notrace(); > > We are returning pointers to stack_records out of the RCU-read > critical section, which are then later used to continue the iteration. > list_for_each_entry_continue_rcu() says this is fine if "... you held > some sort of non-RCU reference (such as a reference count) ...". > Updating the function's documentation to say none of these entries can > be evicted via a stack_depot_put() is required. Thinking about it some more, I think I made a mistake: I am walking all buckets, and within those buckets there are not only page_owner stack_records, which means that I could return a stack_record from e.g: KASAN (which I think can evict stack_records) and then everything goes off the rails. Which means I cannot walk the buckets like that. Actually, I think that having something like the following struct list_stack_records { struct stack_record *stack; struct list_stack_records *next; } in page_owner would make sense. Then the only thing I would have to do is to add a new record on every new stack_record, and then I could just walk the list like a linked list. Which means that the function stack_depot_get_next_stack() could be killed because everything would happen in page_owner code. e.g: static void inc_stack_record_count(depot_stack_handle_t handle) { struct stack_record *stack = __stack_depot_get_stack_record(handle); if (stack) { /* * New stack_record's that do not use STACK_DEPOT_FLAG_GET start * with REFCOUNT_SATURATED to catch spurious increments of their * refcount. * Since we do not use STACK_DEPOT_FLAG_{GET,PUT} API, let us * set a refcount of 1 ourselves. */ if (refcount_read(&stack->count) == REFCOUNT_SATURATED) { refcount_set(&stack->count, 1); add_new_stack_record_into_the_list(stack) } refcount_inc(&stack->count); } } and then just walk the list_stack_records list whenever we want to show the stacktraces and their counting. I think that overall this approach is cleaner and safer. -- Oscar Salvador SUSE Labs