On Thu, 5 Oct 2023 at 22:36, Andrey Konovalov <andreyknvl@xxxxxxxxx> wrote: > > On Wed, Sep 13, 2023 at 7:14 PM <andrey.konovalov@xxxxxxxxx> wrote: > > > > From: Andrey Konovalov <andreyknvl@xxxxxxxxxx> > > > > Currently, the stack depot grows indefinitely until it reaches its > > capacity. Once that happens, the stack depot stops saving new stack > > traces. > > > > This creates a problem for using the stack depot for in-field testing > > and in production. > > > > For such uses, an ideal stack trace storage should: > > > > 1. Allow saving fresh stack traces on systems with a large uptime while > > limiting the amount of memory used to store the traces; > > 2. Have a low performance impact. > > > > Implementing #1 in the stack depot is impossible with the current > > keep-forever approach. This series targets to address that. Issue #2 is > > left to be addressed in a future series. > > > > This series changes the stack depot implementation to allow evicting > > unneeded stack traces from the stack depot. The users of the stack depot > > can do that via new stack_depot_save_flags(STACK_DEPOT_FLAG_GET) and > > stack_depot_put APIs. > > > > Internal changes to the stack depot code include: > > > > 1. Storing stack traces in fixed-frame-sized slots; the slot size is > > controlled via CONFIG_STACKDEPOT_MAX_FRAMES (vs precisely-sized > > slots in the current implementation); > > 2. Keeping available slots in a freelist (vs keeping an offset to the next > > free slot); > > 3. Using a read/write lock for synchronization (vs a lock-free approach > > combined with a spinlock). > > > > This series also integrates the eviction functionality in the tag-based > > KASAN modes. > > > > Despite wasting some space on rounding up the size of each stack record, > > with CONFIG_STACKDEPOT_MAX_FRAMES=32, the tag-based KASAN modes end up > > consuming ~5% less memory in stack depot during boot (with the default > > stack ring size of 32k entries). The reason for this is the eviction of > > irrelevant stack traces from the stack depot, which frees up space for > > other stack traces. > > > > For other tools that heavily rely on the stack depot, like Generic KASAN > > and KMSAN, this change leads to the stack depot capacity being reached > > sooner than before. However, as these tools are mainly used in fuzzing > > scenarios where the kernel is frequently rebooted, this outcome should > > be acceptable. > > > > There is no measurable boot time performance impact of these changes for > > KASAN on x86-64. I haven't done any tests for arm64 modes (the stack > > depot without performance optimizations is not suitable for intended use > > of those anyway), but I expect a similar result. Obtaining and copying > > stack trace frames when saving them into stack depot is what takes the > > most time. > > > > This series does not yet provide a way to configure the maximum size of > > the stack depot externally (e.g. via a command-line parameter). This will > > be added in a separate series, possibly together with the performance > > improvement changes. > > Hi Marco and Alex, > > Could you PTAL at the not-yet-reviewed patches in this series when you > get a chance? There'll be a v3 with a few smaller still-pending fixes, right? I think I looked at it a while back and the rest that I didn't comment on looked fine, just waiting for v3. Feel free to send a v3 by end of week. I'll try to have another look today/tomorrow just in case I missed something, but if there are no more comments please send v3 later in the week.