Re: [PATCH v2 0/4] kasan, arm64, scs, stacktrace: collect stack traces from Shadow Call Stack

Mark Rutland <mark.rutland@xxxxxxx> · Thu, 31 Mar 2022 13:39:01 +0100

On Thu, Mar 31, 2022 at 10:54:08AM +0100, Mark Rutland wrote:
> On Wed, Mar 23, 2022 at 04:32:51PM +0100, andrey.konovalov@xxxxxxxxx wrote:
> > From: Andrey Konovalov <andreyknvl@xxxxxxxxxx>
> > 
> > kasan, arm64, scs, stacktrace: collect stack traces from Shadow Call Stack
> > 
> > Currently, KASAN always uses the normal stack trace collection routines,
> > which rely on the unwinder, when saving alloc and free stack traces.
> > 
> > Instead of invoking the unwinder, collect the stack trace by copying
> > frames from the Shadow Call Stack whenever it is enabled. This reduces
> > boot time by 30% for all KASAN modes when Shadow Call Stack is enabled.
> 
> That is an impressive number. TBH, I'm shocked that this has *that* much of an
> improvement, and I suspect this means we're doing something unnecssarily
> expensive in the regular unwinder.

I've had a quick look into this, to see what we could do to improve the regular
unwinder, but I can't reproduce that 30% number.

In local testing the worst can I could get to was 6-13% (with both the
stacktrace *and* stackdepot logic hacked out entirely).

I'm testing with clang 13.0.0 from the llvm.org binary releases, with defconfig
+ SHADOW_CALL_STACK + KASAN_<option>, using a very recent snapshot of mainline
(commit d888c83fcec75194a8a48ccd283953bdba7b2550). I'm booting a
KVM-accelerated QEMU VM on ThunderX2 with "init=/sbin/reboot -- -f" in the
kernel bootargs, timing the whole run from the outside with "perf stat --null".

The 6% figure is if I count boot as a whole including VM startup and teardown
(i.e. an under-estimate of the proportion), the 13% figure is if I subtract a
baseline timing from a run without KASAN (i.e. an over-estimate of the
proportion).

Could you let me know how you're measuring this, and which platform+config
you're using?

I'll have a play with some configs in case there's a pathological
configuration, but if you could let me know how/what you're testing that'd be a
great help.

Thanks,
Mark.