Re: [PATCH 8/8] build: Decrease maximum stack frame size to 2048

Peter Krempa <pkrempa@xxxxxxxxxx> · Tue, 5 Sep 2023 11:10:26 +0200

On Tue, Sep 05, 2023 at 11:05:17 +0200, Michal Prívozník wrote:
> On 9/5/23 09:43, Peter Krempa wrote:
> > On Mon, Sep 04, 2023 at 16:09:00 +0200, Michal Prívozník wrote:
> >> On 8/30/23 13:59, Peter Krempa wrote:
> >>> After recent cleanups we can now restrict the maximum stack frame size
> >>> to 2k.
> >>>
> >>> Signed-off-by: Peter Krempa <pkrempa@xxxxxxxxxx>
> >>> ---
> >>>  meson.build | 2 +-
> >>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/meson.build b/meson.build
> >>> index 965ada483b..e45f3e2c39 100644
> >>> --- a/meson.build
> >>> +++ b/meson.build
> >>> @@ -248,7 +248,7 @@ alloc_max = run_command(
> >>>  )
> >>>
> >>>  # sanitizer instrumentation may enlarge stack frames
> >>> -stack_frame_size = get_option('b_sanitize') == 'none' ? 4096 : 32768
> >>> +stack_frame_size = get_option('b_sanitize') == 'none' ? 2048 : 32768
> >>>
> >>>  # array_bounds=2 check triggers false positive on some GCC
> >>>  # versions when using sanitizers. Seen on Fedora 34 with
> >>
> >>
> >> I know this is already pushed but to be honest, I don't really
> >> understand why this is needed. I mean, we certainly do not want large
> >> frames, but IIUC only frames larger than a page are problem (i.e. 4KiB).
> >> Can you please point me to some docs where I can learn more?
> > 
> > The main idea of this is to ensure that we don't have massive stack
> > frames which could cause a stack overflow.
> 
> I'm failing to see what stack size and buffer overflow have in common.

I've specifically said "stack overflow". The issue where you put so much
stuff on the stack that you run out of stack space.

> To overwrite the return address you can has as small buffer as 1 byte
> and not check the boundaries (obviously). And yet, you can have 2K
> buffer on stack but if you check the boundaries when reading into it, no
> exploit is possible.
> 
> > 
> > But to be honest, it doesn't matter too much whether the actual limit
> > size is 4k or 2k. It still allows us to have very deep call stack, and a
> > factor of 2 in the size will not make a real difference.
> 
> Yeah. If we care about the actual stack usage (e.g. like kernel does -
> it has limit of 8KiB for the whole stack of a kernel thread), then
> restricting individual functions is not enough since we have recursive
> functions.

Normal programs have 8MiB per stack. We do have recursive functions but
by limiting the stack size to 2k now we can get very deep.

That is the reason I've also said that it really doesn't matter that
much whether it's 4k or 2k. It's just a factor of 2 improvement.

> > At one point I was looking at which functions have massive stack frames
> > and tried avoiding such state. Now this last set of patches was what I
> > had in my local branch for a long time and decided to finally get rid of
> > them. As you can see, there were multiple cases of 2k buffers being
> > stack allocated, so the end result IMO made sense.
> 
> Indeed and this reason alone is good enough. It made the code cleaner
> too. I was just wondering whether there is some deeper sense.
> 
> > 
> > The decreasing of the actual limit to 2k isn't as important as I've
> > noted but similarly to when we add a syntax check after a full-repo
> > cleanup it is a way to prevent regressions after a cleanup.
> > 
> 
> BTW: if you (or anybody else) want to identify functions which have
> massive stack usage, I've found clever perl script from kernel.git:
> 
> libvirt.git $ objdump -d _build/src/libvirt.so.0.9008.0 | \
> kernel.git/scripts/checkstack.pl | \
> head -n30
> 
> some very interesting functions appear on the list
> (virDomainDefFeaturesCheckABIStability()) - but when looking at the
> disassembled code directly, it's not just about initial sub 0x68, %rsp;
> it's also about how many arguments are passed to functions (esp. with
> variable arguments)
> 
> Michal
>