On Fri, Oct 7, 2022 at 3:54 PM Nick Desaulniers <ndesaulniers@xxxxxxxxxx> wrote: > > This seems to be affected by -fno-conserve-stack, a currently gcc-only > command line flag. If I remove that, then i386 defconfig will inline > do_select but x86_64 defconfig will not. > > I have a sneaking suspicion that -fno-conserve-stack and > -Wframe-larger-than conspire in GCC to avoid inlining when doing so > would trip `-Wframe-larger-than` warnings, but it's just a conspiracy > theory; I haven't read the source. Probably should implement exactly > that behavior in LLVM. Sorry, that should have read `-fconserve-stack` (not `-fno-conserve-stack`). Playing with: https://godbolt.org/z/hE67j1Y9G experimentally, it looks like irrespective of -Wframe-larger-than, -fconserve-stack will try to avoid inlining callees if their frame size is 512B or greater for x86-64 targets, and 410B or greater for 32b targets. aarch64 is 410B though, perhaps that's leftover from the 32b ARM port. There's probably more to the story there though. > I'll triple check 32b+64b arm configs next week to verify. But if GCC > is not inlining do_select into core_sys_select then I think my patch > https://lore.kernel.org/llvm/20221007201140.1744961-1-ndesaulniers@xxxxxxxxxx/ > is on the right track; probably could drop the 32b-only condition and > make a note of GCC in the commit message. arm64 does not inline do_select into core_sys_select with aarch64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110 for defconfig. $ CROSS_COMPILE=aarch64-linux-gnu- ARCH=arm64 make -j128 defconfig fs/select.o $ llvm-objdump -Dr --disassemble-symbols=core_sys_select fs/select.o | grep do_select 1a48: 2e fb ff 97 bl 0x700 <do_select> Same for 32b ARM. arm-linux-gnueabi-gcc (Debian 10.2.1-6) 10.2.1 20210110 $ CROSS_COMPILE=arm-linux-gnueabi- ARCH=arm make -j128 defconfig fs/select.o $ llvm-objdump -Dr --disassemble-symbols=core_sys_select fs/select.o | grep do_select 1620: 07 fc ff eb bl #-4068 <do_select> Is there a set of configs or different compiler version for which that's not the case? Perhaps. But it doesn't look like marking do_select noinline_for_stack changes the default behavior for GCC builds, which is good. So it looks like it's just clang being aggressive with inlining since it doesn't have -fconserve-stack. I think https://lore.kernel.org/lkml/20221007201140.1744961-1-ndesaulniers@xxxxxxxxxx/ is still on the right track, though I'd remove the 32b only guard for v2. Christophe mentioned something about KASAN and GCC. I failed to reproduce, and didn't see any reports on lore that seemed relevant. https://lore.kernel.org/lkml/20221010074409.GA20998@xxxxxx/ I'll wait a day to see if there's more info (a config that reproduces) before sending a v2 though. > Also, my colleague Paul just whipped up a neat tool to help debug > -Wframe-larger-than. > https://reviews.llvm.org/D135488 > See the output from my run here: > https://paste.debian.net/1256338/ > It's a very early WIP, but I think it would be incredibly helpful to > have this, and will probably help us improve Clang's stack usage. Paul also mentioned that -finline-max-stacksize is a thing, at least for clang. https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-finline-max-stacksize Though this only landed recently https://reviews.llvm.org/rG8564e2fea559 and wont ship until clang-16. That feels like a large hammer for core_sys_select/do_select; I think we can use a fine scalpel. But it might be interesting to use that with KASAN. -- Thanks, ~Nick Desaulniers