On Mon, Oct 16, 2023 at 11:33:18PM +0100, Sergei Trofimovich wrote: > On Mon, 16 Oct 2023 14:50:05 -0700 > Kees Cook <keescook@xxxxxxxxxxxx> wrote: > > > On Mon, Oct 16, 2023 at 10:25:07PM +0100, Sergei Trofimovich wrote: > > > Before the change linux allowed individual execve() arguments or > > > environment variable entries to be only as big as 32 pages. > > > > > > Histroically before b6a2fea3931 "mm: variable length argument support" > > > MAX_ARG_STRLEN used to be full allowed size `argv[] + envp[]`. > > > > > > When full limit was abandoned individual parameters were still limited > > > by a safe limit of 128K. > > > > > > Nowadays' linux allows `argv[]+envp[]` to be as laerge as 6MB (3/4 > > > `_STK_LIM`). > > > > See bprm_stack_limits() for the details, but yes, 3/4 _STK_LIM tends to > > be the max, unless the rlim_stack is set smaller. > > > > > Some build systems like `autoconf` use a single environment variable > > > to pass `CFLAGS` environment variable around. It's not a bug problem > > > if the argument list is short. > > > > > > But some packaging systems prefer installing each package into > > > individual directory. As a result that requires quite long string of > > > parameters like: > > > > > > CFLAGS="-I/path/to/pkg1 -I/path/to/pkg2 ..." > > > > > > This can easily overflow 128K and does happen for `NixOS` and `nixpkgs` > > > repositories on a regular basis. > > > > That's ... alarming. What are you doing currently to work around this? > > We usually try to stay under the threshold. When the problem arises due > to organic growth of inputs over time we either suffer build failures > without any reasonable workarounds or if the change was recent and > inflated command line options we revert the change and add hacks into > other places (like patching `gcc` directly to apply the > transformations). > > Longer term it would be nice for `gcc` to allow unbounded output via > response files, but it will take some time to sort out as current flags > rewriting expands all flags and response files into a single huge > variable and hits the 128K limit: > > https://gcc.gnu.org/pipermail/gcc/2023-October/242641.html > > Medium term dropping kernel's arbitrary small limit (this change) sounds > like the simplest solution. > > > > > > > Similar pattern is exhibited by `gcc` which converts it's input command > > > line into a single environment variable (https://gcc.gnu.org/PR111527): > > > > > > $ big_100k_var=$(printf "%0*d" 100000 0) > > > > > > # this works: 200KB of options for `printf` external command > > > $ $(which printf) "%s %s" $big_100k_var $big_100k_var >/dev/null; echo $? > > > 0 > > > > > > # this fails: 200KB of options for `gcc`, fails in `cc1` > > > $ touch a.c; gcc -c a.c -DA=$big_100k_var -DB=$big_100k_var > > > gcc: fatal error: cannot execute 'cc1': execv: Argument list too long > > > compilation terminated. > > > > > > I would say this 128KB limitation is arbitrary. > > > The change raises the limit of `MAX_ARG_STRLEN` from 32 pakes (128K > > > n `x86_64`) to the maximum limit of stack allowed by Linux today. > > > > > > It has a minor chance of overflowing userspace programs that use > > > `MAX_ARG_STRLEN` to allocate the strings on stack. It should not be a > > > big problem as such programs are already are at risk of overflowing > > > stack. > > > > > > Tested as: > > > $ V=$(printf "%*d" 1000000 0) ls > > > > > > Before the change it failed with `ls: Argument list too long`. After the > > > change `ls` executes as expected. > > > > > > WDYT of abandoning the limit and allow user to fill entire environment > > > with a single command or a single variable? > > > > Changing this value shouldn't risk any vma collisions, since exec is > > still checking the utilization before starting the actual process > > replacement steps (see bprm->argmin). > > > > It does seem pathological to set this to the full 6MB, though, since > > that would _immediately_ get rejected by execve() with an -E2BIG, but > > ultimately, there isn't really any specific limit to the length of > > individual strings as long as the entire space is less than > > bprm->argmin. > > > > Perhaps MAX_ARG_STRLEN should be removed entirely and the kernel can > > just use bprm->argmin? I haven't really looked at that to see if it's > > sane, though. It just feels like MAX_ARG_STRLEN isn't a meaningful > > limit. > > Removing the limit entirely in favour of 'bprm->argmin' sounds good. > I'll try to make it so and will send v2. > > What should be the fate of userspace-exported `MAX_ARG_STRLEN` value in > that case? Should it stay `(PAGE_SIZE * 32)`? Should it be removed > entirely? (a chance of user space build failures). > > If we are to remove it entirely what should be the reasonable limit in > kernel for other subsystems that still use it? > > fs/binfmt_elf.c: len = strnlen_user((void __user *)p, MAX_ARG_STRLEN); > fs/binfmt_elf_fdpic.c: len = strnlen_user(p, MAX_ARG_STRLEN); > fs/binfmt_flat.c: len = strnlen_user(p, MAX_ARG_STRLEN); > kernel/auditsc.c: len_full = strnlen_user(p, MAX_ARG_STRLEN) - 1; > > Keeping it at an "arbitrary" 6MB limit looked safe :) Yeah, I think leaving MAX_ARG_STRLEN totally unchanged but adjust where it is used only for the ELF loader is probably the least risky thing to do here. -Kees > > > -Kees > > > > > > > > CC: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > > > CC: Eric Biederman <ebiederm@xxxxxxxxxxxx> > > > CC: Kees Cook <keescook@xxxxxxxxxxxx> > > > CC: linux-mm@xxxxxxxxx > > > CC: linux-kernel@xxxxxxxxxxxxxxx > > > Signed-off-by: Sergei Trofimovich <slyich@xxxxxxxxx> > > > --- > > > include/uapi/linux/binfmts.h | 6 +++--- > > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > > > diff --git a/include/uapi/linux/binfmts.h b/include/uapi/linux/binfmts.h > > > index c6f9450efc12..4e828515a22e 100644 > > > --- a/include/uapi/linux/binfmts.h > > > +++ b/include/uapi/linux/binfmts.h > > > @@ -8,11 +8,11 @@ struct pt_regs; > > > > > > /* > > > * These are the maximum length and maximum number of strings passed to the > > > - * execve() system call. MAX_ARG_STRLEN is essentially random but serves to > > > - * prevent the kernel from being unduly impacted by misaddressed pointers. > > > + * execve() system call. MAX_ARG_STRLEN is as large as Linux allows new > > > + * stack to grow. Currently it's `_STK_LIM / 4 * 3 = 6MB` (see fs/exec.c). > > > * MAX_ARG_STRINGS is chosen to fit in a signed 32-bit integer. > > > */ > > > -#define MAX_ARG_STRLEN (PAGE_SIZE * 32) > > > +#define MAX_ARG_STRLEN (6 * 1024 * 1024) > > > #define MAX_ARG_STRINGS 0x7FFFFFFF > > > > > > /* sizeof(linux_binprm->buf) */ > > > -- > > > 2.42.0 > > > > > > > -- > > Kees Cook > > > -- > > Sergei -- Kees Cook