On Tue, May 12, 2020 at 11:46 AM Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote: > > I am still thinking about this one, but here is where I am at. At a > practical level passing the file descriptor of the script to interpreter > seems like something we should encourage in the long term. It removes > races and it is cheaper because then the interpreter does not have to > turn around and open the script itself. Yeah, I think we should continue to support it, because I think it's the right thing to do (and we might just end up having compatibility issues if we don't). How about trying to move the logic to the common code, out of binfmt_misc? IOW, how about something very similar to your "brpm->preserve_creds" thing that you did for the credentials (also for binfmt_misc, which shouldn't surprise anybody: binfmt_misc is simply the "this is the generic thing for letting user mode do the final details"). > Calling fd_install in binfmt_misc still seems wrong, as that exposes > the new file descriptor to user space with the old creds. Right. And it really would be good to simply not have these kinds of very special cases inside the low-level binfmt code: I'd much rather have the special cases in the generic code, so that we see what the ordering is etc. One of the big problems with all these binfmt callbacks has been the fact that it makes it so hard to think about and change the generic code, because the low-level binfmt handlers all do their own special thing. So moving it to generic code would likely simplify things from that angle, even if the actual complexity of the feature itself remains. Besides, we really have exposed this to other code anyway thanks to that whole bprm->interp_data thing, and the AT_EXECFD AUX entries that we have. So it's not really "internal" to binfmt_misc _anyway_. So how about we just move the fd_binary logic to the generic execve code, and just binfmt_misc set the flag for "yes, please do this", exactly like "preserve_creds"? > It is possible although unlikely for userspace to find the file > descriptor without consulting AT_EXECFD so just to be conservative I > think we should install the file descriptor in begin_new_exec even if > the next interpreter does not support AT_EXECFD. Ack. I think the AT_EXECFD thing is a sign that this isn't internal to binfmt_misc, but it also shouldn't be gating this issue. In reality, ELF is the only real binary format that matters - the script/misc binfmts are just indirection entries - and it supports AT_EXECFD, so let's just ignore the theoretical case of "maybe nobody exposes it". So yes, just make it part of begin_new_exec(), and there's no reason to support more than a single fd. No stacks or arrays of these things required, I feel. It's not like AT_EXECFD supports the notion of multiple fd's being reported anyway, nor does it make any sense to have some kind of nested misc->misc binfmt nesting. So making that whole interp_data and fd_binary thing be a generic layer thing would make the search_binary_handler() code in binfmt_misc be a pure tailcall too, and then the conversion to a loop ends up working and being the right thing. No? Linus