On Thu, May 18, 2023 at 09:44:33PM -0700, Alexei Starovoitov wrote: > That footgun was removed from folly in 2021, but we still see this issue from time to time. > My point that the kernel can help here. > Since folks don't like sysctl to control FD assignment how about something like this: > > diff --git a/fs/file.c b/fs/file.c > index 7893ea161d77..896e79433f61 100644 > --- a/fs/file.c > +++ b/fs/file.c > @@ -554,9 +554,15 @@ static int alloc_fd(unsigned start, unsigned end, unsigned flags) > return error; > } > > +__weak noinline u32 get_start_fd(void) > +{ > + return 0; > +} > +/* mark it as BPF_MODIFY_RETURN to let bpf progs adjust return value */ > + > int __get_unused_fd_flags(unsigned flags, unsigned long nofile) > { > - return alloc_fd(0, nofile, flags); > + return alloc_fd(get_start_fd(), nofile, flags); > } > > Then we can enforce fd >= 3 for a certain container or for a particular app. [an extremely belated reply - had been net.dead since mid-May, just got to that thread] As far as I'm concerned, the main conclusion is that BPF handling of file descriptors needs a fairly hostile code review, regarding the interactions with dup2(), shared descriptor tables, SCM_RIGHTS, etc. Rationale: demonstrated utter lack of clue about the nature of file descriptors, along with a weird mental model of how they are used, complete with "if they are used not in the way we expect, let's shove a hook somewhere to enforce The Right Way(tm)". Will do...