On Fri, Apr 16, 2021 at 10:01:43PM -0700, Alexei Starovoitov wrote: > On Fri, Apr 16, 2021 at 9:04 PM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > > > > On Fri, Apr 16, 2021 at 08:46:05PM -0700, Alexei Starovoitov wrote: > > > On Fri, Apr 16, 2021 at 8:42 PM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > > > > > > > > On Fri, Apr 16, 2021 at 08:32:20PM -0700, Alexei Starovoitov wrote: > > > > > From: Alexei Starovoitov <ast@xxxxxxxxxx> > > > > > > > > > > Add bpf_sys_close() helper to be used by the syscall/loader program to close > > > > > intermediate FDs and other cleanup. > > > > > > > > Conditional NAK. In a lot of contexts close_fd() is very much unsafe. > > > > In particular, anything that might call it between fdget() and fdput() > > > > is Right Fucking Out(tm). > > > > In which contexts can that thing be executed? > > > > > > user context only. > > > It's not for all of bpf _obviously_. > > > > Let me restate the question: what call chains could lead to bpf_sys_close()? > > Already answered. User context only. It's all safe. Not only sys_close is safe to call. Literally all syscalls are safe to call. The current allowlist contains two syscalls. It may get extended as use cases come up. The following two codes are equivalent: 1. bpf_prog.c: SEC("syscall") int bpf_prog(struct args *ctx) { bpf_sys_close(1); bpf_sys_close(2); bpf_sys_close(3); return 0; } main.c: int main(int ac, char **av) { bpf_prog_load_and_run("bpf_prog.o"); } 2. main.c: int main(int ac, char **av) { close(1); close(2); close(3); } The kernel will perform the same work with FDs. The same locks are held and the same execution conditions are in both cases. The LSM hooks, fsnotify, etc will be called the same way. It's no different if new syscall was introduced "sys_foo(int num)" that would do { return close_fd(num); }. It would opearate in the same user context.