In a previous commit (1), BPF preload process was switched from user mode process to use in-kernel light skeleton instead. However, in the kernel context the available FD starts from 0, instead of normally 3 for user mode process. The preload process also left two FDs open, taking over FD 0 and 1. This later caused issues when kernel trys to setup stdin/stdout/stderr for init process, assuming FD 0,1,2 are available. As seen here: Before fix: ls -lah /proc/1/fd/* lrwx------1 root root 64 Feb 23 17:20 /proc/1/fd/0 -> /dev/null lrwx------ 1 root root 64 Feb 23 17:20 /proc/1/fd/1 -> /dev/null lrwx------ 1 root root 64 Feb 23 17:20 /proc/1/fd/2 -> /dev/console lrwx------ 1 root root 64 Feb 23 17:20 /proc/1/fd/6 -> /dev/console lrwx------ 1 root root 64 Feb 23 17:20 /proc/1/fd/7 -> /dev/console After Fix / Normal: ls -lah /proc/1/fd/* lrwx------ 1 root root 64 Feb 24 21:23 /proc/1/fd/0 -> /dev/console lrwx------ 1 root root 64 Feb 24 21:23 /proc/1/fd/1 -> /dev/console lrwx------ 1 root root 64 Feb 24 21:23 /proc/1/fd/2 -> /dev/console In this patch: - skel_closenz was changed to skel_closegez to correctly handle FD=0 case. - various places detecting FD > 0 was changed to FD >= 0. - Call iterators_skel__detach() funciton to release FDs after links are obtained. 1: commit cb80ddc ("bpf: Convert bpf_preload.ko to use light skeleton.") Fixes: commit cb80ddc ("bpf: Convert bpf_preload.ko to use light skeleton.") Signed-off-by: Yucong Sun <fallentree@xxxxxx> V3 -> V1: removed all changes related to handle fd=0. V2 -> V1: rename skel_closenez to skel_closegez, added comment as requested. --- kernel/bpf/preload/bpf_preload_kern.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/kernel/bpf/preload/bpf_preload_kern.c b/kernel/bpf/preload/bpf_preload_kern.c index 30207c048d36..13cd0d146dd7 100644 --- a/kernel/bpf/preload/bpf_preload_kern.c +++ b/kernel/bpf/preload/bpf_preload_kern.c @@ -54,6 +54,16 @@ static int load_skel(void) err = PTR_ERR(progs_link); goto out; } + /* Avoid taking over stdin/stdout/stderr of init process. This also + makes skel_closenz() no-op later in free_links_and_skel(). */ + if (skel->links.dump_bpf_map_fd < 3) { + close_fd(skel->links.dump_bpf_map_fd); + skel->links.dump_bpf_map_fd = 0; + } + if (skel->links.dump_bpf_prog_fd < 3) { + close_fd(skel->links.dump_bpf_prog_fd); + skel->links.dump_bpf_prog_fd = 0; + } return 0; out: free_links_and_skel(); -- 2.30.2