On 2/25/22 9:52 AM, Yucong Sun wrote:
In a previous commit (1), BPF preload process was switched from user
For commit, you can just say in:
In commit cb80ddc67152 ("bpf: Convert bpf_preload.ko to use light
skeleton."), BPF preload process ...
People uses reference ([1]) is for web links.
mode process to use in-kernel light skeleton instead. However, in the
kernel context the available fd starts from 0, instead of normally 3 for
user mode process. and the preload process leaked two FDs, taking over
FD 0 and 1. This which later caused issues when kernel trys to setup
stdin/stdout/stderr for init process, assuming fd 0,1,2 is available.
As seen here:
Before fix:
ls -lah /proc/1/fd/*
lrwx------1 root root 64 Feb 23 17:20 /proc/1/fd/0 -> /dev/null
lrwx------ 1 root root 64 Feb 23 17:20 /proc/1/fd/1 -> /dev/null
lrwx------ 1 root root 64 Feb 23 17:20 /proc/1/fd/2 -> /dev/console
lrwx------ 1 root root 64 Feb 23 17:20 /proc/1/fd/6 -> /dev/console
lrwx------ 1 root root 64 Feb 23 17:20 /proc/1/fd/7 -> /dev/console
After Fix / Normal:
ls -lah /proc/1/fd/*
lrwx------ 1 root root 64 Feb 24 21:23 /proc/1/fd/0 -> /dev/console
lrwx------ 1 root root 64 Feb 24 21:23 /proc/1/fd/1 -> /dev/console
lrwx------ 1 root root 64 Feb 24 21:23 /proc/1/fd/2 -> /dev/console
In this patch:
- skel_closenz was changed to skel_closegez to correctly handle
FD=0 case.
- various places detecting FD > 0 was changed to FD >= 0.
- Call iterators_skel__detach() funciton to release FDs after links
are obtained.
1: commit cb80ddc67152 ("bpf: Convert bpf_preload.ko to use light skeleton.")
You don't need the above line.
Fixes: commit cb80ddc67152 ("bpf: Convert bpf_preload.ko to use light skeleton.")
Signed-off-by: Yucong Sun <fallentree@xxxxxx>
LGTM. One comment below.
Acked-by: Yonghong Song <yhs@xxxxxx>
V2 -> V1: rename skel_closenez to skel_closegez, added comment as
requested.
---
kernel/bpf/preload/bpf_preload_kern.c | 4 ++++
kernel/bpf/preload/iterators/iterators.lskel.h | 16 +++++++++-------
tools/bpf/bpftool/gen.c | 9 +++++----
tools/lib/bpf/skel_internal.h | 8 ++++----
4 files changed, 22 insertions(+), 15 deletions(-)
diff --git a/kernel/bpf/preload/bpf_preload_kern.c b/kernel/bpf/preload/bpf_preload_kern.c
index 30207c048d36..3cc8bbfd15b1 100644
--- a/kernel/bpf/preload/bpf_preload_kern.c
+++ b/kernel/bpf/preload/bpf_preload_kern.c
@@ -14,6 +14,8 @@ static void free_links_and_skel(void)
bpf_link_put(maps_link);
if (!IS_ERR_OR_NULL(progs_link))
bpf_link_put(progs_link);
+ /* __detach() was already called before this, __destory() will call it again, but
+ with no effect. */
iterators_bpf__destroy(skel);
This is not the right place to put the comment as free_links_and_skel()
is also called in load_skel() in failure path.
}
@@ -54,6 +56,8 @@ static int load_skel(void)
err = PTR_ERR(progs_link);
goto out;
}
+ /* Release all FDs */
+ iterators_bpf__detach(skel);
How about we put the comments in free_links_and_skel() here. The
comments can be something like:
/* Release all FDs to avoid impacting stdin/stdout/stderr setup
* in init process. Later call of this function in
iterators_bpf__destroy() will be a noop. */
return 0;
out:
free_links_and_skel();
diff --git a/kernel/bpf/preload/iterators/iterators.lskel.h b/kernel/bpf/preload/iterators/iterators.lskel.h
index 70f236a82fe1..6a93538fa69f 100644
--- a/kernel/bpf/preload/iterators/iterators.lskel.h
+++ b/kernel/bpf/preload/iterators/iterators.lskel.h
@@ -28,7 +28,7 @@ iterators_bpf__dump_bpf_map__attach(struct iterators_bpf *skel)
int prog_fd = skel->progs.dump_bpf_map.prog_fd;
int fd = skel_link_create(prog_fd, 0, BPF_TRACE_ITER);
[...]