On 6/30/20 1:39 AM, Lorenz Bauer wrote:
On Tue, 30 Jun 2020 at 06:48, Yonghong Song <yhs@xxxxxx> wrote:
Since bpf_iter is mentioned here, I would like to provide a little
context on how target_fd in link_create is treated there.
Thanks!
Currently, target_fd is always 0 as it is not used. This is
just easier if we want to use it in the future.
In the future, bpf_iter can maintain that target_fd must be 0
or it may not so. For example, it can add a flag value in
link_create such that when flag is set it will take whatever
value in target_fd and use it. Or it may just take a non-0
target_fd as an indication of the flag is set. I have not
finalized patches yet. I intend to do the latter, i.e.,
taking a non-0 target_fd. But we will see once my bpf_iter
patches for map elements are out.
I had a piece of code for sockmap which did something like this:
prog = bpf_prog_get(attr->attach_bpf_fd)
if (IS_ERR(prog))
if (!attr->attach_bpf_fd)
// fall back to old behaviour
else
return PTR_ERR(prog)
else if (prog->type != TYPE)
return -EINVAL
The benefit is that it continues to work if a binary is invoked with
stdin closed, which could lead to a BPF program with fd 0.
For bpf_iter, there is no legacy. So I will have something like
// somecondition could be new attr->flags, or some kernel internal
checking
if (somecondition) {
/* not accepting fd 0 */
if (attr->attach_bpf_fd == 0)
return -EINVAL;
prog = bpf_prog_get(attr->attach_bpf_fd)
if (IS_ERR(prog))
return PTR_ERR(prog)
} else if (attr->attach_bpf_fd != 0)
return -EINVAL;
or I could have
if (somecondition) {
/* accepting any fd */
prog = bpf_prog_get(attr->attach_bpf_fd)
if (IS_ERR(prog))
return PTR_ERR(prog)
} else if (attr->attach_bpf_fd != 0)
return -EINVAL;
This "somecondition" is false for the current bpf_iter, so existing
behavior attr->attach_bpf_fd == 0 is still enforced.
Could this work for bpf_iter as well?
There is another example where 0 and non-0 prog_fd make a difference.
The attach_prog_fd field when doing prog_load.
When attach_prog_fd is 0, it means attaching to vmlinux through
attach_btf_id. If attach_prog_fd is not 0, it means attaching to
another bpf program (replace). So user space (libbpf) may
already need to pay attention to this.
That is unfortunate. What was the reason to use 0 instead of -1 to
attach to vmlinux?
attaching to vmlinux happens first and at that time attach_prog_fd
does not exist. Later when replace prog feature is introduced,
attach_prog_fd is added. This field is used to differentiate
between vmlinux func attachment vs. bpf_prog attachment. A little
bit unfortunate, but using 0 is easier as we have check_attr
in the kernel to ensure all kernel-unsupported fields must be 0.
using -1 will break that.
work around for fd 0 should we need to in the future.
The detach case is more problematic: both cgroups and lirc2 verify
that attach_bpf_fd matches the currently attached program. This
way you need access to the program fd to be able to remove it.
Neither sockmap nor flow_dissector do this. flow_dissector even
has a check for CAP_NET_ADMIN because of this. The patch set
addresses this by implementing the desired behaviour.
There is a possibility for user space breakage: any callers that
don't provide the correct fd will fail with ENOENT. For sockmap
the risk is low: even the selftests assume that sockmap works
the way I described. For flow_dissector the story is less
straightforward, and the selftests use a variety of arguments.
I've includes fixes tags for the oldest commits that allow an easy
backport, however the behaviour dates back to when sockmap and
flow_dissector were introduced. What is the best way to handle these?
This set is based on top of Jakub's work "bpf, netns: Prepare
for multi-prog attachment" available at
https://lore.kernel.org/bpf/87k0zwmhtb.fsf@xxxxxxxxxxxxxx/T/
Since v1:
- Adjust selftests
- Implement detach behaviour
Lorenz Bauer (6):
bpf: flow_dissector: check value of unused flags to BPF_PROG_ATTACH
bpf: flow_dissector: check value of unused flags to BPF_PROG_DETACH
bpf: sockmap: check value of unused args to BPF_PROG_ATTACH
bpf: sockmap: require attach_bpf_fd when detaching a program
selftests: bpf: pass program and target_fd in flow_dissector_reattach
selftests: bpf: pass program to bpf_prog_detach in flow_dissector
include/linux/bpf-netns.h | 5 +-
include/linux/bpf.h | 13 ++++-
include/linux/skmsg.h | 13 +++++
kernel/bpf/net_namespace.c | 22 ++++++--
kernel/bpf/syscall.c | 6 +--
net/core/sock_map.c | 53 +++++++++++++++++--
.../selftests/bpf/prog_tests/flow_dissector.c | 4 +-
.../bpf/prog_tests/flow_dissector_reattach.c | 12 ++---
8 files changed, 103 insertions(+), 25 deletions(-)