On Tue, 30 Jun 2020 at 06:48, Yonghong Song <yhs@xxxxxx> wrote: > > Since bpf_iter is mentioned here, I would like to provide a little > context on how target_fd in link_create is treated there. Thanks! > Currently, target_fd is always 0 as it is not used. This is > just easier if we want to use it in the future. > > In the future, bpf_iter can maintain that target_fd must be 0 > or it may not so. For example, it can add a flag value in > link_create such that when flag is set it will take whatever > value in target_fd and use it. Or it may just take a non-0 > target_fd as an indication of the flag is set. I have not > finalized patches yet. I intend to do the latter, i.e., > taking a non-0 target_fd. But we will see once my bpf_iter > patches for map elements are out. I had a piece of code for sockmap which did something like this: prog = bpf_prog_get(attr->attach_bpf_fd) if (IS_ERR(prog)) if (!attr->attach_bpf_fd) // fall back to old behaviour else return PTR_ERR(prog) else if (prog->type != TYPE) return -EINVAL The benefit is that it continues to work if a binary is invoked with stdin closed, which could lead to a BPF program with fd 0. Could this work for bpf_iter as well? > > There is another example where 0 and non-0 prog_fd make a difference. > The attach_prog_fd field when doing prog_load. > When attach_prog_fd is 0, it means attaching to vmlinux through > attach_btf_id. If attach_prog_fd is not 0, it means attaching to > another bpf program (replace). So user space (libbpf) may > already need to pay attention to this. That is unfortunate. What was the reason to use 0 instead of -1 to attach to vmlinux? > > > work around for fd 0 should we need to in the future. > > > > The detach case is more problematic: both cgroups and lirc2 verify > > that attach_bpf_fd matches the currently attached program. This > > way you need access to the program fd to be able to remove it. > > Neither sockmap nor flow_dissector do this. flow_dissector even > > has a check for CAP_NET_ADMIN because of this. The patch set > > addresses this by implementing the desired behaviour. > > > > There is a possibility for user space breakage: any callers that > > don't provide the correct fd will fail with ENOENT. For sockmap > > the risk is low: even the selftests assume that sockmap works > > the way I described. For flow_dissector the story is less > > straightforward, and the selftests use a variety of arguments. > > > > I've includes fixes tags for the oldest commits that allow an easy > > backport, however the behaviour dates back to when sockmap and > > flow_dissector were introduced. What is the best way to handle these? > > > > This set is based on top of Jakub's work "bpf, netns: Prepare > > for multi-prog attachment" available at > > https://lore.kernel.org/bpf/87k0zwmhtb.fsf@xxxxxxxxxxxxxx/T/ > > > > Since v1: > > - Adjust selftests > > - Implement detach behaviour > > > > Lorenz Bauer (6): > > bpf: flow_dissector: check value of unused flags to BPF_PROG_ATTACH > > bpf: flow_dissector: check value of unused flags to BPF_PROG_DETACH > > bpf: sockmap: check value of unused args to BPF_PROG_ATTACH > > bpf: sockmap: require attach_bpf_fd when detaching a program > > selftests: bpf: pass program and target_fd in flow_dissector_reattach > > selftests: bpf: pass program to bpf_prog_detach in flow_dissector > > > > include/linux/bpf-netns.h | 5 +- > > include/linux/bpf.h | 13 ++++- > > include/linux/skmsg.h | 13 +++++ > > kernel/bpf/net_namespace.c | 22 ++++++-- > > kernel/bpf/syscall.c | 6 +-- > > net/core/sock_map.c | 53 +++++++++++++++++-- > > .../selftests/bpf/prog_tests/flow_dissector.c | 4 +- > > .../bpf/prog_tests/flow_dissector_reattach.c | 12 ++--- > > 8 files changed, 103 insertions(+), 25 deletions(-) > > -- Lorenz Bauer | Systems Engineer 6th Floor, County Hall/The Riverside Building, SE1 7PB, UK www.cloudflare.com