This patch adds the capability to destroy sockets in BPF. We plan to use the capability in Cilium to force client sockets to reconnect when their remote load-balancing backends are deleted. The other use case is on-the-fly policy enforcement where existing socket connections prevented by policies need to be terminated. The use cases, and more details around the selected approach was presented at LPC 2022 - https://lpc.events/event/16/contributions/1358/. RFC discussion - https://lore.kernel.org/netdev/CABG=zsBEh-P4NXk23eBJw7eajB5YJeRS7oPXnTAzs=yob4EMoQ@xxxxxxxxxxxxxx/T/#u. v2 patch series - https://lore.kernel.org/bpf/20230223215311.926899-1-aditi.ghag@xxxxxxxxxxxxx/T/#t v3 highlights: - Martin's review comments: - UDP iterator batching patch supports resume operation. - Removed "full_sock" check from the destroy kfunc. - Reset of metadata in case of rebatching. - Extended selftests to cover cases for destroying listening sockets. - Fixes for destroying listening TCP and UDP sockets. - Stan's review: - Refactored selftests to use ASSERT_* in lieu of CHECK. - Free leaking afinfo in fini_udp. - Restructured test cases per Andrii's comment. Notes to the reviewers: - There are two RFC commits for being able to destroy listening TCP and UDP sockets. The TCP commit isn't quite correct, as inet_unhash could be invoked from BPF context for cases other than iterator. The UDP commit seems reasonable based on my understanding of the code, but it may lead to unintended behavior when there are sockets listening on wildcard and specific address with a common port. I would appreciate insights into both the commits, as I'm not intimately familiar with some of the overall code path. (Below notes are same as v2 patch series.) - I hit a snag while writing the kfunc where verifier complained about the `sock_common` type passed from TCP iterator. With kfuncs, there don't seem to be any options available to pass BTF type hints to the verifier (equivalent of `ARG_PTR_TO_BTF_ID_SOCK_COMMON`, as was the case with the helper). As a result, I changed the argument type of the sock_destory kfunc to `sock_common`. - The `vmlinux.h` import in the selftest prog unexpectedly led to libbpf failing to load the program. As it turns out, the libbpf kfunc related code doesn't seem to handle BTF `FWD` type for structs. I've attached debug information about the issue in case the loader logic can accommodate such gotchas. Although the error in this case was specific to the test imports. Aditi Ghag (5): bpf: Implement batching in UDP iterator bpf: Add bpf_sock_destroy kfunc [RFC] net: Skip taking lock in BPF context [RFC] udp: Fix destroying UDP listening sockets selftests/bpf: Add tests for bpf_sock_destroy include/net/udp.h | 1 + net/core/filter.c | 54 ++++ net/ipv4/inet_hashtables.c | 9 +- net/ipv4/tcp.c | 16 +- net/ipv4/udp.c | 283 +++++++++++++++++- .../selftests/bpf/prog_tests/sock_destroy.c | 190 ++++++++++++ .../selftests/bpf/progs/sock_destroy_prog.c | 151 ++++++++++ 7 files changed, 684 insertions(+), 20 deletions(-) create mode 100644 tools/testing/selftests/bpf/prog_tests/sock_destroy.c create mode 100644 tools/testing/selftests/bpf/progs/sock_destroy_prog.c -- 2.34.1