Hi Stanislav, On 28/07/2023 18:51, Stanislav Fomichev wrote: > On 07/28, Matthieu Baerts wrote: >> Hi Stanislav, >> >> On 27/07/2023 20:01, Stanislav Fomichev wrote: >>> On 07/27, Matthieu Baerts wrote: >>>> Hi Paul, Stanislav, >>>> >>>> On 18/07/2023 18:14, Paul Moore wrote: >>>>> On Tue, Jul 18, 2023 at 11:21 AM Geliang Tang <geliang.tang@xxxxxxxx> wrote: >>>>>> >>>>>> As is described in the "How to use MPTCP?" section in MPTCP wiki [1]: >>>>>> >>>>>> "Your app can create sockets with IPPROTO_MPTCP as the proto: >>>>>> ( socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP); ). Legacy apps can be >>>>>> forced to create and use MPTCP sockets instead of TCP ones via the >>>>>> mptcpize command bundled with the mptcpd daemon." >>>>>> >>>>>> But the mptcpize (LD_PRELOAD technique) command has some limitations >>>>>> [2]: >>>>>> >>>>>> - it doesn't work if the application is not using libc (e.g. GoLang >>>>>> apps) >>>>>> - in some envs, it might not be easy to set env vars / change the way >>>>>> apps are launched, e.g. on Android >>>>>> - mptcpize needs to be launched with all apps that want MPTCP: we could >>>>>> have more control from BPF to enable MPTCP only for some apps or all the >>>>>> ones of a netns or a cgroup, etc. >>>>>> - it is not in BPF, we cannot talk about it at netdev conf. >>>>>> >>>>>> So this patchset attempts to use BPF to implement functions similer to >>>>>> mptcpize. >>>>>> >>>>>> The main idea is add a hook in sys_socket() to change the protocol id >>>>>> from IPPROTO_TCP (or 0) to IPPROTO_MPTCP. >>>>>> >>>>>> [1] >>>>>> https://github.com/multipath-tcp/mptcp_net-next/wiki >>>>>> [2] >>>>>> https://github.com/multipath-tcp/mptcp_net-next/issues/79 >>>>>> >>>>>> v5: >>>>>> - add bpf_mptcpify helper. >>>>>> >>>>>> v4: >>>>>> - use lsm_cgroup/socket_create >>>>>> >>>>>> v3: >>>>>> - patch 8: char cmd[128]; -> char cmd[256]; >>>>>> >>>>>> v2: >>>>>> - Fix build selftests errors reported by CI >>>>>> >>>>>> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/79 >>>>>> Signed-off-by: Geliang Tang <geliang.tang@xxxxxxxx> >>>>>> --- >>>>>> include/linux/bpf.h | 1 + >>>>>> include/linux/lsm_hook_defs.h | 2 +- >>>>>> include/linux/security.h | 6 +- >>>>>> include/uapi/linux/bpf.h | 7 + >>>>>> kernel/bpf/bpf_lsm.c | 2 + >>>>>> net/mptcp/bpf.c | 20 +++ >>>>>> net/socket.c | 4 +- >>>>>> security/apparmor/lsm.c | 8 +- >>>>>> security/security.c | 2 +- >>>>>> security/selinux/hooks.c | 6 +- >>>>>> tools/include/uapi/linux/bpf.h | 7 + >>>>>> .../testing/selftests/bpf/prog_tests/mptcp.c | 128 ++++++++++++++++-- >>>>>> tools/testing/selftests/bpf/progs/mptcpify.c | 17 +++ >>>>>> 13 files changed, 187 insertions(+), 23 deletions(-) >>>>>> create mode 100644 tools/testing/selftests/bpf/progs/mptcpify.c >>>>> >>>>> ... >>>>> >>>>>> diff --git a/security/security.c b/security/security.c >>>>>> index b720424ca37d..bbebcddce420 100644 >>>>>> --- a/security/security.c >>>>>> +++ b/security/security.c >>>>>> @@ -4078,7 +4078,7 @@ EXPORT_SYMBOL(security_unix_may_send); >>>>>> * >>>>>> * Return: Returns 0 if permission is granted. >>>>>> */ >>>>>> -int security_socket_create(int family, int type, int protocol, int kern) >>>>>> +int security_socket_create(int *family, int *type, int *protocol, int kern) >>>>>> { >>>>>> return call_int_hook(socket_create, 0, family, type, protocol, kern); >>>>>> } >>>>> >>>>> Using the LSM to change the protocol family is not something we want >>>>> to allow. I'm sorry, but you will need to take a different approach. >>>> >>>> @Paul: Thank you for your feedback. It makes sense and I understand. >>>> >>>> @Stanislav: Despite the fact the implementation was smaller and reusing >>>> more code, it looks like we cannot go in the direction you suggested. Do >>>> you think what Geliang suggested before in his v3 [1] can be accepted? >>>> >>>> (Note that the v3 is the same as the v1, only some fixes in the selftests.) >>> >>> We have too many hooks in networking, so something that doesn't add >>> a new one is preferable :-( >> >> Thank you for your reply and the explanation, I understand. >> >>> Moreover, we already have a 'socket init' hook, but it runs a bit late. >> >> Indeed. And we cannot move it before the creation of the socket. >> >>> Is existing cgroup/sock completely unworkable? Is it possible to >>> expose some new bpf_upgrade_socket_to(IPPROTO_MPTCP) kfunc which would >>> call some new net_proto_family->upgrade_to(IPPROTO_MPTCP) to do the surgery? >>> Or is it too hacky? >> >> I cannot judge if it is too hacky or not but if you think it would be >> OK, please tell us :) > > Maybe try and see how it goes? Doing the surgery to convert from tcp > to mptcp is probably hard, but it seems that we should be able to > do something like: > > int upgrade_to(sock, sk) { > if (sk is not a tcp one) return -EINVAL; > > sk_common_release(sk); > return inet6_create(net, sock, IPPROTO_MPTCP, false); > } > > ? > > The only thing I'm not sure about is whether you can call inet6_create > on a socket that has seen sk_common_release'd... Oh sorry, now I better understand your suggestion and the fact it is hacky. Good workaround, we can keep this in mind if there is no other solutions to avoid these create-release-create operations. >>> Another option Alexei suggested is to add some fentry-like thing: >>> >>> noinline int update_socket_protocol(int protocol) >>> { >>> return protocol; >>> } >>> /* TODO: ^^^ add the above to mod_ret set */ >>> >>> int __sys_socket(int family, int type, int protocol) >>> { >>> ... >>> >>> protocol = update_socket_protocol(protocol); >>> >>> ... >>> } >>> >>> But it's also too problem specific it seems? And it's not cgroup-aware. >> >> It looks like it is what Geliang did in his v6. If it is the only >> acceptable solution, I guess we can do without cgroup support. We can >> continue the discussions in his v6 if that's easier. > > Ack, that works too, let's see how other people feel about it. I'm > assuming in the bpf program we can always do bpf_get_current_cgroup_id() > to filter by cgroup. Good point, that works too and looks enough! Cheers, Matt -- Tessares | Belgium | Hybrid Access Solutions www.tessares.net