Re: [RFC bpf-next v5] bpf: Force to MPTCP

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Stanislav,

On 28/07/2023 18:51, Stanislav Fomichev wrote:
> On 07/28, Matthieu Baerts wrote:
>> Hi Stanislav,
>>
>> On 27/07/2023 20:01, Stanislav Fomichev wrote:
>>> On 07/27, Matthieu Baerts wrote:
>>>> Hi Paul, Stanislav,
>>>>
>>>> On 18/07/2023 18:14, Paul Moore wrote:
>>>>> On Tue, Jul 18, 2023 at 11:21 AM Geliang Tang <geliang.tang@xxxxxxxx> wrote:
>>>>>>
>>>>>> As is described in the "How to use MPTCP?" section in MPTCP wiki [1]:
>>>>>>
>>>>>> "Your app can create sockets with IPPROTO_MPTCP as the proto:
>>>>>> ( socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP); ). Legacy apps can be
>>>>>> forced to create and use MPTCP sockets instead of TCP ones via the
>>>>>> mptcpize command bundled with the mptcpd daemon."
>>>>>>
>>>>>> But the mptcpize (LD_PRELOAD technique) command has some limitations
>>>>>> [2]:
>>>>>>
>>>>>>  - it doesn't work if the application is not using libc (e.g. GoLang
>>>>>> apps)
>>>>>>  - in some envs, it might not be easy to set env vars / change the way
>>>>>> apps are launched, e.g. on Android
>>>>>>  - mptcpize needs to be launched with all apps that want MPTCP: we could
>>>>>> have more control from BPF to enable MPTCP only for some apps or all the
>>>>>> ones of a netns or a cgroup, etc.
>>>>>>  - it is not in BPF, we cannot talk about it at netdev conf.
>>>>>>
>>>>>> So this patchset attempts to use BPF to implement functions similer to
>>>>>> mptcpize.
>>>>>>
>>>>>> The main idea is add a hook in sys_socket() to change the protocol id
>>>>>> from IPPROTO_TCP (or 0) to IPPROTO_MPTCP.
>>>>>>
>>>>>> [1]
>>>>>> https://github.com/multipath-tcp/mptcp_net-next/wiki
>>>>>> [2]
>>>>>> https://github.com/multipath-tcp/mptcp_net-next/issues/79
>>>>>>
>>>>>> v5:
>>>>>>  - add bpf_mptcpify helper.
>>>>>>
>>>>>> v4:
>>>>>>  - use lsm_cgroup/socket_create
>>>>>>
>>>>>> v3:
>>>>>>  - patch 8: char cmd[128]; -> char cmd[256];
>>>>>>
>>>>>> v2:
>>>>>>  - Fix build selftests errors reported by CI
>>>>>>
>>>>>> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/79
>>>>>> Signed-off-by: Geliang Tang <geliang.tang@xxxxxxxx>
>>>>>> ---
>>>>>>  include/linux/bpf.h                           |   1 +
>>>>>>  include/linux/lsm_hook_defs.h                 |   2 +-
>>>>>>  include/linux/security.h                      |   6 +-
>>>>>>  include/uapi/linux/bpf.h                      |   7 +
>>>>>>  kernel/bpf/bpf_lsm.c                          |   2 +
>>>>>>  net/mptcp/bpf.c                               |  20 +++
>>>>>>  net/socket.c                                  |   4 +-
>>>>>>  security/apparmor/lsm.c                       |   8 +-
>>>>>>  security/security.c                           |   2 +-
>>>>>>  security/selinux/hooks.c                      |   6 +-
>>>>>>  tools/include/uapi/linux/bpf.h                |   7 +
>>>>>>  .../testing/selftests/bpf/prog_tests/mptcp.c  | 128 ++++++++++++++++--
>>>>>>  tools/testing/selftests/bpf/progs/mptcpify.c  |  17 +++
>>>>>>  13 files changed, 187 insertions(+), 23 deletions(-)
>>>>>>  create mode 100644 tools/testing/selftests/bpf/progs/mptcpify.c
>>>>>
>>>>> ...
>>>>>
>>>>>> diff --git a/security/security.c b/security/security.c
>>>>>> index b720424ca37d..bbebcddce420 100644
>>>>>> --- a/security/security.c
>>>>>> +++ b/security/security.c
>>>>>> @@ -4078,7 +4078,7 @@ EXPORT_SYMBOL(security_unix_may_send);
>>>>>>   *
>>>>>>   * Return: Returns 0 if permission is granted.
>>>>>>   */
>>>>>> -int security_socket_create(int family, int type, int protocol, int kern)
>>>>>> +int security_socket_create(int *family, int *type, int *protocol, int kern)
>>>>>>  {
>>>>>>         return call_int_hook(socket_create, 0, family, type, protocol, kern);
>>>>>>  }
>>>>>
>>>>> Using the LSM to change the protocol family is not something we want
>>>>> to allow.  I'm sorry, but you will need to take a different approach.
>>>>
>>>> @Paul: Thank you for your feedback. It makes sense and I understand.
>>>>
>>>> @Stanislav: Despite the fact the implementation was smaller and reusing
>>>> more code, it looks like we cannot go in the direction you suggested. Do
>>>> you think what Geliang suggested before in his v3 [1] can be accepted?
>>>>
>>>> (Note that the v3 is the same as the v1, only some fixes in the selftests.)
>>>
>>> We have too many hooks in networking, so something that doesn't add
>>> a new one is preferable :-(
>>
>> Thank you for your reply and the explanation, I understand.
>>
>>> Moreover, we already have a 'socket init' hook, but it runs a bit late.
>>
>> Indeed. And we cannot move it before the creation of the socket.
>>
>>> Is existing cgroup/sock completely unworkable? Is it possible to
>>> expose some new bpf_upgrade_socket_to(IPPROTO_MPTCP) kfunc which would
>>> call some new net_proto_family->upgrade_to(IPPROTO_MPTCP) to do the surgery?
>>> Or is it too hacky?
>>
>> I cannot judge if it is too hacky or not but if you think it would be
>> OK, please tell us :)
> 
> Maybe try and see how it goes? Doing the surgery to convert from tcp
> to mptcp is probably hard, but it seems that we should be able to
> do something like:
> 
> int upgrade_to(sock, sk) {
> 	if (sk is not a tcp one) return -EINVAL;
> 
> 	sk_common_release(sk);
> 	return inet6_create(net, sock, IPPROTO_MPTCP, false);
> }
> 
> ?
> 
> The only thing I'm not sure about is whether you can call inet6_create
> on a socket that has seen sk_common_release'd...

Oh sorry, now I better understand your suggestion and the fact it is
hacky. Good workaround, we can keep this in mind if there is no other
solutions to avoid these create-release-create operations.

>>> Another option Alexei suggested is to add some fentry-like thing:
>>>
>>> noinline int update_socket_protocol(int protocol)
>>> {
>>> 	return protocol;
>>> }
>>> /* TODO: ^^^ add the above to mod_ret set */
>>>
>>> int __sys_socket(int family, int type, int protocol)
>>> {
>>> 	...
>>>
>>> 	protocol = update_socket_protocol(protocol);
>>>
>>> 	...
>>> }
>>>
>>> But it's also too problem specific it seems? And it's not cgroup-aware.
>>
>> It looks like it is what Geliang did in his v6. If it is the only
>> acceptable solution, I guess we can do without cgroup support. We can
>> continue the discussions in his v6 if that's easier.
> 
> Ack, that works too, let's see how other people feel about it. I'm
> assuming in the bpf program we can always do bpf_get_current_cgroup_id()
> to filter by cgroup.

Good point, that works too and looks enough!

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux