Re: [PATCH bpf-next/net v2 4/7] bpf: Add mptcp_subflow bpf_iter

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/19/24 7:46 AM, Matthieu Baerts (NGI0) wrote:
From: Geliang Tang <tanggeliang@xxxxxxxxxx>

It's necessary to traverse all subflows on the conn_list of an MPTCP
socket and then call kfunc to modify the fields of each subflow. In
kernel space, mptcp_for_each_subflow() helper is used for this:

	mptcp_for_each_subflow(msk, subflow)
		kfunc(subflow);

But in the MPTCP BPF program, this has not yet been implemented. As
Martin suggested recently, this conn_list walking + modify-by-kfunc
usage fits the bpf_iter use case.

So this patch adds a new bpf_iter type named "mptcp_subflow" to do
this and implements its helpers bpf_iter_mptcp_subflow_new()/_next()/
_destroy(). And register these bpf_iter mptcp_subflow into mptcp
common kfunc set. Then bpf_for_each() for mptcp_subflow can be used
in BPF program like this:

	bpf_for_each(mptcp_subflow, subflow, msk)
		kfunc(subflow);

Suggested-by: Martin KaFai Lau <martin.lau@xxxxxxxxxx>
Signed-off-by: Geliang Tang <tanggeliang@xxxxxxxxxx>
Reviewed-by: Mat Martineau <martineau@xxxxxxxxxx>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@xxxxxxxxxx>
---
Notes:
  - v2:
    - Add BUILD_BUG_ON() checks, similar to the ones done with other
      bpf_iter_(...) helpers.
    - Replace msk_owned_by_me() by sock_owned_by_user_nocheck() and
      !spin_is_locked() (Martin).

A few versions of this single patch have been previously posted to the
BPF mailing list by Geliang, before continuing to the MPTCP mailing list
only, with other patches of this series. The version of the whole series
has been reset to 1, but here is the ChangeLog for the previous ones:
  - v2: remove msk->pm.lock in _new() and _destroy() (Martin)
        drop DEFINE_BPF_ITER_FUNC, change opaque[3] to opaque[2] (Andrii)
  - v3: drop bpf_iter__mptcp_subflow
  - v4: if msk is NULL, initialize kit->msk to NULL in _new() and check
        it in _next() (Andrii)
  - v5: use list_is_last() instead of list_entry_is_head() add
        KF_ITER_NEW/NEXT/DESTROY flags add msk_owned_by_me in _new()
  - v6: add KF_TRUSTED_ARGS flag (Andrii, Martin)
---
  net/mptcp/bpf.c | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++++
  1 file changed, 53 insertions(+)

diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c
index c5bfd84c16c43230d9d8e1fd8ff781a767e647b5..e39f0e4fb683c1aa31ee075281daee218dac5878 100644
--- a/net/mptcp/bpf.c
+++ b/net/mptcp/bpf.c
@@ -35,6 +35,15 @@ static const struct btf_kfunc_id_set bpf_mptcp_fmodret_set = {
  	.set   = &bpf_mptcp_fmodret_ids,
  };
+struct bpf_iter_mptcp_subflow {
+	__u64 __opaque[2];
+} __aligned(8);
+
+struct bpf_iter_mptcp_subflow_kern {
+	struct mptcp_sock *msk;
+	struct list_head *pos;
+} __aligned(8);
+
  __bpf_kfunc_start_defs();
__bpf_kfunc static struct mptcp_subflow_context *
@@ -47,10 +56,54 @@ bpf_mptcp_subflow_ctx(const struct sock *sk)
  	return NULL;
  }
+__bpf_kfunc static int
+bpf_iter_mptcp_subflow_new(struct bpf_iter_mptcp_subflow *it,
+			   struct mptcp_sock *msk)
+{
+	struct bpf_iter_mptcp_subflow_kern *kit = (void *)it;
+	struct sock *sk = (struct sock *)msk;
+
+	BUILD_BUG_ON(sizeof(struct bpf_iter_mptcp_subflow_kern) >
+		     sizeof(struct bpf_iter_mptcp_subflow));
+	BUILD_BUG_ON(__alignof__(struct bpf_iter_mptcp_subflow_kern) !=
+		     __alignof__(struct bpf_iter_mptcp_subflow));
+
+	kit->msk = msk;
+	if (!msk)

NULL check is not needed. verifier should have rejected it for KF_TRUSTED_ARGS.

+		return -EINVAL;
+
+	if (!sock_owned_by_user_nocheck(sk) &&
+	    !spin_is_locked(&sk->sk_lock.slock))

I could have missed something. If it is to catch bug, should it be sock_owned_by_me() that has the lockdep splat? For the cg get/setsockopt hook here, the lock should have already been held earlier in the kernel.

This set is only showing the cg sockopt bpf prog but missing the major struct_ops piece. It is hard to comment. I assumed the lock situation is the same for the struct_ops where the lock will be held before calling the struct_ops prog?

+		return -EINVAL;
+
+	kit->pos = &msk->conn_list;
+	return 0;
+}
+
+__bpf_kfunc static struct mptcp_subflow_context *
+bpf_iter_mptcp_subflow_next(struct bpf_iter_mptcp_subflow *it)
+{
+	struct bpf_iter_mptcp_subflow_kern *kit = (void *)it;
+
+	if (!kit->msk || list_is_last(kit->pos, &kit->msk->conn_list))
+		return NULL;
+
+	kit->pos = kit->pos->next;
+	return list_entry(kit->pos, struct mptcp_subflow_context, node);
+}
+
+__bpf_kfunc static void
+bpf_iter_mptcp_subflow_destroy(struct bpf_iter_mptcp_subflow *it)
+{
+}
+
  __bpf_kfunc_end_defs();
BTF_KFUNCS_START(bpf_mptcp_common_kfunc_ids)
  BTF_ID_FLAGS(func, bpf_mptcp_subflow_ctx, KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_iter_mptcp_subflow_new, KF_ITER_NEW | KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_iter_mptcp_subflow_next, KF_ITER_NEXT | KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_iter_mptcp_subflow_destroy, KF_ITER_DESTROY)
  BTF_KFUNCS_END(bpf_mptcp_common_kfunc_ids)
static const struct btf_kfunc_id_set bpf_mptcp_common_kfunc_set = {






[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux