Re: [PATCH v7 net-next 14/15] net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 18/06/2024 09.13, Sebastian Andrzej Siewior wrote:
The XDP redirect process is two staged:
- bpf_prog_run_xdp() is invoked to run a eBPF program which inspects the
   packet and makes decisions. While doing that, the per-CPU variable
   bpf_redirect_info is used.

- Afterwards xdp_do_redirect() is invoked and accesses bpf_redirect_info
   and it may also access other per-CPU variables like xskmap_flush_list.

At the very end of the NAPI callback, xdp_do_flush() is invoked which
does not access bpf_redirect_info but will touch the individual per-CPU
lists.

The per-CPU variables are only used in the NAPI callback hence disabling
bottom halves is the only protection mechanism. Users from preemptible
context (like cpu_map_kthread_run()) explicitly disable bottom halves
for protections reasons.
Without locking in local_bh_disable() on PREEMPT_RT this data structure
requires explicit locking.

PREEMPT_RT has forced-threaded interrupts enabled and every
NAPI-callback runs in a thread. If each thread has its own data
structure then locking can be avoided.

Create a struct bpf_net_context which contains struct bpf_redirect_info.
Define the variable on stack, use bpf_net_ctx_set() to save a pointer to
it, bpf_net_ctx_clear() removes it again.
The bpf_net_ctx_set() may nest. For instance a function can be used from
within NET_RX_SOFTIRQ/ net_rx_action which uses bpf_net_ctx_set() and
NET_TX_SOFTIRQ which does not. Therefore only the first invocations
updates the pointer.
Use bpf_net_ctx_get_ri() as a wrapper to retrieve the current struct
bpf_redirect_info. The returned data structure is zero initialized to
ensure nothing is leaked from stack. This is done on first usage of the
struct. bpf_net_ctx_set() sets bpf_redirect_info::kern_flags  to 0 to
note that initialisation is required. First invocation of
bpf_net_ctx_get_ri() will memset() the data structure and update
bpf_redirect_info::kern_flags.
bpf_redirect_info::nh  is excluded from memset because it is only used
once BPF_F_NEIGH is set which also sets the nh member. The kern_flags is
moved past nh to exclude it from memset.

The pointer to bpf_net_context is saved task's task_struct. Using
always the bpf_net_context approach has the advantage that there is
almost zero differences between PREEMPT_RT and non-PREEMPT_RT builds.

Cc: Alexei Starovoitov<ast@xxxxxxxxxx>
Cc: Andrii Nakryiko<andrii@xxxxxxxxxx>
Cc: Eduard Zingerman<eddyz87@xxxxxxxxx>
Cc: Hao Luo<haoluo@xxxxxxxxxx>
Cc: Jesper Dangaard Brouer<hawk@xxxxxxxxxx>
Cc: Jiri Olsa<jolsa@xxxxxxxxxx>
Cc: John Fastabend<john.fastabend@xxxxxxxxx>
Cc: KP Singh<kpsingh@xxxxxxxxxx>
Cc: Martin KaFai Lau<martin.lau@xxxxxxxxx>
Cc: Song Liu<song@xxxxxxxxxx>
Cc: Stanislav Fomichev<sdf@xxxxxxxxxx>
Cc: Toke Høiland-Jørgensen<toke@xxxxxxxxxx>
Cc: Yonghong Song<yonghong.song@xxxxxxxxx>
Cc:bpf@xxxxxxxxxxxxxxx
Acked-by: Alexei Starovoitov<ast@xxxxxxxxxx>
Reviewed-by: Toke Høiland-Jørgensen<toke@xxxxxxxxxx>
Signed-off-by: Sebastian Andrzej Siewior<bigeasy@xxxxxxxxxxxxx>

Acked-by: Jesper Dangaard Brouer <hawk@xxxxxxxxxx>

---
  include/linux/filter.h | 56 ++++++++++++++++++++++++++++++++++--------
  include/linux/sched.h  |  3 +++
  kernel/bpf/cpumap.c    |  3 +++
  kernel/bpf/devmap.c    |  9 ++++++-
  kernel/fork.c          |  1 +
  net/bpf/test_run.c     | 11 ++++++++-
  net/core/dev.c         | 26 +++++++++++++++++++-
  net/core/filter.c      | 44 +++++++++------------------------
  net/core/lwt_bpf.c     |  3 +++
  9 files changed, 111 insertions(+), 45 deletions(-)




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux