Re: [PATCH net-next 12/24] seg6: Use nested-BH locking for seg6_bpf_srh_states.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2023-12-15 at 18:07 +0100, Sebastian Andrzej Siewior wrote:
> The access to seg6_bpf_srh_states is protected by disabling preemption.
> Based on the code, the entry point is input_action_end_bpf() and
> every other function (the bpf helper functions bpf_lwt_seg6_*()), that
> is accessing seg6_bpf_srh_states, should be called from within
> input_action_end_bpf().
> 
> input_action_end_bpf() accesses seg6_bpf_srh_states first at the top of
> the function and then disables preemption. This looks wrong because if
> preemption needs to be disabled as part of the locking mechanism then
> the variable shouldn't be accessed beforehand.
> 
> Looking at how it is used via test_lwt_seg6local.sh then
> input_action_end_bpf() is always invoked from softirq context. If this
> is always the case then the preempt_disable() statement is superfluous.
> If this is not always invoked from softirq then disabling only
> preemption is not sufficient.
> 
> Replace the preempt_disable() statement with nested-BH locking. This is
> not an equivalent replacement as it assumes that the invocation of
> input_action_end_bpf() always occurs in softirq context and thus the
> preempt_disable() is superfluous.
> Add a local_lock_t the data structure and use local_lock_nested_bh() in
> guard notation for locking. Add lockdep_assert_held() to ensure the lock
> is held while the per-CPU variable is referenced in the helper functions.
> 
> Cc: Alexei Starovoitov <ast@xxxxxxxxxx>
> Cc: Andrii Nakryiko <andrii@xxxxxxxxxx>
> Cc: David Ahern <dsahern@xxxxxxxxxx>
> Cc: Hao Luo <haoluo@xxxxxxxxxx>
> Cc: Jiri Olsa <jolsa@xxxxxxxxxx>
> Cc: John Fastabend <john.fastabend@xxxxxxxxx>
> Cc: KP Singh <kpsingh@xxxxxxxxxx>
> Cc: Martin KaFai Lau <martin.lau@xxxxxxxxx>
> Cc: Song Liu <song@xxxxxxxxxx>
> Cc: Stanislav Fomichev <sdf@xxxxxxxxxx>
> Cc: Yonghong Song <yonghong.song@xxxxxxxxx>
> Cc: bpf@xxxxxxxxxxxxxxx
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
> ---
>  include/net/seg6_local.h |  1 +
>  net/core/filter.c        |  3 ++
>  net/ipv6/seg6_local.c    | 59 ++++++++++++++++++++++------------------
>  3 files changed, 36 insertions(+), 27 deletions(-)
> 
> diff --git a/include/net/seg6_local.h b/include/net/seg6_local.h
> index 3fab9dec2ec45..0f22771359f4c 100644
> --- a/include/net/seg6_local.h
> +++ b/include/net/seg6_local.h
> @@ -20,6 +20,7 @@ extern bool seg6_bpf_has_valid_srh(struct sk_buff *skb);
>  
>  struct seg6_bpf_srh_state {
>  	struct ipv6_sr_hdr *srh;
> +	local_lock_t bh_lock;
>  	u16 hdrlen;
>  	bool valid;
>  };
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 1737884be52f8..c8013f762524b 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -6384,6 +6384,7 @@ BPF_CALL_4(bpf_lwt_seg6_store_bytes, struct sk_buff *, skb, u32, offset,
>  	void *srh_tlvs, *srh_end, *ptr;
>  	int srhoff = 0;
>  
> +	lockdep_assert_held(&srh_state->bh_lock);
>  	if (srh == NULL)
>  		return -EINVAL;
>  
> @@ -6440,6 +6441,7 @@ BPF_CALL_4(bpf_lwt_seg6_action, struct sk_buff *, skb,
>  	int hdroff = 0;
>  	int err;
>  
> +	lockdep_assert_held(&srh_state->bh_lock);
>  	switch (action) {
>  	case SEG6_LOCAL_ACTION_END_X:
>  		if (!seg6_bpf_has_valid_srh(skb))
> @@ -6516,6 +6518,7 @@ BPF_CALL_3(bpf_lwt_seg6_adjust_srh, struct sk_buff *, skb, u32, offset,
>  	int srhoff = 0;
>  	int ret;
>  
> +	lockdep_assert_held(&srh_state->bh_lock);
>  	if (unlikely(srh == NULL))
>  		return -EINVAL;
>  
> diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c
> index 24e2b4b494cb0..ed7278af321a2 100644
> --- a/net/ipv6/seg6_local.c
> +++ b/net/ipv6/seg6_local.c
> @@ -1380,7 +1380,9 @@ static int input_action_end_b6_encap(struct sk_buff *skb,
>  	return err;
>  }
>  
> -DEFINE_PER_CPU(struct seg6_bpf_srh_state, seg6_bpf_srh_states);
> +DEFINE_PER_CPU(struct seg6_bpf_srh_state, seg6_bpf_srh_states) = {
> +	.bh_lock	= INIT_LOCAL_LOCK(bh_lock),
> +};
>  
>  bool seg6_bpf_has_valid_srh(struct sk_buff *skb)
>  {
> @@ -1388,6 +1390,7 @@ bool seg6_bpf_has_valid_srh(struct sk_buff *skb)
>  		this_cpu_ptr(&seg6_bpf_srh_states);
>  	struct ipv6_sr_hdr *srh = srh_state->srh;
>  
> +	lockdep_assert_held(&srh_state->bh_lock);
>  	if (unlikely(srh == NULL))
>  		return false;
>  
> @@ -1408,8 +1411,7 @@ bool seg6_bpf_has_valid_srh(struct sk_buff *skb)
>  static int input_action_end_bpf(struct sk_buff *skb,
>  				struct seg6_local_lwt *slwt)
>  {
> -	struct seg6_bpf_srh_state *srh_state =
> -		this_cpu_ptr(&seg6_bpf_srh_states);
> +	struct seg6_bpf_srh_state *srh_state;
>  	struct ipv6_sr_hdr *srh;
>  	int ret;
>  
> @@ -1420,41 +1422,44 @@ static int input_action_end_bpf(struct sk_buff *skb,
>  	}
>  	advance_nextseg(srh, &ipv6_hdr(skb)->daddr);
>  
> -	/* preempt_disable is needed to protect the per-CPU buffer srh_state,
> -	 * which is also accessed by the bpf_lwt_seg6_* helpers
> +	/* The access to the per-CPU buffer srh_state is protected by running
> +	 * always in softirq context (with disabled BH). On PREEMPT_RT the
> +	 * required locking is provided by the following local_lock_nested_bh()
> +	 * statement. It is also accessed by the bpf_lwt_seg6_* helpers via
> +	 * bpf_prog_run_save_cb().
>  	 */
> -	preempt_disable();
> -	srh_state->srh = srh;
> -	srh_state->hdrlen = srh->hdrlen << 3;
> -	srh_state->valid = true;
> +	scoped_guard(local_lock_nested_bh, &seg6_bpf_srh_states.bh_lock) {
> +		srh_state = this_cpu_ptr(&seg6_bpf_srh_states);
> +		srh_state->srh = srh;
> +		srh_state->hdrlen = srh->hdrlen << 3;
> +		srh_state->valid = true;

Here the 'scoped_guard' usage adds a lot of noise to the patch, due to
the added indentation. What about using directly
local_lock_nested_bh()/local_unlock_nested_bh() ?

Cheers,

Paolo






[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux