Re: [PATCH bpf-next] bpf: use O(log(N)) binary search to find line info record

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 13, 2024 at 04:23:11PM -0800, Andrii Nakryiko wrote:
> Real-world BPF applications keep growing in size. Medium-sized production
> application can easily have 50K+ verified instructions, and its line
> info section in .BTF.ext has more than 3K entries.
> 
> When verifier emits log with log_level>=1, it annotates assembly code
> with matched original C source code. Currently it uses linear search
> over line info records to find a match. As complexity of BPF
> applications grows, this O(K * N) approach scales poorly.
> 
> So, let's instead of linear O(N) search for line info record use faster
> equivalent O(log(N)) binary search algorithm. It's not a plain binary
> search, as we don't look for exact match. It's an upper bound search
> variant, looking for rightmost line info record that starts at or before
> given insn_off.
> 
> Some unscientific measurements were done before and after this change.
> They were done in VM and fluctuate a bit, but overall the speed up is
> undeniable.
> 
> BASELINE
> ========
> File                              Program           Duration (us)   Insns
> --------------------------------  ----------------  -------------  ------
> katran.bpf.o                      balancer_ingress        2497130  343552
> pyperf600.bpf.linked3.o           on_event               12389611  627288
> strobelight_pyperf_libbpf.o       on_py_event              387399   52445
> --------------------------------  ----------------  -------------  ------
> 
> BINARY SEARCH
> =============
> 
> File                              Program           Duration (us)   Insns
> --------------------------------  ----------------  -------------  ------
> katran.bpf.o                      balancer_ingress        2339312  343552
> pyperf600.bpf.linked3.o           on_event                5602203  627288
> strobelight_pyperf_libbpf.o       on_py_event              294761   52445
> --------------------------------  ----------------  -------------  ------
> 
> While Katran's speed up is pretty modest (about 105ms, or 6%), for
> production pyperf BPF program (on_py_event) it's much greater already,
> going from 387ms down to 295ms (23% improvement).
> 
> Looking at BPF selftests's biggest pyperf example, we can see even more
> dramatic improvement, shaving more than 50% of time, going from 12.3s
> down to 5.6s.

nice speedup

Acked-by: Jiri Olsa <jolsa@xxxxxxxxxx>

jirka

> 
> Different amount of improvement is the function of overall amount of BPF
> assembly instructions in .bpf.o files (which contributes to how much
> line info records there will be and thus, on average, how much time linear
> search will take), among other things:
> 
> $ llvm-objdump -d katran.bpf.o | wc -l
> 3863
> $ llvm-objdump -d strobelight_pyperf_libbpf.o | wc -l
> 6997
> $ llvm-objdump -d pyperf600.bpf.linked3.o | wc -l
> 87854
> 
> Granted, this only applies to debugging cases (e.g., using veristat, or
> failing verification in production), but seems worth doing to improve
> overall developer experience anyways.
> 
> Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx>
> ---
>  kernel/bpf/log.c | 30 +++++++++++++++++++++++++-----
>  1 file changed, 25 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/bpf/log.c b/kernel/bpf/log.c
> index 594a234f122b..2dfac246a27d 100644
> --- a/kernel/bpf/log.c
> +++ b/kernel/bpf/log.c
> @@ -333,7 +333,8 @@ find_linfo(const struct bpf_verifier_env *env, u32 insn_off)
>  {
>  	const struct bpf_line_info *linfo;
>  	const struct bpf_prog *prog;
> -	u32 i, nr_linfo;
> +	u32 nr_linfo;
> +	int l, r, m;
>  
>  	prog = env->prog;
>  	nr_linfo = prog->aux->nr_linfo;
> @@ -342,11 +343,30 @@ find_linfo(const struct bpf_verifier_env *env, u32 insn_off)
>  		return NULL;
>  
>  	linfo = prog->aux->linfo;
> -	for (i = 1; i < nr_linfo; i++)
> -		if (insn_off < linfo[i].insn_off)
> -			break;
> +	/* Loop invariant: linfo[l].insn_off <= insns_off.
> +	 * linfo[0].insn_off == 0 which always satisfies above condition.
> +	 * Binary search is searching for rightmost linfo entry that satisfies
> +	 * the above invariant, giving us the desired record that covers given
> +	 * instruction offset.
> +	 */
> +	l = 0;
> +	r = nr_linfo - 1;
> +	while (l < r) {
> +		/* (r - l + 1) / 2 means we break a tie to the right, so if:
> +		 * l=1, r=2, linfo[l].insn_off <= insn_off, linfo[r].insn_off > insn_off,
> +		 * then m=2, we see that linfo[m].insn_off > insn_off, and so
> +		 * r becomes 1 and we exit the loop with correct l==1.
> +		 * If the tie was broken to the left, m=1 would end us up in
> +		 * an endless loop where l and m stay at 1 and r stays at 2.
> +		 */
> +		m = l + (r - l + 1) / 2;
> +		if (linfo[m].insn_off <= insn_off)
> +			l = m;
> +		else
> +			r = m - 1;
> +	}
>  
> -	return &linfo[i - 1];
> +	return &linfo[l];
>  }
>  
>  static const char *ltrim(const char *s)
> -- 
> 2.39.3
> 
> 




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux