Re: [PATCH v2 0/9] tracing: Improbe BTF support on probe events

Alan Maguire <alan.maguire@xxxxxxxxxx> · Thu, 20 Jul 2023 22:50:18 +0100

On 19/07/2023 17:01, Masami Hiramatsu (Google) wrote:
> On Wed, 19 Jul 2023 10:02:06 +0100
> Alan Maguire <alan.maguire@xxxxxxxxxx> wrote:
> 
>> On 17/07/2023 16:23, Masami Hiramatsu (Google) wrote:
>>> Hi,
>>>
>>> Here is the 2nd version of series to improve the BTF support on probe events.
>>> The previous series is here:
>>>
>>> https://lore.kernel.org/linux-trace-kernel/168699521817.528797.13179901018528120324.stgit@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
>>>
>>> In this version, I added a NULL check fix patch [1/9] (which will go to
>>> fixes branch) and move BTF related API to kernel/bpf/btf.c [2/9] and add
>>> a new BTF API [3/9] so that anyone can reuse it.
>>> Also I decided to use '$retval' directly instead of 'retval' pseudo BTF
>>> variable for field access at [5/9] because I introduced an idea to choose
>>> function 'exit' event automatically if '$retval' is used [7/9]. With that
>>> change, we can not use 'retval' because if a function has 'retval'
>>> argument, we can not decide 'f func retval' is function exit or entry.
>>
>> this is fantastic work! (FWIW I ran into the retval argument issue with
>> ksnoop as well; I got around it by using "return" to signify the return
>> value since as a reserved word it won't clash with a variable name.
>> However in the trace subsystem context retval is used extensively so
>> makes sense to stick with that).
> 
> Thanks!
> 
>>
>> One thing we should probably figure out is a common approach to handling
>> ambiguous static functions that will work across ftrace and BPF.  A few
>> edge cases that are worth figuring out:
>>
>> 1. a static function with the same name exists in multiple modules,
>> either with different or identical function signatures
>> 2. a static function has .isra.0 and other gcc suffixes applied to
>> static functions during optimization
>>
>> As Alexei mentioned, we're still working on 1, so it would be good
>> to figure out a naming scheme that works well in both ftrace and BPF
>> contexts. There are a few hundred of these ambiguous functions. My
>> reading of the fprobe docs seems to suggest that there is no mechanism
>> to specify a specific module for a given symbol (as in ftrace filters),
>> is that right?
> 
> Yes, it doesn't have module specificaiton at this moment. I'll considering
> to fix this. BTW, for the same-name functions, we are discussing another
> approach. We also need to sync this with BTF. 
> 
> https://lore.kernel.org/all/20230714150326.1152359-1-alessandro.carminati@xxxxxxxxx/
> 
>>
>> Jiri led a session on this topic at LSF/MM/BPF ; perhaps we should
>> carve out some time at Plumbers to discuss this?
> 
> Yeah, good idea.
> 
>>
>> With respect to 2, pahole v1.25 will generate representations for these
>> "."-suffixed functions in BTF via --btf_gen_optimized [1]. (BTF
>> representation is skipped if the optimizations impact on the registers
>> used for function arguments; if these don't match calling conventions
>> due to optimized-out params, we don't represent the function in BTF,
>> as the tracing expectations are violated).
> 
> Correct. But can't we know which argument is skipped by the optimization
> from the DWARF? At least the function parameters will be changed.
>

Yep; we use the expected registers to spot cases where something
has been optimized out.

>> However the BTF function name - in line with DWARF representation -
>> will not have the .isra suffix. So the thing to bear in mind is if
>> you use the function name with suffix as the fprobe function name,
>> a BTF lookup of that exact ("foo.isra.0") name will not find anything,
>> while a lookup of "foo" will succeed. I'll add some specifics in your
>> patch doing the lookups, but just wanted to highlight the issue at
>> the top-level.
> 
> So, what about adding an index sorted list of the address and BTF entry
> index as an expansion of the BTF? It allowed us to easily map the suffixed
> symbol address (we can get it from kallsyms) to BTF quickly.
> So the module will have
> 
> [BTF data][array length][BTF index array]
> 
> Index array member will be like this.
> 
> struct btf_index {
> 	u32	offset;	// offset from the start text
> 	u32	id:		// BTF type id
> };
> 
> We can do binary search the function type id from the symbol address.
> 

Yeah, I wonder if a representation that bridged between kallsyms and BTF
might be valuable? I don't _think_ it's as much of an issue for your
case though since you only need to do the BTF lookup once on fprobe
setup, right? Thanks!

Alan

> Thank you,
> 
>>
>> Thanks!
>>
>> Alan
>>
>> [1]
>> https://lore.kernel.org/bpf/1675790102-23037-1-git-send-email-alan.maguire@xxxxxxxxxx/
>>
>>> Selftest test case [8/9] and document [9/9] are also updated according to
>>> those changes.
>>>
>>> This series can be applied on top of "v6.5-rc2" kernel.
>>>
>>> You can also get this series from:
>>>
>>> git://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git topic/fprobe-event-ext
>>>
>>>
>>> Thank you,
>>>
>>> ---
>>>
>>> Masami Hiramatsu (Google) (9):
>>>       tracing/probes: Fix to add NULL check for BTF APIs
>>>       bpf/btf: tracing: Move finding func-proto API and getting func-param API to BTF
>>>       bpf/btf: Add a function to search a member of a struct/union
>>>       tracing/probes: Support BTF based data structure field access
>>>       tracing/probes: Support BTF field access from $retval
>>>       tracing/probes: Add string type check with BTF
>>>       tracing/fprobe-event: Assume fprobe is a return event by $retval
>>>       selftests/ftrace: Add BTF fields access testcases
>>>       Documentation: tracing: Update fprobe event example with BTF field
>>>
>>>
>>>  Documentation/trace/fprobetrace.rst                |   50 ++
>>>  include/linux/btf.h                                |    7 
>>>  kernel/bpf/btf.c                                   |   83 ++++
>>>  kernel/trace/trace_fprobe.c                        |   58 ++-
>>>  kernel/trace/trace_probe.c                         |  402 +++++++++++++++-----
>>>  kernel/trace/trace_probe.h                         |   12 +
>>>  .../ftrace/test.d/dynevent/add_remove_btfarg.tc    |   11 +
>>>  .../ftrace/test.d/dynevent/fprobe_syntax_errors.tc |    6 
>>>  8 files changed, 503 insertions(+), 126 deletions(-)
>>>
>>> --
>>> Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx>
>>>
> 
>