Re: [RFC/PATCH 3/3] btf_encoder: Func generation fix

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 12, 2020 at 04:08:02PM -0800, Andrii Nakryiko wrote:
> On Thu, Nov 12, 2020 at 1:14 PM Jiri Olsa <jolsa@xxxxxxxxxx> wrote:
> >
> > On Thu, Nov 12, 2020 at 11:54:41AM -0800, Andrii Nakryiko wrote:
> >
> > SNIP
> >
> > > > @@ -624,32 +644,46 @@ int cu__encode_btf(struct cu *cu, int verbose, bool force,
> > > >                 has_index_type = true;
> > > >         }
> > > >
> > > > -       cu__for_each_function(cu, core_id, fn) {
> > > > -               /*
> > > > -                * The functions_cnt != 0 means we parsed all necessary
> > > > -                * kernel symbols and we are using ftrace location filter
> > > > -                * for functions. If it's not available keep the current
> > > > -                * dwarf declaration check.
> > > > -                */
> > > > -               if (functions_cnt) {
> > > > +       /*
> > > > +        * The functions_cnt != 0 means we parsed all necessary
> > > > +        * kernel symbols and we are using ftrace location filter
> > > > +        * for functions. If it's not available keep the current
> > > > +        * dwarf declaration check.
> > > > +        */
> > > > +       if (functions_cnt) {
> > > > +               cu__for_each_function(cu, core_id, fn) {
> > > > +                       struct elf_function *p;
> > > > +                       struct elf_function key = { .name = function__name(fn, cu) };
> > > > +                       int args_cnt = 0;
> > > > +
> > > >                         /*
> > > > -                        * We check following conditions:
> > > > -                        *   - argument names are defined
> > > > -                        *   - there's symbol and address defined for the function
> > > > -                        *   - function address belongs to ftrace locations
> > > > -                        *   - function is generated only once
> > > > +                        * Collect functions that match ftrace filter
> > > > +                        * and pick the one with proper argument names.
> > > > +                        * The BTF generation happens at the end in
> > > > +                        * btf_encoder__encode function.
> > > >                          */
> > > > -                       if (!has_arg_names(cu, &fn->proto))
> > > > +                       p = bsearch(&key, functions, functions_cnt,
> > > > +                                   sizeof(functions[0]), functions_cmp);
> > > > +                       if (!p)
> > > >                                 continue;
> > > > -                       if (!should_generate_function(btfe, function__name(fn, cu)))
> > > > +
> > > > +                       if (!has_arg_names(cu, &fn->proto, &args_cnt))
> > >
> > > So I can't unfortunately reproduce that GCC bug with DWARF info. What
> > > was exactly the symptom? Maybe you can also share readelf -wi dump for
> > > your problematic vmlinux?
> >
> > hum, can't see -wi working for readelf, however I placed my vmlinux
> > in here:
> >   http://people.redhat.com/~jolsa/vmlinux.gz
> >
> > the symptom is that resolve_btfids will fail kernel build:
> >
> >   BTFIDS  vmlinux
> > FAILED unresolved symbol vfs_getattr
> >
> > because BTF data contains multiple FUNC records for same function
> >
> > and the problem is that declaration tag itself is missing:
> >   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97060
> > so pahole won't skip them
> >
> > the first workaround was to workaround that and go for function
> > records that have code address attached, but that's buggy as well:
> >   https://bugzilla.redhat.com/show_bug.cgi?id=1890107
> >
> > then after some discussions we ended up using ftrace addresses
> > as filter for what we want to display.. plus we got static functions
> > as well
> >
> > however for this way we failed to get proper arguments ;-)
> 
> Right, I followed along overall, but forgot the details of the initial
> problem. Thanks for the refresher. See below for my current thoughts
> on dealing with all this.
> 
> >
> > >
> > > The reason I'm asking is because I wonder if we should still ignore
> > > functions if fn->declaration is set. E.g., for the issue we
> > > investigated yesterday, the function with no arguments has declaration
> > > set to 1, so just ignoring it would solve the problem. I'm wondering
> > > if it's enough to do just that instead of doing this whole delayed
> > > function collection/processing.
> > >
> > > Also, I'd imagine the only expected cases where we can override  the
> > > function (args_cnt > p->args_cnt) would be if p->args_cnt == 0, no?
> >
> > I don't know, because originally I'd expect that we would not see
> > function record with zero args when it actualy has some
> >
> > > All other cases are either newly discovered "bogusness" of DWARF (and
> > > would be good to know about this) or it's a name collision for
> > > functions. Basically, before we go all the way to rework this again,
> > > let's see if just skipping declarations would be enough?
> >
> > so there's actualy new developement today in:
> >   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97060#c14
> >
> > gcc might actualy get fixed, so I think we could go back to
> > using declaration tag and use ftrace as additional filter..
> > at least this exercise gave us static functions ;-)
> >
> > however we still have fedora with disabled disabled CONFIG_DEBUG_INFO_BTF
> > and will need to wait for that fix to enable that back
> 
> Right, we better have a more robust approach not relying on
> not-yet-released GCC.
> 
> >
> > >
> > > >                                 continue;
> > > > -               } else {
> > > > +
> > > > +                       if (!p->fn || args_cnt > p->args_cnt) {
> > > > +                               p->fn = fn;
> > > > +                               p->cu = cu;
> > > > +                               p->args_cnt = args_cnt;
> > > > +                               p->type_id_off = type_id_off;
> > > > +                       }
> > > > +               }
> > > > +       } else {
> > > > +               cu__for_each_function(cu, core_id, fn) {
> > > >                         if (fn->declaration || !fn->external)
> > > >                                 continue;
> > > > +                       if (generate_func(btfe, cu, fn, type_id_off))
> > > > +                               goto out;
> > > >                 }
> > >
> > > I'm trending towards disliking this completely different fallback
> > > mechanism. It saved bpf-next accidentally, but otherwise obscured the
> > > issue and generally makes testing pahole with artificial binary BTFs
> > > (from test programs) harder. How about we unify approaches, but just
> > > use mcount symbols opportunistically, as an additional filter, if it's
> > > available?
> >
> > ok
> >
> > >
> > > With that, testing that we still handle functions with duplicate names
> > > properly would be trivial (which I suspect we don't and we'll just
> > > keep the one with more args now, right?) And it makes static functions
> > > available for non-vmlinux binaries automatically (might be good or
> > > bad, but still...).
> >
> > if we keep just the ftrace filter and rely on declaration tag,
> > the args checking will go away right?
> 
> So I looked at your vmlinux image. I think we should just keep
> everything mostly as it it right now (without changes in this patch),
> but add just two simple checks:
> 
> 1. Skip if fn->declaration (ignore correctly marked func declarations)
> 2. Skip if DW_AT_inline: 1 (ignore inlined functions).
> 
> I'd keep the named arguments check as is, I think it's helpful. 1)
> will skip stuff that's explicitly marked as declaration. 2) inline

ok, that should fix the fail on gccs that still generate
declaration tag, so you should be fine and our version of
gcc should continue work as well 

> check will partially mitigate dropping of fn->external check (and we
> can't really attach to inlined functions). So will the named args
> check as well. Then we should do mcount checks, **if** mcount symbols
> are present (which should always be the case for any reasonable
> vmlinux image that's supposed to be used with BPF, I think).
> 
> So together, it should cover all known issues, I think. And then we'll
> just have to watch out for any new ones. GCC bugfix is good, but it's
> too late: the harm is done and there are compilers out there that we
> should deal with.

hopefuly with this change we can enable BTF back before the gcc fix

thanks,
jirka




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux