Re: [PATCH bpf-next v4 4/8] libbpf: Support BTF.ext loading and output in either endianness

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Sep 2, 2024 at 1:19 AM Tony Ambardar <tony.ambardar@xxxxxxxxx> wrote:
>
> On Fri, Aug 30, 2024 at 02:14:19PM -0700, Andrii Nakryiko wrote:
> > On Fri, Aug 30, 2024 at 12:30 AM Tony Ambardar <tony.ambardar@xxxxxxxxx> wrote:
> > >
> > > Support for handling BTF data of either endianness was added in [1], but
> > > did not include BTF.ext data for lack of use cases. Later, support for
> > > static linking [2] provided a use case, but this feature and later ones
> > > were restricted to native-endian usage.
> > >
> > > Add support for BTF.ext handling in either endianness. Convert BTF.ext data
> > > to native endianness when read into memory for further processing, and
> > > support raw data access that restores the original byte-order for output.
> > > Add internal header functions for byte-swapping func, line, and core info
> > > records.
> > >
> > > Add new API functions btf_ext__endianness() and btf_ext__set_endianness()
> > > for query and setting byte-order, as already exist for BTF data.
> > >
> > > [1] 3289959b97ca ("libbpf: Support BTF loading and raw data output in both endianness")
> > > [2] 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support")
> > >
> > > Signed-off-by: Tony Ambardar <tony.ambardar@xxxxxxxxx>
> > > ---
> > >  tools/lib/bpf/btf.c             | 192 +++++++++++++++++++++++++++++---
> > >  tools/lib/bpf/btf.h             |   3 +
> > >  tools/lib/bpf/libbpf.map        |   2 +
> > >  tools/lib/bpf/libbpf_internal.h |  33 ++++++
> > >  4 files changed, 214 insertions(+), 16 deletions(-)
> > >
> > > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > > index f5081de86ee0..064cfe126c09 100644
> > > --- a/tools/lib/bpf/btf.c
> > > +++ b/tools/lib/bpf/btf.c
> > > @@ -3022,25 +3022,102 @@ static int btf_ext_setup_core_relos(struct btf_ext *btf_ext)
> > >         return btf_ext_setup_info(btf_ext, &param);
> > >  }
> > >
> > > -static int btf_ext_parse_hdr(__u8 *data, __u32 data_size)
> > > +/* Swap byte-order of BTF.ext header with any endianness */
> > > +static void btf_ext_bswap_hdr(struct btf_ext *btf_ext, __u32 hdr_len)
> > >  {
> > > -       const struct btf_ext_header *hdr = (struct btf_ext_header *)data;
> > > +       struct btf_ext_header *h = btf_ext->hdr;
> > >
> > > -       if (data_size < offsetofend(struct btf_ext_header, hdr_len) ||
> > > -           data_size < hdr->hdr_len) {
> > > -               pr_debug("BTF.ext header not found\n");
> > > +       h->magic = bswap_16(h->magic);
> > > +       h->hdr_len = bswap_32(h->hdr_len);
> > > +       h->func_info_off = bswap_32(h->func_info_off);
> > > +       h->func_info_len = bswap_32(h->func_info_len);
> > > +       h->line_info_off = bswap_32(h->line_info_off);
> > > +       h->line_info_len = bswap_32(h->line_info_len);
> > > +
> > > +       if (hdr_len < offsetofend(struct btf_ext_header, core_relo_len))
> > > +               return;
> > > +
> > > +       h->core_relo_off = bswap_32(h->core_relo_off);
> > > +       h->core_relo_len = bswap_32(h->core_relo_len);
> > > +}
> > > +
> > > +/* Swap byte-order of a generic info subsection */
> > > +static void info_subsec_bswap(const struct btf_ext_header *hdr, bool native,
> > > +                             __u32 off, __u32 len, anon_info_bswap_fn_t bswap)
> >
> > ok, so I'm not a fan of this bswap callback, tbh. Also, we don't
> > really enforce that each kind of record has exact size we expect
> > (i.e., bpf_line_info_min and bpf_func_info_min shouldn't be "min" for
> > byte-swapped case, it should be exact).
> >
> > How about this slight modification: split byte swapping of
> > sections/subsection metadata, so we adjust record size, sec_name_off
> > and num_info separately from adjusting each record.
>
> Hmmm, the bulk of code needed is to parse the metadata, with only 2 lines
> used to go through records. Splitting per above would add unnecessary
> duplication it seems, no?
>
> >
> > Once this swapping is done we:
> >
> > a) validate record size for each section is expected (according to its
> > type, of course)
>
> This is a good point I overlooked, and needs doing in any case.
>
> > b) we can then use for_each_btf_ext_sec() and for_each_btf_ext_rec()
> > macro (which assume proper in-memory metadata byte order) and then
> > hard-code swapping of each record fields
>
> How easily can we use these macros? Consider the current call chain:

Not that easily, turns out, because a) it acquires data pointer
implicitly, which makes it hard for btf_ext_raw_data() and b) it
accesses sec->num_info and doesn't cache it, so we'd need an extra
local variable to keep it if we were to swap it in raw data.

>
> btf_ext__new
>      btf_ext_parse
>           btf_ext_bswap_hdr     (1)
>           btf_ext_bswap_info    (2)
>      btf_ext_setup_func_info
>      btf_ext_setup_line_info
>      btf_ext_setup_core_relos   (3)
>
> btf_ext__raw_data
>      btf_ext_bswap_info         (4)
>      btf_ext_bswap_hdr
>
> The macros iterate on 'struct btf_ext_info' instances in 'struct btf_ext'
> but these are only set up after (3) it seems and unavailable at (2). I
> suppose they could be used with some sort of kludge but unsure how well
> they'll work.
>
> >
> > No callbacks.
> >
> > This has also a benefit of not needing this annoying "bool native"
> > flag when producing raw bytes. We just ensure proper order of
> > operation:
> >
> > a) swap records
> > b) swap metadata (so just mirrored order from initialization)
>
> How does that work? If we split up btf_ext_bswap_info(), after (1)
> btf_ext->swapped_endian is set and btf_ext->hdr->magic is swapped, so at
> (2) it's not possible to tell the current info data byte order without
> some hinting.
>
> But maybe if we defer setting btf_ext->swapped_endian until after (b) we
> can drop the "bool native" thanks to symmetry breaking. Let me check.
>
> >
> > WDYT?
>
> Adding a record_size check is definitely needed.
>
> But I have trouble seeing how splitting bswap of info metadata/records
> would yield something simpler and cleaner than the callbacks. What if
> they were passed via a descriptor, as in btf_ext_setup_func_info()? I
> think I need to play around with this a while and see..
>
> It would also help me if you'd elaborate on the drawbacks you see of
> using callbacks, given I see then in other parts of libbpf.

I replied to your latest patches. I generally dislike callbacks as
they make following the code harder. If it's possible to not use
callbacks with reasonable simplicity, I'll always go for that. But
it's ok, given those existing iteration macros are a bit assuming
about data and its endianness, it's hard to use them.

>
> >
> > pw-bot: cr
> >
> > > +{
> > > +       __u32 left, i, *rs, rec_size, num_info;
> > > +       struct btf_ext_info_sec *si;
> > > +       void *p;
> > > +
> > > +       if (len == 0)
> > > +               return;
> > > +
> > > +       rs = (void *)hdr + hdr->hdr_len + off;  /* record size */
> > > +       si = (void *)rs + sizeof(__u32);        /* sec info #1 */
> > > +       rec_size = native ? *rs : bswap_32(*rs);
> > > +       *rs = bswap_32(*rs);
> > > +       left = len - sizeof(__u32);
> > > +       while (left > 0) {
> > > +               num_info = native ? si->num_info : bswap_32(si->num_info);
> > > +               si->sec_name_off = bswap_32(si->sec_name_off);
> > > +               si->num_info = bswap_32(si->num_info);
> > > +               left -= offsetof(struct btf_ext_info_sec, data);
> > > +               p = si->data;
> > > +               for (i = 0; i < num_info; i++)  /* list of records */
> > > +                       p += bswap(p);
> > > +               si = p;
> > > +               left -=  rec_size * num_info;
> >
> > nit: extra space here
>
> Fixed, thanks.
>
> >
> > > +       }
> > > +}
> > > +
> >
> > [...]





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux