Re: Question: missing vmlinux BTF variable declarations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Apr 27, 2022 at 11:43 AM Stephen Brennan
<stephen.s.brennan@xxxxxxxxxx> wrote:
> >
> > I think this quirk of using kallsyms strings is a no-go. But we should
> > experiment and see how much bigger BTF becomes when including all the
> > variables. Can you try to prototype pahole's support for this?
>
> Hi Andrii,
>
> Sorry for such a delay here. I tried to prototype this last month but
> encountered some issues I couldn't resolve. But recently I picked it up
> and I've created a prototype [1] which outputs all variables. (It's a
> quite bad prototype, it strips out some useful logic regarding the
> BTF_VAR_DATASEC for percpu variables. But I think it's good enough).
>
> On my 5.4-based kernel I saw an increase in BTF section size from 3.8
> MiB all the way to 6.1 MiB, or more precisely:
>
> BTF section before: 3905938 bytes
> BTF section after:  6391989 bytes (+2486051, +63.6%)
>
> So almost a 2.5 MiB increase. My prototype doesn't output the
> btf_var_secinfo structs for percpu variables anymore, which probably
> breaks some BPF and reduces BTF slightly. But it also is outputting
> a few thousand "dwarf variables" which were correctly filtered before,
> so I think it's a wash and it's a pretty good comparison.
>
> Clearly it can't be added without a configuration option, as 2.5 MiB is
> pretty huge for a kernel memory addition. But I don't think it's so huge
> that nobody would enable it. I know I would :)
>
> [1]: https://github.com/brenns10/dwarves/tree/remove_percpu_restriction_1
>
> > As you
> > said, we can guard this extra information with KConfig and pahole
> > flags, so distros can always opt-out of bigger BTF if that's too
> > prohibitive. As it is right now, without firm understanding how big
> > the final BTF is it's hard to make a good decision about go or no-go
> > for this.
>
> Hopefully this comparison sheds some light on that now!
>
> >
> > As for including source code itself, it going to be prohibitively
> > huge, so it's probably out of the question for now as well.
>
> Yeah, I wouldn't advocate for that.
>
> Now, to share some of the cool possibilities that this enables. I have:
> - prototype pahole [1] used for the kernel build,
> - a prototype drgn with BTF+kallsyms support [2],
> - some small kernel patches which add symbols to vmcoreinfo, so that
>   drgn can find the kallsyms section. I'm happy to share these, I just
>   haven't sent them anywhere yet.
>
> [2]: https://github.com/brenns10/drgn/tree/kallsyms_plus_btf
>
> Combining these three things, I've got a debugger which can open up a
> vmcore _without DWARF debuginfo_ and allow you to print out typed
> variable values. It just relies on BTF + kallsyms.
>
> So the proof of concept is proven, and I'm quite excited about it!

Exciting indeed. This is pretty cool.

I'm afraid we cannot justify 2.5 Mb kernel memory increase
for pure debugging. The existing vmlinux BTF is used
by the kernel itself to validate bpf prog access.
bpf progs cannot access normal global vars.
If/when they are we can reconsider.

As an alternative path I think we could introduce hierarchical
split BTF.
Currently vmlinux BTF and BTF of kernel modules is a tree
of depth 2.
We can keep such representation of BTFs and
introduce a fake kernel module that contains kernel global vars.
drgn can parse vmlinux BTF plus BTFs of all ko-s including fake one
and obtain the same amount of debug info as if global vars
were part of vmlinux BTF.
Consuming 2.5Mb on demand via ko would be acceptable
in some scenarios whereas unconditionally burning
that much memory in vmlinux BTF (even optional via kconfig)
is probably not.

Ideally we structure BTFs as a multi level tree.
Where BTF with global vars and other non essential BTF info
can be added to vmlinux BTF at run-time. BTF of kernel mods
can add on top and mods can have split BTF too.



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux