Re: Packaging bpftool and libbpf: GitHub or kernel?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Apr 19, 2023 at 2:23 PM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote:
>
> Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> writes:
>
> > On Wed, Apr 19, 2023 at 7:14 AM Shung-Hsi Yu <shung-hsi.yu@xxxxxxxx> wrote:
> >>
> >> On Tue, Apr 18, 2023 at 07:41:32PM +0200, Michal Suchánek wrote:
> >> > On Tue, Apr 18, 2023 at 09:38:20AM -0700, Andrii Nakryiko wrote:
> >> > > On Tue, Apr 18, 2023 at 4:24 AM Michal Suchánek <msuchanek@xxxxxxx> wrote:
> >> > > >
> >> > > > On Mon, Apr 17, 2023 at 05:20:03PM -0700, Andrii Nakryiko wrote:
> >> > > > > On Fri, Apr 14, 2023 at 9:15 AM Michal Suchánek <msuchanek@xxxxxxx> wrote:
> >> > > > > > On Fri, Apr 14, 2023 at 01:30:02PM +0100, Quentin Monnet wrote:
> >> > > > > > > 2023-04-14 11:50 UTC+0200 ~ Michal Suchánek <msuchanek@xxxxxxx>
> >> > > > > > > > Hello,
> >> > > > > > > >
> >> > > > > > > > On Fri, Apr 14, 2023 at 01:35:20AM +0100, Quentin Monnet wrote:
> >> > > > > > > >> Hi Shung-Hsi,
> >> > > > > > > >>
> >> > > > > > > >> On Thu, 13 Apr 2023 at 10:23, Shung-Hsi Yu <shung-hsi.yu@xxxxxxxx> wrote:
> >> > > > > > > >>>
> >> > > > > > > >>> Hi,
> >> > > > > > > >>>
> >> > > > > > > >>> I'm considering switch to bpftool's mirror on GitHub for packaging (instead
> >> > > > > > > >>> of using the source found in kernel), but realize that it should goes
> >> > > > > > > >>> hand-in-hand with how libbpf is packaged, which eventually leads these
> >> > > > > > > >>> questions:
> >> > > > > > > >>>
> >> > > > > > > >>>   What is the suggested approach for packaging bpftool and libbpf?
> >> > > > > > > >>>   Which source is preferred, GitHub or kernel?
> >> > > > > > > >>
> >> > > > > > > >> As you can see from the previous discussions, the suggested approach
> >> > > > > > > >> would be to package from the GitHub mirror, with libbpf and bpftool in
> >> > > > > > > >> sync.
> >> > > > > > > >>
> >> > > > > > > >> My main argument for the mirror is that it keeps things simpler, and
> >> > > > > > > >> there's no need to deal with the rest of the kernel sources for these
> >> > > > > > > >> packages. Download from the mirrors, build, ship. But then I have
> >> > > > > > > >> limited experience at packaging for distros, and I can understand
> >> > > > > > > >> Toke's point of view, too. So ultimately, the call is yours.
> >> > > > > > > >
> >> > > > > > > > Things get only ever more complex when submodules are involved.
> >> > > > > > >
> >> > > > > > > I understand the generic pain points from your other email. But could
> >> > > > > > > you be more specific for the case of bpftool? It's not like we're
> >> > > > > > > shipping all lib dependencies as submodules. Sync-ups are specifically
> >> > > > > > > aligned to the same commit used to sync the libbpf mirror, so that it's
> >> > > > > > > pretty much as if we had the right version of the library shipped in the
> >> > > > > > > repository - only, it's one --recurse-submodules away.
> >> > > > > >
> >> > > > > > It's so in every project that uses submodules. Except git does not
> >> > > > > > recurse into submodules by default, you have to fix it up by hand.
> >> > > > > > Forges don't support submodules so you will not get the submodule when
> >> > > > > > downloading the project archive, and won't see it the the project tree.
> >> > > > >
> >> > > > > git submodule update --init --recursive didn't work?
> >> > > >
> >> > > > That's one part of the manual fixup.
> >> > > >
> >> > > > The other part is after each git operation that could possibly cause the
> >> > > > submodules to go out of sync, basically any operation that changes the
> >> > > > checked-out commit.
> >> > > >
> >> > > > Of course, you can make some shell aliases that append whatever submodule
> >> > > > chicanery to whatever git command you might issue, and tell everyone
> >> > > > else to do that, and then it will work in that one shell, and not in any
> >> > > > other shell nor any tool that invokes git directly.
> >> > >
> >> > > Are we discussing a *standard* Git submodule feature and argue that
> >> > > because it might be cumbersome or unfamiliar to some engineers that
> >> > > projects should avoid using Git submodules?
> >> >
> >> > As far as I am aware they are unfamiliar to *most* engineers, and for
> >> > good reasons.
> >> >
> >> > > For one, I don't have any special aliases for dealing with Git
> >> > > submodules and it works fine. If I jump between branches or tags which
> >> > > update Git submodule reference, I do above `git submodule update
> >> > > --init --recursive` explicitly if I see that Git status shows
> >> > > out-of-sync Git submodule state. If I want to update a Git submodule,
> >> > > I update the submodule's Git repo, and then git add it in the repo
> >> > > that uses this submodule. I haven't run into any other issues with
> >> > > this.
> >> >
> >> > You know, git could just handle submodules automagically. As you say,
> >> > it's not rocket science. For historical reasons it does not.
> >> >
> >> > With that working with submodules is cumbersome, and it's additional
> >> > thing that can break down that the engineer needs to be constantly aware
> >> > of increasing the mental overhead of working with such projects.
> >> >
> >> > It may not be much of a problem for people who work with such projects
> >> > daily but not everyone does. Those who don't need to do the mental
> >> > switch whenever submodules are encountered, and are prone to getting
> >> > issues when they forget that they have to go that extra mile for this
> >> > specific project.
> >>
> >> For me it's less about having to go through the extra loop. It's that
> >> submodules would require git to be installed, network access, which all adds
> >> extra moving parts compared to a tarball...
> >>
> >> > > > > > After previous experience with submodules I did not even try, I just
> >> > > > > > patched the makefile to use system libbpf before attempting anything
> >> > > > > > else.
> >> > > > >
> >> > > > > Quentin mentioned that he's packaging (or will package) libbpf sources
> >> > > > > as part of bpftool release on Github. I've been this for other
> >> > > > > libbpf-using tools as well, and it works pretty well (at least for
> >> > > > > Fedora and ArchLinux). E.g., srcs-full-* archives for veristat ([0])
> >>
> >> and having libbpf included in bpftool release means the complain above no
> >> longer holds. Though I have yet test build the mirror version of libbpf and
> >> bpftool like Michal has done.
> >
> > Great. This seems to work well for other tools that use libbpf through
> > submodule (anakryiko/retsnoop and libbpf/veristat on Github)
> >
> >>
> >> > > > > By switching up actual libbpf used to compile with bpftool, you are
> >> > > > > potentially introducing subtle problems that your users will be quite
> >> > > > > unhappy about, if they run into them. Let's work together to make it
> >> > > > > easier for you to package bpftool properly. We can't switch bpftool to
> >> > > > > reliably use system-wide libbpf (either static or shared, doesn't
> >> > > > > matter) because of dependency on internal functionality.
> >> > > > >
> >> > > > >
> >> > > > >   [0] https://github.com/libbpf/veristat/releases/tag/v0.1
> >> > > >
> >> > > > So how many copies of libbpf do I need for having a CO-RE toolchain?
> >> > >
> >> > > What do you mean by "CO-RE toolchain"? bpftool, veristat, retsnoop,
> >> > > etc are tools. The fact they are using statically linked libbpf
> >> > > through Git submodule is irrelevant to end users. You need one libbpf
> >> > > in the system (for those who link dynamically against libbpf), the
> >> > > rest are just tools.
> >> > >
> >> > > >
> >> > > > Will different tools have different view of the kernel because they each
> >> > > > use different private copy of libbpf with different features?
> >> > >
> >> > > That's up to tools, not libbpf. You are over pivoting on libbpf here.
> >> > > There is one view of the kernel, it depends on what features the
> >> > > kernel supports. If the tool requires some specific functionality of
> >> > > libbpf, it will update its Git submodule reference to get a version of
> >> > > libbpf that provides that feature. That's the point, an
> >> > > application/tool is in control of what kind of features it gets from
> >> > > libbpf.
> >>
> >> Since libbpf has a stable API & ABI, is it theoretically possible for
> >> bpftool, veristat, retsnoop, etc. all share the same version of libbpf?
> >
> > No, because libbpf is not just a set of APIs. Newer libbpf versions
> > support more BPF-side features, more kernel features, etc, etc. Libbpf
> > is not a typical user-space library, it is a BPF loader, and even if
> > user-visible API doesn't change, libbpf's support for various BPF-side
> > features is extended. Which is important for tools like bpftool,
> > retsnoop, veristat which rely on loading and working with BPF object
> > files.
>
> The converse of this is also true: if your system is upgraded to a new
> kernel version with new BPF features, the libbpf version should follow
> it, and all applications linked against it will automatically take
> advantage of any bugfixes regardless without having to wait for each
> application to be updated.

No, if my application was not developed to take advantage of a new
kernel feature, newer libbpf will do nothing for me. If my application
wants to support that feature, I'll update my application and
correspondingly update libbpf embedded in it. If my application is
affected by some bug fix, I'll update libbpf even faster than distros
will get to it.

I've heard all such arguments over the last few years. They are not
convincing and my own practical experience shows irrelevance of the
above argument.

>
> Libbpf is really no different from any other library here, and I really
> don't get why you keep insisting it's "special"...

It's special in the sense that it provides two sets of APIs -- for
user-space (typical libraries) and BPF object files. Besides that, for
BPF-side it's not even a set of APIs (headers, helpers, etc), it also
provides some set of functionality that can improve or be extended
over time. E.g., libbpf used to not support non-inlined BPF
subprograms, and then it started supporting them. In terms of API/ABI
-- nothing changed. Yet the change is very important.

Now, I build a tool that is using libbpf and some BPF functionality,
e.g., retsnoop. Libbpf just got SEC("ksyscall") support. Retsnoop
wants to take advantage of it. I just go and use SEC("ksyscall")
programs in .bpf.c files that are embedded inside retsnoop. I don't
have to *and don't want to* do feature detection of whether a
particular libbpf version that happens to be installed/packaged on the
system supports this version. I *know* it does, because I control it,
through a submodule. That's what I care about.

Whether some distro insists on libbpf being shared across any
libbpf-using application or not is none of my concern. Libbpf is an
implementation detail of my application (retsnoop), it's not for the
packager to decide how I develop and structure my tool.

>
> >> What I'd like to do it build libbpf and bpftool out of bpftool GitHub
> >> mirror's release tarball (w/ submodule included, which exists now for
> >> snapshot). For the rest of the tool that does not depends on libbpf private
> >> function, have them dynamically link to the libbpf built from bpftool's
> >> source, just like how libelf is dynamically linked.
> >
> > Please don't do it, let applications control which libbpf versions
> > they are using. It's not just about user space APIs, I can't emphasize
> > this enough. Don't think you know better than developers of respective
> > applications, don't try to dictate how those applications should be
> > organized and developed.
>
> A well-behaved application will detect which features are available in

No, a well-behaved application will provide a reliable functionality
without necessarily paying maintenance and development cost of a maze
of #ifdef-ery just to satisfy arbitrary distro requirements of linking
with some shared library ("because security").

> the system version of the libraries they use, and if something is
> missing that it needs, either work around it or refuse to build. We do
> this with libbpf in xdp-tools and the only issues we've had with it has
> been the changing API in pre-1.0 libbpf...
>
> > One good example is iproute2, which chose to link (or not) with libbpf
> > dynamically. Now users periodically report various issues where their
> > BPF object files are not loaded, and it often comes down to unexpected
> > version of libbpf (or lack of libbpf support altogether) which which
> > iproute2 was built/deployed. This is just putting a burden on iproute2
> > users, and accidentally libbpf maintainers, for no good reason.
>
> How would this have been any different if iproute2 was statically linked
> against libbpf?

iproute2 version would determine what BPF features are supported, and
it would be consistent across distros and end user systems, regardless
of what libbpf shared library happens to be packaged and installed.
And users would know that starting from version X iproute2 is
libbpf-1.0+ compatible in what sort of BPF object file features are
supported by iproute2 when loading BPF programs.

>
> >> I'm not saying that those tools should not have libbpf as submodule; as
> >> submodule do look useful. But for packaging I really would like to have the
> >> option of choosing the exact version of libbpf being used.
> >
> > The exact version of libbpf used by bpftool, retsnoop, veristat, etc
> > *is not relevant* to you as a packager. If you want happy users, use
> > *exact* version of libbpf from submodule to build them, with which
> > application was developed, tested, and advertised supported BPF
> > features. There is no reuse to be done here, they all can be on
> > different (and sometimes not yet released) libbpf version. For good
> > reasons, which are outside of your control as a packager.
>
> This is... just not how distributions work. As a user I trust my
> distribution to provide me with a coherent system where critical system
> libraries are maintained and receive timely updates. And I absolutely
> trust the distribution more to do this over application developers who
> just vendor in some version as a submodule and leave it there until they
> need a new feature...

Ok.

>
> -Toke
>




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux