Re: F36 Change: Package information on ELF objects (System-Wide Change proposal)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 25, 2021 at 03:09:00PM -0400, Ben Cotton wrote:
> https://fedoraproject.org/wiki/Changes/Package_information_on_ELF_objects
> 
> == Summary ==
> All binaries (executables and shared libraries) are annotated with an
> ELF note that identifies the rpm for which this file was built. This
> allows binaries to be identified when they are distributed without any
> of the rpm metadata. `systemd-coredump` uses this to log package
> versions when reporting crashes.

This is a resubmission of the proposal for F35 which was (narrowly)
rejected at the time. We added copious descriptions of motivations
for the change, and analysis of impact on upgrades, and more links
to documentation.

Zbyszek

> == Owner ==
> * Name: [[User:Zbyszek|Zbigniew Jędrzejewski-Szmek]]
> * Email: zbyszek@xxxxxxxxx
> * Name: Lennart Poettering
> * Email: mzsrqben@xxxxxxxxxxxx
> 
> 
> == Detailed Description ==
> People mix binaries (programs and libraries) from different
> distributions (for example using Fedora containers on Debian or vice
> versa), and distribute binaries without packaging metadata (for
> example by stripping everything except the binary from a container
> image, also removing `/usr/lib/.build-id/*`), compile their own rpm
> packages (for internal distribution and installation), and compile and
> distribute their own binaries. Sometimes we need to introspect a
> binary and figure out its provenance, for example when a program
> crashes and we are looking at a core dump, but also when we have a
> binary without the packaging metadata. When the need to introspect a
> binary arises, we have some very good mechanisms to show the
> provenance: when a file is installed through the package manager we
> can directly list the providing package, but even without this we can
> use build-ids embedded in the binary to uniquely identify the
> originating build. But those mechanisms work best when we're in the
> realm of a single distribution. In particular, build-ids can be easily
> tied to a source rpm, but only when we have the source rpm is part of
> the distribution and the build-id was registered in the appropriate
> database which maps build-ids to real package names. When we move
> outside of the realm of a single distribution, it can be hard to
> figure out where a given binary originates from. If we know that a
> binary is from a given distribution, we may be able to use some
> distro-specific mechanism to figure out this information. But those
> mechanisms will be different for different distributions and will
> often require network access. With this change we aim to provide a
> mechanism that is is very simple, provides a "human-readable" origin
> information without further processing, is portable across distros,
> and works without network access.
> 
> The directly motivating use case is display of core dumps. Right now
> we have build-ids, but those are just opaque hexadecimal numbers that
> are not meaningful to users. We would like to immediately list
> versions of packages involved in the crash (including both the program
> and any libraries it links to). It is not enough to query the rpm
> database to do the equivalent of `rpm -qf …`: very often programs
> crash after some packages have been upgraded and the binaries loaded
> into memory are not the binaries that are currently present on disk,
> or when through some mishap, the binaries on disk do not match the
> installed rpms.  A mechanism that works without rpm database lookup or
> network access allows this information to be showed immediately in
> `coredumpctl` listings and journal entries about the crash. This
> includes crashes that happen in the initrd and sandboxed containers.
> 
> A second motivating use case is when users distribute their own
> binaries and would like to collect crash information. Build-ids are a
> solution that is technically possible, but easy to get wrong in
> practice: users would need to immediately record the build-id after
> the build and store the mapping to program names, versions, and build
> number in some database. It's much easier to be able to record
> something during the build in the build product itself.
> 
> A third motivating use case is the general mixing of Fedora binaries
> with programs and libraries from different distributions, both with
> our binaries being used as the base for foreign binaries, and the
> other way around. Whilst most distributions provide some mechanism to
> figure out the source build information, those mechanisms vary by
> distribution and may not be easy to access from a "foreign" system.
> Such mixing is expected with containers, flatpaks, snaps, Python
> binary wheels, anaconda packages, and quite often when somebody
> compiles a binary and puts it up on the web for other people to
> download.
> 
> We propose a new mechanism which is designed to be very simple but
> extensible: a small JSON document is embedded in an section in the ELF
> binary. This document can be easily read by a human if necessary, but
> it is also well-defined and can be processed programatically. For
> example, `systemd-coredump` will immediately make use of this to
> display package ''nevra'' information for crashes. The format is also
> easy to generate, so it can be added to any build system, either using
> the helpers that we provide or even reimplemented from scratch.
> 
> For the case where we mix binaries from different distros (the third
> motivating use case above), this approach is the most useful when this
> system is used by all distros and even non-distro builds. The more
> widely it is used, the more useful it becomes. The specification was
> developed in collaboration with Debian developers, and we hope that
> Fedora and Debian will lead the way for this to become as widely used
> as build-ids. But even if the information is only available from some
> distros, it is still useful, except that fallback mechanisms need to
> be implemented.
> 
> === Existing system: `.note.gnu.build-id` ===
> 
> We already have build-ids: every ELF object has a `.note.gnu.build-id`
> note, and given a core file, we can read the build-id and look it up
> in the rpm database (`dnf repoquery --whatprovides debuginfo(build-id)
> = …`) to map it to a package name.
> Build-ids are unique and compact and very generic and work as expected
> in general. But they have some downsides:
> * build-ids are not very informative for users. Before the build-id is
> converted back to the appropriate package, it's completely opaque.
> * build-ids require a working rpm database or an internet connection
> to map to the package name.
> 
> Three important cases:
> * minimal containers: the rpm database is not installed in the
> containers. The information about build-ids needs to be stored
> externally, so package name information is not available immediately,
> but only after offline processing. The new note doesn't depend on the
> rpm db in any way.
> * handling of a core from a container, where the container and host
> have different distros
> * self-built and external packages: unless a lot of care is taken to
> keep access to the debuginfo packages, this information may be lost.
> The new note is available even if the repository metadata gets lost.
> Users can easily provide equivalent information in a format that makes
> sense in their own environment. It should work even when rpms and debs
> and other formats are mixed, e.g. during container image creation.
> 
> === New system: `.note.package` ===
> 
> The new note is created and propagated similarly to
> `.note.gnu.build-id`. The difference is that we inject the information
> about package ''nevra'' from the build system.
> 
> The implementation is very simple: `%{build_ldflags}` are extended
> with a command to insert a custom note as a separate section in an ELF
> object. See [https://github.com/systemd/package-notes/blob/main/hello.spec
> hello.spec] for an example. This is done in the default macros, so all
> packages that use the prescribed link flags will be affected.
> 
> The note is a compact json string. This allows the format to be
> trivially extensible (new fields can be added at will), easy to
> process (json is extremely popular and parsers are widely available).
> Using a single field rather than a set of separated notes is more
> space-efficient. With multiple fields the padding and alignment
> requirements cause unnecessary overhead.
> 
> The system was designed with cross-distro collaboration and is
> flexible enough to identify binaries from different packaging formats
> and build systems (rpms, debs, custom binaries).
> 
> See https://systemd.io/COREDUMP_PACKAGE_METADATA/ for detailed
> description of the format.
> 
> One of the advantages of using an ELF note, as opposed to say a series
> of extended attributes on the binary itself, is that the ELF note gets
> automatically captured and copied into a core file by the kernel.
> Extended attributes would have to be copied manually, which might not
> even be possible because the binary on disk may have been removed by
> the time the crash is analyzed.
> 
> The overhead is about 200 bytes for each ELF object.
> We have about overall 33200 files in `/usr/s?bin/` and about 36600
> `.so` files (F35, single architecture,
> results from `dnf repoquery -l 2>/dev/null | rg '^/usr/s?bin/' | sort
> -u | wc -l`,
> `dnf repoquery -l 2>/dev/null | rg '^/usr/lib64/.*\.so$' |sort -u|wc -l`).
> If we do this for the whole distro, we get 69800 × 200 = 13 MB.
> For a typical installation, we can expect about 300–400 kB.
> Thus the overhead of additionally used space is neglible (also see the
> Feedback section for more discussion).
> 
> Precise measurements TBD once this is turned on and we have real
> measurements for a larger number of builds.
> 
> === Examples ===
> <pre>
> $ objdump -s -j .note.package build/libhello.so
> 
> build/libhello.so:     file format elf64-x86-64
> 
> Contents of section .note.package:
>  02ec 04000000 63000000 7e1afeca 46444f00  ....c...~...FDO.
>  02fc 7b227479 7065223a 2272706d 222c226e  {"type":"rpm","n
>  030c 616d6522 3a226865 6c6c6f22 2c227665  ame":"hello","ve
>  031c 7273696f 6e223a22 302d312e 66633335  rsion":"0-1.fc35
>  032c 2e783836 5f363422 2c226f73 43706522  .x86_64","osCpe"
>  033c 3a226370 653a2f6f 3a666564 6f726170  :"cpe:/o:fedorap
>  034c 726f6a65 63743a66 65646f72 613a3333  roject:fedora:33
>  035c 227d0000                             "}..
> </pre>
> 
> <pre>
> $ readelf --notes build/hello | grep "description data" | sed -e
> "s/\s*description data: //g" -e "s/ //g" | xxd -p -r | jq
> readelf: build/hello: Warning: Gap in build notes detected from 0x1091 to 0x10de
> readelf: build/hello: Warning: Gap in build notes detected from 0x1091 to 0x10af
> readelf: build/hello: Warning: Gap in build notes detected from 0x1091 to 0x119f
> {
>   "type": "rpm",
>   "name": "hello",
>   "version": "0-1.fc35.x86_64",
>   "osCpe": "cpe:/o:fedoraproject:fedora:33"
> }
> </pre>
> 
> <pre>
> $ coredumpctl info
>            PID: 44522 (fsverity)
> ...
>        Package: fsverity-utils/1.3-1
>       build-id: ac89bf7175b04d7eec7f6544a923f45be111f0be
>        Message: Process 44522 (fsverity) of user 1000 dumped core.
> 
>                 Found module
> /home/bluca/git/fsverity-utils/libfsverity.so.0 with build-id:
> fa40fdfb79aea84167c98ca8a89add9ac4f51069
>                 Metadata for module
> /home/bluca/git/fsverity-utils/libfsverity.so.0 owned by FDO found: {
>                 "packageType" : "deb",
>                 "package" : "fsverity-utils",
>                 "packageVersion" : "1.3-1"
>                 }
> 
>                 Found module linux-vdso.so.1 with build-id:
> aba08e06103f725e26f1d7c178fb6b76a564a35d
>                 Found module libpthread.so.0 with build-id:
> e91114987a0147bd050addbd591eb8994b29f4b3
>                 Found module libdl.so.2 with build-id:
> d3583c742dd47aaa860c5ae0c0c5bdbcd2d54f61
>                 Found module ld-linux-x86-64.so.2 with build-id:
> f25dfd7b95be4ba386fd71080accae8c0732b711
>                 Found module libcrypto.so.1.1 with build-id:
> 749142d5ee728a76e7cdc61fd79d2311a77405a2
>                 Found module libc.so.6 with build-id:
> 18b9a9a8c523e5cfe5b5d946d605d09242f09798
>                 Found module fsverity with build-id:
> ac89bf7175b04d7eec7f6544a923f45be111f0be
>                 Metadata for module fsverity owned by FDO found: {
>                 "packageType" : "deb",
>                 "package" : "fsverity-utils",
>                 "packageVersion" : "1.3-1"
>                 }
> 
>                 Stack trace of thread 44522:
>                 #0  0x00007fe7c8af26f4 __GI___nanosleep (libc.so.6 + 0xc66f4)
>                 #1  0x00007fe7c8af262a __sleep (libc.so.6 + 0xc662a)
>                 #2  0x00005608481407dd main (fsverity + 0x27dd)
>                 #3  0x00007fe7c8a5009b __libc_start_main (libc.so.6 + 0x2409b)
>                 #4  0x000056084814094a _start (fsverity + 0x294a)
> </pre>
> 
> == Feedback ==
> See [https://github.com/systemd/systemd/issues/18433 systemd issue
> #18433] for upstream discussion and implementation proposals.
> 
> === Concerns about additional changes to files ===
> 
> <pre>
> 17:32:30 <Eighth_Doctor> I think zbyszek underestimates how much of a
> problem it is to stamp every ELF binary with ''nevra'' data
> 17:32:44 <mhroncok> zbyszek: so, assuming python has ~100 ELF .so
> files and I change one text file
> 17:33:22 <mhroncok> (ignore for the time being that the .so files
> often changed because of toolchain updates and assume they are stable)
> </pre>
> 
> I tested this with python3.10. So far there are 13 builds of that
> package in F35:
> `python3.10-3.10.0-1.fc35`,
> `python3.10-3.10.0~a6-1.fc35`,
> `python3.10-3.10.0~a6-2.fc35`,
> `python3.10-3.10.0~a7-1.fc35`,
> `python3.10-3.10.0~b1-1.fc35`,
> `python3.10-3.10.0~b2-2.fc35`,
> `python3.10-3.10.0~b2-3.fc35`,
> `python3.10-3.10.0~b3-1.fc35`,
> `python3.10-3.10.0~b4-1.fc35`,
> `python3.10-3.10.0~b4-2.fc35`,
> `python3.10-3.10.0~b4-3.fc35`,
> `python3.10-3.10.0~rc1-1.fc35`,
> `python3.10-3.10.0~rc2-1.fc35`.
> I extracted the builds (for `.x86_64`) and made a list of all `.so`
> files (1368 files), and calculated sha256 hashes for them. No two
> files repeat, there are 1368 distinct hashes. So the files are
> '''already''' different between builds and the additional proposed
> metadata does will not make a significant difference.
> 
> Note that this range of Python versions encompasses periods when the
> package is under development and undergoes significant changes (alpha
> versions), and when it's only undergoing small changes (rc versions).
> 
> The fact that we get different files in each build is not surprising,
> because files embed build-ids which differ between builds. But even if
> we ignore those, binaries generally differ between builds. Even sizes
> tend to vary between builds: there are 636 distinct `.so` file sizes,
> i.e. on average any given size only repeats twice (presumably most
> often for the same file). Running `diffoscope` on `.so` files from
> different builds shows minor changes in the assembly which I did not
> analyze futher.
> 
> If people have specific questions, for example about overhead in some
> scenario, I'd be happy to answer them. Until now, the issues that were
> raised were very vague, so it's impossible to answer them.
> 
> === Why not just use the rpm database? ===
> 
> <pre>
> 17:34:33 <dcantrell> The main reason for this appears to be that we
> need the RPM db locally to resolve build-ids to package names. But
> since containers wipe /var/lib/rpm, we can't do that. So the solution
> is to put the ''nevra'' in ELF metadata?
> 17:34:39 <dcantrell> That feels like the wrong approach.
> </pre>
> 
> First, there are legitimate reasons to strip packaging metadata from
> images. For example, for an initrd image from rpms, I get 117 MB of
> files (without compression), and out of this `/var/lib/rpm` is 5.9 MB,
> and `/var/lib/dnf` is 4.2 MB. This is an overhead of 9%. This is ''not
> much'', but still too much to keep in the image unless necessary.
> Similar ratios will happen for containers of similar size. Reducing
> image size by one tenth is important. There is no `rpm` or `dnf` in
> the image, to the package database is not even usable without external
> tools.
> 
> As discussed on IRC
> (https://meetbot.fedoraproject.org/teams/fesco/fesco.2021-05-11-17.01.log.html),
> the containers ''we'' build don't wipe this metadata, but custom
> Dockerfiles do that.
> 
> Second, as described in Description section above, not everybody and
> everything uses rpm. The Fedora motto is "we make an operating system
> and we make it easy for you to do useful stuff with it" (and yes, this
> is an actual quote from the official docs), and this stuff involves
> reusing our binaries in containers and custom installations and
> whatnot, not just straightforward installations with `dnf`. And in the
> other direction, people will build their own binaries that are not
> packaged as rpms. But it is still important to be able to figure out
> the exact version of a binary, especially after it crashes.
> 
> === Why do this in Fedora? ===
> 
> <pre>
> 17:36:49 <mhroncok> I don't understand how non-rpm distros and custom
> built binaries are affected by our rpm-build environment :/
> </pre>
> 
> The idea is that we inject this into our build system, and Debian
> injects this into their build system, and so on… As mentioned, this is
> a cross-distro effort. Also, people can use it in their custom build
> systems if they build and distribute binaries internally. The scheme
> would obviously be most useful if used comprehensively, but it's still
> useful when available partially. We hope that Fedora can lead the way.
> (This is similar to build-ids: when initially adopted, they were used
> only by some distros, but were useful even then. Nowadays, with
> comprehensive adoption, they are even more useful.)
> 
> https://hpc.guix.info/blog/2021/09/whats-in-a-package/ contains a nice
> description of a pathological case of packaging hacks and binary
> redistribution. When trying to unravel something like this,
> information embedded directly in the binaries would be quite useful.
> 
> 
> == Benefit to Fedora ==
> A simple and reliable way to gather information about package versions
> of programs is added.
> It enhances, instead of replacing, the existing mechanisms.
> It is particularly useful when reporting crash dumps, but can also be
> used for image introspection and forensincs, license checks and
> version scans on containers, etc.
> 
> If we adopt this in Fedora, Fedora leads the way on implementing the
> standard. Fedora binaries used in any context can be easily
> recognized. Fedora binaries provide a better basis to build things.
> 
> If other distros adopt this, we can introspect and report on those
> binaries easily within the Fedora context. For example, when somebody
> is using a container with some programs that originate in the Debian
> ecosystem, we would be able to identify those programs without tools
> like `apt` or `dpkg-query`. Core dump analaysis executed in the Fedora
> host can easily provide useful information about programs from foreign
> builds.
> 
> == Implementation in Other Distributions ==
> === Microsoft CBL-Mariner ===
> [https://en.wikipedia.org/wiki/CBL-Mariner CBL-Mariner] is an
> [https://github.com/microsoft/CBL-Mariner open source] Linux
> distribution created by Microsoft, targeted at first-party and
> container workloads on Azure. It is used both as a container runner
> host and a base container image.
> Mariner adopted the ELF stamping packaging metadata spec in
> [https://github.com/microsoft/CBL-Mariner/blob/1.0/SPECS/mariner-rpm-macros/gen-ld-script.sh
> version 1.0], initially to add OS metadata, and package-level metadata
> will be added in a following release.
> === Debian ===
> A package-level proof-of-concept is included in the
> [https://github.com/systemd/package-notes/blob/main/dh_package_notes
> package-notes] repository.
> A [https://salsa.debian.org/bluca/debhelper/-/tree/notes_metadata
> system-level proof-of-concept] that enables ELF stamping by default in
> all builds implicitly will be proposed for adoption in the future.
> 
> == Scope ==
> * Proposal owners:
> ** create a specification (First version DONE:
> [https://systemd.io/COREDUMP_PACKAGE_METADATA
> COREDUMP_PACKAGE_METADATA]. We might need to make some adjustments
> based on the deployment in Fedora, but no big changes are expected.)
> ** write a script to generate the package note (First version DONE:
> [https://github.com/systemd/package-notes/blob/main/generate-package-notes.py
> generate-package-notes.py])
> ** provide a patch for `redhat-rpm-config` to insert appropriate
> compilation options
> ** extend systemd's coredumpctl to extract and display this
> information (DONE: [https://github.com/systemd/systemd/pull/19135 PR
> #19135], available in systemd-249)
> ** submit pull request to Packaging Guidelines
> 
> * Other developers:
> ** possibly add support in abrt?
> 
> * Release engineering: There should be no impact.
> 
> * Policies and guidelines:
> The new flags should be mentioned in Packaging Guidelines.
> 
> * Trademark approval: N/A (not needed for this Change)
> N/A
> 
> * Alignment with Objectives:
> It might be relevant for Minimization. Even though it increases the
> image size a tiny bit, it makes minimized images work a bit better.
> 
> == Upgrade/compatibility impact ==
> No impact.
> 
> == How To Test ==
> <pre>
> $ bash -c 'kill -SEGV $$'
> $ coredumpctl
> TIME                            PID  UID  GID SIG     COREFILE EXE
>        SIZE PACKAGE
> 
> Mon 2021-03-01 14:37:22 CET  855151 1000 1000 SIGSEGV present
> /usr/bin/bash 51.7K bash-5.1.0-2.fc34.x86_64
> </pre>
> 
> == User Experience ==
> `coredumpctl` should display information about package versions.
> 
> `readelf --notes` or similar tools can be used on `.so` files and
> compiled programs
> to extract the JSON blurb that describes the originating package.
> 
> == Dependencies ==
> None.
> 
> == Contingency Plan ==
> 
> * Contingency mechanism: Remove the new compilation flags. Rebuild any
> packages that were build with the new flags.
> * Contingency deadline: Beta freeze.
> * Blocks release? No.
> 
> == Documentation ==
> * https://systemd.io/COREDUMP_PACKAGE_METADATA/
> * https://github.com/systemd/package-notes
> 
> See also [[Changes/DebuginfodByDefault]].
> 
> 
> 
> -- 
> Ben Cotton
> He / Him / His
> Fedora Program Manager
> Red Hat
> TZ=America/Indiana/Indianapolis
> _______________________________________________
> devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
> To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
> Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
> Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Users]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux