On Mon, Oct 25, 2021 at 07:26:47PM +0000, Zbigniew Jędrzejewski-Szmek wrote:
>On Mon, Oct 25, 2021 at 03:09:00PM -0400, Ben Cotton wrote:
>>https://fedoraproject.org/wiki/Changes/Package_information_on_ELF_objects
>>
>>== Summary ==
>>All binaries (executables and shared libraries) are annotated with an
>>ELF note that identifies the rpm for which this file was built. This
>>allows binaries to be identified when they are distributed without any
>>of the rpm metadata. `systemd-coredump` uses this to log package
>>versions when reporting crashes.
>
>This is a resubmission of the proposal for F35 which was (narrowly)
>rejected at the time. We added copious descriptions of motivations
>for the change, and analysis of impact on upgrades, and more links
>to documentation.
>
>Zbyszek
>
>>== Owner ==
>>* Name: [[User:Zbyszek|Zbigniew Jędrzejewski-Szmek]]
>>* Email: zbyszek@xxxxxxxxx
>>* Name: Lennart Poettering
>>* Email: mzsrqben@xxxxxxxxxxxx
>>
>>
>>== Detailed Description ==
>>People mix binaries (programs and libraries) from different
>>distributions (for example using Fedora containers on Debian or vice
>>versa), and distribute binaries without packaging metadata (for
>>example by stripping everything except the binary from a container
>>image, also removing `/usr/lib/.build-id/*`), compile their own rpm
>>packages (for internal distribution and installation), and compile and
>>distribute their own binaries. Sometimes we need to introspect a
>>binary and figure out its provenance, for example when a program
>>crashes and we are looking at a core dump, but also when we have a
>>binary without the packaging metadata. When the need to introspect a
>>binary arises, we have some very good mechanisms to show the
>>provenance: when a file is installed through the package manager we
>>can directly list the providing package, but even without this we can
>>use build-ids embedded in the binary to uniquely identify the
>>originating build. But those mechanisms work best when we're in the
>>realm of a single distribution. In particular, build-ids can be easily
>>tied to a source rpm, but only when we have the source rpm is part of
>>the distribution and the build-id was registered in the appropriate
>>database which maps build-ids to real package names. When we move
>>outside of the realm of a single distribution, it can be hard to
>>figure out where a given binary originates from. If we know that a
>>binary is from a given distribution, we may be able to use some
>>distro-specific mechanism to figure out this information. But those
>>mechanisms will be different for different distributions and will
>>often require network access. With this change we aim to provide a
>>mechanism that is is very simple, provides a "human-readable" origin
>>information without further processing, is portable across distros,
>>and works without network access.
>>
>>The directly motivating use case is display of core dumps. Right now
>>we have build-ids, but those are just opaque hexadecimal numbers that
>>are not meaningful to users. We would like to immediately list
>>versions of packages involved in the crash (including both the program
>>and any libraries it links to). It is not enough to query the rpm
>>database to do the equivalent of `rpm -qf …`: very often programs
>>crash after some packages have been upgraded and the binaries loaded
>>into memory are not the binaries that are currently present on disk,
>>or when through some mishap, the binaries on disk do not match the
>>installed rpms. A mechanism that works without rpm database lookup or
>>network access allows this information to be showed immediately in
>>`coredumpctl` listings and journal entries about the crash. This
>>includes crashes that happen in the initrd and sandboxed containers.
>>
>>A second motivating use case is when users distribute their own
>>binaries and would like to collect crash information. Build-ids are a
>>solution that is technically possible, but easy to get wrong in
>>practice: users would need to immediately record the build-id after
>>the build and store the mapping to program names, versions, and build
>>number in some database. It's much easier to be able to record
>>something during the build in the build product itself.
>>
>>A third motivating use case is the general mixing of Fedora binaries
>>with programs and libraries from different distributions, both with
>>our binaries being used as the base for foreign binaries, and the
>>other way around. Whilst most distributions provide some mechanism to
>>figure out the source build information, those mechanisms vary by
>>distribution and may not be easy to access from a "foreign" system.
>>Such mixing is expected with containers, flatpaks, snaps, Python
>>binary wheels, anaconda packages, and quite often when somebody
>>compiles a binary and puts it up on the web for other people to
>>download.
>>
>>We propose a new mechanism which is designed to be very simple but
>>extensible: a small JSON document is embedded in an section in the ELF
>>binary. This document can be easily read by a human if necessary, but
>>it is also well-defined and can be processed programatically. For
>>example, `systemd-coredump` will immediately make use of this to
>>display package ''nevra'' information for crashes. The format is also
>>easy to generate, so it can be added to any build system, either using
>>the helpers that we provide or even reimplemented from scratch.
>>
>>For the case where we mix binaries from different distros (the third
>>motivating use case above), this approach is the most useful when this
>>system is used by all distros and even non-distro builds. The more
>>widely it is used, the more useful it becomes. The specification was
>>developed in collaboration with Debian developers, and we hope that
>>Fedora and Debian will lead the way for this to become as widely used
>>as build-ids. But even if the information is only available from some
>>distros, it is still useful, except that fallback mechanisms need to
>>be implemented.
>>
>>=== Existing system: `.note.gnu.build-id` ===
>>
>>We already have build-ids: every ELF object has a `.note.gnu.build-id`
>>note, and given a core file, we can read the build-id and look it up
>>in the rpm database (`dnf repoquery --whatprovides debuginfo(build-id)
>>= …`) to map it to a package name.
>>Build-ids are unique and compact and very generic and work as expected
>>in general. But they have some downsides:
>>* build-ids are not very informative for users. Before the build-id is
>>converted back to the appropriate package, it's completely opaque.
>>* build-ids require a working rpm database or an internet connection
>>to map to the package name.
>>
>>Three important cases:
>>* minimal containers: the rpm database is not installed in the
>>containers. The information about build-ids needs to be stored
>>externally, so package name information is not available immediately,
>>but only after offline processing. The new note doesn't depend on the
>>rpm db in any way.
>>* handling of a core from a container, where the container and host
>>have different distros
>>* self-built and external packages: unless a lot of care is taken to
>>keep access to the debuginfo packages, this information may be lost.
>>The new note is available even if the repository metadata gets lost.
>>Users can easily provide equivalent information in a format that makes
>>sense in their own environment. It should work even when rpms and debs
>>and other formats are mixed, e.g. during container image creation.
>>
>>=== New system: `.note.package` ===
>>
>>The new note is created and propagated similarly to
>>`.note.gnu.build-id`. The difference is that we inject the information
>>about package ''nevra'' from the build system.
>>
>>The implementation is very simple: `%{build_ldflags}` are extended
>>with a command to insert a custom note as a separate section in an ELF
>>object. See [https://github.com/systemd/package-notes/blob/main/hello.spec
>>hello.spec] for an example. This is done in the default macros, so all
>>packages that use the prescribed link flags will be affected.
>>
>>The note is a compact json string. This allows the format to be
>>trivially extensible (new fields can be added at will), easy to
>>process (json is extremely popular and parsers are widely available).
>>Using a single field rather than a set of separated notes is more
>>space-efficient. With multiple fields the padding and alignment
>>requirements cause unnecessary overhead.
>>
>>The system was designed with cross-distro collaboration and is
>>flexible enough to identify binaries from different packaging formats
>>and build systems (rpms, debs, custom binaries).
>>
>>See https://systemd.io/COREDUMP_PACKAGE_METADATA/ for detailed
>>description of the format.
>>
>>One of the advantages of using an ELF note, as opposed to say a series
>>of extended attributes on the binary itself, is that the ELF note gets
>>automatically captured and copied into a core file by the kernel.
>>Extended attributes would have to be copied manually, which might not
>>even be possible because the binary on disk may have been removed by
>>the time the crash is analyzed.
>>
>>The overhead is about 200 bytes for each ELF object.
>>We have about overall 33200 files in `/usr/s?bin/` and about 36600
>>`.so` files (F35, single architecture,
>>results from `dnf repoquery -l 2>/dev/null | rg '^/usr/s?bin/' | sort
>>-u | wc -l`,
>>`dnf repoquery -l 2>/dev/null | rg '^/usr/lib64/.*\.so$' |sort -u|wc -l`).
>>If we do this for the whole distro, we get 69800 × 200 = 13 MB.
>>For a typical installation, we can expect about 300–400 kB.
>>Thus the overhead of additionally used space is neglible (also see the
>>Feedback section for more discussion).
>>
>>Precise measurements TBD once this is turned on and we have real
>>measurements for a larger number of builds.
>>
>>=== Examples ===
>><pre>
>>$ objdump -s -j .note.package build/libhello.so
>>
>>build/libhello.so: file format elf64-x86-64
>>
>>Contents of section .note.package:
>> 02ec 04000000 63000000 7e1afeca 46444f00 ....c...~...FDO.
>> 02fc 7b227479 7065223a 2272706d 222c226e {"type":"rpm","n
>> 030c 616d6522 3a226865 6c6c6f22 2c227665 ame":"hello","ve
>> 031c 7273696f 6e223a22 302d312e 66633335 rsion":"0-1.fc35
>> 032c 2e783836 5f363422 2c226f73 43706522 .x86_64","osCpe"
>> 033c 3a226370 653a2f6f 3a666564 6f726170 :"cpe:/o:fedorap
>> 034c 726f6a65 63743a66 65646f72 613a3333 roject:fedora:33
>> 035c 227d0000 "}..
>></pre>
>>
>><pre>
>>$ readelf --notes build/hello | grep "description data" | sed -e
>>"s/\s*description data: //g" -e "s/ //g" | xxd -p -r | jq
>>readelf: build/hello: Warning: Gap in build notes detected from 0x1091 to 0x10de
>>readelf: build/hello: Warning: Gap in build notes detected from 0x1091 to 0x10af
>>readelf: build/hello: Warning: Gap in build notes detected from 0x1091 to 0x119f
>>{
>> "type": "rpm",
>> "name": "hello",
>> "version": "0-1.fc35.x86_64",
>> "osCpe": "cpe:/o:fedoraproject:fedora:33"
>>}
>></pre>
>>
>><pre>
>>$ coredumpctl info
>> PID: 44522 (fsverity)
>>...
>> Package: fsverity-utils/1.3-1
>> build-id: ac89bf7175b04d7eec7f6544a923f45be111f0be
>> Message: Process 44522 (fsverity) of user 1000 dumped core.
>>
>> Found module
>>/home/bluca/git/fsverity-utils/libfsverity.so.0 with build-id:
>>fa40fdfb79aea84167c98ca8a89add9ac4f51069
>> Metadata for module
>>/home/bluca/git/fsverity-utils/libfsverity.so.0 owned by FDO found: {
>> "packageType" : "deb",
>> "package" : "fsverity-utils",
>> "packageVersion" : "1.3-1"
>> }
>>
>> Found module linux-vdso.so.1 with build-id:
>>aba08e06103f725e26f1d7c178fb6b76a564a35d
>> Found module libpthread.so.0 with build-id:
>>e91114987a0147bd050addbd591eb8994b29f4b3
>> Found module libdl.so.2 with build-id:
>>d3583c742dd47aaa860c5ae0c0c5bdbcd2d54f61
>> Found module ld-linux-x86-64.so.2 with build-id:
>>f25dfd7b95be4ba386fd71080accae8c0732b711
>> Found module libcrypto.so.1.1 with build-id:
>>749142d5ee728a76e7cdc61fd79d2311a77405a2
>> Found module libc.so.6 with build-id:
>>18b9a9a8c523e5cfe5b5d946d605d09242f09798
>> Found module fsverity with build-id:
>>ac89bf7175b04d7eec7f6544a923f45be111f0be
>> Metadata for module fsverity owned by FDO found: {
>> "packageType" : "deb",
>> "package" : "fsverity-utils",
>> "packageVersion" : "1.3-1"
>> }
>>
>> Stack trace of thread 44522:
>> #0 0x00007fe7c8af26f4 __GI___nanosleep (libc.so.6 + 0xc66f4)
>> #1 0x00007fe7c8af262a __sleep (libc.so.6 + 0xc662a)
>> #2 0x00005608481407dd main (fsverity + 0x27dd)
>> #3 0x00007fe7c8a5009b __libc_start_main (libc.so.6 + 0x2409b)
>> #4 0x000056084814094a _start (fsverity + 0x294a)
>></pre>
>>
>>== Feedback ==
>>See [https://github.com/systemd/systemd/issues/18433 systemd issue
>>#18433] for upstream discussion and implementation proposals.
>>
>>=== Concerns about additional changes to files ===
>>
>><pre>
>>17:32:30 <Eighth_Doctor> I think zbyszek underestimates how much of a
>>problem it is to stamp every ELF binary with ''nevra'' data
>>17:32:44 <mhroncok> zbyszek: so, assuming python has ~100 ELF .so
>>files and I change one text file
>>17:33:22 <mhroncok> (ignore for the time being that the .so files
>>often changed because of toolchain updates and assume they are stable)
>></pre>
>>
>>I tested this with python3.10. So far there are 13 builds of that
>>package in F35:
>>`python3.10-3.10.0-1.fc35`,
>>`python3.10-3.10.0~a6-1.fc35`,
>>`python3.10-3.10.0~a6-2.fc35`,
>>`python3.10-3.10.0~a7-1.fc35`,
>>`python3.10-3.10.0~b1-1.fc35`,
>>`python3.10-3.10.0~b2-2.fc35`,
>>`python3.10-3.10.0~b2-3.fc35`,
>>`python3.10-3.10.0~b3-1.fc35`,
>>`python3.10-3.10.0~b4-1.fc35`,
>>`python3.10-3.10.0~b4-2.fc35`,
>>`python3.10-3.10.0~b4-3.fc35`,
>>`python3.10-3.10.0~rc1-1.fc35`,
>>`python3.10-3.10.0~rc2-1.fc35`.
>>I extracted the builds (for `.x86_64`) and made a list of all `.so`
>>files (1368 files), and calculated sha256 hashes for them. No two
>>files repeat, there are 1368 distinct hashes. So the files are
>>'''already''' different between builds and the additional proposed
>>metadata does will not make a significant difference.
>>
>>Note that this range of Python versions encompasses periods when the
>>package is under development and undergoes significant changes (alpha
>>versions), and when it's only undergoing small changes (rc versions).
>>
>>The fact that we get different files in each build is not surprising,
>>because files embed build-ids which differ between builds. But even if
>>we ignore those, binaries generally differ between builds. Even sizes
>>tend to vary between builds: there are 636 distinct `.so` file sizes,
>>i.e. on average any given size only repeats twice (presumably most
>>often for the same file). Running `diffoscope` on `.so` files from
>>different builds shows minor changes in the assembly which I did not
>>analyze futher.
>>
>>If people have specific questions, for example about overhead in some
>>scenario, I'd be happy to answer them. Until now, the issues that were
>>raised were very vague, so it's impossible to answer them.
>>
>>=== Why not just use the rpm database? ===
>>
>><pre>
>>17:34:33 <dcantrell> The main reason for this appears to be that we
>>need the RPM db locally to resolve build-ids to package names. But
>>since containers wipe /var/lib/rpm, we can't do that. So the solution
>>is to put the ''nevra'' in ELF metadata?
>>17:34:39 <dcantrell> That feels like the wrong approach.
>></pre>
>>
>>First, there are legitimate reasons to strip packaging metadata from
>>images. For example, for an initrd image from rpms, I get 117 MB of
>>files (without compression), and out of this `/var/lib/rpm` is 5.9 MB,
>>and `/var/lib/dnf` is 4.2 MB. This is an overhead of 9%. This is ''not
>>much'', but still too much to keep in the image unless necessary.
>>Similar ratios will happen for containers of similar size. Reducing
>>image size by one tenth is important. There is no `rpm` or `dnf` in
>>the image, to the package database is not even usable without external
>>tools.
>>
>>As discussed on IRC
>>(https://meetbot.fedoraproject.org/teams/fesco/fesco.2021-05-11-17.01.log.html),
>>the containers ''we'' build don't wipe this metadata, but custom
>>Dockerfiles do that.
>>
>>Second, as described in Description section above, not everybody and
>>everything uses rpm. The Fedora motto is "we make an operating system
>>and we make it easy for you to do useful stuff with it" (and yes, this
>>is an actual quote from the official docs), and this stuff involves
>>reusing our binaries in containers and custom installations and
>>whatnot, not just straightforward installations with `dnf`. And in the
>>other direction, people will build their own binaries that are not
>>packaged as rpms. But it is still important to be able to figure out
>>the exact version of a binary, especially after it crashes.
>>
>>=== Why do this in Fedora? ===
>>
>><pre>
>>17:36:49 <mhroncok> I don't understand how non-rpm distros and custom
>>built binaries are affected by our rpm-build environment :/
>></pre>
>>
>>The idea is that we inject this into our build system, and Debian
>>injects this into their build system, and so on… As mentioned, this is
>>a cross-distro effort. Also, people can use it in their custom build
>>systems if they build and distribute binaries internally. The scheme
>>would obviously be most useful if used comprehensively, but it's still
>>useful when available partially. We hope that Fedora can lead the way.
>>(This is similar to build-ids: when initially adopted, they were used
>>only by some distros, but were useful even then. Nowadays, with
>>comprehensive adoption, they are even more useful.)
>>
>>https://hpc.guix.info/blog/2021/09/whats-in-a-package/ contains a nice
>>description of a pathological case of packaging hacks and binary
>>redistribution. When trying to unravel something like this,
>>information embedded directly in the binaries would be quite useful.
>>
>>
>>== Benefit to Fedora ==
>>A simple and reliable way to gather information about package versions
>>of programs is added.
>>It enhances, instead of replacing, the existing mechanisms.
>>It is particularly useful when reporting crash dumps, but can also be
>>used for image introspection and forensincs, license checks and
>>version scans on containers, etc.
>>
>>If we adopt this in Fedora, Fedora leads the way on implementing the
>>standard. Fedora binaries used in any context can be easily
>>recognized. Fedora binaries provide a better basis to build things.
>>
>>If other distros adopt this, we can introspect and report on those
>>binaries easily within the Fedora context. For example, when somebody
>>is using a container with some programs that originate in the Debian
>>ecosystem, we would be able to identify those programs without tools
>>like `apt` or `dpkg-query`. Core dump analaysis executed in the Fedora
>>host can easily provide useful information about programs from foreign
>>builds.
>>
>>== Implementation in Other Distributions ==
>>=== Microsoft CBL-Mariner ===
>>[https://en.wikipedia.org/wiki/CBL-Mariner CBL-Mariner] is an
>>[https://github.com/microsoft/CBL-Mariner open source] Linux
>>distribution created by Microsoft, targeted at first-party and
>>container workloads on Azure. It is used both as a container runner
>>host and a base container image.
>>Mariner adopted the ELF stamping packaging metadata spec in
>>[https://github.com/microsoft/CBL-Mariner/blob/1.0/SPECS/mariner-rpm-macros/gen-ld-script.sh
>>version 1.0], initially to add OS metadata, and package-level metadata
>>will be added in a following release.
>>=== Debian ===
>>A package-level proof-of-concept is included in the
>>[https://github.com/systemd/package-notes/blob/main/dh_package_notes
>>package-notes] repository.
>>A [https://salsa.debian.org/bluca/debhelper/-/tree/notes_metadata
>>system-level proof-of-concept] that enables ELF stamping by default in
>>all builds implicitly will be proposed for adoption in the future.
>>
>>== Scope ==
>>* Proposal owners:
>>** create a specification (First version DONE:
>>[https://systemd.io/COREDUMP_PACKAGE_METADATA
>>COREDUMP_PACKAGE_METADATA]. We might need to make some adjustments
>>based on the deployment in Fedora, but no big changes are expected.)
>>** write a script to generate the package note (First version DONE:
>>[https://github.com/systemd/package-notes/blob/main/generate-package-notes.py
>>generate-package-notes.py])
>>** provide a patch for `redhat-rpm-config` to insert appropriate
>>compilation options
>>** extend systemd's coredumpctl to extract and display this
>>information (DONE: [https://github.com/systemd/systemd/pull/19135 PR
>>#19135], available in systemd-249)
>>** submit pull request to Packaging Guidelines
>>
>>* Other developers:
>>** possibly add support in abrt?
>>
>>* Release engineering: There should be no impact.
>>
>>* Policies and guidelines:
>>The new flags should be mentioned in Packaging Guidelines.
>>
>>* Trademark approval: N/A (not needed for this Change)
>>N/A
>>
>>* Alignment with Objectives:
>>It might be relevant for Minimization. Even though it increases the
>>image size a tiny bit, it makes minimized images work a bit better.
>>
>>== Upgrade/compatibility impact ==
>>No impact.
>>
>>== How To Test ==
>><pre>
>>$ bash -c 'kill -SEGV $$'
>>$ coredumpctl
>>TIME PID UID GID SIG COREFILE EXE
>> SIZE PACKAGE
>>
>>Mon 2021-03-01 14:37:22 CET 855151 1000 1000 SIGSEGV present
>>/usr/bin/bash 51.7K bash-5.1.0-2.fc34.x86_64
>></pre>
>>
>>== User Experience ==
>>`coredumpctl` should display information about package versions.
>>
>>`readelf --notes` or similar tools can be used on `.so` files and
>>compiled programs
>>to extract the JSON blurb that describes the originating package.
>>
>>== Dependencies ==
>>None.
>>
>>== Contingency Plan ==
>>
>>* Contingency mechanism: Remove the new compilation flags. Rebuild any
>>packages that were build with the new flags.
>>* Contingency deadline: Beta freeze.
>>* Blocks release? No.
>>
>>== Documentation ==
>>* https://systemd.io/COREDUMP_PACKAGE_METADATA/
>>* https://github.com/systemd/package-notes
>>
>>See also [[Changes/DebuginfodByDefault]].
Thanks for revising the change proposal and filling in more details.
After reading through it, I have some questions:
1) The proposal notes that users tend to combine built packages from
different distributions. Even in the current environment, do we care
about those use cases without also getting a reproducer for Fedora?
[...] So while these scenarios are described in the proposal, are
they something that Fedora is trying to support?