Re: F41 Change Proposal - Reproducible Package Builds (System-Wide)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Apr 13, 2024 at 04:12:09AM -0500, Neal Gompa wrote:
> On Sat, Apr 13, 2024 at 3:59 AM Richard W.M. Jones <rjones@xxxxxxxxxx> wrote:
> >
> > On Fri, Apr 12, 2024 at 10:41:43PM +0100, Aoife Moloney wrote:
> > > [https://github.com/keszybz/add-determinism add-determinism] is a Rust
> > > program which, as its name suggests, adds determinism to files that
> > > are given as input by attempting to standardize metadata contained in
> > > binary or source files to ensure consistency and clamping to
> > > $SOURCE_DATE_EPOCH in all instances. `add-determinism` is the "Fedora
> > > version" of [https://salsa.debian.org/reproducible-builds/strip-nondeterminism
> > > strip-nondeterminism] from the Debian project. Since
> > > strip-nondeterminism is written in perl, it is undesirable for use in
> > > Fedora, as we don't want to pull perl in the buildroot for every
> > > package.

The proposal explicitly states that we don't want Perl in all buildroots.
It doesn't say that explicitly, but we also don't want Python.
In the past we made an effort to remove gcc and make
[https://fedoraproject.org/wiki/Changes/Remove_make_from_BuildRoot,
https://fedoraproject.org/wiki/Changes/Remove_GCC_from_BuildRoot],
and Python would be a fairly big addition.

If we don't want to pull in an additional language framework, the
options are either a compiled language or a scripting language that is
already installed anyway, i.e. bash or awk. Considering that we want
to do multiprocessing and/or multithreading to make things quick, a
compiled language seems better. And among the compiled languages, I
think Rust should be the default choice nowadays.

The resulting binary is a single file of 2.6 MB, which is more than
it'd be if written in C, but still reasonably small.

> > https://github.com/keszybz/add-determinism looks like a package with a
> > lot of Rust dependencies just to make some small changes to four
> > different file types. Isn't there an easier way to do this?  I would
> > have thought a Python library would be more suitable as the most
> > complicated bit is the *.pyc change which is done using Python code.

The program currently has just four "handlers", but I expect that we'll
need to add some more. Those four address issues that were apparent
after a "mini mass rebuild" of ~2k packages, but that covered only
a subset of packages, and it's possible that those most obvious issues
masked other issues, and we didn't analyze all failures in detail, etc.
The program is structured so that it should be simple to add
additional handlers.

Now I'll try to channel Fabio V.: all the crates that are dependencies
are commonly used, i.e. they're in the common set of crates that are
used by modern Rust code and we need them to be packaged anyway. The
packaging effort for add-determinism was miniscule.
(I made a mistake by packaging a snapshot, which the guidelines
disallow like I did it, but when doing it correctly, it's really just
a matter of running rust2rpm and filling in some blanks.)

> Considering Debian's version is in Perl, yes, it's quite reasonable to
> consider that. It could have been written in Python, C++, or even
> shell (if you truly hated yourself). I'm not a big fan of Rust for
> this either. But unless someone offers to make another version that is
> more appealing, this is what we have.

Yes. But actually I think Rust is the optimal choice here. Writing
this in Python would be possibly slightly nicer, but we don't want
to pull the interpreter and packages into the buildroot. Python
also has the problem (challenge?) that it needs to be bootstrapped
once per year. The less packages are involved in the bootstrap, the
easier it is. And if the brp was written in Python, we'd need to
deal with that, and it would probably increase the number of builds
which are done without the cleanup. Having this as an indepedent
binary avoids some of the issues with bootstrap.

Please note that add-determinism has:
- configurable logging
- unit tests
- integration tests
- multiprocess support
- atomic file replacements (for files without hardlinks)
- rewriting of files in place (for files with hardlinks)

Doing this in bash would be doable, but not worth it, IMO.

Zbyszek
--
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Users]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux