On Wed, Apr 14, 2021 at 11:30 AM Zbigniew Jędrzejewski-Szmek <zbyszek@xxxxxxxxx> wrote: > > On Tue, Apr 13, 2021 at 12:44:42AM +0000, Matthew Almond via devel wrote: > > On Mon, 2021-04-12 at 23:10 +0200, Lennart Poettering wrote: > > > Or in other words: packaging metadata are sources too. If they change > > > (and a version bump constitutes a change) the output might change, > > > and > > > that's expected. What's key really is that the only things that can > > > effect generated output are the build/packaging environment and the > > > sources, but not parameters outside of that, such as the actual > > > wallclock. > > > > The main way that packaging "interferes" with the source is when > > patches are applied - the original timestamp of a tarball (for example) > > isn't complete enough to use for $SOURCE_DATE_EPOCH. That's fair. > > > > > > > > > My concern centers around the Copy on Write (CoW) use case - when > > > > packages are updated, some files changes, and some may stay the > > > > same. > > > > Where they are the same, we can save I/O and possibly download time > > > > long term. > > > > > > Reproducible builds the way they are defined do not address such > > > file-level CoW optimization so much. They do address CoW optimization > > > on a package level much more however: i.e. the same package build > > > will > > > have the same files in them, no matter what. > > > > > > Or to say this differently: if you want reproducible to work the way > > > ou think it should work, you'd have to start by convincing the > > > uptream > > > maintainers to kill $SOURCE_DATE_EPOCH and similar concepts, but good > > > luck with that. > > > > I think we should be careful to de-couple these two things. Just > > because $SOURCE_DATE_EPOCH is likely to affect a lot of binaries is not > > proof that all binaries will. I remain concerned that this proposal > > forces the issue and for every single version of every single ELF > > binary *must* be different, even if they really didn't change. The > > pattern I see is more automation and faster, smaller release cycles, > > and this forcing downloads and writes of binaries that really didn't > > change their code. > > Yeah, that's definitely something to think about. > > The proposed change indeed "forces the issue". This could be a big drawback > or not, depending on how often identical binary builds happen for different > package versions. If it turns out that the answer is "only rarely", then > I wouldn't consider it too important. If the answer is "quite often", we > would a chance for a nice optimization. > > I wanted to investigate this, but unfortunately, it's hard to check > right now, because all builds are non-reproducible (in the sense of > reproducible-builds.org), because we include the mtime of build > products in rpm metadata, so pretty much all binary rpms are > different. And in general other things make builds non-reproducible, > and it's not obvious if *this* change makes things worse. I didn't > want to dig into individual rpms to compare binaries. I *think* most > packages are not actually rebuilt that often without changes…, but real > data is definitely needed. > We could start clamping times by default by adding the following to redhat-rpm-config: %clamp_mtime_to_source_date_epoch 1 > > I have just thought of an alternative proposition: for ELF objects (and > > ELF objects only): rpm could automatically, and systematically record > > the metadata in an xattr. This would work on images without rpmdb, > > works on most filesystem types, be serialized in archives. Most > > interestingly this could be implemented as an rpm plugin, and would > > work retroactively for packages that were built before this proposal. > > It could also be made to work for other packaging systems, and the > > tooling that reads it wouldn't need to know the original packaging > > system. > Unfortunately this doesn't work for two important cases: > - when a binary or shared library has been replaced on disk. E.g. > it is fairly common for packages to crash on upgrade, and the crash > could be in the _old_ code. When the metadata is loaded in a section, > we get it all nice and dandy in the coredump. If it's in an xattr, > we don't or even worse, get outdated info. > - it doesn't work for non-rpm stuff. > > Zbyszek > _______________________________________________ > devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx > To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx > Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx > Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure -- 真実はいつも一つ!/ Always, there's only one truth! _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure