On Sat, Aug 01, 2020 at 02:03:40PM -0600, Jeff Law wrote: > On Sat, 2020-08-01 at 12:12 +0200, Kevin Kofler wrote: > > Hi, > > > > seeing the amount of fallout from LTO, I really think that this feature > > ought to be dropped from F33, and evaluated carefully for F34 (i.e., can it > > be done without breaking the build of or miscompiling a large part of the > > distribution, once the bugs such as the ld bug discussed in this thread are > > fixed, or is it just unsafe to enable by default to begin with?). I.e., > > revert it for F33 for sure, then decide whether to retry it for F34 or can > > it permanently. > Most of the fallout has been Nick pushing through binutils builds that are > broken. Seriously, there's been at least 4 builds pushed through that kept > bringing back the *same* problem. > > And just to be clear, this has been 6+ months of behind the scenes work to find > and identify issues, fix broken packages, put global mitigations of broken crap > in place in place, opt-out packages that do things that are fundamentally > incompatible with LTO, etc. In fact it was that behind-the-scenes work that > pushed this feature from F32 to F33 as it just wasn't ready to go in F32. > > I think the chances of a serious mis-compilation large parts of the distribution > are small. The one mis-compilation we know about was a latent linker bug that > just happened to be triggered by LTO and that particular bug we know how to > identify any packages that might have been broken. > > Frankly, there's been more fallout from infrastructure breakage and cmake issues > than anything. I went through the first ~1000 failures proactively looking for > things that were potentially LTO related and fixing them half-dozen or so I > found, but by far the s390 infrastructure and cmake changes have caused more > failures than anything. > > As has always been the case, I'm here to address any problems that arise and use > my 30 years of experience with GCC development as well as distribution mass > rebuilds to make informed decisions about the best course of action for any > particular problem. Yeah, the s390x failures were anoying. I have several ideas to make things more robust that hopefully we can do before next mass rebuild: * move the cache host from a z/vm instance to a kvm one. * We have the kvm ones oversubscribed on cpus, so I'd like to drop all of them from 4 cpus to 3. * We might play with the weight on them so koji doesn't run as many jobs at a time as it does now. * Make sure ci/koji-simple-ci/koschei isn't doing any long running builds when the mass rebuild starts. A gcc or libreoffice build can take up a builder for a long time. * Run the mass rebuild with --fail-fast so if something fails on some other arch, it never even needs to run on s390x. Anyhow, the mass rebuild is over and tagged in. Rawhide compose is running and should hopefully finish later today. The second pass took failures from 4162 to 2833, so that helped a lot: https://kojipkgs.fedoraproject.org/mass-rebuild/f33-failures.html kevin
Attachment:
signature.asc
Description: PGP signature
_______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx