On Tue, Mar 8, 2022 at 7:40 PM Alexander Sosedkin <asosedkin@xxxxxxxxxx> wrote: > > Hello, community, I need your wisdom for planning a disruptive change. > > Fedora 28 had https://fedoraproject.org/wiki/Changes/StrongCryptoSettings > Fedora 33 had https://fedoraproject.org/wiki/Changes/StrongCryptoSettings2 > I believe we should start planning > for the next cryptographic defaults tightening. > And next time it's gonna be even more disruptive because of SHA-1 (again). > > SHA-1 is a hash function from 1995, > which collision resistance is no longer to be relied upon for security [1]. > At the same time, it's not like software has successfully migrated off it, > not even close. > It's not a question of "if" the world should migrate from it, > sooner or later we must part ways with it. > (Technically, some acute energy crisis or a collapse of civilization > forever raising the costs of computations thousandfold would also do, > but let's agree that migrating to a more modern hash is the way =) > > We've been disabling it in TLS, but its usage is much wider than TLS. > The next agonizing step is to restrict its usage for signatures > on the cryptographic libraries level, with openssl being the scariest one. > > Good news is, RHEL-9 is gonna lead the way > and thus will take a lot of the hits first. > Fedora doesn't have to pioneer it. > Bad news is, Fedora has to follow suit someday anyway, > and this brings me to how does one land such a change. > > --- > > Fedora is a large distribution with short release cycles, and > the only realistic way to weed out its reliance on SHA-1 signatures > from all of its numerous dark corners is to break them. > Make creation and verification fail in default configuration. > But it's unreasonable to just wait for, say, Fedora 37 branch-off > and break it in Rawhide for Fedora 38. > The fallout will just be too big. > > Maintainers need time to get bugs, look into them, think, > analyze, react and test --- and that's just if it fails correctly! > Unfortunately, it's not just that the error paths are as dusty as they get > because the program counter has never set foot on them before. > Some maintainers might even find that > picking a different hash function renders their code non-interoperable, > or even that protocols they implement have SHA-1 hardcoded in the spec. > Or that everything is ready, but real world deployments need another decade. > Or that on-disk formats are just hard to change and migrate. > Took git years to migrate from SHA-1, and some others haven't even started. > There are gonna be investigations, planning, exceptions, upstream changes, > opt-out mechanisms, arguing, compromises, waiting out, all kinds of things. > It's gonna be big. Too big for a single release cycle. > > --- > > But how does one land something and give the distribution > the extra cycles needed to react? That's not really clear to me. > > An obvious thing is to announce it in one cycle and land in another one. > The downsides are well-documented > in "The Hitchhiker's Guide to the Galaxy": > announcements are one weak measure, and then it's too late. > > A second scheme I can come up with is a "jump scare". > Break the functionality in Fedora 37 Rawhide, > make most of the affected people realize the depth of the problem, > then unbreak it. Break again for Fedora 38 and never fix. > > This could also be extended into "let one stable release slide'. > Break in 37 Rawhide, unbreak on branched off 37, > but never in Rawhide. > > But these are all rather... crude? > Sure there should be better ways, > preferably something explored before. > I'm all for pulling this tooth out smoothly, > but I need hints on how to do it. > I hope that together we can devise a better plan than these. > > So, how does one land a change that's bigger than a release cycle? > > [1] https://eprint.iacr.org/2020/014 "You know these lights in the theaters that go out gradually? When the guy ve-ery slo-o-owly pulls the plug out?" - a joke from my childhood. Hello, it's been quiet for a while, and I've been busy but kept thinking about all the useful feedback you folks gave me. Not that it made me flesh out a perfect plan, but hopefully at least a less terrible one. Regarding smudging the change in time, how does the following three-phaser sound? Phase 1 ("Wake-up Call"): In Fedora 37, disable SHA-1 signatures verification/creation in FUTURE policy, i.e. opt-in only. Come up with some logging solution; I'd prefer something non-invasive like eBPF USDT probes [2], but maybe even stderr could work, you've been moderately convincing. (FUTURE change is *maybe* doable in F36, but not logging.) Announce it as a system-wide change anyway for visibility, call for Test Days to report which apps/workflows rely on SHA-1 signatures either from the logs or from opting into blocking operations and seeing what starts failing hard. That'd have to be very actively called for to make an impact, impact that'd mostly be just maintainers thinking what will they do in Phase 2 ("Jump Scare"): As soon as f37 branch-off happens, disable signature verification in DEFAULT in *38 rawhide*. Cue an influx of bugreports because things get broken for all testers and not just the ones who opt in. I anticipate this to be the most eye-opening step even if we test a lot in the previous phase, so to smooth it out more we then *revert* the change in 38 before the release, so the released Fedora behaves just like in 37 and whatever wasn't sorted out in time gets an extra cycle. A second Fedora change should be filed for visibility, but clearly stating this will not affect f38 released. Phase 3 ("Return of the Panik"): And then Fedora 39 comes, where the revert hasn't happened, goes through the whole release cycle, but this time the change goes through and reaches stable. Again, a system-wide change, a third one for the same thing. With the 37-38-39 numbers, that'd mean the change reaching the users in autumn 2023, with lead times of: ~ 3.5 cycles for the most proactive developers to see this thread and panic ~ 3 cycles for the testers to proactively report bugs (logging/opting in) ~ 2 cycles to address everything else coming from rawhide testing before it reaches stable by either switching to some other algorithm, making the users explicitly opt into trusting SHA-1 signatures somehow, or, in the most high-profile cases, having a widely publicised exception (and some plan for the future). Questions: * Do you find this smudging reasonable? * The usual tightening of the other less controversial algorithms, should it follow the same smudging/reverting plan since we're going through all that hassle anyway? * Does the 37-38-39 timeframe feel right? * Do I need to first run this contraption of a plan by FESCo or some other smart folks? * Is there a better signalling mechanism than filing 3 system-wide changes? * What'd be the right mechanism for others to take the wheel if everything goes sideways and the need arises to revise the plan mid-execution? Other kinds of input are, of course, also appreciated. Even the calls to magically attain the mutually exclusive goals of offering secure defaults while not breaking insecure workflows that don't offer actual solutions might serve as a useful mood check. I know it ain't the best plan. Let's figure out the right thing to do. [2] https://www.youtube.com/watch?v=CLlxl7OCPfs _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure