On Mon, May 15, 2023 at 10:40:48AM +0200, Thorsten Leemhuis wrote: > This basically rewrites the 'Prioritize work on fixing regressions' > section of Documentation/process/handling-regressions.rst for various > reasons. Among them: some things were too demanding, some didn't align > well with the usual workflows, and some apparently were not clear enough > -- and of course a few things were missing that would be good to have in > there. > > Linus for example recently stated that regressions introduced during the > past year should be handled similarly to regressions from the current > cycle, if it's a clear fix with no semantic subtlety. His exact > wording[1] didn't fit well into the text structure, but the author tried > to stick close to the apparent intention. > > It was a noble goal from the original author to state "[prevent > situations that might force users to] continue running an outdated and > thus potentially insecure kernel version for more than two weeks after a > regression's culprit was identified"; this directly led to the goal "fix > regression in mainline within one week, if the issue made it into a > stable/longterm kernel", because the stable team needs time to pick up > and prepare a new release. But apparently all that was a bit too > demanding. > > That "one week" target for example doesn't align well with the usual > habits of the subsystem maintainers, which normally send their fixes to > Linus once a week; and it doesn't align too well with stable/longterm > releases either, which often enter a -rc phase on Mondays or Tuesdays > and then are released two to three days later. And asking developers to > create, review, and mainline fixes within one week might be too much to > ask for in general. Hence tone the general goal down to three weeks and > use an approach that better aligns with the usual merging and release > habits. > > While at it, also make the rules of thumb a bit easier to follow by > grouping them by topic (e.g. generic things, timing, procedures, ...). > > Also add text for a few cases where recent discussions showed they need > covering. Among them are multiple points that better explain the > relations to stable and longterm kernels and the team that manages them; > they and the group seperators are the primary reason why this whole > section sadly grew somewhat in the rewrite. > > The group about those relations led to one addition the author came up > with without any precedent from Linus: the text now tells developers to > add a stable tag for any regression that made it into a proper mainline > release during the past 12 months. This is meant to ensure the stable > team will definitely notice any fixes for recent regressions. That > includes those introduced shortly before a new mainline release and > found right after it; without such a rule the stable team might miss the > fix, which then would only reach users after weeks or months with later > releases. > > Note, the aspect "Do not consider regressions from the current cycle as > something that can wait till the cycle's end [...]" might look like an > addition, but was kinda was in the old text as well -- but only > indirectly. That apparently was too subtle, as many developers seem to > assume waiting till the end of the cycle is fine (even for build > fixes). > > In practice this was especially problematic when a cause of a regression > made it into a proper release (either directly or through a backport). A > revert performed by Linus shortly before the 6.3 release illustrated > that[2], as the developer of the culprit had been willing to revert the > culprit about three weeks earlier already -- but didn't do so when a fix > came into sight and a maintainer suggested it can wait. Due to that the > issue in the end plagued users of 6.2.y at least two weeks longer than > necessary, as the fix in the end didn't become ready in time. This issue > in fact could have been resolved one or two additional weeks earlier, if > the developer had reverted the culprit shortly after it had been > identified (which even the old version of the text suggest to do in such > cases). > > [1] https://lore.kernel.org/all/CAHk-=wis_qQy4oDNynNKi5b7Qhosmxtoj1jxo5wmB6SRUwQUBQ@xxxxxxxxxxxxxx/ > > [2] https://lore.kernel.org/all/CAHk-=wgD98pmSK3ZyHk_d9kZ2bhgN6DuNZMAJaV0WTtbkf=RDw@xxxxxxxxxxxxxx/ > > CC: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> > CC: Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> > CC: Lukas Bulwahn <lukas.bulwahn@xxxxxxxxx> > Signed-off-by: Thorsten Leemhuis <linux@xxxxxxxxxxxxx> Acked-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>