Re: Rethinking bug fixing in stable branches (was: 15.2.0 is pushed...)

Lenz Grimmer <lgrimmer@xxxxxxxx> · Tue, 24 Mar 2020 14:47:00 +0100

Hi Sage, all,

On 2020-03-24 02:11, Sage Weil wrote:

> 15.2.0 is pushed to download.ceph.com.  I merged the last 2 doc PRs 
> targetted for octopus and merged that back into master.  Your PR targets 
> master, so that'll need a cherry-pick/backport to octopus branch after it 
> merges.

Now that the octopus branch is fully merged into master again: would
this be a good time to rethink our approach of applying bug fixes to
stable versions?

I'm aware that at this point we won't be able to fix the process for
older stable releases like nautilus or mimic, but maybe we could
consider introducing a different approach starting with octopus, to
slowly establish a new practice?

As a recap: if you currently need to fix a bug in a stable release
branch, you first need to create a pull request against the master
branch, that fixes the issue. Once the PR has been merged, it will
eventually be backported into the desired stable branch via a
"backport", using "git cherry-pick -x". This work is usually done by
members of the "backporting team" (but not necessarily so).

In my view, this approach has several disadvantages:

* Bugs are usually not fixed in the actual branch/release that a bug
  has been reported against, but rather on top of the current
  development branch (master).
* It can cause long delays for getting critical bugs fixed. The
  testing and reviewing in master takes time. Once the master PR has
  been merged, there's delay before the backport is being tackled,
  which again takes some time before it's queued up for
  reviewing/testing. From a downstream perspective, this is a very
  cumbersome process and causes long turnaround times for delivering
  fixes to customers for vendors that follow an "upstream first" policy.
* Performing backports from the development branch into the stable
  branch comes with a much higher risk of introducing regressions.
  If a fix is primarily developed and tested on master, it may still
  cleanly apply using "git cherry-pick". But the further the code bases
  diverge, it may be missing some related changes from other PRs that
  are required to make that change fully functional. This has already
  happened several times and isn't always captured by our tests right
  away.
* If a backport does not apply cleanly, it usually requires attention
  from the original developer, to help with resolving the merge
  conflicts or taking over the backport work.
* It generates a new changeset in the git history. Sure, by using "-x"
  flag, we preserve the reference to the original changeset, but we
  still apply the same change twice, working against git's built-in
  mechanisms of keeping track of changes and its ability to merge them
  automatically. Over time, the stable branch diverges more and more.

How can these challenges be addressed?

Starting with Octopus and going forward, I would like to propose the
following approach for submitting and merging bug fixes:

* For bugs that affect the octopus branch, the initial PR that fixes it
  targets the octopus branch first. The developer creates and tests
  their fix on a local copy of the octopus branch and first submits a PR
  against octopus when ready.
* Once the fix has been reviewed/tested and has been merged into the
  octopus branch, the developer performs a local merge of that fix into
  the master branch (or rather a local working branch of it), using the
  canonical "git merge" process. If a fix needs further modifications
  or amendments, they could be applied via follow-up commits on top of
  that merge changeset. Once the fix has been tested locally, a PR of
  that merge against the master branch is submitted, going through the
  usual testing and review process.
* In case this particular fix would still have to be applied against
  older stable branches, the usual backporting process would kick in at
  this state.

Benefits of this approach:

* First and foremost, a fix is developed and tested on the actual
  branch that is currently used by the community, by the developer who
  is most familiar with the component/code base.
* Turnaround times for bug fixes on stable releases can be drastically
  improved, making downstream consumers happy.
* Propagating fixes is done using git's native merge methods, avoiding
  the duplication of changesets that perform the same code changes.
* The deviation of the code base is limited to changes on the master
  branch, there are no additional changesets that are only contained in
  the stable branch.

I know this is completely against our current process and is a drastic
change in how things have been done around here for ages. But I'd like
to get your thoughts and concerns on this approach. Do you agree to my
reasoning? Where do you see obstacles that make this impractical? Would
it help if I'd visualize my proposal?

Looking forward to your feedback,

Lenz

-- 
SUSE Software Solutions Germany GmbH - Maxfeldstr. 5 - 90409 Nuernberg
GF: Felix Imendörffer, HRB 36809 (AG Nürnberg)

Attachment:
signature.asc

Description: OpenPGP digital signature
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx