Re: avoiding incomplete backports

Nathan Cutler <ncutler@xxxxxxx> · Thu, 26 Jul 2018 05:59:51 +0200

When I first got involved in backporting, IIRC we had a "no feature 
backports unless expressly approved by lead" policy.

Over time this seems to have morphed towards the opposite extreme (no 
holds barred, backport whatever you want).

Obviously, that policy was untenable in practice, but maybe it could be 
tweaked into something viable?

I'm not sure if the "feature/bugfix" dichotomy always works. Some 
features are not high-risk, while some bugfixes *are*. Still, by and 
large, it's the feature backports that bite us, so I agree with Sage 
that these should get closer attention from the start.

One thing to keep in mind throughout all of this is that we are 
backporting from master to the two most recent stable releases - let's 
call them "S" and "S-1". For example, today, mimic is "S" and luminous 
is "S-1".

S-1 is arguably the more important of the two, since it will always have 
a larger installed base. Changes to master are often motivated by a 
desire to get something into S-1.

Now, S-1 is the older, more mature, and hopefully more stable of the two 
Thus, the "bar" for accepting feature backports into S-1 should be 
higher than for S.

Yet, the current backporting workflow allows just about anyone to:

1. open a tracker ticket, mark it "Backport: mimic, luminous"
2. open a master PR
3. once the master PR is merged, open the backport PRs
4. those get merged in due course, unless they break integration test suites

In other words, nobody is *required* to say "wait, this is going into 
S-1, needs closer examination, is it really needed, what are the risks, 
etc."

How could the process be tweaked to address this? It's hard because all 
the solutions I can think of involve making it more difficult to get 
stuff into S-1.

One idea: instead of taking a string of major release names, the 
"Backport" field in Redmine could be turned into a binary on/off switch 
meaning "Backport to S". Backports to "S-1" could only be undertaken 
after the feature/bugfix reached S. (Instead of cherry-picking from 
master, for S-1 we would cherry-pick from S.)

Nathan

On 07/22/2018 04:03 PM, Sage Weil wrote:
Backports are always hard, but I think there are two main categories of
reasons why they can break things:

  (1) is the pattern we saw with 12.2.6, where a feature is introduced with
a bug, a bugfix follows a bit later when it is discovered and fixed, but
the feature backport doesn't realize there is a later fix.
  (2) is when a feature is backported completely but there is some
functional second order interaction with the rest of the system that
prevents it from working correctly.

The latter scenario will always require either a developer who is paying
close attention or a lot of testing to prevent, which is why we should
always be very mindful with backports, especially non-bugfix ones, and
especially ones in complicated code.  The first case, though, is IMO a
failure in tooling and process.

A few possible solutions come to mind:

1- Always reference the commit that introduced the bug/regression in the
bugfix commit.  Then, when backporting, always search git for the sha1 of
the patch being cherry-pick -x'd.  This should turn up the follow-on fix.
(Perhaps some tooling could streamline this?)

2- Always mention the follow-on fix as a comment on the pull request that
originally introduced the new code/feature.  This requires checking the
original pull request when doing the backport.

3- Always open a tracker ticket, even for recent, just-introduced bugfixes
in master.  Reference that... somewhere?

I think the core problem is that when doing feature backports there is not
always a (feature) ticket or other single reference to look at that
captures everything about that feature.  Even #2 and using the PR doesn't
seem like a complete solution because often there are multiple pull
requests to introduce a feature.

It seems like there are two main options: lean on the git history, or lean
on external feature tickets.

I'm worried about the latter because we frequently have a series of
incremental pieces for a single feature, or related features that build on
one another, so there isn't always a 1:1 mapping between the PRs and a
(hypothetic) feature ticket.  We also don't always maintain feature
tickets, so this would be a net increase of process overhead.

This makes me lean toward leaning on git, and being more pedantic about
ensure any bug fix references the git commit where the issue was
reproduced.  The kernel uses a simple

  Fixes: <sha1>

tag in commits.  With that, I think, the missing piece is some
tooling/help to make sure we pay attention to these tags.  I see two
scenarios:

  - You are backport foo and need to notice commit bar that has a Fixes:
foo line.
  - You already backported foo.  Later you commit bar with Fixes: foo, and
need to noticed that foo was already backported and the backport branch
also needs bar.

What do you think?
sage

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Nathan Cutler
Software Engineer Distributed Storage
SUSE LINUX, s.r.o.
Tel.: +420 284 084 037
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html