Re: Streamlining backports

Ilya Dryomov <idryomov@xxxxxxxxx> · Thu, 22 Apr 2021 22:22:34 +0200

On Thu, Apr 22, 2021 at 8:16 PM Nathan Cutler <ncutler@xxxxxxxx> wrote:
>
> Hi Ernesto:
>
> I fully concur with what Loic wrote, and just to add to that:
>
> Years ago, a lead developer gave us some general principles that all backports
> should ideally follow. These are codified:
>
> https://github.com/ceph/ceph/blob/master/SubmittingPatches-backports.rst#general-principles
>
> but I think it's worth quoting them here. Each backport is supposed to specify:
>
> 1. what bug it is fixing
> 2. why this fix is the minimal way to do it
> 3. why does this need to be fixed in <release>
>
> Now, how good we are, as a project, at adhering to these principles is already
> pretty questionable. How will introducing more automation help us improve?
> Or maybe we should change the principles to say: "the Ceph project encourages
> commits to be backported from master indiscriminately without any justification
> or risk analysis"?
>
> I guess we would not change the stated principles as suggested, but I still
> think that, when deciding what kind of automation to introduce, we should ask
> ourselves questions like:
>
> How stable are our "stable" releases?
> Do we value stability over features, or vice versa?
> How often does the drive to backport stuff introduce regressions?
> How to gauge the riskiness of a given backport?
> Do the answers to these questions vary from one component to another, or can
> answers be formulated on a project-wide basis?
>
> Backporting stuff is necessary, but also risky. Automation, I think, can
> actually increase the frequency with which we unintentionally introduce
> regressions into stable releases because
>
> * automation, if successful, might tend to increase the overall number of
>   backports
> * automation cannot provide any justification or estimate of risk, so it might
>   also increase the number of backports that lack any justification or
>   estimation of risk
>
> I'm skeptical of the value of labels, but I do think it would be useful to have
> Jenkins jobs checking:
>
> 1. whether the commits being cherry-picked are really in master
> 2. whether the master commits cherry-picked cleanly
> 3. whether the backport PR contains the same number of commits as the master PR
>
> These couldn't be "mandatory" checks because there are plenty of exceptions, but
> I think having this information there would be useful for reviewers (but I don't
> review backports so I don't know for sure).

Your 1. and 2. and 3. is exactly what I meant by "fully clean
backports" in my earlier reply.  Whether it's a Github Action or a
Jenkins job, whether the output is a set of labels or a comment in
the PR, this information is very useful to someone who is reviewing
backport PRs (especially if you do it in batches) because it allows
to concentrate on the actual changes and not worry about checking
the boilerplate.

Even just "backport:no-conflicts" and "backport:has-conflicts" labels
or comments would be a great help because sometimes people just forget
to uncomment "Conflicts" in the commit message and something that
conflicted always deserves a second look even if the resolution was
trivial.  There are cases when a clean cherry-pick needs adjustments,
but those are rare and hopefully the check can catch those (e.g. if
the backport modifies a file that wasn't modified in the original or
if the final diffstat is too different from the original).

Otherwise I share your and Loic's concerns about further automating the
backports themselves.  Many projects use bots for this which results to
very aggressive and often careless backporting because all it takes is
a comment addressed to a bot or a set label.  I agree that we don't want
to go there.

Thanks,

                Ilya
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx