Re: [PATCH v3] doc-diff: don't `cd_to_toplevel`

Johannes Schindelin <Johannes.Schindelin@xxxxxx> · Thu, 7 Feb 2019 22:42:51 +0100 (STD)

Hi Peff,

On Thu, 7 Feb 2019, Jeff King wrote:

> On Thu, Feb 07, 2019 at 04:41:57PM +0100, Johannes Schindelin wrote:
> 
> > > I think this can be limited to the tests that failed, which makes things
> > > much faster. I.e., we run the tests at the tip of topic X and see that
> > > t1234 fails. We then go back to the fork point and we just need to run
> > > t1234 again. If it succeeds, then we blame X for the failure. If it
> > > fails, then we consider it a false positive.
> > 
> > If you mean merge bases by fork points, I wrote an Azure Pipeline to do
> > that (so that I could use the cloud as kind of a fast computer), but that
> > was still too slow.
> > 
> > Even when there are even only as much as 12 merge bases to test (which is
> > the current number of merge bases between `next` and `pu`), a build takes
> > roughly 6 minutes on Windows, and many tests take 1 minute or more to run
> > (offenders like t7003 and t7610 take over 400 seconds, i.e. roughly 6
> > minutes), we are talking about roughly 1.5h *just* to test the merge
> > bases.
> 
> I was assuming you're testing individual topics from gitster/git here
> (which admittedly is more CPU in total than just the integration
> branches, but it at least parallelizes well).

Indeed. And there, I can usually figure out really quickly myself (but
manually) what is going wrong.

Hopefully we have Azure Pipelines enabled on https://github.com/git/git
soon, with PR builds that include Windows (unlike our current Travis
builds), so that contributors have an easier time to test their code in an
automated fashion.

I also have on my backlog a task to include `sparse` in the Azure
Pipelines jobs. That should take care of even more things in a purely
automated fashion, as long as the contributors look at those builds.

> So with that assumption, I was thinking that you'd just look for the
> merge-base of HEAD and master, which should give you a single point for
> most topics. For inter-twined topics there may be more merge bases, but
> I actually think for our purposes here, just testing the most recent one
> is probably OK. I.e., we're just trying to have a vague sense of whether
> the test failure is due to new commits or old.

Oh, I am typically looking only at the latest commits up until I hit a
merge commit. Usually that already tells me enough, and if not, a bisect
is really quick on a linear history.

I guess my dumb branch^{/^Merge} to find the next merge commit works, but
it is a bit unsatisfying that we do not have a more robust way to say
"traverse the commit history until finding a merge commit, then use that".

> I think Junio's suggestion to just pick some common release points would
> work OK in practice, too. It's possible that some other topic made it to
> master with a breakage, but in most cases, I think these sorts of
> failures are often more coarsely-grained (especially if Junio pays
> attention to the CI results before merging).

If Junio wants to experiment with that, sure, I'm all for it.

Ciao,
Dscho