Re: time needed to rebase shortend by using --onto?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 27, 2021 at 2:59 PM Uwe Kleine-König
<u.kleine-koenig@xxxxxxxxxxxxxx> wrote:
>
> Hello,
>
> On Wed, May 26, 2021 at 07:38:08AM -0700, Elijah Newren wrote:
> > On Wed, May 26, 2021 at 3:13 AM Uwe Kleine-König
> > <u.kleine-koenig@xxxxxxxxxxxxxx> wrote:
> > > I have a kernel topic branch containing 4 patches on top of Linux v5.4.
> > > (I didn't speak to the affected customer, so I cannot easily share the
> > > patch stack. If need be I can probably anonymize it or ask if I can
> > > publish the patches.)
> > >
> > > It rebases clean on v5.10:
> > >
> > >         $ time git rebase v5.10
> > >         Performing inexact rename detection: 100% (36806539/36806539), done.
> > >         Performing inexact rename detection: 100% (36806539/36806539), done.
> > >         Performing inexact rename detection: 100% (36806539/36806539), done.
> > >         Performing inexact rename detection: 100% (36806539/36806539), done.
> > >         Successfully rebased and updated detached HEAD.
> > >
> > >         real    3m47.841s
> > >         user    1m25.706s
> > >         sys     0m11.181s
> > >
> > > If I start with the same rev checked out and explicitly specify the
> > > merge base, the rebase process is considerably faster:
> > >
> > >         $ time git rebase --onto v5.10 v5.4
> > >         Performing inexact rename detection: 100% (36806539/36806539), done.
> > >         Performing inexact rename detection: 100% (36806539/36806539), done.
> > >         Performing inexact rename detection: 100% (36806539/36806539), done.
> > >         Performing inexact rename detection: 100% (36806539/36806539), done.
> > >         Successfully rebased and updated detached HEAD.
> > >
> > >         real    1m20.588s
> > >         user    1m12.645s
> > >         sys     0m6.733s

Note: In your original report you had rename detection and it clearly
took a significant amount of time...

> > >
> > > Is there some relevant complexity in the first invocation I'm not seeing
> > > that explains it takes more than the double time? I would have expected
> > > that
> > >
> > >         git rebase v5.10
> > >
> > > does the same as:
> > >
> > >         git rebase --onto v5.10 $(git merge-base HEAD v5.10)
> > >
> > > . (FTR:
> > >
> > >         $ time git merge-base HEAD v5.10
> > >         219d54332a09e8d8741c1e1982f5eae56099de85
> > >
> > >         real    0m0.158s
> > >         user    0m0.105s
> > >         sys     0m0.052s
> > >
> > > , 219d5433 is v5.4 as expected.
> >
> > That does seem surprising, though if an automatic gc completed between
> > the two commands that could certainly explain it.  If that theory is
> > correct, it would suggest that it'd be difficult for you to reproduce;
>
> This reproduces just fine. The repository is quite big and it is slow at
> times. With the same tree on a different machine, the rebase is quicker,
> but the factor 2 between the two different commands is visible there,
> too:
>
> uwe@taurus:~/gsrc/linux$ git checkout bc2e99c9c9e0d29494b1739624554e4f5f979d32
> HEAD is now at bc2e99c9c9e0 [...]
>
> uwe@taurus:~/gsrc/linux$ time git rebase v5.10
> warning: inexact rename detection was skipped due to too many files.
> warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command.
> warning: inexact rename detection was skipped due to too many files.
> warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command.
> warning: inexact rename detection was skipped due to too many files.
> warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command.
> warning: inexact rename detection was skipped due to too many files.
> warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command.
> Successfully rebased and updated detached HEAD.
>
> real    0m20.737s
> user    0m14.188s
> sys     0m3.767s
>
> uwe@taurus:~/gsrc/linux$ git checkout bc2e99c9c9e0d29494b1739624554e4f5f979d32
> HEAD is now at bc2e99c9c9e0 [...]
>
> uwe@taurus:~/gsrc/linux$ time git rebase --onto v5.10 v5.4
> warning: inexact rename detection was skipped due to too many files.
> warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command.
> warning: inexact rename detection was skipped due to too many files.
> warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command.
> warning: inexact rename detection was skipped due to too many files.
> warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command.
> warning: inexact rename detection was skipped due to too many files.
> warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command.
> Successfully rebased and updated detached HEAD.
>
> real    0m12.129s
> user    0m7.196s
> sys     0m3.141s
>
> (This is with a slightly newer git: 2.30.2-1 from Debian)

And here, there was no rename detection so this isn't the same thing
anymore.  You could try setting merge.renameLimit higher.

However, the 7-8 second difference (and the likely large differences
between 5.4 and 5.10) do suggest that Junio's hunch that fork-point
behavior being at play could be an issue in these two commands.

> Then I repeated the test with git 2.32.0-rc1 (wgit is just calling
> bin-wrappers/git in my git working copy):
>
> uwe@taurus:~/gsrc/linux$ wgit version
> git version 2.32.0.rc1
>
> uwe@taurus:~/gsrc/linux$ wgit checkout bc2e99c9c9e0d29494b1739624554e4f5f979d32
> HEAD is now at bc2e99c9c9e0 [...]
>
> uwe@taurus:~/gsrc/linux$ time wgit rebase v5.10
> warning: inexact rename detection was skipped due to too many files.
> warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command.
> warning: inexact rename detection was skipped due to too many files.
> warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command.
> warning: inexact rename detection was skipped due to too many files.
> warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command.
> warning: inexact rename detection was skipped due to too many files.
> warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command.
> Successfully rebased and updated detached HEAD.
>
> real    0m19.438s
> user    0m13.629s
> sys     0m3.299s
>
> uwe@taurus:~/gsrc/linux$ wgit checkout bc2e99c9c9e0d29494b1739624554e4f5f979d32
> HEAD is now at bc2e99c9c9e0 [...]
>
> uwe@taurus:~/gsrc/linux$ time wgit rebase --onto v5.10 v5.4
> warning: inexact rename detection was skipped due to too many files.
> warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command.
> warning: inexact rename detection was skipped due to too many files.
> warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command.
> warning: inexact rename detection was skipped due to too many files.
> warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command.
> warning: inexact rename detection was skipped due to too many files.
> warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command.
> Successfully rebased and updated detached HEAD.
>
> real    0m13.848s
> user    0m8.315s
> sys     0m3.182s
>
> So the surprise persists.

Yeah, with no rename detection, the newer git version isn't going to
make a bit of difference.

> > running again with either command would give you something closer to
> > the lower time both times.  Is that the case?  (Also, what's the
> > output of "git count-objects -v"?)
>
> After the above commands I have:
>
>         count: 3203
>         size: 17664
>         in-pack: 4763753
>         packs: 11
>         size-pack: 1273957
>         prune-packable: 19
>         garbage: 0
>         size-garbage: 0

So, not freshly packed, but not in need of an automatic gc either.

>         alternate: /home/uwe/var/gitstore/linux.git/objects

You've got an alternate?  How well packed is it?  (What does "git
count-objects -v" in that other repo show?)

>
> (On the repository I did this initially I have:
>
>         warning: garbage found: .git/objects/pack/pack-864148a84c0524073ed8c8aa1a76155d5c677879.pack.temp
>         warning: garbage found: /ptx/src/git/linux.git/objects/pack/tmp_pack_X9gHnq
>         count: 2652
>         size: 14640
>         in-pack: 2117015
>         packs: 8
>         size-pack: 574167
>         prune-packable: 856
>         garbage: 2
>         size-garbage: 1114236
>         alternate: /ptx/src/git/linux.git/objects
>
> (Is the garbage a reason this is so slow? Can I just remove the two
> files pointed out?)

If there isn't some still-running git operation that is fetching and
writing to these files, then yes they can be cleaned out.  I doubt
they'd make too much of a difference, though.  I was more curious if
you went from say 10000 loose objects to ~0, or from 50+ packs down to
1 between operations due to an automatic gc completing.

> > I'd love to try this with git-2.32.0-rc1 (or even my not-yet-upstream
> > patches that optimize even further) with adding "--strategy=ort" to
> > your rebase command to see how much of a timing difference it makes.
> > Any chance the patches could either be published, or you could retry
> > with git-2.32.0-rc1 and add the --strategy=ort command line option to
> > your rebase command(s)?
>
> With --strategy=ort added I have:
>
> uwe@taurus:~/gsrc/linux$ time wgit rebase --strategy=ort v5.10
> Successfully rebased and updated detached HEAD.
>
> real    0m19.202s
> user    0m12.724s
> sys     0m2.961s
>
> [...]
>
> uwe@taurus:~/gsrc/linux$ time wgit rebase --strategy=ort --onto v5.10 v5.4
> Successfully rebased and updated detached HEAD.
>
> real    0m12.395s
> user    0m6.638s
> sys     0m3.284s
>
> So the warnings about inexact rename detection don't appear and it's a
> bit faster, but I still see the timing difference between these two
> commands.

Right, this says that --strategy=ort WITH rename detection is as fast
as the default --strategy=recursive WITHOUT rename detection.

It's not a fair comparison (you'd need to set merge.renameLimit higher
and re-run the cases where you had warnings), but is interesting
nonetheless.  It basically suggests that rename detection comes for
free with the ort strategy.

> I assume you are still interested in seeing this branch? I think
> anonymising it shouldn't be so hard, the patches are not so big. I'll
> modify the branch to make it shareable and assuming the problem still
> reproduces with it will share it with you.

Thanks.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux