Re: [PATCH 00/83] libify apply and use lib in am

Christian Couder <christian.couder@xxxxxxxxx> · Mon, 25 Apr 2016 11:57:44 +0200

On Mon, Apr 25, 2016 at 11:02 AM, Duy Nguyen <pclouds@xxxxxxxxx> wrote:
> On Sun, Apr 24, 2016 at 8:33 PM, Christian Couder
> <christian.couder@xxxxxxxxx> wrote:
>> Sorry if this patch series is a bit long. I can split it into two or
>> more series if it is prefered.
>
> I suspect you deliberately made the series long just to show off how
> good new git-rebase is ;-)

Yeah, and `git am` too :-)

>> Performance numbers:
>>
>>   - A few days ago Ævar did a huge many-hundred commit rebase on the
>>     kernel with untracked cache.
>>
>> command: git rebase --onto 1993b17 52bef0c 29dde7c
>>
>> Vanilla "next" without split index:                1m54.953s
>> Vanilla "next" with split index:                   1m22.476s
>> This series on top of "next" without split index:  1m12.034s
>> This series on top of "next" with split index:     0m15.678s
>
> I was a bit puzzled why split-index helped so much. It shouldn't have
> in my opinion unless you paired it with index-helper, to cut down both
> read and write time. So I ran some tests. Long story short, I think we
> can achieve this gain (and a little more) even without split-index.

Yeah, perhaps. For now though Ævar and myself would be happy to just
have a config option for split-index :-)

The other performance numbers I mentioned show that now the `git am`
part of a 13 commit long rebase is reduced from 58% to 19% of the
whole rebase time. So further improvements in speeding up `git am`
could only make such a rebase at most 19% faster.

> I ran my measurement patch [1] with and without your series used this
> series as rebase material. Without the series, the picture is not so
> surprising. We run git-apply 80+ times, each consists of this sequence
>
> read index
> write index (cache tree updates only)
> read index again
> optionally initialize name hash (when new entries are added, I guess)
> read packed-refs
> write index
>
> With this series, we run a single git-apply which does
>
> read index (and sharedindex too if in split-index mode)
> initialize name hash
> write index 80+ times
>
> This explains why I guessed it wrong: when you write only, not read
> back, of course index-helper is not required. And with split-index you
> only write as many entries as you touch (which is usually a small
> number compared to worktree's size), so writing 80+ times with
> split-index is a lot faster than write 80+ times with whole index.

Yeah, I tried to explain in the cover letter and in the last patch why
this series enables split-index to give great results, but I think you
are much better at explaining it.

> But why write so many times when nobody reads it? We only need to
> write before git-apply exits,

You mean `git am` here I think.

> either after successfully applying the
> whole series, or after it stops at conflicts, and maybe even at die()
> and SIGINT. Yes if git-apply segfaults,

Here too.

> then the index update is lost,
> but in such a case, it's usually a good idea to restart fresh anyway.
> When you only write index once (or twice) it won't matter if
> split-index is used.

Yeah I agree, but it would need further work, that can be done after
this series is merged.
And I am not sure if the potential gains on a typical rebase would be worth it.

Thanks,
Christian.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html