Re: [PATCHv5 00/14] git notes

Junio C Hamano <gitster@xxxxxxxxx> · Tue, 08 Sep 2009 13:31:12 -0700

Johannes Schindelin <Johannes.Schindelin@xxxxxx> writes:

> Hi,
>
> On Tue, 8 Sep 2009, Johan Herland wrote:
>
>> Algorithm / Notes tree   git log -n10 (x100)   git log --all
>> ------------------------------------------------------------
>> next / no-notes                4.77s              63.84s
>> 
>> before / no-notes              4.78s              63.90s
>> before / no-fanout            56.85s              65.69s
>> 
>> 16tree / no-notes              4.77s              64.18s
>> 16tree / no-fanout            30.35s              65.39s
>> 16tree / 2_38                  5.57s              65.42s
>> 16tree / 2_2_36                5.19s              65.76s
>> 
>> flexible / no-notes            4.78s              63.91s
>> flexible / no-fanout          30.34s              65.57s
>> flexible / 2_38                5.57s              65.46s
>> flexible / 2_2_36              5.18s              65.72s
>> flexible / ym                  5.13s              65.66s
>> flexible / ym_2_38             5.08s              65.63s
>> flexible / ymd                 5.30s              65.45s
>> flexible / ymd_2_38            5.29s              65.90s
>> flexible / y_m                 5.11s              65.72s
>> flexible / y_m_2_38            5.08s              65.67s
>> flexible / y_m_d               5.06s              65.50s
>> flexible / y_m_d_2_38          5.07s              65.79s
>
> It's good to see that the no-notes behaves roughly like baseline.
>
> I can see that some people may think that date-based fan-out is the cat's 
> ass,

Actually, my knee-jerk reaction was that 4.77 (next) vs 5.57 (16tree with
2_38) is already a good enough performance/simplicity tradeoff, and 5.57
vs 5.08 (16tree with ym_2_38) probably does not justify the risk of worst
case behaviour that can come from possible mismatch between the access
pattern and the date-optimized tree layout.

But that only argues against supporting _only_ date-optimized layout.

Support of "flexible layout" is not that flexible as its name suggests;
one single note tree needs to have a uniform fanout strategy.  But it is
not unusably rigid either; you only need to be extra careful when merging
two notes trees.  We can leave the heuristics to choose what the optimum
layout to later rounds.

> - I find the restriction to commits rather limiting.

Yeah, we would not want to be surprised to find many people want to
annotate non-commits with this mechanism.

> - most of the performance difference between the date-based and the SHA-1 
>   based fan-out looks to me as if the issue was the top-level tree.  
>   Basically, this tree has to be read _every_ time _anybody_ wants to read 
>   a note.

A comparison between 'next' and another algorithm that opens the top-level
notes tree object and returns "I did not find any note" without doing
anything else would reveal that cost.  But when you are doing "log -n10"
(or "log --all"), you would read the notes top-level tree once, and it is
likely to be cached in the obj_hash[] (or in delta_base cache) already for
the remaining invocations, even if notes mechanism does not do its own
cache, which I think it does, no?

> - I'd love to see performance numbers for less than 157118 notes.  Don't 
>   get me wrong, it is good to see the worst-case scenario in terms of 
>   notes/commits ratio.  But it will hardly be the common case, and I 
>   very much would like to optimize for the common case.
>
>   So, I'd appreciate if you could do the tests with something like 500 
>   notes, randomly spread over the commits (rationale: my original 
>   understanding was that the notes could amend commit messages, and that 
>   is much more likely to be done with relatively old commits that you 
>   cannot change anymore).

Hmph, is that a typical use case?  How does it relate to CC's object
replacement mechanism?

Also Gitney talked about annotating commits in the code-review thing.
What's the expected notes density and distribution in that application?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html