Re: [PATCHv5 00/14] git notes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tuesday 08 September 2009, Johan Herland wrote:
> On Tuesday 08 September 2009, Johannes Schindelin wrote:
> > On Tue, 8 Sep 2009, Johan Herland wrote:
> > > Algorithm / Notes tree   git log -n10 (x100)   git log --all
> > > ------------------------------------------------------------
> > > next / no-notes                4.77s              63.84s
> > >
> > > before / no-notes              4.78s              63.90s
> > > before / no-fanout            56.85s              65.69s
> > >
> > > 16tree / no-notes              4.77s              64.18s
> > > 16tree / no-fanout            30.35s              65.39s
> > > 16tree / 2_38                  5.57s              65.42s
> > > 16tree / 2_2_36                5.19s              65.76s
> > >
> > > flexible / no-notes            4.78s              63.91s
> > > flexible / no-fanout          30.34s              65.57s
> > > flexible / 2_38                5.57s              65.46s
> > > flexible / 2_2_36              5.18s              65.72s
> > > flexible / ym                  5.13s              65.66s
> > > flexible / ym_2_38             5.08s              65.63s
> > > flexible / ymd                 5.30s              65.45s
> > > flexible / ymd_2_38            5.29s              65.90s
> > > flexible / y_m                 5.11s              65.72s
> > > flexible / y_m_2_38            5.08s              65.67s
> > > flexible / y_m_d               5.06s              65.50s
> > > flexible / y_m_d_2_38          5.07s              65.79s

[snip]

> > - I'd love to see performance numbers for less than 157118 notes.
> > Don't get me wrong, it is good to see the worst-case scenario in
> > terms of notes/commits ratio.  But it will hardly be the common case,
> > and I very much would like to optimize for the common case.
> >
> >   So, I'd appreciate if you could do the tests with something like
> > 500 notes, randomly spread over the commits (rationale: my original
> > understanding was that the notes could amend commit messages, and
> > that is much more likely to be done with relatively old commits that
> > you cannot change anymore).
> 
> Ok. I will try to test that.

Here are the results of the 500-notes-in-kernel-repo test:

Algorithm / Notes tree   git log -n10 (x100)   git log --all

next / no-notes                 4.83s             64.78s

before / no-notes               4.84s             64.76s
before / no-fanout              4.98s             64.89s

16tree / no-notes               4.84s             64.61s
16tree / no-fanout              4.92s             64.68s
16tree / 2_38                   4.85s             64.45s
16tree / 2_2_36                 4.85s             64.63s

flexible / no-notes             4.84s             64.82s
flexible / no-fanout            4.91s             65.01s
flexible / 2_38                 4.85s             64.93s
flexible / 2_2_36               4.85s             64.63s
flexible / ym                   4.83s             64.63s
flexible / ym_2_38              4.86s             64.72s
flexible / ymd                  4.91s             64.74s
flexible / ymd_2_38             4.91s             64.56s
flexible / y_m                  4.86s             64.76s
flexible / y_m_2_38             4.86s             64.71s
flexible / y_m_d                4.86s             64.73s
flexible / y_m_d_2_38           4.84s             64.50s

I don't like the noise level in the second column ('git log --all'). Then 
again, I don't find that column very interesting (it's mostly there to 
verify that we don't have any abysmal worst-case behaviours in the notes 
code).

The first column is fairly nice and tidy, though. At a first glance it shows 
pretty much the same results as the 157000-notes table previously posted. 
Obviously the abysmal performance of no-fanout is gone (500 notes in a 
single tree object is not _that_ bad), although a 2/38-fanout is still a 
better choice for 500 notes (but 2/2/36 does not provide any additional 
improvement).

>From this we can start to guess that the threshold for moving from no fanout 
to 2/38 is somewhere below 500 notes, while the theshold for moving from 
2/38 to 2/2/36 is between 500 and ~157000 notes (probably much closer to 
157000 than to 500; I wouldn't be surprised if ~256 entries per level turns 
out to be good a threshold).

The date-based fanout performs on par with the SHA1-based fanout, although 
it's hard to say anything conclusively when the numbers are as close as 
this. However, the ymd and ymd_2_38 fanout probably show signs of too much 
overhead (too many levels) at only 500 notes. This is not surprising.

My gut feeling tells me that moving from 'no-fanout' to either '2_38' or 
'ym' is a good idea at ~256 notes. Then, if we went with '2_38', we'd have 
to switch to '2_2_36' at ~64K notes (i.e. when each /38 level reaches ~256 
notes) However, it seems that with 'ym', we could stick with it for much 
longer before having to consider switching to a different fanout alternative 
(probably 'ym_2_38' or 'y_m_d').


Have fun! :)

...Johan

-- 
Johan Herland, <johan@xxxxxxxxxxx>
www.herland.net
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]