notes, was Re: What's cooking in git.git (Jul 2009, #01; Mon, 06)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Tue, 7 Jul 2009, Shawn O. Pearce wrote:

> Junio C Hamano <gitster@xxxxxxxxx> wrote:
> > "Shawn O. Pearce" <spearce@xxxxxxxxxxx> writes:
> > >> 
> > >> > * jh/notes (Sat May 16 13:44:17 2009 +0200) 5 commits
> > >
> > > I was thinking about this the other day.  We could use a hash of the 
> > > commit timestamp as the top level directory.  E.g. if we take the 
> > > commit time of the commit and convert it to a date string, we could 
> > > make the note path e.g.:
> > >
> > >   YYYY/MM/COMMITSHA1
> > 
> > Is the idea to make the tree object we need to scan for that 
> > particular SHA-1 hash smaller?
> 
> No, the idea was to avoid needing to create a massive hash of all
> commit notes just to answer `git log -10` on the current branch.
> I remember that was a concern last time we were talking about this.
> By putting the notes under a timestamped path we can scan only a
> small percentage of the notes before we have sufficient data to
> output the first few commits.

The problem is that you end up with possibly _very_ large root trees in 
the notes, and the whole idea was to reduce the root tree, and load the 
subtrees only on demand.  That way, outputting a couple of commits (or a 
single one) is still cheap.

To recapitulate mugwump's idea: allow not only blobs in the root tree of 
the notes, but also tree objects.  That allows for fan-out -- if you want 
it.

Example:

Commit 0123456789abcdef0123456789abcdef01234567 can be in 
refs/notes:0123456789abcdef0123456789abcdef01234567 or in
refs/notes:01/23456789abcdef0123456789abcdef01234567 or in
refs/notes:01/23/456789abcdef0123456789abcdef01234567 or in

My idea was to let shorter paths (in terms of characters used) precedence 
(and longer prefixes).  There was also the idea to always show all of 
them, but that would not appeal to me from a performance angle.

> > If so, I am not sure how it would help over another approach of say 
> > taking the first four hexdigits from the SHA-1 to use as the initial 
> > fan-out YYYY, then two hexdigits for the secondary fan-out MM.
> 
> See above, the idea is to avoid scanning all notes at once on startup.  
> SHA-1 is bad at this as a fanout because it is too good at uniform 
> distribution of the names.

The problem is the unpacking of the tree object.

> > Besides, trees and blobs cannot be annotated with that approach.
> 
> True.  But I didn't realize that was a goal.  :-|

It would be a nice-to-have, I guess.

Ciao,
Dscho

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]