Re: [PATCHv5 00/14] git notes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Tue, 8 Sep 2009, Johan Herland wrote:

> On Tuesday 08 September 2009, Johannes Schindelin wrote:

> > I can see that some people may think that date-based fan-out is the 
> > cat's ass, but I have to warn that we have no idea how notes will be 
> > used,
> 
> I don't agree. Although we will certainly see many more use cases for 
> notes, I believe that the vast majority of them can be placed in one of 
> two categories:

My experience with Git is that having beliefs how my work is used was a 
constant source of surprise.

> > - I find the restriction to commits rather limiting.
> 
> I see your point, but I don't agree until I see a compelling case for 
> annotating a non-commit.

My point is that it is too late by then, if you don't allow for a flexible 
and still efficient scheme.

> > - most of the performance difference between the date-based and the 
> >   SHA-1 based fan-out looks to me as if the issue was the top-level 
> >   tree. Basically, this tree has to be read _every_ time _anybody_ 
> >   wants to read a note.
> 
> Not sure what you're trying to say here. The top-level notes tree is 
> read (as in fill_tree_descriptor()) exactly _once_. After that, it is 
> cached by the internal data structure (until free_commit_notes() or 
> end-of-process).

By that reasoning, we do not need any fan-out scheme.

Keep in mind: reading a large tree object takes a long time.  That's why 
we started fan-out.  Reading a large number of tree objects also takes a 
long time.  That's why I propagated flexible fan-out that is only read-in 
on demand.

> > But I think that having a dynamic fan-out that can even put blobs into 
> > the top-level tree (nothing prevents us from doing that, right?)
> 
> Well, the "flexible" code does add the new requirement that all entries 
> in a notes (sub)tree object must follow the same scheme, i.e. you 
> cannot have:
> 
>   /12/34567890123456789012345678901234567890
>   /2345/678901234567890123456789012345678901
> 
> but you can have
> 
>   /12/34567890123456789012345678901234567890
>   /23/45/678901234567890123456789012345678901

Umm, why?  Is there any good technical reason?

> > The real question for me, therefore, is: what is the optimal way to 
> > strike the balance between size of the tree objects (which we want to 
> > be small, so that unpacking them is fast)  and depth of the fan-out 
> > (which we want to be shallow to avoid reading worst-case 39 tree 
> > objects to get at one single note).
> 
> s/39/19/ (each fanout must use at least 2 chars of the 40-char SHA1)

That is another unnecessary restriction that could cost you dearly.  Just 
think what happens if it turns out that the optimal number of tree items 
is closer to 16 than to 255...

> Yes, the challenge is indeed striking the correct balance. I believe 
> that the notes code should be taught to write (and automatically 
> re-organize) the notes tree so that it is optimized for the current 
> collection of notes.

Of course!  I never thought that the user should be allowed to make the 
choice how to organize the notes.

Ciao,
Dscho

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]