Re: Git Notes idea.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 17, 2008 at 3:38 AM, Jeff King <peff@xxxxxxxx> wrote:
> On Tue, Dec 16, 2008 at 12:43:55PM -0600, Govind Salinas wrote:
>
>> I was thinking I would do my first implementation in pyrite and if I find
>> that it works well I will port it.
>
> OK, though your performance will probably suck unless you dump the notes
> tree into a local hash at the beginning of your program. And looking up
> every commit's note during revision traversal is one of the intended
> uses (e.g., decorating git-log output, or filtering commits based on a
> particular note).

Yes, I was thinking that this is the natural way to do things, save that I
would be lazy loading the trees into a cache instead of caching them
all up front.  This is one of the reasons that I think the fan out will
help.

> And as Dscho mentioned, most of what you need is already there in C.
> You are welcome to implement whatever you want in pyrite, of course, but
> there is a desire to have this accessible to the revision traversal
> machinery. And that means if you want your version in pyrite to be
> compatible with what ends up in git, the data structure design needs to
> be suitable for both.

Yes, I completely agree that I want it to have the same scheme as what
git will use.  That is the reason I posted this here.  Since no method
has been formally accepted (checked into master) I wanted to see if
I could nudge things along.  I wasn't aware that you and Dscho had
a (very similar) plan.  Please, if you guys are decided on the format
then I can just go off and start working on it.  But it sounds like there
isn't consensus yet.

<snip>
>> root/
>>      12/
>>          34567890123456789012345678901234567890/
>>              abcdef7890123456789012345678901234567890
>>
>> This way all the notes are attached to a tree, so that gc won't
>> think they are unreferenced objects.
>
> But you have lost the ordering in your list, then, since they will not
> be ordered by sha1 of the note contents. I don't know if you care. The
> second sha1 is pointless, anyway, since nobody will know that number as
> a reference; why not just name them monotonically starting at 1?

In a later mail I suggested that this be the type or name of the note.  Which
I hear is similar to what you suggested.

> One of the things I don't like about having several notes is that it
> introduces an extra level of indirection that every user has to pay for,
> whether they want it or not. If a note can be a blob _or_ a tree, then
> those who want to use blobs can reap the performance benefit. Those who
> want multiple named notes in a hierarchy can pay the extra indirection
> cost.
>
> I haven't measured how big a cost that is (but bearing in mind that we
> might want to do this lookup once per revision in a traversal, even one
> extra object lookup can have an impact).

That seems reasonable.

> I'm also still not convinced the fan-out is worthwhile, but I can see
> how it might be. It would be nice to see numbers for both.
>
<snip>
> Also, how large do you expect the list to be under reasonable
>> circumstances.
>
> As many notes as there are commits is my goal (e.g., it is not hard to
> imagine an automated process to add notes on build status). Ideally, we
> could handle as many notes as there are objects; I see no reason not to
> allow annotating arbitrary sha1's (I don't know if there is a use for
> that, but the more scalable the implementation, the better).

Ah, that is in line with what I was thinking as well.

>> >  Some thoughts from me on naming issues:
>> >  http://article.gmane.org/gmane.comp.version-control.git/100402
>>
>> On naming.  I strongly support a ref/notes/sha1/sha1 approach.  If
>> having a type to the note is important, then perhaps the first line of
>> a note could be considered a type or a set of "tags".  This way you
>
> I don't think we are talking about the same thing. What I mean by naming
> is "here is a shorthand for referring to notes" that is not necessarily
> coupled with the implementation. That is, I would like to do something
> like:
>
>  git log --notes-filter="foo:bar == 1"
>
> and have that "foo:bar" as a shorthand on each commit for:
>
>  refs/notes/foo:$COMMIT/bar
>
> Without a left-hand side (e.g., "bar"), we get:
>
>  refs/notes/default:$COMMIT/bar
>
> Or without a right-hand side (e.g., "foo:"), we get:
>
>  refs/notes/foo:$COMMIT
>

I like the overall plan, but I would suggest that --notes[=default] and
--note-type=whatever would be a little friendlier and less error prone.

Thanks for helping me think through this.

-Govind
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux