On Sat, Nov 08, 2008 at 04:41:25AM +0100, Johan Herland wrote: > > * Discussion on notes > > Can someone elaborate on this? AFAIK, notes have popped up on this list > often enough that I'm convinced it would be a _really_ useful feature. The > only drawback I was aware of, was the lack of an efficient implementation, > but then Jeff comes out of the blue and posts some interesting numbers [1] > a week or so ago. Does this mean there are no remaining obstacles? > > [1]: http://article.gmane.org/gmane.comp.version-control.git/99415 The discussion was along the lines of "here are some more cool things we could do, if we had notes." I don't remember the specifics of the cool things, but they were related to annotating patches with review information. Shawn can probably elaborate more. That led to a "notes as a tree are nice, but too slow because looking up a tree entry is linear" (and obviously you do a ton of lookups in the notes tree during "git log"). Dscho had posted an implementation with a persistent notes cache long ago. Since I failed to actually look at that, I started on a slightly different approach, which is simply doing an in-memory hash table to speedup the notes tree. And those are the numbers and patch I posted. My eventual plan was to re-work Dscho's patches with this performance approach. But it is not at the top of my queue, so if somebody else wanted to pick it up, I would be very happy. Everything I have done so far is in the post you referenced. The only other thing I remember discussing was notes namespaces. The two obvious approaches are: 1. a separate ref for each notes namespace, with each note ending up a blob in a tree. So you might have refs/notes/acked-by:$SHA1 as a blob. 2. one notes ref, with the notes tree pointing a sub-tree that has named entries, one for each note type. So you might have refs/notes:$SHA1/acked-by as a blob. The advantage of '1' is that it keeps your different note types separate, which means it is easy to distribute one type but not the other. The advantage of '2' is that I do one lookup per-commit, and then I can see all of the notes, which keeps performance nice when you want to annotate with several note types. After some discussion, I think Dscho and I came to the conclusion that supporting both might be desirable. And it should be pretty straightforward. You can just have multiple note refs (but default to a "main" one), and within each one, either point to a tree or blob (and we will see which and use it appropriately). And then depending on which notes the user wants, they can refer to them appropriately. My suggestion for naming (and this wasn't discussed earlier, so Dscho has not endorsed this) would be something like "$X:$Y", which would mean "to get the notes for $SHA1, look at the tree in refs/notes/$X for the file $SHA1/$Y". If $Y is empty, then expect $SHA1 to be a blob (if it's a tree, maybe look at $SHA1/default). If "$X" is empty, then use "refs/notes/default". If there is no colon, assume we have "$Y". So you could have a bunch of notes in some "main" namespace just by calling them some name; without a name, you get some "default" note. But if you wanted a separate database (say, for SVN information), you could use "svn:" or "svn:name". -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html