On Tue, Dec 16, 2008 at 02:15:47AM -0600, Govind Salinas wrote: > I was thinking about possible ideas for my little pet project and I > had and idea for way to tack on notes to a commit, or any object > really. I know that the idea has been flying around for a long time > but there has never been any implementation or a concept that people > liked enough to use (unless I have missed something). I think you look at the previous suggestions, because yours is very similar. Which is good, I think, because the current status is that the design is good, but nobody has gotten around to working on it yet. So maybe you can fix that. :) > .git/refs/notes contains a tree-id (assuming that using a tree-id > will not cause any problems, otherwise a commit object can be used. > it does not *need* a history, but it *could* have one). That is the same as the current proposal, except: - the proposal is to use a commit, so your notes are version-controlled - I have suggested supporting multiple note "bases" in the refs/notes namespace. This would allow you to share some notes but not others (e.g., if you had some automated notes related to a build/test system, you might not want to mix those with your human-written notes). > That tree has a structure similar to the layout of .git/objects, where > it is 2 letter subdirectories for the notes objects. I don't think this has been suggested yet, but I'm not sure it is a good idea. The usual reason for this split is that many filesystems handle large directories badly; that isn't a problem here. It does reduce the size of the resulting tree objects when a note is modified (we make updates to two smaller trees instead of one big tree). I don't know if this really matters all that much, since the trees will end up deltified in a pack anyway. And it does make the implementation slightly less simple, since we have to deal with two levels of trees. > Given a git object (commit, tree, blob, tag), use its sha as the > path/filename in this tree. > If I have a commit 1234567890123456789012345678901234567890 then > the notes tree will have a file > 12/34567890123456789012345678901234567890 > > That file has a list of sha1s (one per line). These shas are object > IDs for blobs that have the notes or whatever that you want attached > to the item. This is slightly different than the current proposal. You are proposing that each item have a "list of notes". My thinking was to have "named notes" using a tree for each entry full of blobs. So you could look up the "foo" note for a given commit, but that note would be a single scalar (which could, of course, be interpreted according to its name). > I think you get the idea. When looking up an item, it should be > fairly easy to have the notes tree and subtrees available for doing > lookups. And as far as I know stuff under .git/refs can be It is easy, but it's slow because we have to do a linear search in the (potentially huge) notes tree. And that's what held up the initial implementation. I did a proof-of-concept a month or so ago that pre-seeded an in-memory hash using the tree contents and got pretty reasonable performance. > pushed/pulled even if its not under heads or remotes or tags using > already existing machinery. I am not sure, but I think that would > satisfy gc operations as well. Also, these trees and blobs never have > to be put in the working directory. Right, though I think one of the benefits of this approach is that you _could_ do a checkout on the notes tree if you wanted to do very flexible editing. > Does this sound like something that is workable? I thought it might > appeal since it uses only features that are already present. Yes, it sounds workable, though if you diverge from what has already been discussed, I think you should make an argument about why your approach is better. > This could be extended so that you have different sets of notes under > .git/refs/notes/<my note set> or whatever. So that you can have some > notes you keep private and some that you publish or whatever. Oops, I should have read your whole mail. Yes, that is a good idea. :) For reference, here are the previous discussions that I think are relevant: Johan Herland's original notes proposal (which I think is largely dead, replaced by the one below): http://thread.gmane.org/gmane.comp.version-control.git/46770 Johannes Schindelin's notes proposal (which is more or less the current proposal, but I think the on-disk notes index was not well liked): http://thread.gmane.org/gmane.comp.version-control.git/52598 My test with using a hash to speed it up: http://article.gmane.org/gmane.comp.version-control.git/99415 Some discussion of the interaction of notes and rebase: http://thread.gmane.org/gmane.comp.version-control.git/100533 Some thoughts from me on naming issues: http://article.gmane.org/gmane.comp.version-control.git/100402 Some thoughts from me on the tree speedup: http://article.gmane.org/gmane.comp.version-control.git/101460 which I think should bring you up to speed. :) -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html