Junio C Hamano <junkio@xxxxxxx> wrote: > Johan Herland <johan@xxxxxxxxxxx> writes: > > Ok. But the reverse mapping will help with this, won't it? > > We'll look up the interesting commits and find their associated > > note objects directly. > > The issue Linus brought up worries me, too. > > The "efficient reverse mapping" is still handwaving at this > stage. What it needs to do is an equivalent to your > implementation with "refs/notes/<a dir per commit>/<note>". The > "efficient" one might do a flat file that says "notee note" per > line sorted by notee, or it might use BDB or sqlite, but the > amount of the data and complexity of the look-up is really the > same. A handful notes per each commit in the history (I think > Linus's "Acked-by after the fact" example a very sensible thing > to want from this subsystem). Please, don't use BDB or sqllite. I really don't trust either. I've lost data to both. I've *never* lost data to a Git packfile. ;-) I'm actually thinking pack v4. OK, I know its just a virtual hand waving thing still, but there's really no reason Nico and I cannot get the damn thing finished before we both wind up buying the farm. What if we use a "slow" storage by "refs/notes/$objname/$notename", and we also allow them to appear in the packed-refs file. But during a repack we instead stick the annotations into the same packfile as $objname, and we also include a list of $notename after $objname's other data. This way we have quick access to the $notename(s) of all notes of $objname through the pack, and we can lazily go get the notes raw data if we need them. This isn't too different from what we do with parent fields. We initialize the commit_list when we parse the commit but we don't parse the parents until we really need them. Once packed we delete the note ref (if loose) and during a repack of the packed-refs file we delete the note if $notename exists in the packfile. If someone wants notes we can check to see if refs/notes exists; if it does then we enumerate all refs and catalog the notes we found in memory. Note search then works off the in-memory list and off the packfiles. If refs/notes doesn't exist (and we should delete it when we prune away those ref files or prune them out of packed-refs) then we can skip the ref enumeration and just go straight to the packfile(s). Most notes will be in the packfiles. I think most people repack often enough that the handful of unpacked notes before the next repack won't be a major bottleneck. Especially since we can get the target $objname directly from a readdir() call, or by splitting the string in the packed-refs file. >From an object enumeration standpoint during packfile generation the notes for a given object are treated like the parent fields in a commit; they come after the object itself, but unlike the parent fields they are always output if the object itself was output. (Hence an --objects-edge enumeration would include the notes only if the commit itself had been included.) Unfortunately that doesn't cover the case of a note being added months later and needing to distribute it to clients that already have the object the note is attached to. I haven't been following this discussion very closely, but I'd also like to suggest that if annotated tags are being used for notes that the "tag <name>" field be left out of them. I don't see why a note should be given a specific name that sits in a (roughly) global namespace. -- Shawn. - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html