On Wed, Nov 9, 2011 at 18:26, Junio C Hamano <gitster@xxxxxxxxx> wrote: > - "git notes" is represented as a commit that records a tree that holds > the entire mapping from commit to its annotations, and the only way to > transferr it is to send it together with its history as a whole. It > does not have the nice auto-following property that transfers only the > relevant annotations. True. However, consider these mitigating factors: - The annotations in question (the "signing" of commits) are all intended to be merged eventually (i.e. there is no reason for a developer to (after the fact) sign a commit that will never end up in the public record). Therefore, most or all of the notes in the notes tree are already relevant, or will become relevant in the near future (when the associated commits are merged). - Additionally, you could organize these notes into two (or more) notes trees, one for merged/official annotations, and one for unmerged/pending annotations. Then make the relevant tools (e.g. "git merge") transfer notes from one tree to the other, thereby making sure that the "official" record only contains notes that are relevant to the merged history. - Finally, there's always "git notes prune" to purge annotations for commits that ended up never being merged. My point is that although "notes" might end up transferring more annotations than strictly necessary, I believe that in practice all the notes being transferred are already (or will soon become) relevant. > + "git notes" maps the commits to its annotations in the right direction; > the object name of an annotated object to its annotation. > > In the longer term, I think we would need to extend the system in the > following way: > > - Introduce a mapping machanism that can be locally used to map names of > the objects being annotated to names of other objects (most likely > blobs but there is nothing that fundamentally prevents you from > annotating a commit with a tree). The current "git notes" might be a > perfectly suitable representation of this, or it may turn out to be > lacking (I haven't thought things through), but the important point is > that this "mapping store" is _local_. fsck, repack and prune need to be > told that objects that store the annotation are reachable from the > annotated objects. IMHO this is precisely what "git notes" does today. > - Introduce a protocol extension to transfer this mapping information for > objects being transferred in an efficient way. When "rev-list --objects > have..want" tells us that the receiving end (in either fetch/push > direction) would have an object at the end of the primary transfer > (note that I did not say "an object will be sent in this transfer > transaction"; "have" does not come into the picture), we make sure that > missing annotations attached to the object is also transferred, and new > mapping is registered at the receiving end. > > The detailed design for the latter needs more thought. The auto-following > of tags works even if nothing is being fetched in the primary transfer > (i.e. "git fetch" && "git fetch" back to back to update our origin/master > with the master at the origin) when a new tag is added to ancient part of > the history that leads to the master at the origin, but this is exactly > because the sending end advertises all the available tags and the objects > they point at so that we can tell what new tags added to an old object is > missing from the receiving end. This obviously would not scale well when > we have tens of thousands of objects to annotate. Perhaps an entry in the > "mapping store" would record: > > - The object name of the object being annotated; > > - The object name of the annotation; > > - The "timestamp", i.e. when the association between the above two was > made--this can be local to the repository and a simple counter would > do. > > and also maintain the last "timestamp" this repository sent annotations to > the remote (one timestamp per remote repository). When we push, we would > send annotations pertaining to the object reachable from what we are > pushing (not limited by what they already have, as the whole point of this > exercise is to allow us to transfer annotations added to an object long > after the object was created and sent to the remote) that is newer than > that "timestamp". Similarly, when fetching, we would send the "timestamp" > this repository last fetched annotations from the other end (which means > we would need one such "timestamp" per remote repository) and let the > remote side decide the set of new annotations they added since we last > synched that are on objects reachable from what we "want". > > Or something like that. You would also have to keep track of deleted annotations, to enable the local side to delete an annotation corresponding to an already-deleted annotation on the remote side. Pretty soon, you end up having to record something similar to a DAG, describing the history of manipulating these annotations. At that point, your "timestamp" calculation starts to look very similar to the "have..want" calculation already done when transferring "regular" refs. At which point you have a system that is very similar to what "git notes" does today... ...Johan -- Johan Herland, <johan@xxxxxxxxxxx> www.herland.net -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html