The following proposal is a check to see if this approach would be sane and whether someone is already doing similar work. If not, I offer to implement this solution. THE PROBLEM Say I have a master from which I have branched locally, and that this private branch has four commits: a b c o---o---o \ o---o---o---o 1 2 3 4 I then decide to cherry pick commit 3 onto master. Please believe that my situation is such that I cannot immediately rebase the private branch to drop the now-duplicated change. I end up with this: a b c 3' o---o---o---o \ o---o---o---o 1 2 3 4 Later, there is work on master which changes the same lines of code that 3' has changed. The commit which changes 3' is e* a b c 3' d e* f o---o---o---o---o---o---o \ o---o---o---o 1 2 3 4 At a later date, I want to rebase the private branch onto master. What will happen is that the changes in 3 will conflict with the rewritten changes in e*. However, I'd like Git to know that 3 was already incorporated at some earlier time, and *not consider it during the rebase*, since it doesn't need to. THE SOLUTION For the purposes of this discussion, I'd like to define the term "aggregate identity" (insert better name here) as a set including: a commit's sha, and zero or more shas stored in a new field named "Origin-Ids". If, when cherry-picking, the originating's commit id is stored in the Origin-Ids field of the cherry-picked commit, then rebase could know whether a given commit's changes had already been applied. The logic would look like this: 1. When rebasing a branch A onto B, find the common ancestor of A and B. 2. Examine every commit on B since that common ancestor, collecting a set of their aggregate identities. 3. For each commit on A, ignore it if its aggregate identity occurs in that set. This would cause commit 3 to be ignored during the rebase above, since 3' would have an origin id referring to 3. IMPLEMENTATION A few things need to be done: - Extend commit objects to have an Origin field, which can be zero, one or a list of hashes. - Add an option to git commit so that one or more origin ids can be specified at the time any commit is made. There may be occasions when it's useful to explicitly state that a new commit should somehow 'override' the contents of another during a rebase. - git cherry-pick and git am should add this Origin field, showing the commit their contents originated from. - git merge --squash would store the commit ids, and the origin ids, of every commit involved in the merge into the resulting commit's Origin field. Note that nothing can be done about rebasing a squashed merge commit onto another squashed merge commit, even though it could be detected that they had common changes. I don't believe it would even be useful to warn about this, the user would just have to resolve the conflicts manually. - git log could be extended to show the "parentage" (really, the aunt/uncle) of commits with origin info, assuming those origin commits are not dangling (which is OK, and likely to occur after the originating branch is deleted, or if the originating branch is in another repository). Where there are multiple Origin ids, a search could be done to find the set of most descendent commits, so that history could be usefully shown after an octopus squash, for example. QUESTIONS Is it allowable to add new metadata fields to a commit, and would this require bumping the repository version number? Or should this be implemented by appending a Header-style textual field at the end of the commit message? -- John Wiegley BoostPro Computing http://www.boostpro.com -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html