In message <4C9A66AF.5000302@xxxxxxxxx>, Artur Skawina writes: This started in a thread about locating dead topic branche Isn't that pretty easy to do? `git fsck --unreachable master | grep commits`? Post-processing that to assemble branches would seem to be fairly simple. But yes, I wanted something completely different. Something more like: if a bug was introduced in commit X, what releases or branches has it contaminated (or more positively, if a feature was introduced, where was it made available). The simple case is figuring out on which branch a commit was originally made. I was unhappy when I realized that another way code could get out was through cherry-picks, and that there doesn't seem any non-brute force (computing checksums of patches for every patch in the tree) method to discover them. Two things make the above trivial history a bit more complicated. A) one side-branch can merge another, and build on top of changes that are not yet available on 'master'; the result can then appear in master via either one or both paths. This is why showing when and how a change became visible on every side branch can be interesting. Quite. I encountered this a few different ways and even when I fixed it during the reverse parse, I failed to learn my lesson and it was a problem during the forward parse. I think the latest version is fairly bullet-proof. B) when a side branch does not contain any new changes, but is made uptodate wrt master, the resulting history could end up like this: m-> m -> m -> m -> m -> m -> m -> master \ / \ / b -> b -> b c -> c -> side-branch#1 What happened was -- git "optimized" the simple merge away, turning it into a fast-forward, saving one merge commit, but loosing the link connecting the 'c' and 'b' parts of 'side-branch#1'. Do you (anybody) happen to know a public repo, w/ history as above, ie w/ more then one long-lived branch that has seen some fast-forwards? I wonder how reliable recovering the missing link would be... I have a real (non-public, sorry) tree that did something approaching this: ->m->m->m->m->m---------m / / / b->b->b->b->b------b->b-> \ \ \ / t->t->t->t->t->t However, due to fast-forwarding, it was turned into something like this: ->m->m->m->m->m---------m / / / b->?->?->?->?------b->b-> \ \ \ / t->t->t->t->t->t b b b b b b I don't think there is any way to figure out what happened given git's available information. I was just saying on #git a few hours ago, though, that I think git needed a tree anonymizing program. As long as one does not go overboard, it doesn't seem too difficult. That probably means I just have not thought about the problem hard enough. Of course, it would only replicate what is, not how you got there. And there's no reason why this operation should take ~20 minutes, even for the randomly chosen, but real, worst case. But finding a good repo to test w/ would take longer than writing the code... It only takes 8 seconds per test on the linux kernel, which all things considered is rather fast. The real problem is that each test is treated independently. If someone got the complete history of the project and built a tree out of it, it would be extremely fast to run additional tests even ignoring the obvious optimiziations of not researching known paths. The question is, will this functionality be needed often enough to spend the time necessary to optimize it? -Seth Robertson -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html