Josh Triplett <josh@xxxxxxxxxxxxxxx> writes: > Jamey Sharp and I wrote a script called git-split to accomplish this > repository split. git-split reconstructs the history of a sub-project > previously stored in a subdirectory of a larger repository. It > constructs new commit objects based on the existing tree objects for the > subtree in each commit, and discards commits which do not affect the > history of the sub-project, as well as merges made unnecessary due to > these discarded commits. Very nicely done. > We would like to acknowledge the work of the gobby team in creating a > collaborative editor which greatly aided the development of git-split. > from itertools import izip > from subprocess import Popen, PIPE > import os, sys How recent a Python are we assuming here? Is late 2.4 recent enough? > def walk(commits, new_commits, commit_hash, project): > commit = commits[commit_hash] > if not(commit.has_key("new_hash")): > tree = get_subtree(commit["tree"], project) > commit["new_tree"] = tree > if not tree: > raise Exception("Did not find project in tree for commit " + commit_hash) > new_parents = list(set([walk(commits, new_commits, parent, project) > for parent in commit["parents"]])) > > new_hash = None > if len(new_parents) == 1: > new_hash = new_parents[0] > elif len(new_parents) == 2: # Check for unnecessary merge > if is_ancestor(new_commits, new_parents[0], new_parents[1]): > new_hash = new_parents[0] > elif is_ancestor(new_commits, new_parents[1], new_parents[0]): > new_hash = new_parents[1] > if new_hash and new_commits[new_hash]["new_tree"] != tree: > new_hash = None This is a real gem. I really like reading well-written Python programs. When git-rev-list (or "git-log --pretty=raw" that you use in your main()) simplifies the merge history based on subtree, we look at the merge and if the tree matches any of the parent we discard other parents and make the history a single strand of pearls. However for this application that is not what you want, so I can see why you run full "git-log" and prune history by hand here like this. I wonder if using "git-log --full-history -- $project" to let the core side omit commits that do not change the $project (but still give you all merged branches) would have made your job any easier? You are handling grafts by hand because --pretty=raw is special in that it displays the real parents (although traversal does use grafts). Maybe it would have helped if we had a --pretty format that is similar to raw but rewrites the parents? - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html