On Fri, 20 Dec 2019 at 10:56, Ed Maste <emaste@xxxxxxxxxxx> wrote: > > On Wed, 18 Dec 2019 at 19:57, Tom Clarkson <tqclarkson@xxxxxxxxxx> wrote: > > > > > Overall I think your proposed algorithm is reasonable (even though I > > > think it won't address some of the cases in our repo). Will your > > > algorithm allow us to pass $dir to git rev-list, for the initial > > > split? > > > > Is this just for performance reasons? As I understand it that was left out because it would exclude relevant commits on an existing subtree, but it could make sense as an optimization for the first split of a large repo. > > Yes, it's for performance reasons on a first split that I'd like to > see it. On the FreeBSD repo the difference is some 40 minutes vs. a > few seconds. Following up on this old thread, I plan to revisit the optimization, implementing something on top of your work in https://github.com/gitgitgadget/git/pull/493. I might look at adding a --initial flag to subtree split, having it essentially auto-detect a revision to use as the value for --onto. For the common case of an initial merge commit with two parents I think we can relatively easily determine which is the subtree parent. If that's not sufficiently general (or broadly useful outside of our context) we could just create a helper script wrapping `subtree split` tailored to the FreeBSD cases. We have something like 100 projects we're looking to split, as part of our svn to git migration.