Re: git filter-branch should run git gc --auto

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Jan 23, 2008, at 12:07 AM, Sam Vilain wrote:

Kevin Ballard wrote:
I'm actually considering what the cost would be of switching macports
to git (not that it will ever happen - too many anonymous people pull
from svn trunk). Right now the svn trunk contains a subfolder for the
source code and another subfolder for all ~4400+ Portfiles. In such a
theoretical move, I'd want to split that up, probably into two
unrelated branches. Doing so would mean running git-filter-branch over
a linear commit history that's 31580 objects long, with a tree filter
to prune the dports directory away and a msg filter to remove the svn-
id stuff that git-svn left behind.

You could have used git-svn --no-metadata :)

Sure, except I imported the svn repo with the intention of continuing to track it. I'm only floating the idea now of converting the upstream repo to git, but as I said before we have enough anonymous checkouts of people tracking trunk that we probably can't justify switching VCSs, especially when svn is now bundled on Leopard but git isn't.

Using a commit filter to implement the pruning will be much faster;
you'll need to make a temporary index, use git-read-tree, git-rm, then
git-commit.  This way you avoid the expense of checking out the files
just to delete them in your rewrite hook.

I suspect an index filter would be simpler, and that's really what I meant when I said tree filter.

I'd also have to
figure out some way to remove the commit objects entirely that only
reference the dports directory.

This can be done with a parent filter.

Good to know.

I'd suggest a patch to run git gc --auto, but it looks like you just
did in a subsequent email. As for your comments about the reflogs,
can't I disable recording those, at least temporarily? I'd rather
clean up after myself as I work rather than balloon the repository and
collapse it in a single operation at the end.

Honestly, the optimisation I mention above will save you much more time.
Note that you can run git-repack -d every half hour out of cron, it is
safe and will let it clean as you go.

That's a reasonable suggestion. And I'm still just thinking about this, so I have no idea if I'll ever actually have to run git-filter- branch on this massive history.

--
Kevin Ballard
http://kevin.sb.org
kevin@xxxxxx
http://www.tildesoft.com


<<attachment: smime.p7s>>


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux